A past coaching client of mine interviewed for an entry-level Data Science job in Los Angeles that paid $110k per year.

Here's the 19 interview questions they were asked:

(how many of these could you answer?)

๐’๐ญ๐š๐ญ๐ข๐ฌ๐ญ๐ข๐œ๐ฌ ๐Ÿ”ข

1. Explain the difference between population variance vs sample variance

2. Describe a p-value to a 13-year-old (8th grader)

3. What Does an R-Squared Value of 0.9 Mean?
๐’๐ญ๐š๐ญ๐ข๐ฌ๐ญ๐ข๐œ๐ฌ (Continued) ๐Ÿ”ข

4. What are some metrics to evaluate how predictive your logistic regression model is?

5. Can you explain precision vs. recall? How does that relate to an ROC curve?
๐’๐๐‹ โš™๏ธ

6. What is a database view?

7. What's the difference between a primary key & a foreign key?

8. Have you had to troubleshoot a slow query before? How did you do it?

9. When should you use a Sub-Query vs. CTE?
๐’๐๐‹ (Continued) โš™๏ธ

10. Can you write a SQL query to find the 3-day rolling average?

(very similar to this Twitter SQL interview question on DataLemur)

https://t.co/ssEQvcrKgo
๐’๐๐‹ (Continued) โš™๏ธ

11. Can you write a SQL query to find the most frequently purchased pairs of items?

(very similar to the Walmart SQL Interview question on DataLemur)

https://t.co/J8rwcjp8P5
๐๐ฒ๐ญ๐ก๐จ๐ง ๐Ÿ

12. Which Python libraries do you use?

13. What are some steps you take to clean dirty data?

14. In this dataframe, how would you remove all rows that met X condition?
๐๐ฒ๐ญ๐ก๐จ๐ง (Continued) ๐Ÿ

15. Can you write code to find this value in a JSON object?

16. After loading the IRIS dataset via scikit-learn, can you build a simple regression model, and interpret the output?
๐Œ๐ข๐ฌ๐œ๐ž๐ฅ๐ฅ๐š๐ง๐ž๐จ๐ฎ๐ฌ ๐Ÿคทโ€โ™‚๏ธ

17. What are some ways you'd improve this sample Tableau dashboard, so a business stakeholder could get more value from it?

18. What metrics do you think are important to track for our business line? Do you see any potential flaws in those metrics?
๐Œ๐ข๐ฌ๐œ๐ž๐ฅ๐ฅ๐š๐ง๐ž๐จ๐ฎ๐ฌ (Continued) ๐Ÿคทโ€โ™‚๏ธ

19. Walk us through 2 of your past work projects.

What the most challenging technical and non-technical part of each project?
How many of these 19 questions could you answer?

Would you be able to land the $110k Data Science job?

BTW if you enjoyed this thread:

1. Follow me @NickSinghTech for more data interview tips!
2. RT the tweet below to share this thread & challenge your followers! https://t.co/GUkhNwMcRu
Looking for resources to study for #DataScience & #MachineLearning interviews?

Follow this 12-Week Roadmap ๐Ÿงต๐Ÿ‘‡

https://t.co/NLJ4oU6NbL
Also, don't forget, that #SQL is a KEY part of the Data Science and Data Analyst interview process.

Here's the 30 Day Plan to go from SQL Zero to SQL HERO!

๐Ÿงต๐Ÿ‘‡

https://t.co/7uRtIH1Y5F

More from All

You May Also Like

@EricTopol @NBA @StephenKissler @yhgrad B.1.1.7 reveals clearly that SARS-CoV-2 is reverting to its original pre-outbreak condition, i.e. adapted to transgenic hACE2 mice (either Baric's BALB/c ones or others used at WIV labs during chimeric bat coronavirus experiments aimed at developing a pan betacoronavirus vaccine)

@NBA @StephenKissler @yhgrad 1. From Day 1, SARS-COV-2 was very well adapted to humans .....and transgenic hACE2 Mice


@NBA @StephenKissler @yhgrad 2. High Probability of serial passaging in Transgenic Mice expressing hACE2 in genesis of SARS-COV-2


@NBA @StephenKissler @yhgrad B.1.1.7 has an unusually large number of genetic changes, ... found to date in mouse-adapted SARS-CoV2 and is also seen in ferret infections.
https://t.co/9Z4oJmkcKj


@NBA @StephenKissler @yhgrad We adapted a clinical isolate of SARS-CoV-2 by serial passaging in the ... Thus, this mouse-adapted strain and associated challenge model should be ... (B) SARS-CoV-2 genomic RNA loads in mouse lung homogenates at P0 to P6.
https://t.co/I90OOCJg7o