How to ace Machine Learning Interviews: My Personal Playbook
Resources that helped me to prepare for ML interview rounds
Getting ready for a machine learning interview at a big tech company or startup? It can feel overwhelming with all the different rounds, from testing your broad knowledge to diving deep into specific problems, system design, and coding challenges. When I first started, I felt completely lost and had to dig deep to piece together the right resources.
Thatโs why Iโm writing this article โ to share the resources that helped me, so you donโt have to struggle through the similar confusion and can confidently navigate your Machine Learning interviews. ๐
Note: This article won't cover general software engineering interview rounds like DSA coding or specialized roles like Data Analyst, where the focus is on SQL or data-specific questions.
Letโs get started โ but first, a quick disclaimer1.
The Four Types of Rounds:
๐ Machine Learning Breadth: This round tests your broad knowledge across various ML topics. I usually spend about 35% of my prep time here.
๐ Machine Learning Depth: This round requires about 25% of my prep time, focusing on specialized topics and detailed case studies.
๐ ๏ธ Machine Learning System Design: This round evaluates your ability to design scalable ML systems. I allocate around 40% of my prep time here.
๐ป Machine Learning Coding: In this round, you'll tackle coding challenges around basic algorithms. Since this round is generally rare, I dedicate about 0-10% of my prep time (shared with Breadth).
Note: This breakdown will vary depending on your experience level and background. Adapt your preparation strategy to best fit your needs.
๐ Machine Learning Breadth
This round assesses an engineer's understanding of ML fundamentals across various topics. It's often a rapid-fire session where interviewers may jump between different topics or might ask general questions within an area.
A common mistake is relying solely on online questionnaires. Interviewers often reframe questions, making it essential to understand concepts deeply. For example, I was once asked how to handle imbalanced classes in a user-click dataset (where one class, like user click, is much less common than the other class). I suggested "oversampling," which can distort the dataset distribution and affect the model's accuracy. Instead, with a large dataset, it's better not to change the data distribution.
Topics vary by company. For instance, self-driving car companies may ask computer vision fundamentals for junior or senior roles, while other companies might focus on general ML fundamentals.
Fundamental Topics (All Levels)
Supervised Learning: Classification (Logistic Regression, SVMs, Decision Trees), Regression (Linear and Ridge regression)
Unsupervised Learning: Clustering (k-means, hierarchical, DBSCAN), Dimensionality Reduction, Latent Semantic Analysis (LSA)
Ensemble Methods: Random Forest (Bagging), Gradient Boosting Machines
Deep Learning: Multi-layer Perceptron, CNNs
Model Evaluation Metrics: Classification vs Regression evaluation metrics, Bias vs Variance Tradeoff
Loss Functions: Different types of loss functions, Regularization, Overfitting vs Underfitting
Feature Selection and Importance: Techniques to identify important features like Correlation analysis, recursive feature elimination, Lasso
Statistics: P-values, r-square, regression analysis, Naive Bayes, distributions, maximum likelihood estimation, A/B testing
๐ก
Highly Recommended Resources
General ML:
Deep Learning:
Udacity's Deep Learning Nanodegree
Coursera's Deep Learning Specialization
Statistics: Statistics for Machine Learning
"The Hundred-Page Machine Learning Book" by Andriy Burkov
Specialized Topics (Senior/Specialized Roles)
Time-Series and Sequential Data:
RNNs, LSTMs, Seq2Seq models: Understanding architectures designed for handling time-dependent and sequential data.
๐ก
Resources:
Good tutorials for RNNs and LSTMs and Seq2Seq modelNatural Language Processing (NLP):
Word Embeddings: Techniques like Word2Vec, GloVe.
Attention Mechanisms and Transformer Models
LLM Fundamentals: Becoming increasingly more relevant.
๐ก
Great articles:
Computer Vision:
Not an expert on CV. You can consider this course: Udacity Computer Vision Nanodegree.
Reinforcement Learning [Rare]:
Relevant for robotics company roles
๐ก
Resources: Reinforcement Learning Overview, Tutorial
๐ Next Steps
Once you understand the fundamentals, practice with sample interview questions, Some good resources for practicing sample questions:
๐ Machine Learning Depth
This round is particularly interesting and the most open-ended of all interview types. It is generally aimed for above entry-level jobs with some experience. The goal of this interview is threefold:
Theoretical and practical understanding of a specific problem space
Designing a valid experiment and analyzing results
Communication skills
Expect interviewers to have a significant background knowledge in this area. While it's challenging to prepare specifically for this round, understanding the general directions it can take can be helpful. Here are some common directions this interview can take:
Past Experience: Discussing your academic or industrial projects and their shortcomings. Be prepared to discuss your projects in detail.
Theoretical Background: Delves into the theoretical aspects of a specialized field, especially if the role requires such specialization.
Some examples of specializations and their helpful resources:
NLP Resource:
NLP MetaBlog by
LLMs Overview by
Computer Vision Resource:
Coursera: Computer Vision Basics
Reinforcement Learning Resource:
Github: Reinforcement Learning Overview
Learning to Rank Resource:
Towardsdatascience: Introduction to Ranking
Time-Series Forecasting Resource:
๐ ๏ธ Machine Learning System Design
This round resembles a typical software engineering system design interview, applying similar principles. You are given a product space (e.g., designing a YouTube recommendation ML system) and asked to define the problem, outline the design process, and communicate your thoughts, including tradeoffs.
Interviewers look for your problem-solving approach, thought process, and high-level design skills. During the interview, you may be asked to dive deeper into specific components. Generally most ML systems can be broken down into 5 critical components. Hereโs main focus of each component:
1๏ธโฃ Problem Definition
Clearly define the problem and formulate your hypothesis.
2๏ธโฃ Evaluation Metrics
Understand the difference between model performance metrics and business metrics.
Model Metrics: Precision, Recall, MSE, etc.
Business Metrics: Revenue, number of clicks, number of frauds detected, etc.
3๏ธโฃ Feature and Data
Handle offline and real-time data streaming pipelines and preprocess raw data to create features.
4๏ธโฃ Model Development
Training:
Select appropriate models and discuss your rationale behind model choices. Provide details on the training process (one-time vs. recurrent vs. online training).
Evaluation:
Choose evaluation datasets and effectively evaluate model performance.
5๏ธโฃ Inference Service and Deployment
Develop a deployment strategy, including the choice of technologies and ensuring the scalability of the inference service. Discuss your approach for automated deployment, evaluation, and A/B testing.
๐ก
Highly Recommended Courses
๐ป Machine Learning Coding
I've only encountered a Machine Learning-specific coding round once. This round is uncommon but more frequently seen in startups. Unless it's explicitly mentioned, I don't typically prepare for it. Always confirm with your recruiter if there will be ML coding questions in any of the rounds. If not, focus your preparation on other sections. If there is a specific ML coding round, reallocate some time from ML breadth to practice coding for a few days. The strategy is simple: merge your ML coding preparation with the fundamentals in ML breadth, practicing coding for basic models.
๐ก
Helpful Resources
Github: 100 Days of ML Code
Kaggle: Great platform where people share their code. For example, House Prices Prediction using TFDF
AlgoExpert: For visual learners, check out AlgoExpert's ML Coding Questions. I tried it once and found it helpful.
Parting note ๐
Preparing for machine learning interviews can be a challenging journey, but with the right resources and planning, you can navigate it confidently. Remember, every interview is a learning experience, bringing you one step closer to your dream job. Stay curious, keep learning, and don't hesitate to ask for clarifications from your recruiter. Good luck! ๐
As a parting note, I want to share few more resources for quick references:
ML Cheatsheet for quick revisions: https://sites.google.com/view/datascience-cheat-sheetsย
ML Interview book: https://huyenchip.com/ml-interviews-book/ย by Chip Huyen
๐ Good Reads for the weekends
The problem with the problem with popularity bias by
TinyLlama paper read by
What does it take to be a data professional ? by
Poor Communication Blocking Promo? by
Is it still worth it for Big Tech ? by
Please let me know if I missed anything in comments. If you like the article, please hit โค๏ธ button. Consider subscribing if you havenโt already.
Disclaimer: This blog is based on personal experiences and publicly available resources. Please note, the opinions expressed are solely my own and do not represent those of my past or current employers. Always refer to official resources and guidelines from hiring companies for the most accurate information.
This was a great read! Simple, yet useful.
Am I understanding the ML breadth vs depth difference right:
1. In breadth, you cover most topics in a quick way
2. In depth, you dive into your resume and project, and maybe deep dive into a specific ML topic the company cares about? Feels more like a resume deep-dive round.