[home]

Notes: Learning Machine Learning

February 1, 2023

Motivated by the ChatGPT release and the new gold rush, I've decided to spend some time refreshing my ML fundamentals. Back in 2015-2016 (before TensorFlow) I worked on what is now called MLOps for a large multi-label classification model for Google Maps, where I implemented distributed data processing pipelines, model versioning, deployment, and data and inference observability, all in C++, but I've lost touch with the modern ecosystem. In grad school I took several ML and classical AI courses but haven't had to revisit those fundamentals in a long time.

I'll link notes and code below as I go along.

Edit: This project is complete. See the retrospective below.

Table of Contents

Google's Machine Learning Crash Course

Course: https://developers.google.com/machine-learning/crash-course

Status: Done

Review:

This ~15 hour video-heavy course provides a broad overview of the field. The emphasis is on linear and logistic regression, neural networks, and applied/engineering factors. There are also some "programming exercises" that provide ~zero learning value, as you do little more than just read and execute existing TensorFlow code. But still this is a very good and information-dense course that I would highly recommend as an introduction.

Notes:

  1. Framing
  2. Descending into ML
  3. Reducing Loss
  4. First Steps with TensorFlow
  5. Real Datasets
  6. Generalization
  7. Training, Validation, Test Sets
  8. Representation
  9. Feature Crosses
  10. Regularization: Simplicity
  11. Logistic Regression
  12. Classification
  13. Regularization: Sparsity
  14. Neural Networks
  15. Training Neural Networks
  16. Multi-Class Neural Networks
  17. Embeddings
  18. ML Engineering
  19. ML Fairness
  20. Real-World Examples
  21. Guidelines

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

Book: https://www.oreilly.com/library/view/hands-on-machine-learning/9781098125967/

Status: Abandoned. Fantastic book but goes way deeper into library/tool usage and a million applications than I was looking for. Still I think the first few chapters and their exercises helped me develop a reasonable understanding of the ML workflow.

Notes and Colabs:

Independent practice

Coursera Machine Learning Specialization

Course: https://www.deeplearning.ai/courses/machine-learning-specialization/

Status: Done

Review:

Great course for learning the algorithms and ideas in more detail than the Google crash course. Lots of math (like backpropagation from scratch and so on) helped me develop a better sense of how things work than just playing around with Kaggle problems. Also, Andrew Ng's comments and tips based on his extensive experience were a nice feature. Unfortunately the programming labs were almost entirely useless - you implement a tiny bit of an algorithm based on literally translating a couple of equations into Python. It would have been much more interesting if the labs required you to build up the entire pipeline and algorithm from scratch (or with hints). If I hadn't worked through the first couple of chapters of HOML I wouldn't have had any idea how the labs worked.

Notes and colabs:

Retrospective

I spent a year on this project (Feb 2023 to Feb 2024), with almost all of the work work happening in the first six months (before I relocated to France over the summer, when I had new problems to deal with). The effort absolutely paid off professionally. Even if I didn't transition to AI work, my current company does a lot of ML, and the time I spent on this material made it possible for me to understand what kind of approaches we use and why and their tradeoffs, identify the largest ML engineering challenges we're facing, and have reasonable conversations with the team members about their work and its difficulty and impact. It also helped me to understand and apply my (existing but very outdated) ML knowledge and experience in terms of the modern ecosystem.

My next step is to dig further into Deep Learning and building stuff. I'll make a new post about that :)

[home]