Newsletter #5: ML, MLOps, Scaling Engineering Teams and More
Similarity Clustering at Stripe, Airbnb ML Powered-Search Ranking, MLOps, Lessons Learned from Scaling Engineering Teams & More
Who knows you better than you: your manager, a recruiter looking to hire for a new role, the promotion committee, your resume that's partially out of date?
We all have answered the "tell me about you" question. We have all proved to someone at some points in our career that we deserved something: a job, a promotion, a raise and we'll prove it again – and again.
I recently started keeping a list of my accomplishments and successes at my job to keep track of my growth and stuff I'm proud of. I learned when it comes to telling people about me; nobody can do it better than me.
An accomplishment list is an excellent reminder of one's successes. Keeping a list of your achievements prepares you to advocate for yourself and to craft a better resume.
As you read this technical newsletter today, bear it in mind to cultivate a non-technical habit of writing your achievements on paper every month because you're your own best hype person.
Without further ado, let's dive into what I have got for you this month.
Machine Learning (ML)
Monitoring machine learning models
When it comes to monitoring machine learning models, the approach is not the same as traditional software monitoring and observability. In ML, you don't only monitor your code and configurations, but also your model and data. A change in any of these could potentially affect the behavior of your ML system. In this post, you'll learn fundamental principles for monitoring ML systems.
Airbnb machine learning-powered search ranking
Mihajilo Grbovic, a principal data scientist at Airbnb, wrote about how Airbnb built and iterated on a machine learning search ranking platform for a new two-sided marketplace and how they helped it grow.
Using similarity clustering to catch fraud rings at Stripe
A payment system like Stripe is definitely a target for payment fraud and cybercrime. Fraudsters have gone sophisticated, and to counter them at scale, you need a predictive machine model capable of circumventing their schemes - that's why I like this article on how Stripe uses similarity clustering to catch fraud rings.
Ensuring capacity safety of microservices using ML
An organization like Uber with thousands of microservices needs a prediction of individual services' capacity requirements to prevent capacity-related outages. Here is another interesting read from Ranjib Dey on how Uber is using ML to plan the capacity of various microservices to prevent an outage.
Ten commandments of MLOps
MLOps is an intersection of ML, DevOps, and Data Engineering. You can see it as a set of practices that provide determinism, scalability, agility, and governance in the model development and deployment pipeline. Here are ten practices for infrastructure that allows you to deploy ML applications in a durable, hassle-free, and flexible manner.
How Stripe train ML models with Kubernetes
How do you give every team the tools they need to train their models without requiring them to operate their infrastructure at scale? If your organization is like Stripe, who serves millions of businesses worldwide, it's important that your teams have a stable and fast ML pipeline to update and train new models to respond to change continuously. This is why you should read how Stripe trains machine learning models with kubernetes rapidly.
Data selection for ML - A challenge for automated driving data
Computer vision applications are heavily dependent on a large amount of high-quality data. But some of these data add more values to the training process than others. Out of the sheer size of data available for training, it's crucial to be able to identify high-quality data and take advantage of them. In this post, Mark Pfeiffer wrote about a typical problem computer vision engineers face.
Engineering
Lessons from scaling engineering teams through euphoria and horror
In startups, nothing is more important than scale. And failure to scale can be a catastrophe. Tim Howes, a director of engineering at Facebook who has helped scale an engineering team to 650 in just 18 months and at another company, he scaled a team from 150 to 300. In this post, he shared some impactful lessons he has learned through euphoria and horror.
Building Twitter's ad platform architecture for the future
Good software architecture is adaptable, but not always. There's always a point where you have to rethink your architecture entirely to meet the changing world's demand. For ten years, twitter had been running its Adserver until a point where adapting it is no longer an option. That's why I like how Twitter built its new Ad platform for the future.
Deleting data distributed throughout your microservices architecture
The GDPR introduces a right for individuals to have personal data erased. This implies that your architecture should be able to accommodate data erasure when users request it. Deleting data across microservices is not trivial. Data could be distributed across multiple services or exported to offline snapshots often will require coordination between systems and teams for a full deletion. This post from twitter engineering blog presents some solutions that could spark ideas when designing your microservice article.
Think in tradeoffs
When it comes to making big engineering decisions, it's important not to make them superficially; instead, critical decisions should be viewed from the lens of tradeoffs.
Libraries
Diagrams
Diagrams is a tool that lets you draw cloud system architecture in Python code. It was born for prototyping a new system architecture without any design tools. You can also describe or visualize the existing system architecture as well.
Thanks for reading
Thanks for reading! If you like this Newsletter and want to support it, please share with friends or buy me a coffee.
HubofML delivers a curated newsletter for machine learning engineers, data scientists, software engineers, architects, tech leads, and more. If you missed the last newsletter, you can catch up here.
Cheers 🥂