Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

MLOps Engineering at Scale
MLOps Engineering at Scale
MLOps Engineering at Scale
Ebook697 pages4 hours

MLOps Engineering at Scale

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Dodge costly and time-consuming infrastructure tasks, and rapidly bring your machine learning models to production with MLOps and pre-built serverless tools!

In MLOps Engineering at Scale you will learn:

    Extracting, transforming, and loading datasets
    Querying datasets with SQL
    Understanding automatic differentiation in PyTorch
    Deploying model training pipelines as a service endpoint
    Monitoring and managing your pipeline’s life cycle
    Measuring performance improvements

MLOps Engineering at Scale shows you how to put machine learning into production efficiently by using pre-built services from AWS and other cloud vendors. You’ll learn how to rapidly create flexible and scalable machine learning systems without laboring over time-consuming operational tasks or taking on the costly overhead of physical hardware. Following a real-world use case for calculating taxi fares, you will engineer an MLOps pipeline for a PyTorch model using AWS server-less capabilities.

About the technology
A production-ready machine learning system includes efficient data pipelines, integrated monitoring, and means to scale up and down based on demand. Using cloud-based services to implement ML infrastructure reduces development time and lowers hosting costs. Serverless MLOps eliminates the need to build and maintain custom infrastructure, so you can concentrate on your data, models, and algorithms.

About the book
MLOps Engineering at Scale teaches you how to implement efficient machine learning systems using pre-built services from AWS and other cloud vendors. This easy-to-follow book guides you step-by-step as you set up your serverless ML infrastructure, even if you’ve never used a cloud platform before. You’ll also explore tools like PyTorch Lightning, Optuna, and MLFlow that make it easy to build pipelines and scale your deep learning models in production.

What's inside

    Reduce or eliminate ML infrastructure management
    Learn state-of-the-art MLOps tools like PyTorch Lightning and MLFlow
    Deploy training pipelines as a service endpoint
    Monitor and manage your pipeline’s life cycle
    Measure performance improvements

About the reader
Readers need to know Python, SQL, and the basics of machine learning. No cloud experience required.

About the author
Carl Osipov implemented his first neural net in 2000 and has worked on deep learning and machine learning at Google and IBM.

Table of Contents

PART 1 - MASTERING THE DATA SET
1 Introduction to serverless machine learning
2 Getting started with the data set
3 Exploring and preparing the data set
4 More exploratory data analysis and data preparation
PART 2 - PYTORCH FOR SERVERLESS MACHINE LEARNING
5 Introducing PyTorch: Tensor basics
6 Core PyTorch: Autograd, optimizers, and utilities
7 Serverless machine learning at scale
8 Scaling out with distributed training
PART 3 - SERVERLESS MACHINE LEARNING PIPELINE
9 Feature selection
10 Adopting PyTorch Lightning
11 Hyperparameter optimization
12 Machine learning pipeline
LanguageEnglish
PublisherManning
Release dateMar 22, 2022
ISBN9781638356509
MLOps Engineering at Scale

Related to MLOps Engineering at Scale

Related ebooks

Computers For You

View More

Related articles

Reviews for MLOps Engineering at Scale

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    MLOps Engineering at Scale - Carl Osipov

    MLOps Engineering at Scale

    CARL OSIPOV

    To comment go to liveBook

    Manning_M_small

    Manning

    Shelter Island

    For more information on this and other Manning titles go to

    www.manning.com

    Copyright

    For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.

    For more information, please contact

    Special Sales Department

    Manning Publications Co.

    20 Baldwin Road

    PO Box 761

    Shelter Island, NY 11964

    Email: [email protected]

    ©2022 by Manning Publications Co. All rights reserved.

    No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

    Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

    ♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

    ISBN: 9781617297762

    contents

    preface

    acknowledgments

    about this book

    about the author

    about the cover illustration

    Part 1 Mastering the data set

    1 Introduction to serverless machine learning

    1.1 What is a machine learning platform?

    1.2 Challenges when designing a machine learning platform

    1.3 Public clouds for machine learning platforms

    1.4 What is serverless machine learning?

    1.5 Why serverless machine learning?

    Serverless vs. IaaS and PaaS

    Serverless machine learning life cycle

    1.6 Who is this book for?

    What you can get out of this book

    1.7 How does this book teach?

    1.8 When is this book not for you?

    1.9 Conclusions

    2 Getting started with the data set

    2.1 Introducing the Washington, DC taxi rides data set

    What is the business use case?

    What are the business rules?

    What is the schema for the business service?

    What are the options for implementing the business service?

    What data assets are available for the business service?

    Downloading and unzipping the data set

    2.2 Starting with object storage for the data set

    Understanding object storage vs. filesystems

    Authenticating with Amazon Web Services

    Creating a serverless object storage bucket

    2.3 Discovering the schema for the data set

    Introducing AWS Glue

    Authorizing the crawler to access your objects

    Using a crawler to discover the data schema

    2.4 Migrating to columnar storage for more efficient analytics

    Introducing column-oriented data formats for analytics

    Migrating to a column-oriented data format

    3 Exploring and preparing the data set

    3.1 Getting started with interactive querying

    Choosing the right use case for interactive querying

    Introducing AWS Athena

    Preparing a sample data set

    Interactive querying using Athena from a browser

    Interactive querying using a sample data set

    Querying the DC taxi data set

    3.2 Getting started with data quality

    From garbage in, garbage out to data quality

    Before starting with data quality

    Normative principles for data quality

    3.3 Applying VACUUM to the DC taxi data

    Enforcing the schema to ensure valid values

    Cleaning up invalid fare amounts

    Improving the accuracy

    3.4 Implementing VACUUM in a PySpark job

    4 More exploratory data analysis and data preparation

    4.1 Getting started with data sampling

    Exploring the summary statistics of the cleaned-up data set

    Choosing the right sample size for the test data set

    Exploring the statistics of alternative sample sizes

    Using a PySpark job to sample the test set

    Part 2 PyTorch for serverless machine learning

    5 Introducing PyTorch: Tensor basics

    5.1 Getting started with tensors

    5.2 Getting started with PyTorch tensor creation operations

    5.3 Creating PyTorch tensors of pseudorandom and interval values

    5.4 PyTorch tensor operations and broadcasting

    5.5 PyTorch tensors vs. native Python lists

    6 Core PyTorch: Autograd, optimizers, and utilities

    6.1 Understanding the basics of autodiff

    6.2 Linear regression using PyTorch automatic differentiation

    6.3 Transitioning to PyTorch optimizers for gradient descent

    6.4 Getting started with data set batches for gradient descent

    6.5 Data set batches with PyTorch Dataset and DataLoader

    6.6 Dataset and DataLoader classes for gradient descent with batches

    7 Serverless machine learning at scale

    7.1 What if a single node is enough for my machine learning model?

    7.2 Using IterableDataset and ObjectStorageDataset

    7.3 Gradient descent with out-of-memory data sets

    7.4 Faster PyTorch tensor operations with GPUs

    7.5 Scaling up to use GPU cores

    8 Scaling out with distributed training

    8.1 What if the training data set does not fit in memory?

    Illustrating gradient accumulation

    Preparing a sample model and data set

    Understanding gradient descent using out-of-memory data shards

    8.2 Parameter server approach to gradient accumulation

    8.3 Introducing logical ring-based gradient descent

    8.4 Understanding ring-based distributed gradient descent

    8.5 Phase 1: Reduce-scatter

    8.6 Phase 2: All-gather

    Part 3 Serverless machine learning pipeline

    9 Feature selection

    9.1 Guiding principles for feature selection

    Related to the label

    Recorded before inference time

    Supported by abundant examples

    Expressed as a number with a meaningful scale

    Based on expert insights about the project

    9.2 Feature selection case studies

    9.3 Feature selection using guiding principles

    Related to the label

    Recorded before inference time

    Supported by abundant examples

    Numeric with meaningful magnitude

    Bring expert insight to the problem

    9.4 Selecting features for the DC taxi data set

    10 Adopting PyTorch Lightning

    10.1 Understanding PyTorch Lightning

    Converting PyTorch model training to PyTorch Lightning

    Enabling test and reporting for a trained model

    Enabling validation during model training

    11 Hyperparameter optimization

    11.1 Hyperparameter optimization with Optuna

    Understanding loguniform hyperparameters

    Using categorical and log-uniform hyperparameters

    11.2 Neural network layers configuration as a hyperparameter

    11.3 Experimenting with the batch normalization hyperparameter

    Using Optuna study for hyperparameter optimization

    Visualizing an HPO study in Optuna

    12 Machine learning pipeline

    12.1 Describing the machine learning pipeline

    12.2 Enabling PyTorch-distributed training support with Kaen

    Understanding PyTorch-distributed training settings

    12.3 Unit testing model training in a local Kaen container

    12.4 Hyperparameter optimization with Optuna

    Enabling MLFlow support

    Using HPO for DcTaxiModel in a local Kaen provider

    Training with the Kaen AWS provider

    Appendix A Introduction to machine learning

    Appendix B Getting started with Docker

    index

    front matter

    preface

    A useful piece of feedback that I got from a reviewer of this book was that it became a cheat code for them to scale the steep MLOps learning curve. I hope that the content of this book will help you become a better informed practitioner of machine learning engineering and data science, as well as a more productive contributor to your projects, your team, and your organization.

    In 2021, major technology companies are vocal about their efforts to democratize artificial intelligence (AI) by making technologies like deep learning more accessible to a broader population of scientists and engineers. Regrettably, the democratization approach taken by the corporations focuses too much on core technologies and not enough on the practice of delivering AI systems to end users. As a result, machine learning (ML) engineers and data scientists are well prepared to create experimental, proof-of-concept AI prototypes but fall short in successfully delivering these prototypes to production. This is evident from a wide spectrum of issues: from unacceptably high failure rates of AI projects to ethical controversies about AI systems that make it to end users. I believe that, to become successful, the effort to democratize AI must progress beyond the myopic focus on core, enabling technologies like Keras, PyTorch, and TensorFlow. MLOps emerged as a unifying term for the practice of taking experimental ML code and running it effectively in production. Serverless ML is the leading cloud-native software development model for ML and MLOps, abstracting away infrastructure and improving productivity of the practitioners.

    I also encourage you to make use of the Jupyter notebooks that accompany this book. The DC taxi fare project used in the notebook code is designed to give you the practice you need to grow as a practitioner. Happy reading and happy coding!

    acknowledgments

    I am forever grateful to my daughter, Sophia. You are my eternal source of happiness and inspiration. My wife, Alla, was boundlessly patient with me while I wrote my first book. You were always there to support me and to cheer me along. To my father, Mikhael, I wouldn’t be who I am without you.

    I also want to thank the people at Manning who made this book possible: Marina Michaels, my development editor; Frances Buontempo, my technical development editor; Karsten Strøbaek, my technical proofreader; Deirdre Hiam, my project editor; Michele Mitchell, my copyeditor; and Keri Hales, my proofreader.

    Many thanks go to the technical peer reviewers: Conor Redmond, Daniela Zapata, Dianshuang Wu, Dimitris Papadopoulos, Dinesh Ghanta, Dr. Irfan Ullah, Girish Ahankari, Jeff Hajewski, Jesús A. Juárez-Guerrero, Trichy Venkataraman Krishnamurthy, Lucian-Paul Torje, Manish Jain, Mario Solomou, Mathijs Affourtit, Michael Jensen, Michael Wright, Pethuru Raj Chelliah, Philip Kirkbride, Rahul Jain, Richard Vaughan, Sayak Paul, Sergio Govoni, Srinivas Aluvala, Tiklu Ganguly, and Todd Cook. Your suggestions helped make this a better book.

    about this book

    Thank you for purchasing MLOps Engineering at Scale.

    Who should read this book

    To get the most value from this book, you’ll want to have existing skills in data analysis with Python and SQL, as well as have some experience with machine learning. I expect that if you are reading this book, you are interested in developing your expertise as a machine learning engineer, and you are planning to deploy your machine learning—based prototypes to production.

    This book is for information technology professionals or those in academia who have had some exposure to machine learning and are working on or are interested in launching a machine learning system in production. There is a refresher on machine learning prerequisites for this book in appendix A. Keep in mind that if you are brand new to machine learning you may find that studying both machine learning and cloud-based infrastructure for machine learning at the same time can be overwhelming.

    If you are a software or a data engineer, and you are planning on starting a machine learning project, this book can help you gain a deeper understanding of the machine learning project life cycle. You will see that although the practice of machine learning depends on traditional information technologies (i.e., computing, storage, and networking), it is different from the traditional information technology in practice. The former is significantly more experimental and more iterative than you may have experienced as a software or a data professional, and you should be prepared for the outcomes to be less known in advance. When working with data, the machine learning practice is more like the scientific process, including forming hypotheses about data, testing alternative models to answer questions about the hypothesis, and ranking and choosing the best performing models to launch atop your machine learning platform.

    If you are a machine learning engineer or practitioner, or a data scientist, keep in mind that this book is not about making you a better researcher. The book is not written to educate you about the frontiers of science in machine learning. This book also will not attempt to reteach you the machine learning basics, although you may find the material in appendix A, targeted at information technology professionals, a useful reference. Instead, you should expect to use this book to become a more valuable collaborator on your machine learning team. The book will help you do more with what you already know about data science and machine learning so that you can deliver ready-to-use contributions to your project or your organization. For example, you will learn how to implement your insights about improving machine learning model accuracy and turn them into production-ready capabilities.

    How this book is organized: A road map

    This book is composed of three parts. In part 1, I chart out the landscape of what it takes to put a machine learning system in production, describe an engineering gap between experimental machine learning code and production machine learning systems, and explain how serverless machine learning can help bridge the gap. By the end of part 1, I’ll have taught you how to use serverless features of a public cloud (Amazon Web Services) to get started with a real-world machine learning use case, prepare a working machine learning data set for the use case, and ensure that you are prepared to apply machine learning to the use case.

    Chapter 1 presents a broad view on the field on machine learning systems engineering and what it takes to put the systems in production.

    Chapter 2 introduces you to the taxi trips data set for the Washington, DC, municipality and teaches you how to start using the data set for machine learning in the Amazon Web Services (AWS) public cloud.

    Chapter 3 applies the AWS Athena interactive query service to dig deeper into the data set, uncover data quality issues, and then address them through a rigorous and principled data quality assurance process.

    Chapter 4 demonstrates how to use statistical measures to summarize data set samples and to quantify their similarity to the entire data set. The chapter also covers how to pick the right size for your test, training, and validation data sets and use distributed processing in the cloud to prepare the data set samples for machine learning.

    In part 2, I teach you to use the PyTorch deep learning framework to develop models for a structured data set, explain how to distribute and scale up machine learning model training in the cloud, and show how to deploy trained machine learning models to scale with user demand. In the process, you’ll learn to evaluate and assess the performance of alternative machine learning model implementations and how to pick the right one for the use case.

    Chapter 5 covers the PyTorch fundamentals by introducing the core tensor application programming interface (API) and helping you gain a level of fluency with using the API.

    Chapter 6 focuses on the deep learning aspects of PyTorch, including support for automatic differentiation, alternative gradient descent algorithms, and supporting utilities.

    Chapter 7 explains how to scale up your PyTorch programs by teaching about the graphical processing unit (GPU) features and how to take advantage of them to accelerate your deep learning code.

    Chapter 8 teaches about data parallel approaches for distributed PyTorch training and covers, in-depth, the distinction between traditional, parameter, server-based approaches and the ring-based distributed training (e.g., Horovod).

    In part 3, I introduce you to the battle-tested techniques of machine learning practitioners and cover feature engineering, hyperparameter tuning, and machine learning pipeline assembly. By the conclusion of this book, you will have set up a machine learning platform that ingests raw data, prepares it for machine learning, applies feature engineering, and trains high-performance, hyperparameter-tuned machine learning models.

    Chapter 9 explores the use cases around feature selection and feature engineering, using case studies to build intuition about the features that can be selected or engineered for the DC taxi data set.

    Chapter 10 teaches how to eliminate boilerplate engineering code in your DC taxi PyTorch model implementation by adopting a framework called PyTorch Lightning. Also, the chapter navigates through the steps required to train, validate, and test your enhanced deep learning model.

    Chapter 11 integrates your deep learning model with an open-source hyperparameter optimization framework called Optuna, helping you train multiple models based on alternative hyperparameter values, and then ranking the trained models according to their loss and metric performance.

    Chapter 12 packages your deep learning model implementation into a Docker container in order to run it through the various stages of the entire machine learning pipeline, starting from the development data set all the way to a trained model ready for production deployment.

    About the code

    You can access the code for this book from my Github repository: github.com/osipov/smlbook. The code in this repository is packaged as Jupyter notebooks and is designed to be used in a Linux-based Jupyter notebook environment. This means that you have options when it comes to how you can execute the code. If you have your own, local Jupyter environment, for example, with the Jupyter native client (JupyterApp: https://github.com/jupyterlab/jupyterlab_app) or a Conda distribution (https://jupyter.org/install), that’s great! If you do not use a local Jupyter distribution, you can run the code from the notebooks using a cloud-based service such as Google Colab or Binder. My Github repository README.md file includes badges and hyperlinks to help you launch chapter-specific notebooks in Google Colab.

    I strongly urge you to use a local Jupyter installation as opposed to a cloud service, especially if you are worried about the security of your AWS account credentials. Some steps of the code will require you to use your AWS credentials for tasks like creating storage buckets, launching AWS Glue extract-transform-load (ETL) jobs, and more. The code for chapter 12 must be executed on node with Docker installed, so I recommend planning to use a local Jupyter installation on a laptop or a desktop where you have sufficient capacity to install Docker. You can find out more about Docker installation requirements in appendix B.

    liveBook discussion forum

    Purchase of MLOps for Engineering at Scale includes free access to liveBook, Manning’s online reading platform. Using liveBook’s exclusive discussion features, you can attach comments to the book globally or to specific sections or paragraphs. It’s a snap to make notes for yourself, ask and answer technical questions, and receive help from the author and other users.

    To access the forum, go to https://livebook.manning.com/#!/book/mlops-engineering-at-scale/discussion. Be sure to join the forum and say hi! You can also learn more about Manning’s forums and the rules of conduct at https://livebook.manning.com/#!/discussion.

    Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the author can take place. It is not a commitment to any specific amount of participation on the part of the author, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking the author some challenging questions lest his interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

    about the author

    about the cover illustration

    The figure on the cover of MLOps Engineering at Scale is captioned Femme du Thibet, or a woman of Tibet. The illustration is taken from a collection of dress costumes from various countries by Jacques Grasset de Saint-Sauveur (1757—1810), titled Costumes de Différents Pays, published in France in 1797. Each illustration is finely drawn and colored by hand. The rich variety of Grasset de Saint-Sauveur’s collection reminds us vividly of how culturally apart the world’s towns and regions were just 200 years ago. Isolated from each other, people spoke different dialects and languages. In the streets or in the countryside, it was easy to identify where they lived and what their trade or station in life was just by their dress.

    The way we dress has changed since then and the diversity by region, so rich at the time, has faded away. It is now hard to tell apart the inhabitants of different continents, let alone different towns, regions, or countries. Perhaps we have traded cultural diversity for a more varied personal life—certainly for a more varied and fast-paced technological life.

    At a time when it is hard to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Grasset de Saint-Sauveur’s pictures.

    Part 1 Mastering the data set

    Engineering an effective machine learning system depends on a thorough understanding of the project data set. If you have prior experience building machine learning models, you might be tempted to skip this step. After all, shouldn’t the machine learning algorithms automate the learning of the patterns from the data? However, as you are going to observe throughout this book, machine learning systems that succeed in production depend on a practitioner who understands the project data set and then applies human insights about the data in ways that modern algorithms can’t.

    1 Introduction to serverless machine learning

    This chapter covers

    What serverless machine learning is and why you should care

    The difference between machine learning code and a machine learning platform

    How this book teaches about serverless machine learning

    The target audience for this book

    What you can learn from this book

    A Grand Canyon—like gulf separates experimental machine learning code and production machine learning systems. The scenic view across the canyon is magical: when a machine learning system is running successfully in production it can seem prescient. The first time I started typing a query into a machine learning—powered autocomplete search bar and saw the system anticipate my words, I was hooked. I must have tried dozens of different queries to see how well the system worked. So, what does it take to trek across the canyon?

    It is surprisingly easy to get started. Given the right data and less than an hour of coding time, it is possible to write the experimental machine learning code and re-create the remarkable experience I have had using the search bar that predicted my words. In my conversations with information technology professionals, I find that many have started to experiment with machine learning. Online classes in machine learning, such as the one from Coursera and Andrew Ng, have a wealth of information about how to get started with machine learning basics. Increasingly, companies that hire for information technology jobs expect entry-level experience with machine learning.¹

    While it is relatively easy to experiment with machine learning, building on the results of the experiments to deliver products, services, or features has proven to be difficult. Some companies have even started to use the word unicorn to describe the unreasonably hard-to-find machine learning practitioners with the skills needed to launch production machine learning systems. Practitioners with successful launch experience often have skills that span machine learning, software engineering, and many information technology specialties.

    This book is for those who are interested in trekking the journey from experimental machine learning code to a production machine learning system. In this book, I will teach you how to assemble the components for a machine learning platform and use them as a foundation for your production machine learning system. In the process, you will learn:

    How to use and integrate public cloud services, including the ones from Amazon Web Services (AWS), for machine learning, including data ingest, storage, and processing

    How to assess and achieve data quality standards for machine learning from structured data

    How to engineer synthetic features to improve machine learning effectiveness

    How to reproducibly sample structured data into experimental subsets for exploration and analysis

    How to implement machine learning models using PyTorch and Python in a Jupyter notebook environment

    How to implement data processing and machine learning pipelines to achieve both high throughput and low latency

    How to train and deploy machine learning models that depend on data processing pipelines

    How to monitor and manage the life cycle of your machine learning system once it is put in production

    Why should you invest the time to learn these skills? They will not make you a renowned machine learning researcher or help you discover the next ground-breaking machine learning algorithm. However, if you learn from this book, you can prepare yourself to deliver the results of your machine learning efforts sooner and more productively, and grow to be a more valuable contributor to your machine learning project, team, or organization.

    1.1 What is a machine learning platform?

    If you have never heard of the phrase yak shaving as it is used in the information technology industry,² here’s a hypothetical example of how it may show up during a day in a life of a machine learning practitioner:

    My company wants our machine learning system to launch in a month . . . but it is taking us too long to train our machine learning models . . . so I should speed things up by enabling graphical processing units (GPUs) for training . . . but our GPU device drivers are incompatible with our machine learning framework . . . so I need to upgrade to the latest Linux device drivers for compatibility . . . which means that I need to be on the new version of the Linux distribution.

    There are many more similar possibilities in which you need to shave a yak to speed up machine learning. The contemporary practice of launching machine learning—based systems in production and keeping them running has too much in common with the yak-shaving story. Instead of focusing on the features needed to make the product a resounding success, too much engineering time is spent on apparently unrelated activities like re-installing Linux device drivers or searching the web for the right cluster settings to configure the data processing middleware.

    Why is that? Even if you have the expertise of machine learning PhDs on your project, you still need the support of many information technology services and resources to launch the system. Hidden Technical Debt in Machine Learning Systems, a peer-reviewed article published in 2015 and based on insights from dozens of machine learning practitioners at Google, advises that mature machine learning systems end up being (at most) 5% machine learning code (http://mng.bz/01jl).

    This book uses the phrase machine learning platform to describe the 95% that play a supporting yet critical role in the entire system. Having the right machine learning platform can make or break your product.

    If you take a closer look at figure 1.1, you should be able to describe some of the capabilities you need from a machine learning platform. Obviously, the platform needs to ingest and store data, process data (which includes applying machine learning and other computations to data), and serve the insights discovered by machine learning to the users of the platform. The less obvious observation is that the platform should be able to handle multiple, concurrent machine learning projects and enable multiple users to run the projects in isolation from each other. Otherwise, replacing only the machine learning code translates to reworking 95% of the system.

    01-01

    Figure 1.1 Although machine learning code is what makes your machine learning system stand out, it amounts to only about 5% of the system code according to the experiences described in Hidden Technical Debt in Machine Learning Systems by Google’s Sculley et al. Serverless machine learning helps you assemble the other 95% using cloud-based infrastructure.

    1.2 Challenges when designing a machine learning platform

    How much data should the platform be able to store and process? AcademicTorrents.com is a website dedicated to helping machine learning practitioners get access to public data sets suitable for machine learning. The website lists over 50 TB of data sets, of which the largest are 1—5 TB in size. Kaggle, a website popular for hosting data science competitions, includes data sets as large as 3 TB. You might be tempted to ignore the largest data sets as outliers and focus on more common data sets that are at the scale of gigabytes. However, you should keep in mind that successes in machine learning are often due to reliance on larger data sets. The Unreasonable Effectiveness of Data, by Peter Norvig et al. (http://mng.bz/5Zz4), argues in favor of the machine learning systems that can take advantage of larger data sets: simple models and a lot of data trump more elaborate models based on less data.

    A machine learning platform that is expected to operate on a scale of terabytes to petabytes of data for storage and processing must be built as a distributed computing system using multiple inter-networked servers in a cluster, each processing a part of the data set. Otherwise, a data set with hundreds of gigabytes to terabytes will cause out-of-memory problems when processed by a single server with a typical hardware configuration. Having a cluster of servers as part of a machine learning platform also addresses the input/output bandwidth limitations of individual servers. Most servers can supply a CPU with just a few gigabytes of data per second. This means that most types of data processing performed by a machine learning platform can be sped up by splitting up the data sets in chunks (sometimes called shards) that are processed in parallel by the servers in the cluster. The distributed systems design for a machine learning platform as described is commonly known as scaling out.

    A significant portion of figure 1.1 is the serving part of the infrastructure used in the platform. This is the part that exposes the data insights produced by the machine learning code to the users of the platform. If you have ever had your email provider classify your emails as spam or not spam, or if you have ever used a product recommendation feature of your favorite e-commerce website, you have interacted as a user with the serving infrastructure part of a machine learning platform. The serving infrastructure for a major email or an e-commerce provider needs to be capable of making the decisions for millions of users around the globe, millions of times a second. Of course, not every machine learning platform needs to operate at this scale. However, if you are planning to deliver a product based on machine learning, you need to keep in mind that it is within the realm of possibility for digital products and services to reach hundreds of millions of users in months. For example, Pokemon Go, a machine learning—powered video game from Niantic, reached half a billion users in less than two months.

    Is it prohibitively expensive to launch and operate a machine learning platform at scale? As recently as the 2000s, running a scalable machine learning platform would have required a significant upfront investment in servers, storage, networking as well as software and the expertise needed to build one. The first machine learning platform I worked on for a customer back in 2009 cost over $100,000 USD and was built using on-premises hardware and open source Apache Hadoop (and Mahout) middleware. In addition to upfront costs, machine learning platforms can be expensive to operate due to waste of resources: most machine learning code underutilizes the capacity of the platform. As you know, the training phase of machine learning is resource-intensive, leading to high utilization of computing, storage, and networking. However, trainings are intermittent and are relatively rare for a machine learning system in production, translating to low average utilization. Serving infrastructure utilization varies based on the specific use case for a machine learning system and fluctuates based on factors like time of day, seasonality, marketing events, and more.

    1.3 Public clouds for machine learning platforms

    The good news is that public cloud-computing infrastructure can help you create a machine learning platform and address the challenges described in the previous section. In particular, the approach described in this book will take advantage of public clouds from vendors like Amazon Web Services, Microsoft Azure, or Google Cloud to provide your machine learning platform with:

    Secure isolation so that multiple users of your platform can work in parallel with different machine learning projects and code

    Access to information technologies like data storage, computing, and networking when your projects need them and for as long as they are needed

    Metering based on consumption so that your machine learning projects are billed just for the resources you used

    This book will teach you how to create a machine learning platform from public cloud infrastructure using Amazon Web Services as the primary example. In particular, I will teach you:

    How to use public cloud services to cost effectively store data sets regardless of whether they are made of kilobytes of terabytes of data

    How to optimize the utilization and cost of your machine learning platform computing infrastructure so that you are using just the servers you need

    How to elastically scale your serving infrastructure to reduce the operational costs of your machine learning platform

    1.4 What is serverless machine learning?

    Serverless machine learning is a model for the software development of machine learning code written to run on a machine learning platform hosted in a cloud-computing infrastructure with consumption-based metering and billing.

    If a machine learning system

    Enjoying the preview?
    Page 1 of 1