Machine Learning Engineering on AWS: Build, scale, and secure machine learning systems and MLOps pipelines in production

Ebook994 pages6 hours

Machine Learning Engineering on AWS: Build, scale, and secure machine learning systems and MLOps pipelines in production

Name: Machine Learning Engineering on AWS: Build, scale, and secure machine learning systems and MLOps pipelines in production
Author: Joshua Arvin Lat
ISBN: 9781803231389

By Joshua Arvin Lat

Rating: 0 out of 5 stars

()

Read preview

About this ebook

There is a growing need for professionals with experience in working on machine learning (ML) engineering requirements as well as those with knowledge of automating complex MLOps pipelines in the cloud. This book explores a variety of AWS services, such as Amazon Elastic Kubernetes Service, AWS Glue, AWS Lambda, Amazon Redshift, and AWS Lake Formation, which ML practitioners can leverage to meet various data engineering and ML engineering requirements in production.
This machine learning book covers the essential concepts as well as step-by-step instructions that are designed to help you get a solid understanding of how to manage and secure ML workloads in the cloud. As you progress through the chapters, you’ll discover how to use several container and serverless solutions when training and deploying TensorFlow and PyTorch deep learning models on AWS. You’ll also delve into proven cost optimization techniques as well as data privacy and model privacy preservation strategies in detail as you explore best practices when using each AWS.
By the end of this AWS book, you'll be able to build, scale, and secure your own ML systems and pipelines, which will give you the experience and confidence needed to architect custom solutions using a variety of AWS services for ML engineering requirements.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateOct 27, 2022

ISBN9781803231389

Author

Joshua Arvin Lat

Related authors

Skip carousel

Related to Machine Learning Engineering on AWS

Related ebooks

Skip carousel

Computer Vision on AWS: Build and deploy real-world CV solutions with Amazon Rekognition, Lookout for Vision, and SageMaker
Ebook
Computer Vision on AWS: Build and deploy real-world CV solutions with Amazon Rekognition, Lookout for Vision, and SageMaker
byLauren Mullennex
Rating: 0 out of 5 stars
0 ratings
Learn Amazon SageMaker: A guide to building, training, and deploying machine learning models for developers and data scientists
Ebook
Learn Amazon SageMaker: A guide to building, training, and deploying machine learning models for developers and data scientists
byJulien Simon
Rating: 0 out of 5 stars
0 ratings
Automated Machine Learning with Microsoft Azure: Build highly accurate and scalable end-to-end AI solutions with Azure AutoML
Ebook
Automated Machine Learning with Microsoft Azure: Build highly accurate and scalable end-to-end AI solutions with Azure AutoML
byDennis Michael Sawyers
Rating: 0 out of 5 stars
0 ratings
Accelerate Deep Learning Workloads with Amazon SageMaker: Train, deploy, and scale deep learning models effectively using Amazon SageMaker
Ebook
Accelerate Deep Learning Workloads with Amazon SageMaker: Train, deploy, and scale deep learning models effectively using Amazon SageMaker
byVadim Dabravolski
Rating: 0 out of 5 stars
0 ratings
Mastering AWS CloudFormation: Build resilient and production-ready infrastructure in Amazon Web Services with CloudFormation
Ebook
Mastering AWS CloudFormation: Build resilient and production-ready infrastructure in Amazon Web Services with CloudFormation
byKaren Tovmasyan
Rating: 0 out of 5 stars
0 ratings
Getting Started with Amazon SageMaker Studio: Learn to build end-to-end machine learning projects in the SageMaker machine learning IDE
Ebook
Getting Started with Amazon SageMaker Studio: Learn to build end-to-end machine learning projects in the SageMaker machine learning IDE
byMichael Hsieh
Rating: 0 out of 5 stars
0 ratings
Amazon SageMaker Best Practices: Proven tips and tricks to build successful machine learning solutions on Amazon SageMaker
Ebook
Amazon SageMaker Best Practices: Proven tips and tricks to build successful machine learning solutions on Amazon SageMaker
bySireesha Muppala
Rating: 0 out of 5 stars
0 ratings
Hands-On Serverless Applications with Go: Build real-world, production-ready applications with AWS Lambda
Ebook
Hands-On Serverless Applications with Go: Build real-world, production-ready applications with AWS Lambda
byMohamed Labouardy
Rating: 0 out of 5 stars
0 ratings
Azure for Developers.: Implement rich Azure PaaS ecosystems using containers, serverless services, and storage solutions
Ebook
Azure for Developers.: Implement rich Azure PaaS ecosystems using containers, serverless services, and storage solutions
byKamil Mrzygłód
Rating: 0 out of 5 stars
0 ratings
Serverless Architectures with AWS: Discover how you can migrate from traditional deployments to serverless architectures with AWS
Ebook
Serverless Architectures with AWS: Discover how you can migrate from traditional deployments to serverless architectures with AWS
byMohit Gupta
Rating: 0 out of 5 stars
0 ratings
Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines
Ebook
Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines
byAlan Bernardo Palacio
Rating: 0 out of 5 stars
0 ratings
Google Cloud Certified Professional Cloud Developer Exam Guide: Modernize your applications using cloud-native services and best practices
Ebook
Google Cloud Certified Professional Cloud Developer Exam Guide: Modernize your applications using cloud-native services and best practices
bySebastian Moreno
Rating: 0 out of 5 stars
0 ratings
Accelerating DevSecOps on AWS: Create secure CI/CD pipelines using Chaos and AIOps
Ebook
Accelerating DevSecOps on AWS: Create secure CI/CD pipelines using Chaos and AIOps
byNikit Swaraj
Rating: 0 out of 5 stars
0 ratings
Learn AWS Serverless Computing: A beginner's guide to using AWS Lambda, Amazon API Gateway, and services from Amazon Web Services
Ebook
Learn AWS Serverless Computing: A beginner's guide to using AWS Lambda, Amazon API Gateway, and services from Amazon Web Services
byScott Patterson
Rating: 0 out of 5 stars
0 ratings
Machine Learning on Kubernetes: A practical handbook for building and using a complete open source machine learning platform on Kubernetes
Ebook
Machine Learning on Kubernetes: A practical handbook for building and using a complete open source machine learning platform on Kubernetes
byFaisal Masood
Rating: 0 out of 5 stars
0 ratings
Learn CloudFormation: Write, deploy, and maintain your AWS infrastructure
Ebook
Learn CloudFormation: Write, deploy, and maintain your AWS infrastructure
byAgus Kurniawan
Rating: 0 out of 5 stars
0 ratings
AWS Certified DevOps Engineer - Professional Certification and Beyond: Pass the DOP-C01 exam and prepare for the real world using case studies and real-life examples
Ebook
AWS Certified DevOps Engineer - Professional Certification and Beyond: Pass the DOP-C01 exam and prepare for the real world using case studies and real-life examples
byAdam Book
Rating: 0 out of 5 stars
0 ratings
Serverless ETL and Analytics with AWS Glue: Your comprehensive reference guide to learning about AWS Glue and its features
Ebook
Serverless ETL and Analytics with AWS Glue: Your comprehensive reference guide to learning about AWS Glue and its features
byVishal Pathak
Rating: 0 out of 5 stars
0 ratings
Actionable Insights with Amazon QuickSight: Develop stunning data visualizations and machine learning-driven insights with Amazon QuickSight
Ebook
Actionable Insights with Amazon QuickSight: Develop stunning data visualizations and machine learning-driven insights with Amazon QuickSight
byManos Samatas
Rating: 0 out of 5 stars
0 ratings
Hands-On Azure for Developers: Implement rich Azure PaaS ecosystems using containers, serverless services, and storage solutions
Ebook
Hands-On Azure for Developers: Implement rich Azure PaaS ecosystems using containers, serverless services, and storage solutions
byKamil Mrzygłód
Rating: 0 out of 5 stars
0 ratings
Scalable Data Architecture with Java: Build efficient enterprise-grade data architecting solutions using Java
Ebook
Scalable Data Architecture with Java: Build efficient enterprise-grade data architecting solutions using Java
bySinchan Banerjee
Rating: 0 out of 5 stars
0 ratings
AWS for System Administrators: Build, automate, and manage your infrastructure on the most popular cloud platform – AWS
Ebook
AWS for System Administrators: Build, automate, and manage your infrastructure on the most popular cloud platform – AWS
byPrashant Lakhera
Rating: 0 out of 5 stars
0 ratings
The Machine Learning Solutions Architect Handbook: Create machine learning platforms to run solutions in an enterprise setting
Ebook
The Machine Learning Solutions Architect Handbook: Create machine learning platforms to run solutions in an enterprise setting
byDavid Ping
Rating: 0 out of 5 stars
0 ratings
Infrastructure Monitoring with Amazon CloudWatch: Effectively optimize resource allocation, detect anomalies, and set automated actions on AWS
Ebook
Infrastructure Monitoring with Amazon CloudWatch: Effectively optimize resource allocation, detect anomalies, and set automated actions on AWS
byEwere Diagboya
Rating: 0 out of 5 stars
0 ratings
Learning AWS: Design, build, and deploy responsive applications using AWS Cloud components, 2nd Edition
Ebook
Learning AWS: Design, build, and deploy responsive applications using AWS Cloud components, 2nd Edition
bySarkar Aurobindo
Rating: 0 out of 5 stars
0 ratings
AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide: The definitive guide to passing the MLS-C01 exam on the very first attempt
Ebook
AWS Certified Machine Learning Specialty: MLS-C01 Certification Guide: The definitive guide to passing the MLS-C01 exam on the very first attempt
bySomanath Nanda
Rating: 0 out of 5 stars
0 ratings
Rapid Application Development with AWS Amplify: Full stack web development on Amazon Web Servics
Ebook
Rapid Application Development with AWS Amplify: Full stack web development on Amazon Web Servics
byAdrian Leung
Rating: 0 out of 5 stars
0 ratings
Implementing AWS: Design, Build, and Manage your Infrastructure: Leverage AWS features to build highly secure, fault-tolerant, and scalable cloud environments
Ebook
Implementing AWS: Design, Build, and Manage your Infrastructure: Leverage AWS features to build highly secure, fault-tolerant, and scalable cloud environments
byYohan Wadia
Rating: 0 out of 5 stars
0 ratings
Learning AWS
Ebook
Learning AWS
byAmit Shah
Rating: 4 out of 5 stars
4/5
Mastering Flask Web and API Development: Build and deploy production-ready Flask apps seamlessly across web, APIs, and mobile platforms
Ebook
Mastering Flask Web and API Development: Build and deploy production-ready Flask apps seamlessly across web, APIs, and mobile platforms
bySherwin John C. Tragura
Rating: 0 out of 5 stars
0 ratings

Data Modeling & Design For You

Skip carousel

Data Visualization: a successful design process
Ebook
Data Visualization: a successful design process
byAndy Kirk
Rating: 4 out of 5 stars
4/5
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
Ebook
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Deep Learning: An Essential Guide to Deep Learning for Beginners Who Want to Understand How Deep Neural Networks Work and Relate to Machine Learning and Artificial Intelligence
Ebook
Deep Learning: An Essential Guide to Deep Learning for Beginners Who Want to Understand How Deep Neural Networks Work and Relate to Machine Learning and Artificial Intelligence
byHerbert Jones
Rating: 5 out of 5 stars
5/5
Thinking in Algorithms: Strategic Thinking Skills, #2
Ebook
Thinking in Algorithms: Strategic Thinking Skills, #2
byAlbert Rutherford
Rating: 5 out of 5 stars
5/5
Mastering Agile User Stories
Ebook
Mastering Agile User Stories
byDeEtta Balthazar
Rating: 4 out of 5 stars
4/5
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
Data Analytics for Beginners: Introduction to Data Analytics
Ebook
Data Analytics for Beginners: Introduction to Data Analytics
byAnthony S. Williams
Rating: 4 out of 5 stars
4/5
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
Ebook
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
byRob Collie
Rating: 4 out of 5 stars
4/5
DAX Patterns: Second Edition
Ebook
DAX Patterns: Second Edition
byMarco Russo
Rating: 5 out of 5 stars
5/5
Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning
Ebook
Neural Networks for Beginners: An Easy-to-Follow Introduction to Artificial Intelligence and Deep Learning
byBrian Murray
Rating: 2 out of 5 stars
2/5
Tableau Cookbook – Recipes for Data Visualization
Ebook
Tableau Cookbook – Recipes for Data Visualization
byShweta Sankhe-Savale
Rating: 0 out of 5 stars
0 ratings
Learning Social Media Analytics with R
Ebook
Learning Social Media Analytics with R
byDipanjan Sarkar
Rating: 0 out of 5 stars
0 ratings
Living in Data: A Citizen's Guide to a Better Information Future
Ebook
Living in Data: A Citizen's Guide to a Better Information Future
byJer Thorp
Rating: 4 out of 5 stars
4/5
Data Analytics with Python: Data Analytics in Python Using Pandas
Ebook
Data Analytics with Python: Data Analytics in Python Using Pandas
byFrank Millstein
Rating: 3 out of 5 stars
3/5
Managing Data Using Excel
Ebook
Managing Data Using Excel
byMark Gardener
Rating: 5 out of 5 stars
5/5
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps
Ebook
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps
byJason Scotts
Rating: 3 out of 5 stars
3/5
Learning Cypher
Ebook
Learning Cypher
byOnofrio Panzarino
Rating: 0 out of 5 stars
0 ratings
WordPress For Beginners - How To Set Up A Self Hosted WordPress Blog
Ebook
WordPress For Beginners - How To Set Up A Self Hosted WordPress Blog
byCyrus Jackson
Rating: 0 out of 5 stars
0 ratings
Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code
Ebook
Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code
byPedro Lopes
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
Ebook
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
byBrandon Railey
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
Ebook
Hands-On Data Science for Marketing: Improve your marketing strategies with machine learning using Python and R
byYoon Hyup Hwang
Rating: 5 out of 5 stars
5/5
Advanced Splunk
Ebook
Advanced Splunk
byAshish Kumar Tulsiram Yadav
Rating: 5 out of 5 stars
5/5
Mastering Python Design Patterns
Ebook
Mastering Python Design Patterns
bySakis Kasampalis
Rating: 0 out of 5 stars
0 ratings
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
Ebook
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
byMatt Allington
Rating: 5 out of 5 stars
5/5
Tailoring Prompts For Success - The Ultimate ChatGPT Prompt Engineering Guide
Ebook
Tailoring Prompts For Success - The Ultimate ChatGPT Prompt Engineering Guide
byMichael Ferguson
Rating: 3 out of 5 stars
3/5
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
Ebook
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
byAlbert Rutherford
Rating: 0 out of 5 stars
0 ratings
AI and UX: Why Artificial Intelligence Needs User Experience
Ebook
AI and UX: Why Artificial Intelligence Needs User Experience
byGavin Lew
Rating: 0 out of 5 stars
0 ratings
AI-Driven Data Engineering
Ebook
AI-Driven Data Engineering
byChuck Sherman
Rating: 0 out of 5 stars
0 ratings
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
Ebook
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
byMichael Blake
Rating: 5 out of 5 stars
5/5
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

#608: Generative AI Roundup - August 2023: Simon takes you on a tour of your GenAI options. From software development, to AI policy, to trialli
Podcast episode
#608: Generative AI Roundup - August 2023: Simon takes you on a tour of your GenAI options. From software development, to AI policy, to trialli
byAWS Podcast
0 ratings
0% found this document useful
Cloud SQL Insights with Nimesh Bhagat: This week on the podcast, Mark Mirchandani and Gabi Ferrara talk with Nimesh Bhagat about Cloud SQL Insights.
Podcast episode
Cloud SQL Insights with Nimesh Bhagat: This week on the podcast, Mark Mirchandani and Gabi Ferrara talk with Nimesh Bhagat about Cloud SQL Insights.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
ML Lifecycle with Dale Markowitz and Craig Wiley: Jenny Brown co-hosts with Mark Mirchandani this week for a great conversation about the ML lifecycle with our guests Craig Wiley and Dale Markowitz.
Podcast episode
ML Lifecycle with Dale Markowitz and Craig Wiley: Jenny Brown co-hosts with Mark Mirchandani this week for a great conversation about the ML lifecycle with our guests Craig Wiley and Dale Markowitz.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
#302: March 2019 Update Show Part One: Simon is joined by new update co-host Nicki Klein (Senior Technical Evangelist, AWS) to discuss all
Podcast episode
#302: March 2019 Update Show Part One: Simon is joined by new update co-host Nicki Klein (Senior Technical Evangelist, AWS) to discuss all
byAWS Podcast
100%
100% found this document useful
Stop Embedding Credentials: This week in security news: some harsh realities about ransomware that we should be aware of, Twitch has had a breach, and so has Robinhood, but that is to be saved for another day. Tune in for the rest!
Podcast episode
Stop Embedding Credentials: This week in security news: some harsh realities about ransomware that we should be aware of, Twitch has had a breach, and so has Robinhood, but that is to be saved for another day. Tune in for the rest!
byAWS Morning Brief
0 ratings
0% found this document useful
Cloud Dataflow with Frances Perry: Cloud Dataflow and its OSS counterpart Apache Beam are amazing tools for Big Data so we asked Frances Perry, the Tech Lead and PMC for those projects, to join us and tell us more about it.
Podcast episode
Cloud Dataflow with Frances Perry: Cloud Dataflow and its OSS counterpart Apache Beam are amazing tools for Big Data so we asked Frances Perry, the Tech Lead and PMC for those projects, to join us and tell us more about it.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
SAP + Apigee: The Power of APIs with Benjamin Schuler and Dave Feuer: Max Saltonstall and Carter Morgan co-host the podcast this week and talk APIs with our guests Benjamin Schuler and Dave Feuer.
Podcast episode
SAP + Apigee: The Power of APIs with Benjamin Schuler and Dave Feuer: Max Saltonstall and Carter Morgan co-host the podcast this week and talk APIs with our guests Benjamin Schuler and Dave Feuer.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
#532: [INTRODUCING] Amazon CodeWhisperer: While the Cloud has democratized application development by giving on-demand access to compute, stor
Podcast episode
#532: [INTRODUCING] Amazon CodeWhisperer: While the Cloud has democratized application development by giving on-demand access to compute, stor
byAWS Podcast
0 ratings
0% found this document useful
#295: Creating a DevOps Workflow for Machine Learning Models: Training your ML model is just one part of the puzzle; retaining it and deploying into production on
Podcast episode
#295: Creating a DevOps Workflow for Machine Learning Models: Training your ML model is just one part of the puzzle; retaining it and deploying into production on
byAWS Podcast
0 ratings
0% found this document useful
Cloud SQL with Amy Krishnamohan: We're learning all about Cloud SQL this week with our guest, Amy Krishnamohan.
Podcast episode
Cloud SQL with Amy Krishnamohan: We're learning all about Cloud SQL this week with our guest, Amy Krishnamohan.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Machine in Production = Data Engineering + ML + Software Engineering // Satish Chandra Gupta // MLOps Coffee Sessions #16
Podcast episode
Machine in Production = Data Engineering + ML + Software Engineering // Satish Chandra Gupta // MLOps Coffee Sessions #16
byMLOps.community
0 ratings
0% found this document useful
ASW Tranium and Inferentia // Kamran Khan and Matthew McClean // #238
Podcast episode
ASW Tranium and Inferentia // Kamran Khan and Matthew McClean // #238
byMLOps.community
0 ratings
0% found this document useful
Cloud AutoML Vision with Amy Unruh and Sara Robinson: Amy Unruh and Sara Robinson share about the launch of Cloud AutoML Vision.
Podcast episode
Cloud AutoML Vision with Amy Unruh and Sara Robinson: Amy Unruh and Sara Robinson share about the launch of Cloud AutoML Vision.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
OpenTelemetry for Databases: Empowering DevOps through sqlcommenter with Nimesh Bhagat: Optimizing or debugging database calls has to become as easy as optimizing your application code based on logs, metrics or traces your observability platform provides to developers. It has to be doable by the development and DevOps teams who are...
Podcast episode
OpenTelemetry for Databases: Empowering DevOps through sqlcommenter with Nimesh Bhagat: Optimizing or debugging database calls has to become as easy as optimizing your application code based on logs, metrics or traces your observability platform provides to developers. It has to be doable by the development and DevOps teams who are...
byPurePerformance
0 ratings
0% found this document useful
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
Understanding Machine Learning Features and Platforms
Podcast episode
Understanding Machine Learning Features and Platforms
byThe Cloudcast
0 ratings
0% found this document useful
Streaming alternatives to Kafka
Podcast episode
Streaming alternatives to Kafka
byThe Cloudcast
0 ratings
0% found this document useful
Security Awareness Training in Five Minutes: This week in security news: move fast and break things does just that, three ways to improve your organizations cyber security awareness, Corey offers some Twitter musing on security, re:Quinnvent is returning yet again! Say tuned for more to come there a
Podcast episode
Security Awareness Training in Five Minutes: This week in security news: move fast and break things does just that, three ways to improve your organizations cyber security awareness, Corey offers some Twitter musing on security, re:Quinnvent is returning yet again! Say tuned for more to come there a
byAWS Morning Brief
0 ratings
0% found this document useful
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
Podcast episode
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
byData Engineering Podcast
0 ratings
0% found this document useful
Episode 495 - How APIM powers Backbase's Grand Central: Hearing how our ISVs are using Azure in innovative ways is always eye-opening. So, we are fortunate to have Saquib Rashid, a Technical Strategist in the Azure partner organization, give us insights into how Backbase used APIM and the Azure Service Operator to integrate with their customers and backend systems.   Media file: https://azpodcast.blob.core.windows.net/episodes/Episode495.mp3  YouTube: https://youtu.be/qFwOoSNpDME Resources:   Backbase Grand Central Microsoft Customer Story-Backbase seeks to usher in a new era of engagement banking with its Grand Central platform and Azure API Management APIM links: Azure API Management - v2 tiers | Microsoft Learn ASO links: Azure/azure-service-operator: Azure Service Operator allows you to create Azure resources using kubectl (github.com) Azure Service Operator v2   Other updates: Public Preview - Azure Compute Fleet | Azure updates | Microsoft Azure http
Podcast episode
Episode 495 - How APIM powers Backbase's Grand Central: Hearing how our ISVs are using Azure in innovative ways is always eye-opening. So, we are fortunate to have Saquib Rashid, a Technical Strategist in the Azure partner organization, give us insights into how Backbase used APIM and the Azure Service Operator to integrate with their customers and backend systems.   Media file: https://azpodcast.blob.core.windows.net/episodes/Episode495.mp3  YouTube: https://youtu.be/qFwOoSNpDME Resources:   Backbase Grand Central Microsoft Customer Story-Backbase seeks to usher in a new era of engagement banking with its Grand Central platform and Azure API Management APIM links: Azure API Management - v2 tiers | Microsoft Learn ASO links: Azure/azure-service-operator: Azure Service Operator allows you to create Azure resources using kubectl (github.com) Azure Service Operator v2   Other updates: Public Preview - Azure Compute Fleet | Azure updates | Microsoft Azure http
byThe Azure Podcast
0 ratings
0% found this document useful
Rainforest QA with Russell Smith: Russell Smith, cofounder and CTO of Rainforest QA, joins the podcast to explain how they power their analytics platform with BigQuery, streaming thousands of rows per second.
Podcast episode
Rainforest QA with Russell Smith: Russell Smith, cofounder and CTO of Rainforest QA, joins the podcast to explain how they power their analytics platform with BigQuery, streaming thousands of rows per second.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Episode 449 - Java Jakarta EE Applications: Reza Rahman, a Principal Program Manager for Java on Azure talks to us about the work his team has done to enable customers using legacy Java frameworks to easily migrate to Azure in a variety of ways. Media file: https://azpodcast.blob.core.windows.net/episodes/Episode449.mp3 YouTube: https://youtu.be/OqjDtcT6c9E Resources: https://speakerdeck.com/reza_rahman/why-jakarta-ee-developers-are-first-class-citizens-on-azure https://learn.microsoft.com/en-us/azure/developer/java/ee/ https://aka.ms/migration-survey   Other updates: Public preview: Call automation capabilities for Azure Communication Services | Azure updates | Microsoft Azure   Improve speech-to-text accuracy with Azure Custom Speech https://azure.microsoft.com/en-us/blog/improve-speechtotext-accuracy-with-azure-custom-speech/   Microsoft and Isovalent partner to bring next generation eBPF dataplane for cloud-native applications in Azure https://azure.microso
Podcast episode
Episode 449 - Java Jakarta EE Applications: Reza Rahman, a Principal Program Manager for Java on Azure talks to us about the work his team has done to enable customers using legacy Java frameworks to easily migrate to Azure in a variety of ways. Media file: https://azpodcast.blob.core.windows.net/episodes/Episode449.mp3 YouTube: https://youtu.be/OqjDtcT6c9E Resources: https://speakerdeck.com/reza_rahman/why-jakarta-ee-developers-are-first-class-citizens-on-azure https://learn.microsoft.com/en-us/azure/developer/java/ee/ https://aka.ms/migration-survey   Other updates: Public preview: Call automation capabilities for Azure Communication Services | Azure updates | Microsoft Azure   Improve speech-to-text accuracy with Azure Custom Speech https://azure.microsoft.com/en-us/blog/improve-speechtotext-accuracy-with-azure-custom-speech/   Microsoft and Isovalent partner to bring next generation eBPF dataplane for cloud-native applications in Azure https://azure.microso
byThe Azure Podcast
0 ratings
0% found this document useful
Evolution of Public Cloud Integrators
Podcast episode
Evolution of Public Cloud Integrators
byThe Cloudcast
0 ratings
0% found this document useful
Using Artificial Intelligence to Redefine Customer Appreciation and Customer Relationships
Podcast episode
Using Artificial Intelligence to Redefine Customer Appreciation and Customer Relationships
byThe Secret To Success with Antonio T Smith Jr
0 ratings
0% found this document useful
Episode 464 - Azure Deployment Environments: Cale and Russell talk to the Microsoft Program Manager for DevBox and Azure Deployment Environments, Sagar Chandra Reddy Lankala, about how Azure Deployment Environments can enable rapid deployment of on-demand dev/test environments while providing governance, security and cost management - plus some more updates from Microsoft Build 2023! Media File: https://azpodcast.blob.core.windows.net/episodes/Episode464.mp3 Sagar's links: GA blog - https://aka.ms/ade-ga-blog Sign up for Terraform support - https://aka.ms/ade-terraform-signup LinkedIn profile - https://www.linkedin.com/in/sagarchandrareddy Other updates mentioned in this episode: Public preview: Introducing NGads V620-series VMs optimized for cloud gaming | Azure updates | Microsoft Azure Generally available: Azure Data Explorer Kusto Emulator on Linux | Azure updates | Microsoft Azure Explore the latest features for Datadog—An Azure Native ISV Service Microsoft Cost Management updates
Podcast episode
Episode 464 - Azure Deployment Environments: Cale and Russell talk to the Microsoft Program Manager for DevBox and Azure Deployment Environments, Sagar Chandra Reddy Lankala, about how Azure Deployment Environments can enable rapid deployment of on-demand dev/test environments while providing governance, security and cost management - plus some more updates from Microsoft Build 2023! Media File: https://azpodcast.blob.core.windows.net/episodes/Episode464.mp3 Sagar's links: GA blog - https://aka.ms/ade-ga-blog Sign up for Terraform support - https://aka.ms/ade-terraform-signup LinkedIn profile - https://www.linkedin.com/in/sagarchandrareddy Other updates mentioned in this episode: Public preview: Introducing NGads V620-series VMs optimized for cloud gaming | Azure updates | Microsoft Azure Generally available: Azure Data Explorer Kusto Emulator on Linux | Azure updates | Microsoft Azure Explore the latest features for Datadog—An Azure Native ISV Service Microsoft Cost Management updates
byThe Azure Podcast
0 ratings
0% found this document useful
Episode 38: Must be Willing to Defeat the JSON Heretics: Do you understand how tabs work? How spaces work? Are you willing to defeat the JSON heretics? Most people understand the power of the serverless paradigm, but need help to put it into a useful form. That’s where Stackery comes in to treat YAML as an ass
Podcast episode
Episode 38: Must be Willing to Defeat the JSON Heretics: Do you understand how tabs work? How spaces work? Are you willing to defeat the JSON heretics? Most people understand the power of the serverless paradigm, but need help to put it into a useful form. That’s where Stackery comes in to treat YAML as an ass
byScreaming in the Cloud
0 ratings
0% found this document useful
The Power of Serverless with Aparna Sinha and Philip Beevers: On the show this week, Mark Mirchandani joins Stephanie Wong to talk about serverless computing and the Cloud OnAir Serverless event with our guests.
Podcast episode
The Power of Serverless with Aparna Sinha and Philip Beevers: On the show this week, Mark Mirchandani joins Stephanie Wong to talk about serverless computing and the Cloud OnAir Serverless event with our guests.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
ConnectWise Security 360, Asana's AI Workflows, Raspberry Pi's AI Kit, and Google's AI Overviews
Podcast episode
ConnectWise Security 360, Asana's AI Workflows, Raspberry Pi's AI Kit, and Google's AI Overviews
byBusiness of Tech: Daily 10-Minute IT Services Insights
0 ratings
0% found this document useful
SQL Commenter with Nimesh Bhagat and Morgan McLean: First time co-host joins this week to talk about database observability and the cool tools that make it possible. Morgan McLean and Nimesh Bhagat describe database observability, which uses metrics, logs, and other tools to help users understand the...
Podcast episode
SQL Commenter with Nimesh Bhagat and Morgan McLean: First time co-host joins this week to talk about database observability and the cool tools that make it possible. Morgan McLean and Nimesh Bhagat describe database observability, which uses metrics, logs, and other tools to help users understand the...
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
#311: Live Audience Update Show from Sydney Summit: Simon & Nicki are joined by a live audience to record a great set of cool updates for customers! Ch
Podcast episode
#311: Live Audience Update Show from Sydney Summit: Simon & Nicki are joined by a live audience to record a great set of cool updates for customers! Ch
byAWS Podcast
0 ratings
0% found this document useful

Skip carousel

Three Low-code Options
PC Pro Magazine
Article
Three Low-code Options
Nov 12, 2020
Counting Intel, Vodafone and VW among its customers, OutSystems helps businesses create cloudbased, on-premises and hybrid applications for mobile and web. Its development environment is predominantly drag-and-drop, with views for processes, data and
3 min read
FLASK Web Frameworks
Linux Format
Article
FLASK Web Frameworks
Jun 4, 2019
The main focus of Python has always been to get you cracking on with your coding – the language was never made for web programming. However, this has just made it more interesting to extend the language for the web, or to create an interface to web-b
9 min read
AI As A Service
PC Pro Magazine
Article
AI As A Service
Jul 9, 2020
2 min read
Code An Admin Back-end In Django
Linux Format
Article
Code An Admin Back-end In Django
Dec 13, 2022
Credit: www.djangoproject.com OUR EXPERT Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://
6 min read
PyScript – Bring Python Coding To The Web
APC
Article
PyScript – Bring Python Coding To The Web
Aug 8, 2022
4 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
Leveling Up Remote Management
Residential Tech Today
Article
Leveling Up Remote Management
Sep 29, 2023
2 min read
5 Tools That Integrate Your Cloud Storage Into Windows File Explorer
PCWorld
Article
5 Tools That Integrate Your Cloud Storage Into Windows File Explorer
Apr 30, 2024
6 min read
Real World Computing
PC Pro Magazine
Article
Real World Computing
May 11, 2023
Migrating to Azure isn’t necessarily the toughest part of a successful cloud migration, explains our guest columnist Many organisations succeed at deploying resources in or migrating to Microsoft Azure. But many of those same organisations fail to en
6 min read
AWS vs Azure
Linux Format
Article
AWS vs Azure
Aug 22, 2023
9 min read
IDrive Business
PC Pro Magazine
Article
IDrive Business
Aug 7, 2022
2 min read
ManageEngine OpManager Professional 12.7
PC Pro Magazine
Article
ManageEngine OpManager Professional 12.7
Feb 8, 2024
2 min read
2 Build A Photography Website In Drupal
Digital Camera World
Article
2 Build A Photography Website In Drupal
Aug 21, 2020
4 min read
How Netflix’s OTT Architecture Functions?
Techfastly
Article
How Netflix’s OTT Architecture Functions?
May 1, 2022
With so many OTT platforms in the market today, Netflix has managed to capture a majority of the audience on a global scale. Netflix has become the go-to source of so much entertainment for consumers in less than 20 years. It can even be said that Ne
4 min read
AWS Vs Azure What’s The Difference?
PC Pro Magazine
Article
AWS Vs Azure What’s The Difference?
Sep 11, 2022
7 min read
There’s A New Career In Town
True Love
Article
There’s A New Career In Town
Oct 21, 2019
2 min read
“We’re Learning As We Go And Accepting Any False Starts As Being A Part Of The Process”
PC Pro Magazine
Article
“We’re Learning As We Go And Accepting Any False Starts As Being A Part Of The Process”
Jul 8, 2021
6 min read
How We Tested…
Linux Format
Article
How We Tested…
May 4, 2021
Web browsers are among the most intuitive applications so we won’t test them on available documentation, even though this is an area where Falkon is severely lacking. As always, we want a feature-rich browser that can suitably replace the default off
1 min read
Discover Easy-to -build Desktop Apps
Linux Format
Article
Discover Easy-to -build Desktop Apps
Oct 22, 2019
Electron is actually a browser packaged with node.js and a few APIs. Because it’s built on top of the Chromium browser, you have everything available from there to add to your application. GitHub developed it as part of the Atom editor; it was open-s
7 min read
Retrospect Backup 17
PC Pro Magazine
Article
Retrospect Backup 17
Jul 9, 2020
2 min read
Assessing Ease Of Use
Linux Format
Article
Assessing Ease Of Use
Jul 28, 2020
3 min read
Collect And Graph Metrics With Python
Linux Format
Article
Collect And Graph Metrics With Python
May 4, 2021
7 min read
Retrospect Backup 18.1
PC Pro Magazine
Article
Retrospect Backup 18.1
Oct 7, 2021
2 min read
Top 10 Apps For Bloggers
iCreate
Article
Top 10 Apps For Bloggers
Jun 17, 2021
3 min read
Build A Dynamic App Security Pipeline
Linux Format
Article
Build A Dynamic App Security Pipeline
Sep 21, 2021
8 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
Opinion
Linux Format
Article
Opinion
Aug 20, 2024
Italo Vignoli is one of the founders of LibreOffice and the Document Foundation. “Think about the personal and confidential information in your office suite documents; it’s essential your office suite respects user privacy. LibreOffice does not ask y
3 min read
Set Up A Secure Password Manager
Linux Format
Article
Set Up A Secure Password Manager
Feb 11, 2020
7 min read
1Password 8
APC
Article
1Password 8
Mar 21, 2022
2 min read
Supercomputer On A Platter
Business Today
Article
Supercomputer On A Platter
Apr 1, 2022
CHENNAI-HEADQUARTERED automobile major TVS Motor Company uses high-performance computing (HPC) for running R&D simulations and testing the aero-dynamics of two-wheelers, which allows it to make the vehicles stable at speed and more efficient, cool en
7 min read

Related categories

Skip carousel

Reviews for Machine Learning Engineering on AWS

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Machine Learning Engineering on AWS - Joshua Arvin Lat

9781803247595cov_Highres.png

BIRMINGHAM—MUMBAI

Machine Learning Engineering on AWS

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Publishing Product Manager: Ali Abidi

Content Development Editor: Priyanka Soam

Technical Editor: Devanshi Ayare

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Sejal Dsilva

Production Designer: Ponraj Dhandapani

Marketing Coordinators: Shifa Ansari

First published: October 2022

Production reference: 1290922

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80324-759-5

www.packt.com

Contributors

About the author

Joshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO of three Australian-owned companies and also served as Director of Software Development and Engineering for multiple e-commerce start-ups in the past, which allowed him to be more effective as a leader. Years ago, he and his team won first place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and has shared his knowledge at several international conferences, discussing practical strategies on machine learning, engineering, security, and management.

About the reviewers

Raphael Jambalos manages the Cloud-Native Development Team at eCloudValley, Philippines. His team architects and implements solutions that leverage AWS services to deliver reliable applications. He is also a community leader for the AWS user group MegaManila, organizing monthly meetups and growing the community. In his free time, he loves to read books and write about tech on his blog (https://dev.to/raphael_jambalos). He holds five AWS certifications and is an AWS APN Ambassador for the Philippines. He was also a technical reviewer for the Packt book Machine Learning with Amazon SageMaker Cookbook.

Sophie Soliven is the General Manager of E-commerce Services and Dropship for BeautyMnl. As one of the pioneers and leaders of the company, she contributed to its growth from its humble beginnings to what it is today – the biggest homegrown e-commerce platform in the Philippines – by using a data-driven approach to scale its operations. She has obtained a number of certifications on data analytics and cloud computing, including Microsoft Power BI Data Analyst Associate, Tableau Desktop Specialist, and AWS Certified Cloud Practitioner. For the last couple of years, she has been sharing her knowledge and experience in data-driven operations at local and international conferences and events.

Table of Contents

Preface

Part 1: Getting Started with Machine Learning Engineering on AWS

Introduction to ML Engineering on AWS

Technical requirements

What is expected from ML engineers?

How ML engineers can get the most out of AWS

Essential prerequisites

Creating the Cloud9 environment

Increasing Cloud9’s storage

Installing the Python prerequisites

Preparing the dataset

Generating a synthetic dataset using a deep learning model

Exploratory data analysis

Train-test split

Uploading the dataset to Amazon S3

AutoML with AutoGluon

Setting up and installing AutoGluon

Performing your first AutoGluon AutoML experiment

Getting started with SageMaker and SageMaker Studio

Onboarding with SageMaker Studio

Adding a user to an existing SageMaker Domain

No-code machine learning with SageMaker Canvas

AutoML with SageMaker Autopilot

Summary

Further reading

Deep Learning AMIs

Technical requirements

Getting started with Deep Learning AMIs

Launching an EC2 instance using a Deep Learning AMI

Locating the framework-specific DLAMI

Choosing the instance type

Ensuring a default secure configuration

Launching the instance and connecting to it using EC2 Instance Connect

Downloading the sample dataset

Training an ML model

Loading and evaluating the model

Cleaning up

Understanding how AWS pricing works for EC2 instances

Using multiple smaller instances to reduce the overall cost of running ML workloads

Using spot instances to reduce the cost of running training jobs

Summary

Further reading

Deep Learning Containers

Technical requirements

Getting started with AWS Deep Learning Containers

Essential prerequisites

Preparing the Cloud9 environment

Downloading the sample dataset

Using AWS Deep Learning Containers to train an ML model

Serverless ML deployment with Lambda’s container image support

Building the custom container image

Testing the container image

Pushing the container image to Amazon ECR

Running ML predictions on AWS Lambda

Completing and testing the serverless API setup

Summary

Further reading

Part 2: Solving Data Engineering and Analysis Requirements

Serverless Data Management on AWS

Technical requirements

Getting started with serverless data management

Preparing the essential prerequisites

Opening a text editor on your local machine

Creating an IAM user

Creating a new VPC

Uploading the dataset to S3

Running analytics at scale with Amazon Redshift Serverless

Setting up a Redshift Serverless endpoint

Opening Redshift query editor v2

Creating a table

Loading data from S3

Querying the database

Unloading data to S3

Setting up Lake Formation

Creating a database

Creating a table using an AWS Glue Crawler

Using Amazon Athena to query data in Amazon S3

Setting up the query result location

Running SQL queries using Athena

Summary

Further reading

Pragmatic Data Processing and Analysis

Technical requirements

Getting started with data processing and analysis

Preparing the essential prerequisites

Downloading the Parquet file

Preparing the S3 bucket

Automating data preparation and analysis with AWS Glue DataBrew

Creating a new dataset

Creating and running a profile job

Creating a project and configuring a recipe

Creating and running a recipe job

Verifying the results

Preparing ML data with Amazon SageMaker Data Wrangler

Accessing Data Wrangler

Importing data

Transforming the data

Analyzing the data

Exporting the data flow

Turning off the resources

Verifying the results

Summary

Further reading

Part 3: Diving Deeper with Relevant Model Training and Deployment Solutions

SageMaker Training and Debugging Solutions

Technical requirements

Getting started with the SageMaker Python SDK

Preparing the essential prerequisites

Creating a service limit increase request

Training an image classification model with the SageMaker Python SDK

Creating a new Notebook in SageMaker Studio

Downloading the training, validation, and test datasets

Uploading the data to S3

Using the SageMaker Python SDK to train an ML model

Using the %store magic to store data

Using the SageMaker Python SDK to deploy an ML model

Using the Debugger Insights Dashboard

Utilizing Managed Spot Training and Checkpoints

Cleaning up

Summary

Further reading

SageMaker Deployment Solutions

Technical requirements

Getting started with model deployments in SageMaker

Preparing the pre-trained model artifacts

Preparing the SageMaker script mode prerequisites

Preparing the inference.py file

Preparing the requirements.txt file

Preparing the setup.py file

Deploying a pre-trained model to a real-time inference endpoint

Deploying a pre-trained model to a serverless inference endpoint

Deploying a pre-trained model to an asynchronous inference endpoint

Creating the input JSON file

Adding an artificial delay to the inference script

Deploying and testing an asynchronous inference endpoint

Cleaning up

Deployment strategies and best practices

Summary

Further reading

Part 4: Securing, Monitoring, and Managing Machine Learning Systems and Environments

Model Monitoring and Management Solutions

Technical prerequisites

Registering models to SageMaker Model Registry

Creating a new notebook in SageMaker Studio

Registering models to SageMaker Model Registry using the boto3 library

Deploying models from SageMaker Model Registry

Enabling data capture and simulating predictions

Scheduled monitoring with SageMaker Model Monitor

Analyzing the captured data

Deleting an endpoint with a monitoring schedule

Cleaning up

Summary

Further reading

Security, Governance, and Compliance Strategies

Managing the security and compliance of ML environments

Authentication and authorization

Network security

Encryption at rest and in transit

Managing compliance reports

Vulnerability management

Preserving data privacy and model privacy

Federated Learning

Differential Privacy

Privacy-preserving machine learning

Other solutions and options

Establishing ML governance

Lineage Tracking and reproducibility

Model inventory

Model validation

ML explainability

Bias detection

Model monitoring

Traceability, observability, and auditing

Data quality analysis and reporting

Data integrity management

Summary

Further reading

Part 5: Designing and BuildingEnd-to-end MLOps Pipelines

Machine Learning Pipelines with Kubeflow on Amazon EKS

Technical requirements

Diving deeper into Kubeflow, Kubernetes, and EKS

Preparing the essential prerequisites

Preparing the IAM role for the EC2 instance of the Cloud9 environment

Attaching the IAM role to the EC2 instance of the Cloud9 environment

Updating the Cloud9 environment with the essential prerequisites

Setting up Kubeflow on Amazon EKS

Running our first Kubeflow pipeline

Using the Kubeflow Pipelines SDK to build ML workflows

Cleaning up

Recommended strategies and best practices

Summary

Further reading

Machine Learning Pipelines with SageMaker Pipelines

Technical requirements

Diving deeper into SageMaker Pipelines

Preparing the essential prerequisites

Running our first pipeline with SageMaker Pipelines

Defining and preparing our first ML pipeline

Running our first ML pipeline

Creating Lambda functions for deployment

Preparing the Lambda function for deploying a model to a new endpoint

Preparing the Lambda function for checking whether an endpoint exists

Preparing the Lambda function for deploying a model to an existing endpoint

Testing our ML inference endpoint

Completing the end-to-end ML pipeline

Defining and preparing the complete ML pipeline

Running the complete ML pipeline

Cleaning up

Recommended strategies and best practices

Summary

Further reading

Index

Other Books You May Enjoy

Preface

This machine learning book covers the essential concepts as well as step-by-step instructions that are designed to help you get a solid understanding of how to manage and secure ML workloads in the cloud. As you progress through the chapters, you’ll discover how to use several container and serverless solutions when training and deploying TensorFlow and PyTorch deep learning models on AWS. You’ll also delve into proven cost optimization techniques as well as data privacy and model privacy preservation strategies in detail as you explore best practices when using each AWS.

By the end of this AWS book, you'll be able to build, scale, and secure your own ML systems and pipelines, which will give you the experience and confidence needed to architect custom solutions using a variety of AWS services for ML engineering requirements.

Who this book is for

This book is for ML engineers, data scientists, and AWS cloud engineers interested in working on production data engineering, machine learning engineering, and MLOps requirements using a variety of AWS services such as Amazon EC2, Amazon Elastic Kubernetes Service (EKS), Amazon SageMaker, AWS Glue, Amazon Redshift, AWS Lake Formation, and AWS Lambda -- all you need is an AWS account to get started. Prior knowledge of AWS, machine learning, and the Python programming language will help you to grasp the concepts covered in this book more effectively.

What this book covers

Chapter 1, Introduction to ML Engineering on AWS, focuses on helping you get set up, understand the key concepts, and get your feet wet quickly with several simplified AutoML examples.

Chapter 2, Deep Learning AMIs, introduces AWS Deep Learning AMIs and how they are used to help ML practitioners perform ML experiments faster inside EC2 instances. Here, we will also dive a bit deeper into how AWS pricing works for EC2 instances so that you will have a better idea of how to optimize and reduce the overall costs of running ML workloads in the cloud.

Chapter 3, Deep Learning Containers, introduces AWS Deep Learning Containers and how they are used to help ML practitioners perform ML experiments faster using containers. Here, we will also deploy a trained deep learning model inside an AWS Lambda function using Lambda’s container image support.

Chapter 4, Serverless Data Management on AWS, presents several serverless solutions, such as Amazon Redshift Serverless and AWS Lake Formation, for managing and querying data on AWS.

Chapter 5, Pragmatic Data Processing and Analysis, focuses on the different services available when working on data processing and analysis requirements, such as AWS Glue DataBrew and Amazon SageMaker Data Wrangler.

Chapter 6, SageMaker Training and Debugging Solutions, presents the different solutions and capabilities available when training an ML model using Amazon SageMaker. Here, we dive a bit deeper into the different options and strategies when training and tuning ML models in SageMaker.

Chapter 7, SageMaker Deployment Solutions, focuses on the relevant deployment solutions and strategies when performing ML inference on the AWS platform.

Chapter 8, Model Monitoring and Management Solutions, presents the different monitoring and management solutions available on AWS.

Chapter 9, Security, Governance, and Compliance Strategies, focuses on the relevant security, governance, and compliance strategies needed to secure production environments. Here, we will also dive a bit deeper into the different techniques to ensure data privacy and model privacy.

Chapter 10, Machine Learning Pipelines with Kubeflow on Amazon EKS, focuses on using Kubeflow Pipelines, Kubernetes, and Amazon EKS to deploy an automated end-to-end MLOps pipeline on AWS.

Chapter 11, Machine Learning Pipelines with SageMaker Pipelines, focuses on using SageMaker Pipelines to design and build automated end-to-end MLOps pipelines. Here, we will apply, combine, and connect the different strategies and techniques we learned in the previous chapters of the book.

To get the most out of this book

You will need an AWS account and a stable internet connection to complete the hands-on solutions in this book. If you still do not have an AWS account, feel free to check the AWS Free Tier page and click Create a Free Account: https://aws.amazon.com/free/.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/jeBII.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: ENTRYPOINT is set to /opt/conda/bin/python -m awslambdaric. The CMD command is then set to app.handler. The ENTRYPOINT and CMD instructions define which command is executed when the container starts to run.

A block of code is set as follows:

SELECT booking_changes, has_booking_changes, *

FROM dev.public.bookings

WHERE

(booking_changes=0 AND has_booking_changes='True')

(booking_changes>0 AND has_booking_changes='False');

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

---

apiVersion: eksctl.io/v1alpha5

kind: ClusterConfig

metadata:

name:

kubeflow-eks-000

region:

us-west-2

version: 1.21

availabilityZones: [

us-west-2a, us-west-2b, us-west-2c, us-west-2d

]

managedNodeGroups:

- name: nodegroup

desiredCapacity:

instanceType:

m5.xlarge

ssh:

enableSsm: true

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: After clicking the FILTER button, a drop-down menu should appear. Locate and select Greater than or equal to from the list of options under By condition. This should update the pane on the right side of the page and show the list of configuration options for the Filter values operation.

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, select your book, click on the Errata Submission Form link, and enter the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Machine Learning Engineering on AWS, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Part 1: Getting Started with Machine Learning Engineering on AWS

In this section, readers will be introduced to the world of ML engineering on AWS.

This section comprises the following chapters:

Chapter 1, Introduction to ML Engineering on AWS

Chapter 2, Deep Learning AMIs

Chapter 3, Deep Learning Containers

1

Introduction to ML Engineering on AWS

Most of us started our machine learning (ML) journey by training our first ML model using a sample dataset on our laptops or home computers. Things are somewhat straightforward until we need to work with much larger datasets and run our ML experiments in the cloud. It also becomes more challenging once we need to deploy our trained models to production-level inference endpoints or web servers. There are a lot of things to consider when designing and building ML systems and these are just some of the challenges data scientists and ML engineers face when working on real-life requirements. That said, we must use the right platform, along with the right set of tools, when performing ML experiments and deployments in the cloud.

At this point, you might be wondering why we should even use a cloud platform when running our workloads. Can’t we build this platform ourselves? Perhaps you might be thinking that building and operating your own data center is a relatively easy task. In the past, different teams and companies have tried setting up infrastructure within their data centers and on-premise hardware. Over time, these companies started migrating their workloads to the cloud as they realized how hard and expensive it was to manage and operate data centers. A good example of this would be the Netflix team, which migrated their resources to the AWS cloud. Migrating to the cloud allowed them to scale better and allowed them to have a significant increase in service availability.

The Amazon Web Services (AWS) platform provides a lot of services and capabilities that can be used by professionals and companies around the world to manage different types of workloads in the cloud. These past couple of years, AWS has announced and released a significant number of services, capabilities, and features that can be used for production-level ML experiments and deployments as well. This is due to the increase in ML workloads being migrated to the cloud globally. As we go through each of the chapters in this book, we will have a better understanding of how different services are used to solve the challenges when productionizing ML models.

The following diagram shows the hands-on journey for this chapter:

Figure 1.1 – Hands-on journey for this chapter

Figure 1.1 – Hands-on journey for this chapter

In this introductory chapter, we will focus on getting our feet wet by trying out different options when building an ML model on AWS. As shown in the preceding diagram, we will use a variety of AutoML services and solutions to build ML models that can help us predict if a hotel booking will be cancelled or not based on the information available. We will start by setting up a Cloud9 environment, which will help us run our code through an integrated development environment (IDE) in our browser. In this environment, we will generate a realistic synthetic dataset using a deep learning model called the Conditional Generative Adversarial Network. We will upload this dataset to Amazon S3 using the AWS CLI. Inside the Cloud9 environment, we will also install AutoGluon and run an AutoML experiment to train and generate multiple models using the synthetic dataset. Finally, we will use SageMaker Canvas and SageMaker Autopilot to run AutoML experiments using the uploaded dataset in S3. If you are wondering what these fancy terms are, keep reading as we demystify each of these in this chapter.

In this chapter, we will cover the following topics:

What is expected from ML engineers?

How ML engineers can get the most out of AWS

Essential prerequisites

Preparing the dataset

AutoML with AutoGluon

Getting started with SageMaker and SageMaker Canvas

No-code machine learning with SageMaker Canvas

AutoML with SageMaker Autopilot

In addition to getting our feet wet using key ML services, libraries, and tools to perform AutoML experiments, this introductory chapter will help us gain a better understanding of several ML and ML engineering concepts that will be relevant to the succeeding chapters of this book. With this in mind, let’s get started!

Technical requirements

Before we start, we must have an AWS account. If you do not have an AWS account yet, simply create an account here: https://aws.amazon.com/free/. You may proceed with the next steps once the account is ready.

The Jupyter notebooks, source code, and other files for each chapter are available in this book’s GitHub repository: https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS.

What is expected from ML engineers?

ML engineering involves using ML and software engineering concepts and techniques to design, build, and manage production-level ML systems, along with pipelines. In a team working to build ML-powered applications, ML engineers are generally expected to build and operate the ML infrastructure that’s used to train and deploy models. In some cases, data scientists may also need to work on infrastructure-related requirements, especially if there is no clear delineation between the roles and responsibilities of ML engineers and data scientists in an organization.

There are several things an ML engineer should consider when designing and building ML systems and platforms. These would include the quality of the deployed ML model, along with the security, scalability, evolvability, stability, and overall cost of the ML infrastructure used. In this book, we will discuss the different strategies and best practices to achieve the different objectives of an ML engineer.

ML engineers should also be capable of designing and building automated ML workflows using a variety of solutions. Deployed models degrade over time and model retraining becomes essential in ensuring the quality of deployed ML models. Having automated ML pipelines in place helps enable automated model retraining and deployment.

Important note

If you are excited to learn more about how to build custom ML pipelines on AWS, then you should check out the last section of this book: Designing and building end-to-end MLOps pipelines. You should find several chapters dedicated to deploying complex ML pipelines on AWS!

How ML engineers can get the most out of AWS

There are many services and capabilities in the AWS platform that an ML engineer can choose from. Professionals who are already familiar with using virtual machines can easily spin up EC2 instances and run ML experiments using deep learning frameworks inside these virtual private servers. Services such as AWS Glue, Amazon EMR, and AWS Athena can be utilized by ML engineers and data engineers for different data management and processing needs. Once the ML models need to be deployed into dedicated inference endpoints, a variety of options become available:

Figure 1.2 – AWS machine learning stack

Figure 1.2 – AWS machine learning stack

As shown in the preceding diagram, data scientists, developers, and ML engineers can make use of multiple services and capabilities from the AWS machine learning stack. The services grouped under AI services can easily be used by developers with minimal ML experience. To use the services listed here, all we need would be some experience working with data, along with the software development skills required to use SDKs and APIs. If we want to quickly build ML-powered applications with features such as language translation, text-to-speech, and product recommendation, then we can easily do that using the services under the AI Services bucket. In the middle, we have ML services and their capabilities, which help solve the more custom ML requirements of data scientists and ML engineers. To use the services and capabilities listed here, a solid understanding of the ML process is needed. The last layer, ML frameworks and infrastructure, offers the highest level of flexibility and customizability as this includes the ML infrastructure and framework support needed by more advanced use cases.

So, how can ML engineers make the most out of the AWS machine learning stack? The ability of ML engineers to design, build, and manage ML systems improves as they become more familiar with the services, capabilities, and tools available in the AWS platform. They may start with AI services to quickly build AI-powered applications on AWS. Over time, these ML engineers will make use of the different services, capabilities, and infrastructure from the lower two layers as they become more comfortable dealing with intermediate ML engineering requirements.

Essential prerequisites

In this section, we will prepare the following:

The Cloud9 environment

The S3 bucket

The synthetic dataset, which will be generated using a deep learning model

Let’s get started.

Creating the Cloud9 environment

One of the more convenient options when performing ML experiments inside a virtual private server is to use the AWS Cloud9 service. AWS Cloud9 allows developers, data scientists, and ML engineers to manage and run code within a development environment using a browser. The code is stored and executed inside an EC2 instance, which provides an environment similar to what most developers have.

Important note

It is recommended to use an Identity and Access Management (IAM) user with limited permissions instead of the root account when running the examples in this book. We will discuss this along with other security best practices in detail in Chapter 9, Security, Governance, and Compliance Strategies. If you are just starting to use AWS, you may proceed with using the root account in the meantime.

Follow these steps to create a Cloud9 environment where we will generate the synthetic dataset and run the AutoGluon AutoML experiment:

Type cloud9 in the search bar. Select Cloud9 from the list of results:

Figure 1.3 – Navigating to the Cloud9 console

Figure 1.3 – Navigating to the Cloud9 console

Here, we can see that the region is currently set to Oregon (us-west-2). Make sure that you change this to where you want the resources to be created.

Next, click Create environment.

Under the Name environment field, specify a name for the Cloud9 environment (for example, mle-on-aws) and click Next step.

Under Environment type, choose Create a new EC2 instance for environment (direct access). Select m5.large for Instance type and then Ubuntu Server (18.04 LTS) for Platform:

Figure 1.4 – Configuring the Cloud9 environment settings

Figure 1.4 – Configuring the Cloud9 environment settings

Here, we can see that there are other options for the instance type. In the meantime, we will stick with m5.large as it should be enough to run the hands-on solutions in this chapter.

For the Cost-saving setting option, choose After four hours from the list of drop-down options. This means that the server where the Cloud9 environment is running will automatically shut down after 4 hours of inactivity.

Under Network settings (advanced), select the default VPC of the region for the Network (VPC) configuration. It should have a format similar to vpc-abcdefg (default). For the Subnet option, choose the option that has a format similar to subnet-abcdefg | Default in us-west-2a.

Important note

It is recommended that you use the default VPC since the networking configuration is simple. This will help you avoid issues, especially if you’re just getting started with VPCs. If you encounter any VPC-related issues when launching a Cloud9 instance, you may need to check if the selected subnet has been configured with internet access via the route table configuration in the VPC console. You may retry launching the instance using another subnet or by using a new VPC altogether. If you are planning on creating a new VPC, navigate to https://go.aws/3sRSigt and create a VPC with a Single Public Subnet. If none of these options work, you may try launching the Cloud9 instance in another region. We’ll discuss Virtual Private Cloud (VPC) networks in detail in Chapter 9, Security, Governance, and Compliance Strategies.

Click Next Step.

On the review page, click Create environment. This should redirect you to the Cloud9 environment, which should take a minute or so to load. The Cloud9 IDE is shown in the following screenshot. This is where we can write our code and run the scripts and commands needed to work on some of the hands-on solutions in this book:

Figure 1.5 – AWS Cloud9 interface

Figure 1.5 – AWS Cloud9 interface

Using this IDE is fairly straightforward as it looks very similar to code editors such as Visual Studio Code and Sublime Text. As shown in the preceding screenshot, we can find the menu bar at the top (A). The file tree can be found on the left-hand side (B). The editor covers a major portion of the screen in the middle (C). Lastly, we can find the terminal at the bottom (D).

Important note

If this is your first time using AWS Cloud9, here is a 4-minute introduction video from AWS to help you get started: https://www.youtube.com/watch?v=JDHZOGMMkj8.

Now that we have our Cloud9 environment ready, it is time we configure it with a larger storage space.

Increasing Cloud9’s storage

When a Cloud9 instance is created, the attached volume only starts with 10GB of disk space. Given that we will be installing different libraries and frameworks while running ML experiments in this instance, we will need more than 10GB of disk space. We will resize the volume programmatically using the boto3 library.

Important note

If this is your first time using the boto3 library, it is the AWS SDK for Python, which gives us a way to programmatically manage the different AWS resources in our AWS accounts. It is a service-level SDK that helps us list, create, update, and delete AWS resources such as EC2 instances, S3 buckets, and EBS volumes.

Follow these steps to download and run some scripts to increase the volume disk space from 10GB to 120GB:

In the terminal of our Cloud9 environment (right after the $ sign at the bottom of the screen), run the following bash command:

wget -O resize_and_reboot.py https://bit.ly/3ea96tW

This will download the script file located at https://bit.ly/3ea96tW. Here, we are simply using a URL shortener, which would map the shortened link to https://raw.githubusercontent.com/PacktPublishing/Machine-Learning-Engineering-on-AWS/main/chapter01/resize_and_reboot.py.

Important note

Note that we are using the big O flag instead of a small o or a zero (0) when using the wget command.

What’s inside the file we just downloaded? Let’s quickly inspect the file before we run the script. Double-click the resize_and_reboot.py file in the file tree (located on the left-hand side of the screen) to open the Python script file in the editor pane. As shown in the following screenshot, the resize_and_reboot.py script has three major sections. The first block of code focuses on importing the prerequisites needed to run the script. The second block of code focuses on resizing the volume of a selected EC2 instance using the boto3 library. It makes use of the describe_volumes() method to get the volume ID of the current instance, and then makes use of the modify_volume() method to update the volume size to 120GB. The last section involves a single line of code that simply reboots the EC2 instance. This line of code uses the os.system() method to run the sudo reboot shell command:

Figure 1.6 – The resize_and_reboot.py script file

Figure 1.6 – The resize_and_reboot.py script file

You can find the resize_and_reboot.py script file in this book’s GitHub repository: https://github.com/PacktPublishing/Machine-Learning-Engineering-on-AWS/blob/main/chapter01/resize_and_reboot.py. Note that for this script to work, the EC2_INSTANCE_ID environment variable must be set to select the correct target instance. We’ll set this environment variable a few steps from now before we run the resize_and_reboot.py script.

Next, run the following command in the terminal:

python3 -m pip install --user --upgrade boto3

This will upgrade the version of boto3 using pip.

Important note

If this is your first time using pip, it is the package installer for Python. It makes it convenient to install different packages and libraries using the command line.

You may use python3 -m pip show boto3 to check the version you are using. This book assumes that you are using version 1.20.26 or later.

The remaining statements focus on getting the Cloud9 environment’s instance_id from the instance metadata service and storing this value in the EC2_INSTANCE_ID variable. Let’s run the following in the terminal:

TARGET_METADATA_URL=http://169.254.169.254/latest/meta-data/instance-idexport EC2_INSTANCE_ID=$(curl -s $TARGET_METADATA_URL)echo $EC2_INSTANCE_ID

This should give us an EC2 instance ID with a format similar to i-01234567890abcdef.

Now that we have the EC2_INSTANCE_ID environment variable set with the appropriate value, we can run the following command:

python3 resize_and_reboot.py

This will run the Python script we downloaded earlier using the wget command. After performing the volume resize operation using boto3, the script will reboot the instance. You should see a Reconnecting… notification at the top of the page while the Cloud9 environment’s EC2 instance is being restarted.

Important note

Feel free to run the lsblk command after the instance has been restarted. This should help you verify that the volume of the Cloud9 environment instance has been resized to 120GB.

Now that we have successfully resized the volume to 120GB, we should be able to work on the next set of solutions without having to worry about disk space

Enjoying the preview?

Page 1 of 1

Machine Learning Engineering on AWS: Build, scale, and secure machine learning systems and MLOps pipelines in production

About this ebook

Joshua Arvin Lat

Related authors

Related to Machine Learning Engineering on AWS

Related ebooks

Data Modeling & Design For You

Related podcast episodes

Related articles

Related categories

Reviews for Machine Learning Engineering on AWS

What did you think?

Book preview

Machine Learning Engineering on AWS - Joshua Arvin Lat

1