How The Components Work Together?
How The Components Work Together?
According to Microsoft:
Machine learning is a data science technique that allows computers to use existing data to forecast
future behaviors, outcomes, and trends.
As you move forward, you will learn how to use Azure Machine Learning Studio as an integrated,
end-to-end data science and advanced analytics solution, enabling data scientists to prepare data,
develop experiments, and deploy models at cloud scale.
Closely related to Machine Learning Studio, Machine Learning Service is an offering in Azure from
Microsoft which provides the utility similar to Azure ML Studio.
It is currently in preview and provides the environment with much better support to open
frameworks in python such as Tensorflow, sci-kit learn etc..
It supports Jupyter Notebooks, Visual Studio Code Tools for AI, Azure Batch AI and Containerised
Deployment.
Moving Forward
Anomaly Detection
5. Azure AD Authentication
Azure Active Directory (Azure AD) simplifies authentication with support for
industry-standard protocols such as
OAuth 2.0
OpenID Connect
SAML
WS-Federation
Basic Authentication
Synced Authentication
1. With Password Hash
2. Without Password Hash
Federated Authentication
What is it Based on?
Azure Machine Learning is built on top of:
Jupyter Notebook
Apache Spark
Docker
Kubernetes
Python
Conda
Microsoft Machine Learning Library for Apache Spark
Microsoft Cognitive Toolkit.
Other ML Tools
In addition to Azure Machine Learning Studio, there are a variety of options at
Microsoft to build, deploy, and manage your machine learning models.
Machine Learning Service
SQL Server Machine Learning Services
Microsoft Machine Learning Server
Azure Data Science Virtual Machine
Spark MLLib in HDInsight
Batch AI Training Service
Microsoft Cognitive Toolkit (CNTK)
Azure Cognitive Services
Moving Further
Azure ML Platform
This helps understand using Azure ML platform on the web to generate predictions by creating and
deploying ML models.
Create a Predictive experiment from training experiment encapsulating data transformations and
training model.
Use the Predictive experiment to create a web service to generate predictions using API-endpoint
and API key.
Workspace
Workspace is provision given on Azure Subscription which can be referred to as a
playground for Machine Learning Experimentation.
A Workspace contains
Training Experiment
Predictive Experiment and
Web Service collection for a user.
To explore the Azure ML Platform, we must have an Azure subscription and must
create a workspace.
Getti ng Started
Watch the following video to learn to create and explore the features of a
Workspace.
Play
01:35
-03:02
Mute
Settings
Enter fullscreen
Play
Managing Workspaces
Watch the video to learn to manage multiple workspaces for multiple
projects/experiments.
Play
02:36
-04:47
Mute
Settings
Enter fullscreen
Play
Managing Datasets
Watch the video to learn how to get datasets into a workspace to start working on
them for various experiments.
Play
05:37
-10:37
Mute
Settings
Enter fullscreen
Play
Notebooks
Azure ML platform has built-in support to execute R and Python scripts using Jupyter
Notebook.
Notebooks can be used to transform, clean, visualise data and train models according
to requirements.
They can be placed as part of an experiment or can be used for
visualisation/transformation separately for particular datasets.
Projects
The Projects tab gives summaries of all the assets i.e experiments, notebooks, and
datasets used by us and added to the project.
Project layout enables easy management of various assets related to our experiments.
The Cortana Intelligence Gallery is a collection of resources people have shared on the Azure ML
platform.
Experiments can be copied from the gallery into a workspace and can be explored/modified for a
better understanding.
We can also share our work on experiments and web services for others to learn and explore.
Datasets
An Azure ML experiment requires at least one dataset on which the model is created. The data can
be imported directly or into Azure Storage and used for creating a model.
Hive or U-SQL jobs are used to clean and prepare for analysis.
This data can be stored on Azure Storage and can be easily imported to Azure ML Workspace. Data
can also be imported from Hive or Azure SQL database.
Further in the topic you will learn different types of data used in experiments and ways to use them
Training Data
TRAINING DATA : The Data to train the Experiment is mandatory data and acts as the start of
Dataflow.
Reference Data
REFERENCE DATA is not directly used in the experiment or for training a model.It is only used to
provide additional information.
For an example when a web service is published based on the ML model, a service endpoint is
created to consume the web service.
Based on the client using the service, the reference data to be used can be varied accordingly to
provide customisation and flexibility to the web service.
Play
04:00
-07:07
Mute
Settings
Enter fullscreen
Play
Play
02:35
-04:25
Mute
Settings
Enter fullscreen
Play
In this topic, you have learnt ways to import data from various Azure Data stores and
the types of data used. Few other notable points are:
Data can also be imported from On-premises SQL Server or several other
online sources using Import module.
Multiple Data formats such as .txt, .csv, .nh.csv, .tsv, .nh.tsv, Excel file, Azure
table, Hive table, SQL database table, .RData etc.. are supported.
Data types recognised by ML Studio are String, Integer, Double, Boolean,
DateTime and TimeSpan.
Import Data
Prepare Data
Scenarios for advanced analytics.
The Lifecycle
A Data Science solution involves:
- Identifying a Business Problem
- Data Extraction
- Data Cleaning
- Data Transformation
- Exploratory Data Analysis
- Data Visualisation
- Machine Learning Modeling
- Publishing the Prediction Service
Adopting ML Studio
Azure Machine Learning Studio helps through the entire lifecycle of a Data Science
solution.
We already had a look at Data Import from various sources, which can be considered
as Data Extraction.
Data Cleaning and Transformation are done based on our business problem and the
approach we take to provide a solution for it. You can see this in the videos over the
following cards.
Filters : Transforms and cleans digital data and can help in Speech Processing
Manipulation : Cleans missing values, Meta data editing, SQL transforms, etc..
Scale and Reduce : Normalization, Grouping, Clipping, etc..
Sample and Split : Partition and Sample data.
Data Visualisation
Exploratory Data Analysis and Data Visualisation are facilitated by Notebooks in the
Azure ML Studio.Data Visualisation is readily available in Azure ML Studio over the
right-click of uploaded datasets.
However, Notebooks can be used to further visualise data in a required manner with
Python/R scripts adding flexibility and functionality.
Watch the following video to learn using Notebooks for data visualisation.
Play
02:53
-06:05
Mute
Settings
Enter fullscreen
Play
They are divided into 4 major classes for both Structured and unstructured data.
- Regression
- Classification
- Anomaly Detection
- Clustering
Go through Machine Learning Axioms and other ML courses for model selection and
feature selection.
Available ML algorithms
This video doesn't contain any audio.
Play
00:17
00:38
Settings
PIPEnter fullscreen
Play
Supervised Learning
Azure ML Platform is equipped with over 20 types of supervised learning methods in-
built and ready to use.
Users can also write their own python scripts and embed them into the ML workflow
for customised and optimised models using Notebooks and supported ML libraries
like sci-kit learn, Tensorflow, etc.
Move to the next cards to check out examples of various Supervised Learning Algorithms
along with concepts of Data cleaning and Transformation.
Regression Model
Watch this video to understand how to train a regression model from a sample
dataset using Azure ML Experimentation service.
Play
06:18
-09:09
Mute
Settings
Enter fullscreen
Play
Play
04:15
-06:49
Mute
Settings
Enter fullscreen
Play
Moving Further
Supervised learning is the widely used ML modelling technique for structured data.
Different algorithms have their own pros and cons. Our requirements must be
decided on to select the optimal algorithm.
Refer the following links for a better understanding on Supervised learning methods w.r.t
Azure ML platform: Feature Engineering, Algorithm Selection and Evaluating ML Model.
Unsupervised Learning
Unsupervised Learning is used to work on unstructured and un-labelled data.
Clustering is the most commonly used method where similar data is grouped by
finding features and grouping them based on their feature set.
Azure ML Platform supports K-Means Clustering for Unsupervised Learning.
K-Means Clustering
Watch the video to learn using K-Means Clustering Algorithm and visualise data after
clustering.
Play
02:54
-04:35
Mute
Settings
Enter fullscreen
Play
Recommenders
Recommenders, as their name suggests, are used in sectors of e-commerce, ads and
social platforms etc. They recommend related items of interest based on users
previous interaction.
Play
02:07
-03:37
Mute
Settings
Enter fullscreen
Play
What's Next?
In this topic, you learned how to create an ML model from datasets taken and
different types of ML models supported on Azure ML Platform.
Move over to next topic to learn to deploy the ML models as web services.
Refer the following links for further info on: K-Means Clustering and Match Box
Recommender.
Once the trained models are tested for accuracy and optimised, they need to be
deployed for consumption through API.
This topic helps to learn to deploy the trained model as a web service using a
predictive experiment.
Project Experiments
Predictive Experiment
Predictive Experiment is created from successful Training Experiment.
This defines the workflow of processing web requests to generate predictions.
It is similarly structured to training experiment but the data transformations
like normalisation, are encapsulated into a single block in order to not modify
the training experiment.
Webservice input and output will be the new blocks added to the experiment.
Play
08:20
-11:39
Mute
Settings
Enter fullscreen
Play
Webservice
A Webservice is to be created from a Predictive Experiment to consume the generated
ML model.
Creating a Webservice generates API end-points which can be accessed through
Primary and Secondary API-keys.
The API end-points take inputs required by the ML model and given JSON output
with predictions.
Webservice Workflow
The following workflow is adopted for building a web service with predictive
experiment:
Deploying a Webservice
Check the following video to understand how to deploy a web service.
Play
02:47
-03:40
Mute
Settings
Enter fullscreen
Play
Moving Further
Now as the basics of creating and deploying a Webservice are understood, Move
over to the Next topic to learn more about managing and customising Webservices.
Consumption and
You will also learn about metrics and logging regarding the usage of Web services.
Request-Response mode or
Batch Mode in asynchronous way.
API endpoints and API Keys are used according to the requirement.
The APIs are built as REST APIs and can be consumed by required client application
by passing required parameters as an HTTPS request.
Microsoft Excel along with Azure ML Plug-in can also be used to consume the
service.
Consuming APIs
The following video explains how to consume API service using API endpoints and
also using excel.
Play
04:11
-05:43
Mute
Settings
Enter fullscreen
Play
Parameters
Web Service can be customised to take required parameters to access additional
data.
For example, you can specify a database to fetch the data. It will be provided as an
additional parameter apart from the required parameters for a prediction.
This helps in customised client consumption such that each client will use the same
service for prediction but data output/retrieval is different for each of them.
Monitoring
Azure ML Platform enables Web service management through the Usage Statistics
and Logging.
Dashboards provide an overview of the total no.of requests made to the API and their
success/fail rate over a selected period of time. They also give the average compute
time and latency associated with the API.
Logging can be enabled for a more detailed JSON response files stored automatically
on Azure storage providing a detailed report of the request and response.
Play
02:24
-03:36
Mute
Settings
Enter fullscreen
Play
You are the senior Cloud Engineer in your project. There are new joinees in your
project. You need to give them a demo of the following scenario in Azure Machine
Learning Studio. i) Create Machine Learning Studio (classic) workspace: Location:
South Central US, Pricing tier: DEVTEST Standard (Navigate to the Azure ML Studio).
ii) Create Blank Experiment: add sample dataset: Automobile price data (Raw), add
Select Columns in Dataset: Exclude column name: normalized-losses, add Clean
Missing Data: Remove the entire row, run the experiment to complete data
preparation. iii) Add Select Columns in Dataset: Include column name: make, body-
style, wheel-base, engine-style, horsepower, peak-rpm, highway-mpg, price, add Split
Data: Fraction of rows in the first output dataset: 0.75 iv) Add Linear Regression and
Train model, connect the output of linear regression and left output of split data to
the Train model, add price column in the Train model module and run the experiment
to train the regression model. v) Add a Score model and connect it with the Train
Model and Split Data. Run the experiment and visualize the output of the Score
model for the prediction of price. vi) To test the result quality, add the Evaluate model
and connect the output of the Score model to the input of the Evaluate model.
Note: Use the credentials given in the hands-on to log in to the Azure Portal. Create a
new resource group and use the same resource group for all resources. The
Username/Password/Services Name can be as per your choice, for the module
connection, refer to the diagram provided, after completing the hands-on, delete all
the resources created.
However, if we want to use ML models in scenarios having large sets of data i.e for
Big Data, processing must be done in batches at scheduled intervals and in optimal
times to reduce latency and promote the asynchronous way of obtaining predictions
for our data.
The Azure Data Factory and its pipeline come in to play in these kinds of scenarios.
Azure ML in Pipeline
When Big Data batch processing is done by pipeline, predictions can be part of the
pipeline with Azure ML being used as a linked service.
Azure ML batch execution activity is used to call a predictive web service from a
pipeline.
The input data set is passed to the web service input and, the predicted output from
the web service is returned continuing to the next activity in the pipeline.
Other Linked services are data sources, like Azure Storage or Azure SQL Database,
and compute services, like Azure HD Insight or Azure Data Link Analytics.
Play
05:36
-10:13
Mute
Settings
Enter fullscreen
Play
In a Big Data Scenario retraining the model might be needed a tad frequently, since
the data generated is of huge amounts and may vary over short intervals of time.
We need fresh data to train the ML model to improve the model accuracy with the
changing inflow of data.
However, considering the amount of data required to be handled while Big Data
Processes are ongoing, this is neither efficient nor recommended.
In this situation, Azure Data Factory Pipeline and Azure ML Update Resource Service
comes to the rescue to automate the retraining of ML models.
Automating Retraining
To automate the process of retraining, Azure ML provides us a feature to publish
training experiment as a retraining web service.
The following activities can be executed in sequence, in the Azure Data Factory
Pipeline, to achieve the automation task:
1. Azure ML Batch Execution Activity is used to call the retraining web service to
generate a new model as a file.
2. The model file is passed to an Azure ML Update Resource Activity that
updates the scoring experiment replacing the existing model.
Play
07:50
-15:37
Mute
Settings
Enter fullscreen
Play
Moving Further
Refer the following links for more info on Predictive Pipelines and Update Resource
Activity.
As Big Data processing for predictions is looked into, Move over to the next topic to
learn process real-time data for predictions using the Webservices of Azure ML
Platform.
Streaming Process
Input : Are often event hubs or IoT hubs that are used to ingest real-time data at
scale.
Streaming Job : Used to process the data. Generally, an Azure Stream Analytics
query.
Output : The expected result, could be anything from a database update to a real-
time dashboard with analysis.
Azure ML Functions
Watch the following video to understand how Azure ML Webservice works as a
function in Streaming query.
Play
00:21
-00:42
Mute
Settings
Enter fullscreen
Play
If you have trouble playing this video, please click here for help.
Play
07:23
-09:26
Mute
Settings
Enter fullscreen
Play
Real-Life Implementation
So Far...
Using Azure ML functionality to clean data, create ML models and deploy them as
web services.
Consuming the web services in real-time client applications.
The course helps you to get started with creating solutions using Azure Machine
Learning Platform.