Hands-On Artificial Intelligence for Beginners: An introduction to AI concepts, algorithms, and their implementation
Ebook593 pages

About this ebook

Virtual Assistants, such as Alexa and Siri, process our requests, Google's cars have started to read addresses, and Amazon's prices and Netflix's recommended videos are decided by AI. Artificial Intelligence is one of the most exciting technologies and is becoming increasingly significant in the modern world.
Hands-On Artificial Intelligence for Beginners will teach you what Artificial Intelligence is and how to design and build intelligent applications. This book will teach you to harness packages such as TensorFlow in order to create powerful AI systems. You will begin with reviewing the recent changes in AI and learning how artificial neural networks (ANNs) have enabled more intelligent AI. You'll explore feedforward, recurrent, convolutional, and generative neural networks (FFNNs, RNNs, CNNs, and GNNs), as well as reinforcement learning methods. In the concluding chapters, you'll learn how to implement these methods for a variety of tasks, such as generating text for chatbots, and playing board and video games.
By the end of this book, you will be able to understand exactly what you need to consider when optimizing ANNs and how to deploy and maintain AI applications.

Release dateOct 31, 2018
    Hands-On Artificial Intelligence for Beginners - Patrick D. Smith

    An introduction to AI concepts, algorithms, and their implementation

    Patrick D. Smith


    Copyright © 2018 Packt Publishing

    About the author

    Patrick D. Smith is the Data Science Lead for Excella in Arlington, Virginia, where he founded the data science and machine learning team. Prior to Excella, Patrick was the lead instructor for the data science program at General Assembly in Washington, DC, as well as a data scientist with Booz Allen Hamilton's Strategic Innovations Group.

    He holds a bachelor's degree from The George Washington University in International Economics, and is currently a part-time masters student in software engineering at Harvard University.

    My journey into technology never would have been possible without my father, Curtis Griswold Smith, who was director of I.T. for one of the the world's first pioneering computer companies, Digital Equipment Corporation. It was he who introduced me to computing at three years old, and where my love of all technology stems from.

    About the reviewer

    David Dindi received a M.Sc. and a B.Sc. in chemical engineering with a focus on artificial intelligence from Stanford University. While at Stanford, David developed deep learning frameworks for predicting patient-specific adverse reactions to drugs at the Stanford Center for Biomedical Informatics. He currently advises a number of early stage start-ups in Silicon Valley and in New York.

    Table of Contents

    Who this book is for

    What this book covers

    To get the most out of this book

    Download the example code files

    Conventions used

    Get in touch


    The History of AI

    The beginnings of AI –1950–1974

    Rebirth –1980–1987

    The modern era takes hold – 1997-2005

    Deep learning and the future – 2012-Present


    Machine Learning Basics

    Technical requirements

    Applied math basics

    The building blocks – scalars, vectors, matrices, and tensors





    Matrix math

    Scalar operations

    Element–wise operations

    Basic statistics and probability theory

    The probability space and general theory

    Probability distributions

    Probability mass functions

    Probability density functions

    Conditional and joint probability

    Chain rule for joint probability

    Bayes' rule for conditional probability

    Constructing basic machine learning algorithms

    Supervised learning algorithms

    Random forests

    Unsupervised learning algorithms

    Basic tuning

    Overfitting and underfitting

    K-fold cross-validation

    Hyperparameter optimization


    Platforms and Other Essentials

    Technical requirements

    TensorFlow, PyTorch, and Keras


    Basic building blocks

    The TensorFlow graph


    Basic building blocks

    The PyTorch graph


    Basic building blocks

    Wrapping up

    Cloud computing essentials

    AWS basics

    EC2 and virtual machines

    S3 Storage

    AWS Sagemaker

    Google Cloud Platform basics

    GCP cloud storage

    GCP Cloud ML Engine

    CPUs, GPUs, and other compute frameworks

    Installing GPU libraries and drivers

    With Linux (Ubuntu)

    With Windows

    Basic GPU operations

    The future – TPUs and more


    Your First Artificial Neural Networks

    Technical requirements

    Network building blocks

    Network layers

    Naming and sizing neural networks

    Setting up network parameters in our MNIST example

    Activation functions

    Historically popular activation functions

    Modern approaches to activation functions

    Weights and bias factors

    Utilizing weights and biases in our MNIST example

    Loss functions

    Using a loss function for simple regression

    Using cross-entropy for binary classification problems

    Defining a loss function in our MNIST example

    Stochastic gradient descent

    Learning rates

    Utilizing the Adam optimizer in our MNIST example


    The training process

    Putting it all together

    Forward propagation


    Forwardprop and backprop with MNIST

    Managing a TensorFlow model

    Saving model checkpoints


    Convolutional Neural Networks

    Overview of CNNs

    Convolutional layers

    Layer parameters and structure

    Pooling layers

    Fully connected layers

    The training process

    CNNs for image tagging


    Recurrent Neural Networks

    Technical requirements

    The building blocks of RNNs

    Basic structure

    Vanilla recurrent neural networks




    Backpropagation through time

    Memory units – LSTMs and GRUs



    Sequence processing with RNNs

    Neural machine translation

    Attention mechanisms

    Generating image captions

    Extensions of RNNs

    Bidirectional RNNs

    Neural turing machines


    Generative Models

    Technical requirements

    Getting to AI – generative models


    Network architecture

    Building an autoencoder

    Variational autoencoders




    Training and optimizing VAEs

    Utilizing a VAE

    Generative adversarial networks

    Discriminator network

    Generator network

    Training GANs

    Other forms of generative models

    Fully visible belief nets

    Hidden Markov models

    Boltzmann machines



    Reinforcement Learning

    Technical requirements

    Principles of reinforcement learning

    Markov processes



    Value functions

    The Bellman equation


    Policy optimization

    Extensions on policy optimization


    Deep Learning for Intelligent Agents

    Technical requirements

    Word embeddings


    Training Word2vec models


    Constructing a basic agent


    Deep Learning for Game Playing

    Technical requirements


    Networks for board games

    Understanding game trees

    AlphaGo and intelligent game-playing AIs

    AlphaGo policy network

    AlphaGo value network

    AlphaGo in action

    Networks for video games

    Constructing a Deep Q–network

    Utilizing a target network

    Experience replay buffer

    Choosing action

    Training methods

    Training the network

    Running the network


    Deep Learning for Finance


    Introduction to AI in finance

    Deep learning in trading

    Building a trading platform

    Basic trading functions

    Creating an artificial trader

    Managing market data

    Price prediction utilizing LSTMs

    Backtesting your algorithm

    Event-driven trading platforms

    Gathering stock price data

    Generating word embeddings

    Neural Tensor Networks for event embeddings

    Predicting events with a convolutional neural network

    Deep learning in asset management


    Deep Learning for Robotics

    Technical requirements


    Setting up your environment

    MuJoCo physics engine

    Downloading the MuJoCo binary files

    Signing up for a free trial of MuJoCo

    Configuring your MuJoCo files

    Installing the MuJoCo Python package

    Setting up a deep deterministic policy gradients model

    Experience replay buffer

    Hindsight experience replay

    The actor–critic network

    The actor

    The critic

    Deep Deterministic Policy Gradients

    Implementation of DDPG



    Deploying and Maintaining AI Applications

    Technical requirements


    Deploying your applications

    Deploying models with TensorFlow Serving

    Utilizing docker

    Building a TensorFlow client

    Training and deploying with the Google Cloud Platform

    Training on GCP

    Deploying for online learning on GCP

    Using an API to Predict

    Scaling your applications

    Scaling out with distributed TensorFlow

    Testing and maintaining your applications

    Testing deep learning algorithms



    Other Books You May Enjoy

    Leave a review - let other readers know what you think


    Virtual assistants such as Alexa and Siri process our requests, Google's cars have started to read addresses, and Amazon's prices and Netflix's recommended videos are decided by AI. AI is one of the most exciting technologies, and is becoming increasingly significant in the modern world.

    Hands-On Artificial Intelligence for Beginners will teach you what AI is and how to design and build intelligent applications. This book will teach you to harness packages such as TensorFlow to create powerful AI systems. You will begin by reviewing the recent changes in AI and learning how artificial neural networks (ANNs) have enabled more intelligent AI. You'll explore feedforward, recurrent, convolutional, and generative neural networks (FFNNs, RNNs, CNNs, and GNNs), as well as reinforcement learning methods. In the concluding chapters, you'll learn how to implement these methods for a variety of tasks, such as generating text for chatbots, directing self-driving cars, and playing board and video games.

    By the end of this book, you will be able to understand exactly what you need to consider when optimizing ANNs and how to deploy and maintain AI applications.

    Who this book is for

    This book is designed for beginners in AI, aspiring AI developers, and machine learning enthusiasts with an interest in leveraging various algorithms to build powerful AI applications.

    What this book covers

    Chapter 1, The History of AI, begins by discussing the mathematical basis of AI and how certain theorems evolved. Then, we'll look at the research done in the 1980s and 90s to improve ANNs, we'll look at the AI winter, and we'll finish off with how we arrived at where we are today.

    Chapter 2, Machine Learning Basics, introduces the fundamentals of machine learning and AI. Here, we will cover essential probability theory, linear algebra, and other elements that will lay the groundwork for the future chapters.

    Chapter 3, Platforms and Other Essentials, introduces the deep learning libraries of Keras and TensorFlow and moves onto an introduction of basic AWS terminology and concepts that are useful for deploying your networks in production. We'll also introduce CPUs and GPUs, as well as other forms of compute architecture that you should be familiar with when building deep learning solutions.

    Chapter 4, Your First Artificial Neural Networks, explains how to build our first artificial neural network. Then, we will learn ability of the core elements of ANNs and construct a simple single layer network both in Keras and TensorFlow so that you understand how the two languages work. With this simple network, we will do a basic classification task, such as the MNIST OCR task.

    Chapter 5, Convolutional Neural Networks, introduces the convolutional neural network and explains its inner workings. We'll touch upon the basic building blocks of convolutions, pooling layers, and other elements. Lastly, we'll construct a Convolutional Neural Network for image tagging.

    Chapter 6, Recurrent Neural Networks, introduces one of the workhorses of deep learning and AI—the recurrent neural network. We'll first introduce the conceptual underpinnings of recurrent neural networks, with a specific focus on utilizing them for natural language processing tasks. We'll show how one can generate text utilizing you of these networks and see how they can be utilized for predictive financial models.

    Chapter 7, Generative Models, covers generative models primarily through the lens of GANs, and we'll look at how we can accomplish each of the above tasks with GANs.

    Chapter 8, Reinforcement Learning, introduces additional forms of neural networks. First, we'll take a look at autoencoders, which are unsupervised learning algorithms that help us recreate inputs when we don't have access to input data. Afterwards, we'll touch upon other forms of networks, such as the emerging geodesic neural networks.

    Chapter 9, Deep Learning for Intelligent Assistant, focuses on utilizing our knowledge of various forms of neural networks from the previous section to make an intelligent assistant, along the lines of Amazon's Alexa or Apple's Siri. We'll learn about and utilize word embeddings, recurrent neural networks, and decoders.

    Chapter 10, Deep Learning for Game Playing, explains how to construct game-playing algorithms with reinforcement learning. We'll look at several different forms of games, from simple Atari-style games to more advanced board games. We'll touch upon the methods that Google Brain utilized to build AlphaGo.

    Chapter 11, Deep Learning for Finance, shows how to create an advanced market prediction system in TensorFlow utilizing RNNs.

    Chapter 12, Deep Learning for Robotics, uses deep learning to teach a robot to move objects. We will first train the neural network in simulated environments and then move on to real mechanical parts with images acquired from a camera.

    Chapter 13, Scale, Deploy and Maintain AI Application, introduces methods for creating and scaling training pipelines and deployment architectures for AI systems.

    To get the most out of this book

    The codes in the chapter can be directly executed using Jupyter and Python. The code files for the book are present in the GitHub link provided in the following sections.

    The History of AI

    The term Artificial Intelligence (AI) carries a great deal of weight. AI has benefited from over 70 years of research and development. The history of AI is varied and winding, but one ground truth remains – tireless researchers have worked through funding growths and lapses, promise and doubt, to push us toward achieving ever more realistic AI.

    Before we begin, let's weed through the buzzwords and marketing and establish what AI really is. For the purposes of this book, we will rely on this definition:

    AI is a system or algorithm that allows computers to perform tasks without explicitly being programmed to do so.

    AI is an interdisciplinary field. While we'll focus largely on utilizing deep learning in this book, the field also encompasses elements of robotics and IoT, and has a strong overlap (if it hasn't consumed it yet) with generalized natural language processing research. It's also intrinsically linked with fields such as Human-Computer Interaction (HCI) as it becomes increasingly important to integrate AI with our lives and the modern world around us.

    AI goes through waves, and is bound to go through another (perhaps smaller) wave in the future. Each time, we push the limits of AI with the computational power that is available to us, and research and development stops. This day and age may be different, as we benefit from the confluence of increasingly large and efficient data stores, rapid fast and cheap computing power, and the funding of some of the most profitable companies in the world. To understand how we ended up here, let's start at the beginning.

    In this chapter, we will cover the following topics:

    The beginnings of AI – 1950–1974

    Rebirth – 1980–1987

    The modern era takes hold – 1997–2005

    Deep learning and the future – 2012–Present

    The beginnings of AI –1950–1974

    Since some of the earliest mathematicians and thinkers, AI has been a long sought after concept. The ancient Greeks developed myths of the automata, a form of robot that would complete tasks for the Gods that they considered menial, and throughout early history thinkers pondered what it meant to human, and if the notion of human intelligence could be replicated. While it's impossible to pinpoint an exact beginning for AI as a field of research, its development parallels the early advances of computer science. One could argue that computer science as a field developed out of this early desire to create self-thinking machines.

    During the second world war, British mathematician and code breaker Alan Turing developed some of the first computers, conceived with the vision of AI in mind. Turing wanted to create a machine that would mimic human comprehension, utilizing all available information to reason and make decisions. In 1950, he published Computing Machinery and Intelligence, which introduced what we now call the Turing test of AI. The Turing test, which is a benchmark by which to measure the aptitude of a machine to mimic human interaction, states that to pass the test, the machine must be able to sufficiently fool a discerning judge as to if it is a human or not. This might sound simple, but think about how many complex items would have to be conquered to reach this point. The machine would be able to comprehend, store information on, and respond to natural language, all the while remembering knowledge and responding to situations with what we deem common sense.

    Turing could not move far beyond his initial developments; in his day, utilizing a computer for research cost almost $200,000 per month and computers could not store commands. His research and devotion to the field, however, has earned him accolades. Today, he is widely considered the father of AI and the academic study of computer science.

    It was in the summer of 1956, however, that the field was truly born. Just a few months before, researchers at the RAND Corporation developed the Logic Theorist – considered the world's first AI program – which proved 38 theorems of the Principia Mathematica. Spurred on by this development and others, John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon hosted the now famous Dartmouth Summer Research Project on AI, coining the term Artificial Intelligence itself and providing the groundwork for the field. With funding from the Rockefeller Foundation, these four friends brought together some of the most preeminent researchers in AI over the course of the summer to brainstorm and effectively attempt to provide a roadmap for the field. They came from the institutions and companies that were on the leading edge of the computing revolution at the time; Harvard, Dartmouth, MIT, IBM, Bell Labs, and the RAND Corporation. Their topics of discussion were fairly forward-thinking for the time – they could have easily been those of an AI conference today—Artificial Neural Networks (ANN), natural language processing (NLP), theories of computation, and general computing frameworks. The Summer Research Project was seminal in creating the field of AI as we know it today, and many of its discussion topics spurned the growth of AI research and development through the 1950s and 1960s.

    After 1956, innovation kept up a rapid pace. Years later, in 1958, a researcher at the Cornell Aeronautical Laboratory named Frank Rosenblatt invented one of the founding algorithms of AI, the Perceptron. The following diagram shows the Perceptron algorithm:

    The Perceptron algorithm

    Perceptrons are simple, single-layer networks that work as linear classifiers. They consist of four main architectural aspects which are mentioned as follows:

    The input layer: The initial layer for reading in data

    Weight and biases vectors: Weights help learn appropriate values during training for the connections between neurons, while biases help shift the activation function to fit the desired output

    A summation function: A simple summation of the input

    An activation function: A simple mapping of the summed weighted input to the output

    As you can see, these networks use basic mathematics to perform basic mathematical operations. They failed to live up to the hype, however, and significantly contributed to the first AI winter because of the vast disappointment they created.

    Another important development of this early era of research was adaline. As you can see, adaline attempted to improve upon the perceptron by utilizing continuous predicted values to learn the coefficients, unlike the perceptron, which utilizes class labels. The following diagram shows the adaline algorithm:

    These golden years also brought us early advances such as the student program that solved high school algebra programs and the ELIZA Chatbot. By 1963, the advances in the field convinced the newly formed Advanced Research Projects Agency (DARPA) to begin funding AI research at MIT.

    By the late 1960s, funding in the US and the UK began to dry up. In 1969, a book named Perceptrons by MIT's Marvin Minsky and Seymour Papert ( proved that these networks could only mathematically compute extremely basic functions. In fact, they went so far as to suggest that Rosenblatt had greatly exaggerated his findings and the importance of the perceptron. Perceptrons were of limited functionality to the field, effectively halting research in network structures.

    With both governments releasing reports that significantly criticized the usefulness of AI, the field was shuttled into what has become known as the AI winter. AI research continued throughout the late 1960s and 1970s, mostly under different terminology. The terms machine learning, knowledge-based system, and pattern recognition all come from this period, when researchers had to think up creative names for their work in order to receive funding. Around this time, however, a student at the University of Cambridge named Geoffrey Hinton began exploring ANNs and how we could utilize them to mimic the brain's memory functions. We'll talk a lot more about Hinton in the following sections and throughout this book, as he has become one of the most important figures in AI today.

    Rebirth –1980–1987

    The 1980s saw the birth of deep learning, the brain of AI that has become the focus of most modern AI research. With the revival of neural network research by John Hopfield and David Rumelhart, and several funding initiatives in Japan, the United States, and the United Kingdom, AI research was back on track.

    In the early 1980s, while the United States was still toiling from the effects of the AI Winter, Japan was funding the fifth generation computer system project to advance AI research. In the US, DARPA once again ramped up funding for AI research, with business regaining interest in AI applications. IBM's T.J. Watson Research Center

