A Primer on AI Machine Learning
What you need to know about machine learning at-a-glance
Posted February 4, 2019
Chances are that you are already using services powered by artificial intelligence (AI) every day. But what exactly powers machine learning? Let’s take a look at the engine under AI machine learning’s hood.
Machine learning is a subset of artificial intelligence that does not require explicit hard-coding (programming) in order to perform tasks. This is achieved by providing machine learning algorithms with large amounts of data to “learn” from and process. Machine learning is performed in a manner that is either supervised, unsupervised, semi-supervised, or by reinforcement methods.
Supervised machine learning uses labeled training data—for each input, there is a known and associated output value. The goal of supervised learning is to learn a function that best estimates the relationship between the input and output data. Whereas in unsupervised learning, there are no labeled output training data associated with the input data, so the objective is for the machine to infer from the input training data provided—to identify the similarities and differences between data points. Semi-supervised machine learning uses some labeled training data.
Reinforcement learning (RL) is the method where learning is achieved through software agents interacting with its environment with the goal to maximize reward. Markov Decision Processes (MDPs) are typically used for reinforcement learning. MDP mathematically models decision making in uncertain environments.
At the heart of artificial intelligence is the mathematics and statistics used in computer algorithms, the procedures for solving a problem. Algorithms that perform regression, classification or clustering are examples of common machine learning tasks.
The concept of regression was introduced by polymath Sir Francis Galton (Charles Darwin’s cousin) in his genetics research papers “Regression towards mediocrity in hereditary stature” and “Natural Inheritance” published in 1886 and 1889 respectively. “Regression toward the mean” is the phenomena for data outliers that are outside of the norm to be closer to the average the next time it is measured. In scientific terms, regression to the mean typically occurs due to errors in data sampling. This can arise when the sample size is too small or if samples are not randomly selected.
A way to think of this is in the context of the familiar adage to “walk away from the table” when you are ahead at a casino because winning is a random outlier, and over time, the outcome will regress towards the mean of losing. Winning streaks are uncommon outcomes and chances are high that over time you will eventually start losing if you keep playing.
Linear regression is the simplest form of regression that is use for predictive analysis in machine learning algorithms. The goal is to minimize the error between the algorithm’s actual value and predicted value. A cost function, also known as the Mean Squared Error (MSE) function, measures the prediction errors.
Gradient descent is an optimization algorithm for machine learning used to identify the values of the coefficients (parameters) of a function that will minimize a cost function.
Linear regression is relatively simple and straightforward. Oftentimes however, in any given dataset, the relationship between two variables are not directly proportional, and therefore cannot be derived by linear regression. In machine learning, typically non-linear regression techniques are used. Examples of nonlinear regression algorithms include gradient descent, Gauss-Newton, and the Levenberg-Marquardt methods.
Another common machine learning task is classification. Classification is supervised machine learning where the computer learns from labeled training data and applies the learning with the goal of accurately predicting the class for the data. For example, on HBO’s comedy “Silicon Valley,” one of the enterprising characters, Mr. Jian-Yang, created an AI app called “Not Hotdog” to classify images as hot dogs or not hot dogs. In real life, Tim Anglade, the show’s lead technical advisor, created a Not Hotdog app. As with any machine learning, the quantity and quality of training is important. In this case, Anglade wrote in his blog post on Medium that due to biases in the initial dataset used, the app was "unable to recognize French-style hotdogs, Asian hotdogs, and more oddities we did not have immediate personal experience with,” and that AI is impacted “by the same human biases we fall prey to, via the training sets humans provide.”
The third major type of machine learning task is clustering—the organization of unlabeled data in to similar groups through unsupervised machine learning. To illustrate the concept of clustering, let’s look at an example of human-based statistical cluster analysis—the work done by John Snow, MD, was one of the first epidemiologists. Dr. Snow mapped cases of cholera, and noticed that clusters of outbreak were near a water pump. As it turns out, the water at that pump was polluted with the soiled diaper of a baby with cholera. Dr. Snow theorized that cholera was a water-borne illness during a major outbreak in 1854 in the London neighborhood of SoHo. Based on his detailed analysis, he concluded that cholera was not caused by “miasma” (“bad air”) as was the dominant thought at that time.
The recent surge in artificial intelligence (AI) investments across many industry sectors is largely due to the pattern-recognition capabilities from deep learning, a machine learning method of more than two layers of neural networks. Deep learning are neural networks consisting of two or more layers that use nonlinear processing. Deep learning is state-of-the-art for pattern-recognition used for image and speech recognition. This technique is optimal when there are large data sets available for training.
AI has been interwoven in social media apps, internet search, online shopping suggestions, customer service bots, personalized medicine, financial trading, industrial manufacturing management, medical drug discovery, fraud prevention, business intelligence analytics, recruiting human resources, virtual assistants, autonomous vehicles, translation engines, facial recognition, converting images to color, and even esports. The interdisciplinary fields of mathematics, statistics, data science and computer science converge in machine learning, which in turn, is rapidly changing how we live, work and play.
Copyright © 2019 Cami Rosso All rights reserved.
Galton, Francis. “Regression towards mediocrity in hereditary stature.” Macmillan. 1886. Retrieved 2-4-2019 from http://galton.org/books/natural-inheritance/pdf/galton-nat-inh-1up-clea…
Galton, Francis. “Natural Inheritance.” Macmillan. 1889. Retrieved 2-4-2019 from http://galton.org/books/natural-inheritance/pdf/galton-nat-inh-1up-clea…
GeegksforGeeks. “Clustering in Machine Learning.” Retrieved 2/4/2019 from https://www.geeksforgeeks.org/
Anglade, Tim. “How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native.” Medium. June 26, 2017.
Rogers, Simon. “John Snow's data journalism: the cholera map that changed the world.” The Guardian. March 15, 2013.