Support Vector Regression

Definition of Support Vector Regression:

Support vector regression (SVR) is a machine learning algorithm that can be trained to learn the nonlinear relationship between input data and a target output variable without prior description of the underlying processes. SVR is based on support vector machine theory that classifies data by representing them in a multidimensional space.

This is the common definition for Support Vector Regression, other definitions can be discussed in the article

Short introduction

A support vector machine is a classifier that maps a subdomain of the input variables onto a subdomain of the output target variable(s). If the dependence of the output variable(s) on the input variables is nonlinear, it is not possible to define a linear classifier that separates input variables and corresponding output variables in distinct clusters. By performing a so-called nonlinear kernel transformation, a low dimensional data space is converted into a high dimensional space where a linear hyper-plane can classify the data points (i.e. define distinct clusters of input data and corresponding output data). Given the kernel function, the support vector machine does a systematic search to determine the hyperplane that most efficiently separates the training data into different input-output clusters. Support vectors correspond to the data points that are near to the hyperplane and help in orienting it. Once the hyperplane is known, the position of a new input data point (e.g. from the test data) relative to the hyperplane determines the cluster to which it belongs. Support vector regression assumes that clusters correspond to restricted value ranges of the target variable. A number of kernel functions exist such as Polynomial Functions (mapping data onto a finite-dimensional space) or Radial Basis Functions (mapping data onto an infinite-dimensional space) that enable non-linear classification. Support Vector Regression further assumes that the training and test data are independent and preprocessed in order to follow identical distributions (e.g., subtraction of the mean value and division by the square root of the variance). Being a highly sophisticated and mathematically sound algorithm, Support Vector Regression is one of the most accurate machine learning algorithms.

Analysis technique	Strengths	Limitations	Application example
Prediction tool based on machine learning from training data	* Handles unstructured data and nonlinear relationships in high dimensional spaces * Does classification and regression * Robust method based on sound mathematical principles * Efficient for small datasets * Overfitting can easily be avoided	* Black box, no easy interpretation of results, no probability estimates * Sensitivity to noise and outliers * Less efficient for large datasets * Not reliable outside the range of trained situations * Results influenced by the choice of the kernel transformation	Pattern recognition from images, e.g. interpretation remote sensing images

For more detailed explanations see:

StatQuest: Support Vector Machines Part 1: Main Ideas by Josh Starmer

Wikipedia Support Vector Machine

More

Navigation

Tools

Support Vector Regression

Short introduction

Related articles