Support Vector Regression

From MarineSpecies Introduced Traits Wiki
Jump to: navigation, search
Definition of Support Vector Regression:
Support vector regression (SVR) is a machine learning algorithm that can be trained to learn the nonlinear relationship between input data and a target output variable without prior description of the underlying processes. SVR is based on support vector machine theory that classifies data by representing them in a multidimensional space.
This is the common definition for Support Vector Regression, other definitions can be discussed in the article

Short introduction

A support vector machine is a classifier that maps a subdomain of the input variables onto a subdomain of the output target variable(s). If the dependence of the output variable(s) on the input variables is nonlinear, it is not possible to define a linear classifier that separates input variables and corresponding output variables in distinct clusters. By performing a so-called nonlinear kernel transformation, a low dimensional data space is converted into a high dimensional space where a linear hyper-plane can classify the data points (i.e. define distinct clusters of input data and corresponding output data). Given the kernel function, the support vector machine does a systematic search to determine the hyperplane that most efficiently separates the training data into different input-output clusters. Support vectors correspond to the data points that are near to the hyperplane and help in orienting it. Once the hyperplane is known, the position of a new input data point (e.g. from the test data) relative to the hyperplane determines the cluster to which it belongs. Support vector regression assumes that clusters correspond to restricted value ranges of the target variable. A number of kernel functions exist such as Polynomial Functions (mapping data onto a finite-dimensional space) or Radial Basis Functions (mapping data onto an infinite-dimensional space) that enable non-linear classification. Support Vector Regression further assumes that the training and test data are independent and preprocessed in order to follow identical distributions (e.g., subtraction of the mean value and division by the square root of the variance). Being a highly sophisticated and mathematically sound algorithm, Support Vector Regression is one of the most accurate machine learning algorithms.


Analysis technique Strengths Limitations Application example
Prediction tool based on machine learning from training data * Handles unstructured data and nonlinear relationships in high dimensional spaces
* Does classification and regression
* Robust method based on sound mathematical principles
* Efficient for small datasets
* Overfitting can easily be avoided
* Black box, no easy interpretation of results, no probability estimates
* Sensitivity to noise and outliers
* Less efficient for large datasets
* Not reliable outside the range of trained situations
* Results influenced by the choice of the kernel transformation
Pattern recognition from images, e.g. interpretation remote sensing images


For more detailed explanations see:

StatQuest: Support Vector Machines Part 1: Main Ideas by Josh Starmer
Wikipedia Support Vector Machine


Related articles

Data analysis techniques for the coastal zone


The main author of this article is Job Dronkers
Please note that others may also have edited the contents of this article.

Citation: Job Dronkers (2024): Support Vector Regression. Available from http://www.coastalwiki.org/wiki/Support_Vector_Regression [accessed on 27-04-2024]