sklearn.linear_model
.Ridge¶

class
sklearn.linear_model.
Ridge
(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, solver='auto', random_state=None)[source]¶ Linear least squares with l2 regularization.
This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2norm. Also known as Ridge Regression or Tikhonov regularization. This estimator has builtin support for multivariate regression (i.e., when y is a 2darray of shape [n_samples, n_targets]).
Read more in the User Guide.
Parameters: alpha : {float, arraylike}, shape (n_targets)
Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to
C^1
in other linear models such as LogisticRegression or LinearSVC. If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.copy_X : boolean, optional, default True
If True, X will be copied; else, it may be overwritten.
fit_intercept : boolean
Whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (e.g. data is expected to be already centered).
max_iter : int, optional
Maximum number of iterations for conjugate gradient solver. For ‘sparse_cg’ and ‘lsqr’ solvers, the default value is determined by scipy.sparse.linalg. For ‘sag’ solver, the default value is 1000.
normalize : boolean, optional, default False
If True, the regressors X will be normalized before regression. This parameter is ignored when fit_intercept is set to False. When the regressors are normalized, note that this makes the hyperparameters learnt more robust and almost independent of the number of samples. The same property is not valid for standardized data. However, if you wish to standardize, please use preprocessing.StandardScaler before calling fit on an estimator with normalize=False.
solver : {‘auto’, ‘svd’, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’}
Solver to use in the computational routines:
 ‘auto’ chooses the solver automatically based on the type of data.
 ‘svd’ uses a Singular Value Decomposition of X to compute the Ridge coefficients. More stable for singular matrices than ‘cholesky’.
 ‘cholesky’ uses the standard scipy.linalg.solve function to obtain a closedform solution.
 ‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for largescale data (possibility to set tol and max_iter).
 ‘lsqr’ uses the dedicated regularized leastsquares routine scipy.sparse.linalg.lsqr. It is the fastest but may not be available in old scipy versions. It also uses an iterative procedure.
 ‘sag’ uses a Stochastic Average Gradient descent. It also uses an iterative procedure, and is often faster than other solvers when both n_samples and n_features are large. Note that ‘sag’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.
All last four solvers support both dense and sparse data. However, only ‘sag’ supports sparse input when fit_intercept is True.
New in version 0.17: Stochastic Average Gradient descent solver.
tol : float
Precision of the solution.
random_state : int seed, RandomState instance, or None (default)
The seed of the pseudo random number generator to use when shuffling the data. Used only in ‘sag’ solver.
New in version 0.17: random_state to support Stochastic Average Gradient.
Attributes: coef_ : array, shape (n_features,) or (n_targets, n_features)
Weight vector(s).
intercept_ : float  array, shape = (n_targets,)
Independent term in decision function. Set to 0.0 if
fit_intercept = False
.n_iter_ : array or None, shape (n_targets,)
Actual number of iterations for each target. Available only for sag and lsqr solvers. Other solvers will return None.
New in version 0.17.
Examples
>>> from sklearn.linear_model import Ridge >>> import numpy as np >>> n_samples, n_features = 10, 5 >>> np.random.seed(0) >>> y = np.random.randn(n_samples) >>> X = np.random.randn(n_samples, n_features) >>> clf = Ridge(alpha=1.0) >>> clf.fit(X, y) Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None, normalize=False, random_state=None, solver='auto', tol=0.001)
Methods
decision_function
(\*args, \*\*kwargs)DEPRECATED: and will be removed in 0.19. fit
(X, y[, sample_weight])Fit Ridge regression model get_params
([deep])Get parameters for this estimator. predict
(X)Predict using the linear model score
(X, y[, sample_weight])Returns the coefficient of determination R^2 of the prediction. set_params
(\*\*params)Set the parameters of this estimator. 
__init__
(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, solver='auto', random_state=None)[source]¶

decision_function
(*args, **kwargs)[source]¶ DEPRECATED: and will be removed in 0.19.
Decision function of the linear model.
Parameters: X : {arraylike, sparse matrix}, shape = (n_samples, n_features)
Samples.
Returns: C : array, shape = (n_samples,)
Returns predicted values.

fit
(X, y, sample_weight=None)[source]¶ Fit Ridge regression model
Parameters: X : {arraylike, sparse matrix}, shape = [n_samples, n_features]
Training data
y : arraylike, shape = [n_samples] or [n_samples, n_targets]
Target values
sample_weight : float or numpy array of shape [n_samples]
Individual weights for each sample
Returns: self : returns an instance of self.

get_params
(deep=True)[source]¶ Get parameters for this estimator.
Parameters: deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.

predict
(X)[source]¶ Predict using the linear model
Parameters: X : {arraylike, sparse matrix}, shape = (n_samples, n_features)
Samples.
Returns: C : array, shape = (n_samples,)
Returns predicted values.

score
(X, y, sample_weight=None)[source]¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1  u/v), where u is the regression sum of squares ((y_true  y_pred) ** 2).sum() and v is the residual sum of squares ((y_true  y_true.mean()) ** 2).sum(). Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: X : arraylike, shape = (n_samples, n_features)
Test samples.
y : arraylike, shape = (n_samples) or (n_samples, n_outputs)
True values for X.
sample_weight : arraylike, shape = [n_samples], optional
Sample weights.
Returns: score : float
R^2 of self.predict(X) wrt. y.

set_params
(**params)[source]¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: self :