Ridge regression in python

Содержание

Ридж-регрессия в Python (шаг за шагом)
Шаг 1: Импортируйте необходимые пакеты
Шаг 2: Загрузите данные
Шаг 3: Подберите модель регрессии хребта
Шаг 4: Используйте модель для прогнозирования
sklearn.linear_model .ridge_regression¶
Ridge Regression in Python
Understanding Ridge Regression
Ridge Regression in Python – A Practical Approach
Conclusion

Ридж-регрессия в Python (шаг за шагом)

Ридж-регрессия — это метод, который мы можем использовать для подбора модели регрессии, когда в данных присутствует мультиколлинеарность .

В двух словах, регрессия методом наименьших квадратов пытается найти оценки коэффициентов, которые минимизируют сумму квадратов остатков (RSS):

RSS = Σ(y i – ŷ i )2

Σ : греческий символ, означающий сумму
y i : Фактическое значение отклика для i -го наблюдения
ŷ i : прогнозируемое значение отклика на основе модели множественной линейной регрессии.

И наоборот, гребневая регрессия стремится минимизировать следующее:

RSS + λΣβ j 2

где j находится в диапазоне от 1 до p переменных-предикторов и λ ≥ 0.

Этот второй член уравнения известен как штраф за усадку.В гребневой регрессии мы выбираем значение λ, которое дает наименьшую возможную тестовую MSE (среднеквадратическую ошибку).

В этом руководстве представлен пошаговый пример выполнения гребневой регрессии в Python.

Шаг 1: Импортируйте необходимые пакеты

Во-первых, мы импортируем необходимые пакеты для выполнения гребневой регрессии в Python:

import pandas as pd from numpy import arange from sklearn. linear_model import Ridge from sklearn. linear_model import RidgeCV from sklearn. model_selection import RepeatedKFold

Шаг 2: Загрузите данные

В этом примере мы будем использовать набор данных под названием mtcars , который содержит информацию о 33 различных автомобилях. Мы будем использовать hp в качестве переменной ответа и следующие переменные в качестве предикторов:

Следующий код показывает, как загрузить и просмотреть этот набор данных:

#define URL where data is located url = "https://raw.githubusercontent.com/Statology/Python-Guides/main/mtcars.csv" #read in data data_full = pd.read_csv (url) #select subset of data data = data_full[["mpg", "wt", "drat", "qsec", "hp"]] #view first six rows of data data[0:6] mpg wt drat qsec hp 0 21.0 2.620 3.90 16.46 110 1 21.0 2.875 3.90 17.02 110 2 22.8 2.320 3.85 18.61 93 3 21.4 3.215 3.08 19.44 110 4 18.7 3.440 3.15 17.02 175 5 18.1 3.460 2.76 20.22 105

Шаг 3: Подберите модель регрессии хребта

Затем мы будем использовать функцию RidgeCV() из sklearn, чтобы соответствовать модели регрессии хребта, и мы будем использовать функцию RepeatedKFold () для выполнения k-кратной перекрестной проверки, чтобы найти оптимальное значение альфа для использования в качестве наказания.

Примечание.Термин «альфа» используется вместо «лямбда» в Python.

Для этого примера мы выберем k = 10 раз и повторим процесс перекрестной проверки 3 раза.

Также обратите внимание, что RidgeCV() по умолчанию проверяет только альфа-значения .1, 1 и 10. Однако мы можем определить собственный альфа-диапазон от 0 до 1 с шагом 0,01:

#define predictor and response variables X = data[["mpg", "wt", "drat", "qsec"]] y = data["hp"] #define cross-validation method to evaluate model cv = RepeatedKFold(n_splits= 10 , n_repeats= 3 , random_state= 1 ) #define model model = RidgeCV(alphas= arange (0, 1, 0.01), cv=cv, scoring='neg_mean_absolute_error') #fit model model. fit (X, y) #display lambda that produced the lowest test MSE print(model. alpha_ ) 0.99

Значение лямбда, которое минимизирует СКО теста, оказывается равным 0,99 .

Шаг 4: Используйте модель для прогнозирования

Наконец, мы можем использовать окончательную модель гребневой регрессии, чтобы делать прогнозы на основе новых наблюдений. Например, следующий код показывает, как определить новый автомобиль со следующими атрибутами:

В следующем коде показано, как использовать регрессионную модель подогнанного хребта для прогнозирования значения hp этого нового наблюдения:

#define new observation new = [24, 2.5, 3.5, 18.5] #predict hp value using ridge regression model model. predict([new]) array([104.16398018])

Основываясь на входных значениях, модель предсказывает, что этот автомобиль будет иметь значение 104,16398018 л.с.

Полный код Python, использованный в этом примере, вы можете найти здесь .

Источник

sklearn.linear_model .ridge_regression¶

sklearn.linear_model. ridge_regression ( X , y , alpha , * , sample_weight = None , solver = ‘auto’ , max_iter = None , tol = 0.0001 , verbose = 0 , positive = False , random_state = None , return_n_iter = False , return_intercept = False , check_input = True ) [source] ¶

Solve the ridge equation by the method of normal equations.

Parameters : X of shape (n_samples, n_features)

y ndarray of shape (n_samples,) or (n_samples, n_targets)

alpha float or array-like of shape (n_targets,)

Constant that multiplies the L2 term, controlling regularization strength. alpha must be a non-negative float i.e. in [0, inf) .

When alpha = 0 , the objective is equivalent to ordinary least squares, solved by the LinearRegression object. For numerical reasons, using alpha = 0 with the Ridge object is not advised. Instead, you should use the LinearRegression object.

If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.

sample_weight float or array-like of shape (n_samples,), default=None

Individual weights for each sample. If given a float, every sample will have the same weight. If sample_weight is not None and solver=’auto’, the solver will be set to ‘cholesky’.

Solver to use in the computational routines:

‘auto’ chooses the solver automatically based on the type of data.
‘svd’ uses a Singular Value Decomposition of X to compute the Ridge coefficients. It is the most stable solver, in particular more stable for singular matrices than ‘cholesky’ at the cost of being slower.
‘cholesky’ uses the standard scipy.linalg.solve function to obtain a closed-form solution via a Cholesky decomposition of dot(X.T, X)
‘sparse_cg’ uses the conjugate gradient solver as found in scipy.sparse.linalg.cg. As an iterative algorithm, this solver is more appropriate than ‘cholesky’ for large-scale data (possibility to set tol and max_iter ).
‘lsqr’ uses the dedicated regularized least-squares routine scipy.sparse.linalg.lsqr. It is the fastest and uses an iterative procedure.
‘sag’ uses a Stochastic Average Gradient descent, and ‘saga’ uses its improved, unbiased version named SAGA. Both methods also use an iterative procedure, and are often faster than other solvers when both n_samples and n_features are large. Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.
‘lbfgs’ uses L-BFGS-B algorithm implemented in scipy.optimize.minimize . It can be used only when positive is True.

All solvers except ‘svd’ support both dense and sparse data. However, only ‘lsqr’, ‘sag’, ‘sparse_cg’, and ‘lbfgs’ support sparse input when fit_intercept is True.

New in version 0.17: Stochastic Average Gradient descent solver.

New in version 0.19: SAGA solver.

Maximum number of iterations for conjugate gradient solver. For the ‘sparse_cg’ and ‘lsqr’ solvers, the default value is determined by scipy.sparse.linalg. For ‘sag’ and saga solver, the default value is 1000. For ‘lbfgs’ solver, the default value is 15000.

tol float, default=1e-4

Precision of the solution. Note that tol has no effect for solvers ‘svd’ and ‘cholesky’.

Changed in version 1.2: Default value changed from 1e-3 to 1e-4 for consistency with other linear models.

Verbosity level. Setting verbose > 0 will display additional information depending on the solver used.

positive bool, default=False

When set to True , forces the coefficients to be positive. Only ‘lbfgs’ solver is supported in this case.

random_state int, RandomState instance, default=None

Used when solver == ‘sag’ or ‘saga’ to shuffle the data. See Glossary for details.

return_n_iter bool, default=False

If True, the method also returns n_iter , the actual number of iteration performed by the solver.

If True and if X is sparse, the method also returns the intercept, and the solver is automatically changed to ‘sag’. This is only a temporary fix for fitting the intercept with sparse data. For dense data, use sklearn.linear_model._preprocess_data before your regression.

If False, the input arrays X and y will not be checked.

n_iter int, optional

The actual number of iteration performed by the solver. Only returned if return_n_iter is True.

intercept float or ndarray of shape (n_targets,)

The intercept of the model. Only returned if return_intercept is True and if X is a scipy sparse array.

This function won’t compute the intercept.

Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization. Alpha corresponds to 1 / (2C) in other linear models such as LogisticRegression or LinearSVC . If an array is passed, penalties are assumed to be specific to the targets. Hence they must correspond in number.

Источник

Ridge Regression in Python

Hello, readers! Today, we would be focusing on an important aspect in the concept of Regression — Ridge Regression in Python, in detail.

Understanding Ridge Regression

We all are aware that, Linear Regression estimates the best fit line and predicts the value of the target numeric variable. That is, it predicts a relationship between the independent and dependent variables of the dataset.

It finds the coefficients of the model via the defined technique during the prediction.

The issue with Linear Regression is that the calculated coefficients of the model variables can turn out to become a large value which in turns makes the model sensitive to inputs. Thus, this makes the model very unstable.

This is when Ridge Regression comes into picture!

Ridge regression also known as, L2 Regression adds a penalty to the existing model. It adds penalty to the loss function which in turn makes the model have a smaller value of coefficients. That is, it shrinks the coefficients of the variables of the model that do not contribute much to the model itself.

It penalizes the model based on the Sum of Square Error(SSE). Though it penalizes the model, it prevents it from being excluded from the model by letting them have towards zero as a value of coefficients.

To add, a hyper-parameter called lambda is included into the L2 penalty to have a check at the weighting of the penalty values.

In a Nutshell, ridge regression can be framed as follows:

Ridge = loss + (lambda * l2_penalty)

Let us now focus on the implementation of the same!

Ridge Regression in Python – A Practical Approach

In this example, we will be working on the Bike Rental Count dataset. You can find the dataset here!

At first, we load the dataset into the Python environment using read_csv() function. Further, we split the data using train_test_split() function.

Then we define the error metrics for the model Here we have made use of MAPE as an error metric.

At last, we apply Ridge regression to the model using Ridge() function.

import os import pandas #Changing the current working directory os.chdir("D:/Ediwsor_Project - Bike_Rental_Count") BIKE = pandas.read_csv("day.csv") bike = BIKE.copy() categorical_col_updated = ['season','yr','mnth','weathersit','holiday'] bike = pandas.get_dummies(bike, columns = categorical_col_updated) #Separating the depenedent and independent data variables into two dataframes. from sklearn.model_selection import train_test_split X = bike.drop(['cnt'],axis=1) Y = bike['cnt'] X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.20, random_state=0) import numpy as np def MAPE(Y_actual,Y_Predicted): mape = np.mean(np.abs((Y_actual - Y_Predicted)/Y_actual))*100 return mape from sklearn.linear_model import Ridge ridge_model = Ridge(alpha=1.0) ridge=ridge_model.fit(X_train , Y_train) ridge_predict = ridge.predict(X_test) Ridge_MAPE = MAPE(Y_test,ridge_predict) print("MAPE value: ",Ridge_MAPE) Accuracy = 100 - Ridge_MAPE print('Accuracy of Ridge Regression: %.'.format(Accuracy))

Using Ridge (L2) penalty, we have received an accuracy of 83.3%

MAPE value: 16.62171367018922 Accuracy of Ridge Regression: 83.38%.

Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.

For more such posts related to Python, Stay tuned with us.

Источник