Генетический алгоритм python sklearn

Содержание

Genetic Algorithm in Machine Learning using Python
Import libraries
Data
API Reference

Genetic Algorithm in Machine Learning using Python

One of the advanced algorithms in the field of computer science is Genetic Algorithm inspired by the Human genetic process of passing genes from one generation to another.It is generally used for optimization purpose and is heuristic in nature and can be used at various places. For eg – solving np problem,game theory,code-breaking,etc.

Another trending and useful modern-day tech is Machine Learning creating a lot of impacts on mankind which involve learning and finding the pattern in the large amount of data for classification and regression.

But can we somehow involve genetic algorithm in machine learning? How will it affect the results? Let’s find out.

Here are quick steps for how the genetic algorithm works:

Initial Population – Initialize the population randomly based on the data.
Fitness function – Find the fitness value of the each of the chromosomes(a chromosome is a set of parameters which define a proposed solution to the problem that the genetic algorithm is trying to solve)
Selection– Select the best fitted chromosomes as parents to pass the genes for the next generation and create a new population
Cross-over– Create new set of chromosome by combining the parents and add them to new population set
Mutation– Perfrom mutation which alters one or more gene values in a chromosome in the new population set generated. Mutation helps in getting more diverse oppourtinity.Obtained population will be used in the next generation

Repeat step 2-5 again for each generation

Now, let’s get our hands on the code:

Initially, we will run the Logisitcs regression algorithm on breast cancer data.

Import libraries

We will import the important python libraries required for this algorithm.

import numpy as np import pandas as pd import random import matplotlib.pyplot %matplotlib inline

Import some other important libraries for implementation of the Machine Learning Algorithm.

from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score

Data

Import the dataset from the python library sci-kit-learn.

#import the breast cancer dataset from sklearn.datasets import load_breast_cancer cancer=load_breast_cancer() df = pd.DataFrame(cancer['data'],columns=cancer['feature_names']) label=cancer["target"]

Splitting dataset into test and train.

#splitting the model into training and testing set X_train, X_test, y_train, y_test = train_test_split(df, label, test_size=0.30, random_state=101)

Training using Logistics Regression Technique-

#training a logistics regression model logmodel = LogisticRegression() logmodel.fit(X_train,y_train) predictions = logmodel.predict(X_test) print(«Accuracy Accuracy score after genetic algorithm is tags-link»> Logistic Regression Machine Learning

Abhinav Choudhary works or receives funding from a company or organization that would benefit from this article. Views expressed here are supported by a university or a company.

Источник

API Reference

class genetic_selection. GeneticSelectionCV ( estimator , cv = None , scoring = None , fit_params = None , max_features = None , verbose = 0 , n_jobs = 1 , n_population = 300 , crossover_proba = 0.5 , mutation_proba = 0.2 , n_generations = 40 , crossover_independent_proba = 0.1 , mutation_independent_proba = 0.05 , tournament_size = 3 , n_gen_no_change = None , caching = False ) [source] 

Feature selection with genetic algorithm.

estimator (object) – A supervised learning estimator with a fit method.
cv (int,cross-validation generatororan iterable,optional) – Determines the cross-validation splitting strategy. Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
For integer/None inputs, if y is binary or multiclass, StratifiedKFold used. If the estimator is a classifier or if y is neither binary nor multiclass, KFold is used.

The number of selected features with cross-validation.

The mask of selected features.

array of shape [n_features]
The maximum cross-validation score for each generation.

array of shape [n_generations]
The external estimator fit on the reduced dataset.

An example showing genetic feature selection.
```
>>> import numpy as np >>> from sklearn import datasets, linear_model >>> from genetic_selection import GeneticSelectionCV >>> iris = datasets.load_iris() >>> E = np.random.uniform(0, 0.1, size=(len(iris.data), 20)) >>> X = np.hstack((iris.data, E)) >>> y = iris.target >>> estimator = linear_model.LogisticRegression(solver="liblinear", multi_class="ovr") >>> selector = GeneticSelectionCV(estimator, cv=5) >>> selector = selector.fit(X, y) >>> selector.support_ array([ True True True True False False False False False False False False False False False False False False False False False False False False], dtype=bool) 
```
Fit the GeneticSelectionCV model and the underlying estimator on the selected features.
- X (,sparse matrix>,shape =[n_samples,n_features]) – The training input samples.
Reduce X to the selected features and then predict using the underlying estimator.

X (array of shape [n_samples, n_features]) – The input samples.

y – The predicted target values.

Reduce X to the selected features and return the score of the underlying estimator.

© Copyright 2016-2022, Manuel Calzolari. Revision b119bad4 .

Источник

Читайте также: Java stringbuilder удалить пробелы