Fuzzy k means python

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

The repository includes a modular implementation for Fuzzy K-Means based on numpy with sklearn like interface

License

ammarSherif/Fuzzy-K-Means

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

The repository includes a modular implementation for Fuzzy K-Means based on numpy with sklearn like interface.

The algorithm iteratively computes two values until convergence:

  • the centroid of the ith cluster
  • the degree to which a data point belongs to a cluster whose centroid is ;
    note ,

Given a fuzzification index, m, and the number of clusters, n, we compute the above values as below:

As well, the cluster centroid is just a weighted mean of all the data points, having weights equal to how much it belongs to this cluster or mathematically:

Therefore, we keep iterating on computing these two values until convergence.

Our module has a similar interface to that of normal KMeans provided by sklearn . The initializer interface accepts the parameters of KMeans besides:

  • m : indicates the fuzziness index according to the above equations
  • eps : determines the threshold value to recognize convergence.
    The lower the value to more accurate the results would be. Its default value is 0.001

Given that, the below code demonstrates how to use the module:

# ============================================================================== # We assume that holds the data samples, upon which we will cluster them # ------------------------------------------------------------------------------ # We initialize the fuzziness index, m, with 2 # As well, we would like to have 3 clusters # ============================================================================== fkm = FuzzyKMeans(m=2, n_clusters= 3) # ============================================================================== # Fit the model to the training data # ============================================================================== fkm = fkm.fit(X) # ============================================================================== # Get the fitting results # - cluster_centers_: the centroids of the clusters # - labels_: the data point labels, where each belongs to the cluster hav- # ing the highest membership value of # - fmm_: the fuzzy membership value of each data point to each cluster, w # ============================================================================== fitted_centroids = fkm.cluster_centers_ X_labels = fkm.labels_ fmm = fkm.fmm_ # ============================================================================== # You can as well predict, get the labels of other data and get the membership # values # ============================================================================== new_labels = fkm.predict(new_X) new_fmm = fkm.compute_membership(new_X)

Fuzzy KMeans vs Scikit Learn KMeans

Algorithm Comparison

Please feel free to checkout this notebook that compares between KMeans and our fuzzy implementation of it. Notice: we change the opacity to indicate how much a data point belongs to a cluster. Below is a the brief results at various values of m

Источник

mblondel / kmeans.py

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

# Copyright Mathieu Blondel December 2011
# License: BSD 3 clause
import numpy as np
import pylab as pl
from sklearn . base import BaseEstimator
from sklearn . utils import check_random_state
from sklearn . cluster import MiniBatchKMeans
from sklearn . cluster import KMeans as KMeansGood
from sklearn . metrics . pairwise import euclidean_distances , manhattan_distances
from sklearn . datasets . samples_generator import make_blobs
##############################################################################
# Generate sample data
np . random . seed ( 0 )
batch_size = 45
centers = [[ 1 , 1 ], [ — 1 , — 1 ], [ 1 , — 1 ]]
n_clusters = len ( centers )
X , labels_true = make_blobs ( n_samples = 1200 , centers = centers , cluster_std = 0.3 )
class KMeans ( BaseEstimator ):
def __init__ ( self , k , max_iter = 100 , random_state = 0 , tol = 1e-4 ):
self . k = k
self . max_iter = max_iter
self . random_state = random_state
self . tol = tol
def _e_step ( self , X ):
self . labels_ = euclidean_distances ( X , self . cluster_centers_ ,
squared = True ). argmin ( axis = 1 )
def _average ( self , X ):
return X . mean ( axis = 0 )
def _m_step ( self , X ):
X_center = None
for center_id in range ( self . k ):
center_mask = self . labels_ == center_id
if not np . any ( center_mask ):
# The centroid of empty clusters is set to the center of
# everything
if X_center is None :
X_center = self . _average ( X )
self . cluster_centers_ [ center_id ] = X_center
else :
self . cluster_centers_ [ center_id ] = \
self . _average ( X [ center_mask ])
def fit ( self , X , y = None ):
n_samples = X . shape [ 0 ]
vdata = np . mean ( np . var ( X , 0 ))
random_state = check_random_state ( self . random_state )
self . labels_ = random_state . permutation ( n_samples )[: self . k ]
self . cluster_centers_ = X [ self . labels_ ]
for i in xrange ( self . max_iter ):
centers_old = self . cluster_centers_ . copy ()
self . _e_step ( X )
self . _m_step ( X )
if np . sum (( centers_old — self . cluster_centers_ ) ** 2 ) < self . tol * vdata :
break
return self
class KMedians ( KMeans ):
def _e_step ( self , X ):
self . labels_ = manhattan_distances ( X , self . cluster_centers_ ). argmin ( axis = 1 )
def _average ( self , X ):
return np . median ( X , axis = 0 )
class FuzzyKMeans ( KMeans ):
def __init__ ( self , k , m = 2 , max_iter = 100 , random_state = 0 , tol = 1e-4 ):
«»»
m > 1: fuzzy-ness parameter
The closer to m is to 1, the closter to hard kmeans.
The bigger m, the fuzzier (converge to the global cluster).
«»»
self . k = k
assert m > 1
self . m = m
self . max_iter = max_iter
self . random_state = random_state
self . tol = tol
def _e_step ( self , X ):
D = 1.0 / euclidean_distances ( X , self . cluster_centers_ , squared = True )
D **= 1.0 / ( self . m — 1 )
D /= np . sum ( D , axis = 1 )[:, np . newaxis ]
# shape: n_samples x k
self . fuzzy_labels_ = D
self . labels_ = self . fuzzy_labels_ . argmax ( axis = 1 )
def _m_step ( self , X ):
weights = self . fuzzy_labels_ ** self . m
# shape: n_clusters x n_features
self . cluster_centers_ = np . dot ( X . T , weights ). T
self . cluster_centers_ /= weights . sum ( axis = 0 )[:, np . newaxis ]
def fit ( self , X , y = None ):
n_samples , n_features = X . shape
vdata = np . mean ( np . var ( X , 0 ))
random_state = check_random_state ( self . random_state )
self . fuzzy_labels_ = random_state . rand ( n_samples , self . k )
self . fuzzy_labels_ /= self . fuzzy_labels_ . sum ( axis = 1 )[:, np . newaxis ]
self . _m_step ( X )
for i in xrange ( self . max_iter ):
centers_old = self . cluster_centers_ . copy ()
self . _e_step ( X )
self . _m_step ( X )
if np . sum (( centers_old — self . cluster_centers_ ) ** 2 ) < self . tol * vdata :
break
return self
kmeans = KMeans ( k = 3 )
kmeans . fit ( X )
kmedians = KMedians ( k = 3 )
kmedians . fit ( X )
fuzzy_kmeans = FuzzyKMeans ( k = 3 , m = 2 )
fuzzy_kmeans . fit ( X )
fig = pl . figure ()
colors = [ ‘#4EACC5’ , ‘#FF9C34’ , ‘#4E9A06’ ]
objects = ( kmeans , kmedians , fuzzy_kmeans )
for i , obj in enumerate ( objects ):
ax = fig . add_subplot ( 1 , len ( objects ), i + 1 )
for k , col in zip ( range ( obj . k ), colors ):
my_members = obj . labels_ == k
cluster_center = obj . cluster_centers_ [ k ]
ax . plot ( X [ my_members , 0 ], X [ my_members , 1 ], ‘w’ ,
markerfacecolor = col , marker = ‘.’ )
ax . plot ( cluster_center [ 0 ], cluster_center [ 1 ], ‘o’ , markerfacecolor = col ,
markeredgecolor = ‘k’ , markersize = 6 )
ax . set_title ( obj . __class__ . __name__ )
pl . show ()

Источник

Fuzzy K-Means¶

The fuzzy k-means module has 3 seperate models that can be imported as:

import sklearn_extensions as ske mdl = ske.fuzzy_kmeans.FuzzyKMeans() mdl.fit_predict(X, y) mdl = ske.fuzzy_kmeans.KMeans() mdl.fit_predict(X, y) mdl = ske.fuzzy_kmeans.KMedians() mdl.fit_predict(X, y) 

Examples¶

import numpy as np from sklearn_extensions.fuzzy_kmeans import KMedians, FuzzyKMeans, KMeans from sklearn.datasets.samples_generator import make_blobs np.random.seed(0) batch_size = 45 centers = [[1, 1], [-1, -1], [1, -1]] n_clusters = len(centers) X, labels_true = make_blobs(n_samples=1200, centers=centers, cluster_std=0.3) kmeans = KMeans(k=3) kmeans.fit(X) kmedians = KMedians(k=3) kmedians.fit(X) fuzzy_kmeans = FuzzyKMeans(k=3, m=2) fuzzy_kmeans.fit(X) print('KMEANS') print(kmeans.cluster_centers_) print('KMEDIANS') print(kmedians.cluster_centers_) print('FUZZY_KMEANS') print(fuzzy_kmeans.cluster_centers_) 
KMEANS [[ 0.74279904 0.94377717] [ 1.22177014 1.00196511] [-0.00873034 -0.99593489]] KMEDIANS [[ 0.99538235 -1.01070379] [ 0.96275935 0.98959938] [-0.97974863 -0.99788949]] FUZZY_KMEANS [[ 0.98642164 -1.0000844 ] [ 0.97111065 0.99339691] [-0.98862482 -0.99082696]] 

Third Party Docs¶

The original unmodified version of this module’s code can be found here: Fuzzy K-Means

© Copyright 2015, Will McGinnis.

Источник

Читайте также:  Php load dll files
Оцените статью