Machine Learning A-Z: Hands-On Python and java
About Lesson

K-Means Clustering Concept

K-Means is an unsupervised learning algorithm used for clustering. It partitions data into K distinct clusters based on feature similarity. The algorithm assigns data points to clusters by minimizing the variance within each cluster.

Implementation in Python (Scikit-learn)

Step 1: Import Libraries

pythonCopier le codefrom sklearn.cluster import KMeans

Step 2: Prepare Data

pythonCopier le codekmeans = KMeans(n_clusters=3, random_state=42)

Step 3: Fit the Model

pythonCopier le codekmeans.fit(X)

Step 4: Predict Clusters

pythonCopier le codeclusters = kmeans.predict(X)

Implementation in Java (Weka)

Step 1: Load the Dataset

javaCopier le codeInstances data = new Instances(new BufferedReader(new FileReader(“data.arff”)));

Step 2: Set Up K-Means

javaCopier le codeSimpleKMeans kMeans = new SimpleKMeans();kMeans.setNumClusters(3);kMeans.buildClusterer(data);

Step 3: Assign Instances to Clusters

javaCopier le codeint[] assignments = kMeans.getAssignments();