K-Means Clustering Concept
K-Means is an unsupervised learning algorithm used for clustering. It partitions data into K distinct clusters based on feature similarity. The algorithm assigns data points to clusters by minimizing the variance within each cluster.
Implementation in Python (Scikit-learn)
Step 1: Import Libraries
pythonCopier le codefrom sklearn.cluster import KMeans
Step 2: Prepare Data
pythonCopier le codekmeans = KMeans(n_clusters=3, random_state=42)
Step 3: Fit the Model
pythonCopier le codekmeans.fit(X)
Step 4: Predict Clusters
pythonCopier le codeclusters = kmeans.predict(X)
Implementation in Java (Weka)
Step 1: Load the Dataset
javaCopier le codeInstances data = new Instances(new BufferedReader(new FileReader(“data.arff”)));
Step 2: Set Up K-Means
javaCopier le codeSimpleKMeans kMeans = new SimpleKMeans();kMeans.setNumClusters(3);kMeans.buildClusterer(data);
Step 3: Assign Instances to Clusters
javaCopier le codeint[] assignments = kMeans.getAssignments();