Overlapping of Computer Vision and Machine Learning Algorithm (Mean Shift and K-Means)

Johnny Chen
3 min readJun 4, 2021


Mean Shift in Computer Vision and K-Means in Machine Learning

In my 14 days quarantine period travelling back from New York to Taiwan, the spare time I spent is to do a project on computer vision. In the search of object tracking, I found out that Mean Shift clustering algorithm is quite similar to K-Means clustering taught in machine learning.

Before jumping into the computer vision world, let us review the content in K-Means clustering. So what is K-Means clustering?

K-Means clustering is an unsupervised learning in classification among two other popular clustering algorithms used: Hierarchical Cluster Analysis and Gaussian Mixture Models. The definition of clustering can be loosely define as clusters have objects that are “similar in some way”. Let us stop for a while, how do we calculate the distance of different data points? Euclidean distance! And that is the math used behind Hierarchical Cluster Algorithm.

Interactive website example: Visualizing K-Means Clustering (stanford.edu)

For K-Means clustering, the first step is to pick a k (kernel) input parameter, which means we want to identify k clusters. Next, we will randomly select k data point as initial points. And then we measured the distance between the 1st point and the k initial clusters and assign the 1st point to the nearest cluster. After all the points are colored in cluster, we calculate the mean of each clusters. As for how to determine the number “k”? we can use “elbow plot” to determine the number of cluster (k) has huge reduction and after that number k, the variation don’t go down as quickly [picture 2].

[Picture 2] Elbow plot

About Mean Shift algorithm, it also uses the information extraction technique from data point with mean vector operation and that is what’s similar to K-Means clustering. In Mean Shift every data is the cluster center and it works by updating the candidates for centroids to be the mean of the points. And based on kernel, the point would move to the highest density place. By doing so to cluster or track.

Let’s break down the steps.

1st: The direction to the closest cluster centroid is determined by where most of the points nearby are at.

2nd: So each iteration each blue point will move closer to where the most points are at, which is or will lead to the cluster center.

3rd: The red and blue data points overlap completely in the first iteration before the mean shift algorithm starts.

At the end of iteration 1, all the blue points move towards the clusters. Here is appears there will be either 3 or 4 clusters

To sum up, the complexity for mean shift is O(kN²) and for K-Means it is only O(kN), k here means the number of iteration steps and N is the number of data points. Since mean shift deals with the Euclidean distance it got larger complexity and makes it time-consuming if the dataset is huge. As for limitations, some outlier might not be able to correctly identify as noise, whereas K-Means often requires that the number of clusters to be pre-determined.

— Reference —

  1. Visualizing K-Means Clustering. https://stanford.edu/class/engr108/visualizations/kmeans/kmeans.html
  2. StatQuest: K-means clustering. https://www.youtube.com/watch?v=4b5d3muPQmA&t=376s
  3. On Mean Shift and K-Means Clustering. http://jamesxli.blogspot.com/2012/03/on-mean-shift-and-k-means-clustering.html#:~:text=Mean%20shift%20and%20K%2DMeans%20algorithm%20are%20two%20similar%20clustering,(e.g.%20for%20image%20segmentation.)
  4. Python for Computer Vision with OpenCV and Deep Learning. https://www.udemy.com/course/python-for-computer-vision-with-opencv-and-deep-learning/