Pjore
2 min readAug 9, 2021

--

K-Means Clustering Use Case

Hello Guys ๐Ÿ™‹โ€โ™€๏ธ In this article, we are going to look at a case study of the K-Means clustering algorithm in the security domain. but before that let's look at some concepts related to k means clustering

K -Means clustering is one of the algorithm used in clustering problems. clustering comes under the category of unsupervised learning

What is Unsupervised Learning??

In this type of Machine Learning, we only provide Input to the model and it gives us output by finding some pattern in given data. There is no supervisor or teacher for the unsupervised machine learning algorithm. There are mainly two categories of unsupervised machine learning:

Clustering and Association. In clustering, we classify the objects by creating clusters according to their similarities. Consider an example: There are different animals like Cat, Dog, and Cow. We will create a separate cluster for each type of animal and when new data comes (containing a particular animal) by finding a pattern, the algorithm assigns a particular cluster to that data.

What is Clustering ??

Clustering is the process of grouping the same objects. In clustering model tries to find some kind of pattern and accordingly creates a cluster of the same objects. Consider the following example:

Consider we need to classify or categorize Cat, Dog, Spider, and Sparrow. We will give data to the model which contains images of Cat, Dog, Spider, And Sparrow. In this technique, the model will try to find patterns in the data, and accordingly, it will create a cluster of the same categories. after training of the model if we provide an image of Dog then the model will put this object accurately in the class of Dogs.

K-Means Clustering

K-Means Clustering is one kind of algorithm used for the purpose of classifying the various objects in the form of clusters. It uses the concept of Euclidean distances.

Use Case:

One of the use case of the K-Means Clustering algorithm is system log clustering.Log files contain information about almost all events that take place in a system. deployed logging infrastructure automatically collects, aggregates and stores the logs that are continuously produced by most components and devices. A major issue with forensic log analysis is that problems are only detected in hindsight. It is a time- and resource-consuming task that requires domain knowledge about the system at hand. For these reasons, modern approaches in cybersecurity shift from a purely forensic to a proactive analysis. So in this case approach like K-Means Clustering helps to solve the issue.

Thanks For Reading๐Ÿ˜Š๐Ÿ˜Š๐Ÿ˜Š

--

--