Use Of Confusion Matrix In Cyber Crime

Pjore
4 min readJun 5, 2021

Hello Guys 🙋‍♀️ In this article we are going to look at how the concept of Confusion Matrix is used in Cyber Crime.

Let us first discuss some basic terminologies before going deeper into the journey of integration of Machine Learning and Cyber Crime.

Confusion Matrix is one of the most important terms used in Machine Learning, so let's discuss What is Machine Learning?

What Is Machine Learning???

For understanding, let’s consider an example of a little kid. Try to visualize the Machine Learning term as a little kid. Little kid always tries to explore the world around himself and always try to explore new things. At starting he faces many difficulties but with experience, he becomes better at understanding and exploring new things.

The same thing happens with Machine Learning. We need to provide it lots of data by using some programs or models. There are multiple models are available in Machine Learning. The simple meaning of Machine Learning is it is technology in which Machine learns from past experience by using huge amount of data.

Cyber Crime

Crime that involves a computer and a network is known as Cyber Crime. It involves offences that are committed against individuals or group of individuals with a crime motive to intentionally harm the reputation of the victim or cause physical or mental harm,or loss, to the victim directly or indirectly, using modern telecommunication network such as internet (network including chat rooms, emails, notice board and group) and mobile phones(SMS/Bluetooth/MMS).

Following are some types of Cyber Crime:

❇️ Hacking

❇️ Cyberstalking

❇️ Denial of Service

❇️ Dissemination of Mallicious Software

❇️ Phishing

❇️ Spyware

Need Of Machine Learning In Cyber Crime

❇️ Number of attacks are getting increases: In order to prevent various kinds of attacks, we need Machine Learning.

❇️ Cyber Crime for sale: Professional cybercriminals are selling customize hacking as a service and they are victimizing various consumers.

❇️ More network logs that make it difficult to analyze it manually

❇️ Intrusion Prevention System/ Intrusion Detection System are mostly pattern-based devices: These devices are based on supervised learning. we stored the attack which is already seen by the device and whenever a new packet comes, it compares the new packet with the previous packet. These kinds of attacks are detected by anomaly detection.

❇️ More Sophisticated Attacks

❇️ Detecting Zero-Day Attacks

❇️ Protecting Big Data

Following are some Applications of Machine Learning in Cyber Crime:

❇️ Risk Assessment

❇️ Digital Forensics

❇️ Spam Filtering

❇️ Phishing Emails

❇️ Event Correlation

❇️ Network Monitoring

What Is Confusion Matrix ?

Cofusion Matrix is one of the important terms in Machine Learning which is used to calculate the accuracy of the Classification Model. Basically, there are two main parts of the confusion matrix, i.e. Actual value and Predicted value. Actual Values nothing but the real values, and Predicted Values are values predicted by our model.

Following are some components of Confusion Matrix:

TP(True Positive): Actual value is positive and predicted value is also positive

TN(True Negative): Actual value is negative and predicted value is also negative

FP(False Positive): Actual value is negative but the predicted value is positive

FN(False Negative): Actual value is positive but the predicted value is negative

Terms Related to Confusion Matrix:

Recall= TP/Actual Yes

Accuracy= TP+TN/Total

Error Rate= 1- Accuracy

Precision=TP/Predicted Yes

How Confusion Matrix used in Cyber Crime

With help of the confusion matrix, we can get information related to correctly classified categories and incorrectly classified categories. In a similar way, we can use it in predicting the accuracy of the model which involves the identification of different types of cyberattacks.

Consider the following example of cybercrime from one of the case study. we need to classify all these classes accordingly some features are selected.

Let's understand this case by considering the result……

This is the result of the use case.

criminal court case label can be predicted with an accuracy of 76%. This means 24% of all criminal court cases get misclassified as another class. However, since this accuracy is the weighted average of each f1_score of a class, it may be better to calculate accuracies per class as some classes are performing better than others. It appears ‘child pornography’ can be determined with high accuracy.

In this way, the Confusion Matrix is used in solving various challenges of Cyber Crime.

Thanks For Reading😊😊😊

--

--