Want to create interactive content? It’s easy in Genially!

Get started free

Machine Learning

Quentin Saguer

Created on November 27, 2024

Start designing with a free template

Discover more than 1500 professional designs like these:

Transcript

Team Members :SAGUER Quentin/GUESSOUS Samy CARRE Pablo/PONTHIEU Gabriel/TALLARON Matéo

Machine Learning

Dataset Overview

  • Source : Cyber Threat Detection document
  • This dataset contains 1430 rows and 23 columns
  • Problem statement :
    • Goal : Classify network activities as either malicious or mild
    • Target : Label column where 1 = malicious and mild = 0

Exploring Dataset

Data Cleaning

  • Tasks performed :
    • Removed irrelevant columns
    • Verified there were no missing values in the dataset
    • Checked for duplicates : "No duplicate rows found"

Splitting features and target

Data Visualization

  • Features : All relevant columns except Label
  • Target : The Label column
  • Charts to include :
    • Bar chart : Distribution of Label values
    • Histogram : Distribution of Packet_Length or another numeric features
    • Heatmap : Correlation between features (use seaborn or similar)

Splitting Data

  • Process :
    • Split dataset into training (80%) and testing (20%)
    • Use the python code above
  • Why ?
    • To ensure model is tested on unseen data

Training Models

  • Models used :
    • Logistic Regression
    • Random Forest Classifier
    • Support Vector Machine (SVM)
  • Process :
    • Train each model using the training data
    • Use the python code here shown ->

Models Evaluation

  • Key Insight :
    • Class imbalance is a challenge but can be managed with Random Forest
    • Key features such as Packet_Length and Bytes_Sent are critical for classification
    • Visualizing the data helped in understanding distributions and feature importance, leading to a better model

Conclusion

  • Best Model :
    • Random Forest showed the best overall performance for this classification task

Any questions

Thank you for listening