team photo


Team 7

Team Members

Faculty Advisor

Sarah Nelson
David Lyder, Jr.
Danny Zarate-Romero
Azlan Sarwar
Tyler Barry

Mukul Bansal

sponsored by
sponsor logo

Machine Learning for Cybersecurity Applications

In this project we created a machine learning algorithm that, when given the log files from a server, can tell if a server is under attack from certain cyber security threats. The cyber threats our model can detect are SQL injections and Denial of Service attacks. Our project has potential to link two important and growing fields within computer science.

For our project we set up a Tomcat Apache server running a simple pet clinic website. This website allowed users to enter data in a database, such as the name and address of clients, and search that same database. We then attacked this website using Kali Linux tools and had python scripts running to simulate normal usage. During both attacks and simulated normal usage, JavaMelody was used to capture the server logs to feed to our machine learning algorithm.

We preprocessed our data to only contain what we found to be the necessary attributes from the server logs. Doing this helped the model not be distracted by attributes that had no correlation with being in an attacked state or normal state.

Once data was generated and preprocessed, we trained a Long Short Term Memory (LSTM) model to identify whether the server was in an attacked state or normal state. A LSTM model is a recurrent neural network which will hold onto a data point for an arbitrary amount of time depending on the specific cell in the network. This is great for our project since it is important for the model to remember what attacks which took place a long time ago look like and what recent attacks look like. 

Finally, after data is run through the model, we scored how the model performed using a variety of methods, such as the receiving operator characteristic (ROC) curve and F1 Score, and visualized it using principal component analysis (PCA) so that it could be viewed in two dimensions despite having over 20 dimensions. When our project is running in real time, it can deliver a notification of when a server is under attack based upon our pre-trained model.