top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

StackOverflow Data Pipeline

Project type

Machine Learning and Natural Language Processing for Hate Speech Detection

Date

May 2025

Location

Deakin University, Victoria, Australia

Link

Built an end-to-end pipeline to process StackOverflow XML data, converting it into structured datasets, performing feature engineering, and training multiple ML models (Logistic Regression, Random Forest, XGBoost, LightGBM) to predict accepted answers and post views. Applied SMOTE, ADASYN, PCA, and SHAP for imbalance handling, dimensionality reduction, and explainability.

bottom of page