cornelliusyudhawijaya/End_to_End_Spam_Classifier

1 Branches

docker_airflow

171a68649c

Script Dags Change

1 month ago

ete_spam_classifier

a3ca7e9b52

Add Image Result

1 month ago

spam_backend

a3ca7e9b52

Add Image Result

1 month ago

spam_frontend

8f9f1be3da

21 march 2024

2 months ago

storage

a3ca7e9b52

Add Image Result

1 month ago

.gitignore

f4f9e7d99c

v.0

4 months ago

README.md

c64983c77b

Update 'README.md'

1 month ago

docker-compose.yaml

f58cf4ad4c

First Commit

4 months ago

end-to-end-spam-classification-flowchart.png

6d0cfe9ec9

21 Mar 2024

2 months ago

DagsHub Storage

You have to be logged in to leave a comment.

This is a code repository collection for the NLP Machine Learning Spam Classifier end-to-end project. The article for this code will be attached here as soon as it has been published. The code is still incomplete as it's still in the process of cleaning up, so keep an eye on this repository.

Aim

This project aims to develop a spam classifier model from the available email data and simulate how to deploy and maintain them.

Procedure

We would follow the standard data science project for this project, including:

Data Exploration
Data Preprocessing
Model Development
Model Evaluation
Model Deployment
Model Maintenance

The overall project structure would look like the following:

Tech Stack

For this project, we would use various open-source tech stacks that are appropriate for each step, including:

Python
Jupyter Notebook
VS Code
Docker
Airflow
FastAPI

Additionally, we would use related Python packages common in data science projects. Some notable packages include:

MLFlow
Evidently
NLTK
Wordcloud

Dataset

You can find the dataset in the storage folder (https://dagshub.com/cornelliusyudhawijaya/End_to_End_Spam_Classifier/src/master/storage/spam_emails.csv

Screenshot

Some of the screenshots of this project activity.

1. Email Statistics

2. Visualization of the Email Statistics

3. Word cleanings and word cloud visualization

4. Model development and evaluation experiment

5. Model Evaluation various metrics change in each iteration

6. Model Front-end Streamlit simple dashboard

7. Building the model API and serving in the backend

8. Using Docker Compose to containerize the frontend and backend

9. Tracking the model experiments with MLFlow

Tip!

Press p or to see the previous file or, n or to see the next file

README.md