Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git
171a68649c
Script Dags Change
1 month ago
a3ca7e9b52
Add Image Result
1 month ago
a3ca7e9b52
Add Image Result
1 month ago
8f9f1be3da
21 march 2024
2 months ago
a3ca7e9b52
Add Image Result
1 month ago
4 months ago
c64983c77b
Update 'README.md'
1 month ago
f58cf4ad4c
First Commit
4 months ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

This is a code repository collection for the NLP Machine Learning Spam Classifier end-to-end project. The article for this code will be attached here as soon as it has been published. The code is still incomplete as it's still in the process of cleaning up, so keep an eye on this repository.

Aim

This project aims to develop a spam classifier model from the available email data and simulate how to deploy and maintain them.

Procedure

We would follow the standard data science project for this project, including:

  1. Data Exploration
  2. Data Preprocessing
  3. Model Development
  4. Model Evaluation
  5. Model Deployment
  6. Model Maintenance

The overall project structure would look like the following:

image

Tech Stack

For this project, we would use various open-source tech stacks that are appropriate for each step, including:

  1. Python
  2. Jupyter Notebook
  3. VS Code
  4. Docker
  5. Airflow
  6. FastAPI

Additionally, we would use related Python packages common in data science projects. Some notable packages include:

  1. MLFlow
  2. Evidently
  3. NLTK
  4. Wordcloud

Dataset

You can find the dataset in the storage folder (https://dagshub.com/cornelliusyudhawijaya/End_to_End_Spam_Classifier/src/master/storage/spam_emails.csv

Screenshot

Some of the screenshots of this project activity.

1. Email Statistics

image

2. Visualization of the Email Statistics

image

3. Word cleanings and word cloud visualization

image

4. Model development and evaluation experiment

image

5. Model Evaluation various metrics change in each iteration

image

6. Model Front-end Streamlit simple dashboard

image

7. Building the model API and serving in the backend

image

8. Using Docker Compose to containerize the frontend and backend

image

9. Tracking the model experiments with MLFlow

image

Tip!

Press p or to see the previous file or, n or to see the next file

About

This is a code repository collection for the NLP Machine Learning Spam Classifier end-to-end project.

Collaborators 1

Comments

Loading...