Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

README.md 2.1 KB

You have to be logged in to leave a comment. Sign In

Project: NLP Binary Classification using Microservices Architecture for StackOverflow Tag Prediction with DVC Integration.

✨ Project information:

The project is a natural language processing (NLP) binary classifier problem of predicting tags for a given StackOverflow question. For example, we want one classifier which can predict a post that is about the R language by tagging it R. The project uses DVC (data version control) for managing data. It is built on a microservices architecture and is an end-to-end project. The dataset can be downloaded from this link.

📚 Libraries used :

  • Scikit-learn
  • Pandas
  • Numpy
  • DVC

🚀 Project structure:

workflow workflow

🐨 DagsHub Data Pipeline

workflow

Complete Project Data Pipeline is available at DagsHub Data Pipeline

🔥 Technologies Used:

1. Python 
2. shell scripting 
3. aws cloud Provider 
4. DVC

🔌 Infrastructure:

1. AWS S3
2. GitHub
3. DaghsHub

👷 Initial Setup:

conda create --prefix ./env python=3.9
conda activate ./env 
pip install -r requirements.txt
dvc init

Conclusion

This project is production ready to be used for the similar use cases and it will provide the automated and orchesrated production ready pipelines(Training & Serving)

Thanks for taking a look at this project. If you find it valuable, kindly rate it by clicking the star icon. Your support is highly appreciated! 😊🙏⭐

📃 License

MIT license © My Website website
Let's connect on LinkedIn

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...