Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
General:  transformers Data Domain:  nlp
1e5d3d7871
first commit about DVC initialization
2 years ago
5f33d885c6
creating the automation script
2 years ago
54b422c06c
Addind MLFlow Experimentation
2 years ago
4af193250e
commit adding the image folder
2 years ago
be1594a090
Addind Notebook File
2 years ago
src
5b32725e71
Commit for mlflow yaml config and packages implementation
2 years ago
5f33d885c6
creating the automation script
2 years ago
1e5d3d7871
first commit about DVC initialization
2 years ago
5d8ea54713
dvc data tracking
2 years ago
292053c78f
Update of Usage section of README
2 years ago
32299000a2
adding the data files
2 years ago
45a04c7649
Adding the dvc.yaml
2 years ago
45a04c7649
Adding the dvc.yaml
2 years ago
5f33d885c6
creating the automation script
2 years ago
5f33d885c6
creating the automation script
2 years ago
5f33d885c6
creating the automation script
2 years ago
5f33d885c6
creating the automation script
2 years ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

Multi Linguage Sentiment Classification - Part 1

This project is the first part of a serie of two blogs. It aims to leverage Hugging Face transformers in order to perform multi linguage sentiment analysis.

Motivation

There are more and more people across the glob, increasingly sharing their opinion on social media platforms, review sites in different languages. To be able to efficiently analyze what is being expressed about them sentiment-wise, industries and organizations need to find the right technologies that is not only focused on English language.

This is what is the overal goal of this project, aimining to develop a tool able to understand sentiment expressed in different languages to finally those organisations to make the right decisions.

The Dataset

The dataset used for this project is license-free available on Sentiment NLPRoc.

Acknowledgements

  1. Cross-Lingual Propagation for Deep Sentiment Analysis
  2. A Helping Hand: Transfer Learning for Deep Sentiment Analysis

Scope of the project

The previous cited papers covers 9 languages. For simplicity sake, we will focus on the following 5 languages as described in the table below.

Project data scope

Prerequisites

  • Python 3.6+
  • Transformers 3.1.0
  • All the requirements are specified in the requirements.txt file

Usage

Set Up of the project from the root directory of the project

  • Create virtual environment
python3 -m venv your_virtual_environment
  • Start virtual environment
source your_virtual_environment/bin/activate

Run the experimentation

This script will do the following tasks:

  1. create the data folders: data/raw and data/processed
  2. Unzip the original data into the data/raw folder
  3. Run the processing to create the final data set
  4. Run the evaluation to track the pretrained models performance using MLFlow tracking
  • Run the automation script
chmod +x prepare_environment.sh
./prepare_environment.sh

After that you should be able to see the experimentation tracking from the Experimentation tab in the top left corner

Tip!

Press p or to see the previous file or, n or to see the next file

About

The goal of this project is to perform sentiment classification on different languages using Hugging Face Transformers.

Collaborators 1

Comments

Loading...