You have to be logged in to leave a comment.

Multi Linguage Sentiment Classification - Part 1

This project is the first part of a serie of two blogs. It aims to leverage Hugging Face transformers in order to perform multi linguage sentiment analysis.

Motivation

There are more and more people across the glob, increasingly sharing their opinion on social media platforms, review sites in different languages. To be able to efficiently analyze what is being expressed about them sentiment-wise, industries and organizations need to find the right technologies that is not only focused on English language.

This is what is the overal goal of this project, aimining to develop a tool able to understand sentiment expressed in different languages to finally those organisations to make the right decisions.

The Dataset

The dataset used for this project is license-free available on Sentiment NLPRoc.

Acknowledgements

Scope of the project

The previous cited papers covers 9 languages. For simplicity sake, we will focus on the following 5 languages as described in the table below.

Prerequisites

Python 3.6+
Transformers 3.1.0
All the requirements are specified in the requirements.txt file

Usage

Set Up of the project from the root directory of the project

Create virtual environment

python3 -m venv your_virtual_environment

Start virtual environment

source your_virtual_environment/bin/activate

Run the experimentation

This script will do the following tasks:

create the data folders: data/raw and data/processed
Unzip the original data into the data/raw folder
Run the processing to create the final data set
Run the evaluation to track the pretrained models performance using MLFlow tracking

Run the automation script

chmod +x prepare_environment.sh
./prepare_environment.sh

After that you should be able to see the experimentation tracking from the Experimentation tab in the top left corner

Tip!

Press p or to see the previous file or, n or to see the next file

README.md 2.3 KB

Permalink History Raw

Multi Linguage Sentiment Classification - Part 1

Motivation

The Dataset

Acknowledgements

Scope of the project

Prerequisites

Usage

Set Up of the project from the root directory of the project

Run the experimentation

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

zoumana.keita / Multi_language_Sentiment_Classification

README.md 2.3 KB Permalink History Raw

Multi Linguage Sentiment Classification - Part 1

Motivation

The Dataset

Acknowledgements

Scope of the project

Prerequisites

Usage

Set Up of the project from the root directory of the project

Run the experimentation

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

zoumana.keita
/
Multi_language_Sentiment_Classification

README.md 2.3 KB

Permalink History Raw