1 Branches

.dvc

2c08fbf012

Initialise DVC

3 years ago

data

paraphrase-MiniLM-L6-v2

src

d9af368b8b

Updated app.py

3 years ago

.dvcignore

2c08fbf012

Initialise DVC

3 years ago

.gitignore

eebf819145

Adding the code for Semantic Similarity

3 years ago

.slugignore

7986d0b6ba

Added mlruns to slugignore

3 years ago

Aptfile

371acb0c56

Deploying SimSem App

3 years ago

Procfile

8d02652238

Removing extra reqs

3 years ago

README.md

54a1ed0b55

Adding app link to README

3 years ago

app.py

7986d0b6ba

Added mlruns to slugignore

3 years ago

data.dvc

ce1f3e86ab

Saving output in data directory

3 years ago

dvc.lock

93f96492f0

Calculating Similarity

3 years ago

dvc.yaml

93f96492f0

Calculating Similarity

3 years ago

metrics.csv

d6ee128c17

FastText Experiment Logged

3 years ago

params.yml

6d7173aad0

Add FastText Experiment

3 years ago

paraphrase-MiniLM-L6-v2.dvc

5f12157a6c

Adding SentBERT

3 years ago

prepare_ec2.py

3190187e46

EC2 Deployment Files

3 years ago

requirements.txt

3190187e46

EC2 Deployment Files

3 years ago

runtime.txt

00d30c3ec9

Removing extra reqs

3 years ago

streamlit_params.yaml

3190187e46

EC2 Deployment Files

3 years ago

DagsHub Storage

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

You have to be logged in to leave a comment.

Semantic_Similarity

Calculating semantic similarity between the search query and the database to retrieve the most similar one and rank them accordingly.

Prerequisites

Python 3.8+
Transformers
All the specified requirements in the text file.

Usage

Clone this repository.
Install requirements.txt using pip install -r requirements.txt.
Use DVC to pull the files that are stored on the DAGsHub remote storage by running dvc pull
To run the code for BERT-based model, use the command :

python src/main.py --model_name 'BERT' --search_criteria './data/search.txt' --query_file './data/data.csv' --column_name 'Title'

To run the code for FastText-based model, use the command :

python src/main.py --model_name 'fasttext' --search_criteria './data/search.txt' --query_file './data/data.csv' --column_name 'Title'

The search criteria can either be a file or a text string.
Other arguments can be customised as per your data.

Running the Streamlit App

To run the streamlit app:

streamlit run app.py

Dataset

The dataset has been extracted from Kaggle and is based on questions from Stack Overflow.

Checkout the App on AWS EC2

This project has been deployed to EC2, feel free to check it out here.

Tip!

Press p or to see the previous file or, n or to see the next file

Specify your S3 bucket

Bucket name cannot be the same as the repository name. Please change one of them.

Bucket url and prefix

Region

Endpoint Url

Disable SSL verification

README.md

Semantic_Similarity

Calculating semantic similarity between the search query and the database to retrieve the most similar one and rank them accordingly.

Prerequisites

Usage

Running the Streamlit App

Dataset

Checkout the App on AWS EC2

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

ShambhaviCodes / Semantic_Similarity

README.md

Semantic_Similarity

Calculating semantic similarity between the search query and the database to retrieve the most similar one and rank them accordingly.

Prerequisites

Usage

Running the Streamlit App

Dataset

Checkout the App on AWS EC2

Comments

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

ShambhaviCodes
/
Semantic_Similarity