Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  dvc git mlflow
Nurbek 6922128d54
uncommented dvc data changes
1 week ago
b0672bf2f7
Updated data and model 2024-04-19 16:48:22
2 weeks ago
6778f08147
Updated data and model 2024-04-25 09:04:31
1 week ago
b964cf8acc
Updated data and model 2024-04-25 09:17:43
1 week ago
019fd7d945
first setup
2 weeks ago
3fa1163b84
updated gitignore && added techdock
1 week ago
3fa1163b84
updated gitignore && added techdock
1 week ago
5b07347b91
Added raw data
1 month ago
3fa1163b84
updated gitignore && added techdock
1 week ago
1f733bd95b
updated readme
1 week ago
3fa1163b84
updated gitignore && added techdock
1 week ago
3fa1163b84
updated gitignore && added techdock
1 week ago
5c1ac11972
setup mlflow
2 weeks ago
5c1ac11972
setup mlflow
2 weeks ago
019fd7d945
first setup
2 weeks ago
1f733bd95b
updated readme
1 week ago
3fa1163b84
updated gitignore && added techdock
1 week ago
6778f08147
Updated data and model 2024-04-25 09:04:31
1 week ago
4dd1fc416d
Updated data and model 2024-04-24 11:03:52
1 week ago
6778f08147
Updated data and model 2024-04-25 09:04:31
1 week ago
a90dab937c
Updated data and model 2024-04-25 14:27:52
1 week ago
6778f08147
Updated data and model 2024-04-25 09:04:31
1 week ago
6922128d54
uncommented dvc data changes
1 week ago
3fa1163b84
updated gitignore && added techdock
1 week ago
6778f08147
Updated data and model 2024-04-25 09:04:31
1 week ago
019fd7d945
first setup
2 weeks ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

Hand Gesture Recognition for Alphabet Letters

Overview

This project leverages a webcam to recognize alphabet letters from hand gestures using a machine learning model. The application integrates computer vision and deep learning technologies to interpret American Sign Language (ASL) gestures captured in real-time through a webcam.

Dataset

The dataset utilized in this project comes from RoboFlow, specifically the American Sign Language Letters dataset. It includes images labeled for various ASL signs representing the alphabet.

Model

The model is based on YOLOv8s, known for its efficiency and accuracy in real-time object detection tasks. The training pipeline is managed with DVC (Data Version Control), allowing for efficient tracking of dataset changes and model retraining based on specific user inputs.

Development and Monitoring Tools

  • DVC (Data Version Control): Manages data and model versioning to ensure reproducibility and facilitate incremental updates to the model.
  • DAGsHub: Used for storing data and code. It also provides MLflow hosting for tracking experiments and model performance.
  • MLflow: Monitors the training processes and compares the metrics to select the best performing model for production deployment.

Application Architecture

The backend server is built using Flask, handling the integration with the machine learning model and the webcam. The frontend is developed using Vue.js 3, providing a responsive interface for real-time webcam access and display of the recognition results.

Installation and Usage

Prerequisites

  • Python 3.8+
  • Flask
  • Vue.js
  • MLflow
  • DVC
  • YOLOv8s

Setup

Clone the repository and navigate to the project directory:

git clone <repository-url>
cd <repository-dir>
pip install requirements.txt

Start

python server.py

Application Flow and Model Retraining Explanation

Model Selection and Download

When the application starts or when the model is required for gesture recognition, it follows a specific flow to ensure the best performing model is used:

  1. Finding the Best Model: The backend server queries all the model runs tracked by MLflow. It compares their performance based on the 'loss' metric, looking for the model with the lowest loss as this indicates the best performance.

  2. Model Download: If the best model (i.e., the one with the least loss) is not already present in the local directory, it is downloaded from the remote storage. This ensures that the application always uses the most accurate and up-to-date model available without manual intervention.

Model Retraining

The application supports two methods for model retraining — through the UI and using DVC commands:

  1. UI-Driven Retraining:

    • Users can initiate retraining directly from the application's UI. This is particularly useful for non-technical users or for quick adjustments based on new data.
    • The UI includes options to upload new training data or modify existing datasets, and to start the training process with new parameters. Once the user submits the retraining request, the backend triggers a new DVC pipeline run.
    • The results, including updated metrics and models, are automatically tracked in MLflow and made available for comparison and deployment.
  2. Command Line Retraining via DVC:

    • Advanced users can utilize DVC directly from the command line to handle more complex retraining workflows or to integrate custom scripts.
    • Retraining can be initiated by running dvc repro which re-executes the entire pipeline defined in dvc.yaml. This command checks for changes in the dataset or in the pipeline steps and only reruns the necessary parts.
    • Like with the UI, updates to models and metrics are tracked via MLflow, ensuring that any improvements are documented and that the best model can be identified and used.

Real-Time Use

Once the best model is identified and available locally, the Flask backend integrates this model to interpret live webcam data in real-time. The frontend displays the recognized ASL gestures, providing immediate feedback to the user.

This detailed explanation outlines how your application manages data and models dynamically, supports user-driven interactions, and ensures the deployment of the most accurate models for real-time ASL gesture recognition.

Tip!

Press p or to see the previous file or, n or to see the next file

About

No description

Collaborators 1

Comments

Loading...