Are you sure you want to delete this access key?
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
Legend |
---|
DVC Managed File |
Git Managed File |
Metric |
Stage File |
External File |
This project leverages a webcam to recognize alphabet letters from hand gestures using a machine learning model. The application integrates computer vision and deep learning technologies to interpret American Sign Language (ASL) gestures captured in real-time through a webcam.
The dataset utilized in this project comes from RoboFlow, specifically the American Sign Language Letters dataset. It includes images labeled for various ASL signs representing the alphabet.
The model is based on YOLOv8s, known for its efficiency and accuracy in real-time object detection tasks. The training pipeline is managed with DVC (Data Version Control), allowing for efficient tracking of dataset changes and model retraining based on specific user inputs.
The backend server is built using Flask, handling the integration with the machine learning model and the webcam. The frontend is developed using Vue.js 3, providing a responsive interface for real-time webcam access and display of the recognition results.
Clone the repository and navigate to the project directory:
git clone <repository-url>
cd <repository-dir>
pip install requirements.txt
python server.py
When the application starts or when the model is required for gesture recognition, it follows a specific flow to ensure the best performing model is used:
Finding the Best Model: The backend server queries all the model runs tracked by MLflow. It compares their performance based on the 'loss' metric, looking for the model with the lowest loss as this indicates the best performance.
Model Download: If the best model (i.e., the one with the least loss) is not already present in the local directory, it is downloaded from the remote storage. This ensures that the application always uses the most accurate and up-to-date model available without manual intervention.
The application supports two methods for model retraining — through the UI and using DVC commands:
UI-Driven Retraining:
Command Line Retraining via DVC:
dvc repro
which re-executes the entire pipeline defined in dvc.yaml
. This command checks for changes in the dataset or in the pipeline steps and only reruns the necessary parts.Once the best model is identified and available locally, the Flask backend integrates this model to interpret live webcam data in real-time. The frontend displays the recognized ASL gestures, providing immediate feedback to the user.
This detailed explanation outlines how your application manages data and models dynamically, supports user-driven interactions, and ensures the deployment of the most accurate models for real-time ASL gesture recognition.
Press p or to see the previous file or, n or to see the next file
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?
Are you sure you want to delete this access key?