Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

TECHDOC.md 4.3 KB

You have to be logged in to leave a comment. Sign In

Application Documentation

This documentation covers the overall architecture, file structure, and usage instructions for the application. This system is designed to handle machine learning workflows including training, testing, versioning, and production deployment.

Project Structure

  • data/ - Contains training and testing data used by the models. Managed with DVC.
  • models/ - Stores all trained models and the best performing model along with its metrics.
  • constants.py - Contains MLflow credentials for tracking experiments.
  • data.dvc - Stores the hash of data tracked by DVC using dvc add data.
  • delete.py - Script to delete all experiments recorded in MLflow.
  • downloader.py - Contains functions to save the best model from MLflow runs to best.pt and its metrics.
  • dvc.lock - Contains the hash of DVC steps used by dvc repro to ensure consistency across runs.
  • model.py - Defines the machine learning model and includes training procedures.
  • params.yaml - Configuration file for model training parameters.
  • server.py - Flask server for retraining models and deploying them in production.
  • utils.py - Utility functions used across various modules of the application.
  • requirements.txt - Specifies Python dependencies necessary for the project.

Environment Setup

  1. Create a Virtual Environment:

    python -m venv venv
    source venv/bin/activate  # On Unix/macOS
    venv\Scripts\activate  # On Windows
    
  2. Install Dependencies:

    pip install -r requirements.txt
    
  3. Reproduce Pipeline: Use DVC to manage and reproduce workflows:

    dvc repro
    

Server Endpoints

  • /retrain (POST): Initiates retraining of the model with the provided parameters. Accepts a JSON object with training configuration:
    {
      "batch": "<batch_size>",
      "epochs": "<number_of_epochs>",
      "optimizer": "<optimizer_type>",
      "lr0": "<initial_learning_rate>"
    }
    

alt text It rewrites the params.yaml with provided data

Real-Time Data Handling

  • frame_from_client: Handles real-time video frames sent by a web client. The client sends images in base64 format which are then converted back to an image format, processed, and fed into the model to generate predictions.

Additional Notes

  • DVC Integration: Ensure that DVC is properly set up and configured to handle large data and model files.
  • MLflow Usage: Use MLflow for experiment tracking. Ensure constants.py is properly set up with your MLflow credentials.
  • Security Considerations: Secure the Flask server and ensure that data transmitted over the network is encrypted.

Web Application Interface

The web application provides a user-friendly interface to interact with the machine learning model directly. It includes several buttons that facilitate different operations such as analyzing video frames and managing model training.

Interface Buttons

Analyze Frame

Functionality: Captures the current screen frame and sends it to the server in base64 format via a WebSocket connection. Server Interaction: The server receives the frame, processes it through the model, and returns the detection results, including labels, confidence scores, and bounding box coordinates. Client-Side Processing: Upon receiving the data, the web app draws rectangles on the video frame based on the bounding boxes to visually represent the detected objects.

Switch to Draw

Functionality: Enables a mode on the client side that allows users to interact with the video frame directly, such as drawing or annotating. (Further details needed on what "to do" means in this context.) Server Interaction: This button primarily triggers client-side behavior and does not interact with the server unless it's part of a larger feature not described.

Retrain

Functionality: Sends a POST request to the /retrain endpoint with the current configuration alt text for training the model. Server Interaction: The server uses the provided parameters to initiate the retraining pipeline. This includes data preprocessing, model training, and logging the new model's performance metrics to MLflow. Feedback: The user receives feedback on the initiation and progress of the training process through the web interface.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...