Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  git github
de32d2d9dc
Update .pre-commit-config.yaml
1 year ago
0f37b9cc21
Update README.md
11 months ago
0dabfef9cc
edit structure and readme
1 year ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

View Article

Data Science Cookie Cutter for Prefect

Why Should You Use This Template?

This template is the result of my years refining the best way to structure a data science project so that it is reproducible and maintainable.

This template allows you to:

:white_check_mark: Create a readable structure for your project

:white_check_mark: Automatically run tests when committing your code

:white_check_mark: Enforce type hints at runtime

:white_check_mark: Check issues in your code before committing

:white_check_mark: Efficiently manage the dependencies in your project

:white_check_mark: Create short and readable commands for repeatable tasks

:white_check_mark: Rerun only modified components of a pipeline

:white_check_mark: Automatically document your code

:white_check_mark: Observe and automate your code

Tools used in this project

Project structure

.
├── data            
│   ├── final                       # data after training the model
│   ├── processed                   # data after processing
│   ├── raw                         # raw data
├── docs                            # documentation for your project
├── .flake8                         # configuration for flake8 - a Python formatter tool
├── .gitignore                      # ignore files that cannot commit to Git
├── Makefile                        # store useful commands to set up the environment
├── models                          # store models
├── notebooks                       # store notebooks
├── .pre-commit-config.yaml         # configurations for pre-commit
├── pyproject.toml                  # dependencies for poetry
├── README.md                       # describe your project
├── src                             # store source code
│   ├── __init__.py                 # make src a Python module
│   ├── config.py                   # store configs 
│   ├── process.py                  # process data before training model
│   ├── run_notebook.py             # run notebook
│   └── train_model.py              # train model
└── tests                           # store tests
    ├── __init__.py                 # make tests a Python module 
    ├── test_process.py             # test functions for process.py
    └── test_train_model.py         # test functions for train_model.py

How to use this project

Install Cookiecutter:

pip install cookiecutter

Create a project based on the template:

cookiecutter https://github.com/khuyentran1401/data-science-template --checkout prefect-poetry

Resources

Tip!

Press p or to see the previous file or, n or to see the next file

About

Template for a data science project

Collaborators 1

Comments

Loading...