Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  dvc git github
c0d9a4bafc
Do not dvc the entire raw data folder
2 years ago
11284d10a4
Add revamped QSAR dataset abstraction
2 years ago
567dc819d6
Readd temporal split and lint code
2 years ago
89d213c264
add qsar update experiments
1 year ago
567dc819d6
Readd temporal split and lint code
2 years ago
47d22bbd17
add figures folder.
2 years ago
3cefa29996
Project structure update
2 years ago
89d213c264
add qsar update experiments
1 year ago
89d213c264
add qsar update experiments
1 year ago
9265fc01ad
Start adding benchmarks and refactor env
2 years ago
3fc1694deb
Big functionality update, add dvc, lightning etc.
2 years ago
89d213c264
add qsar update experiments
1 year ago
30c42a7f9d
Setup skeleton of entire project
2 years ago
8a18b8a7a6
Readd LICENSE and some old structs
2 years ago
567dc819d6
Readd temporal split and lint code
2 years ago
3cefa29996
Project structure update
2 years ago
d19bf28187
add dvc stage
2 years ago
9265fc01ad
Start adding benchmarks and refactor env
2 years ago
46cc02cbef
Update packaging to be installable
2 years ago
89d213c264
add qsar update experiments
1 year ago
9265fc01ad
Start adding benchmarks and refactor env
2 years ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

rxitect

Tools used in this project

Project structure

.
├── config                      
│   ├── main.yaml                   # Main configuration file
│   ├── model                       # Configurations for training model
│   │   ├── model1.yaml             # First variation of parameters to train model
│   │   └── model2.yaml             # Second variation of parameters to train model
│   └── process                     # Configurations for processing data
│       ├── process1.yaml           # First variation of parameters to process data
│       └── process2.yaml           # Second variation of parameters to process data
├── data            
│   ├── final                       # data after training the model
│   ├── processed                   # data after processing
│   ├── raw                         # raw data
│   └── raw.dvc                     # DVC file of data/raw
├── docs                            # documentation for your project
├── dvc.yaml                        # DVC pipeline
├── .flake8                         # configuration for flake8 - a Python formatter tool
├── .gitignore                      # ignore files that cannot commit to Git
├── Makefile                        # store useful commands to set up the environment
├── models                          # store models
├── notebooks                       # store notebooks
├── .pre-commit-config.yaml         # configurations for pre-commit
├── pyproject.toml                  # dependencies for poetry
├── README.md                       # describe your project
├── src                             # store source code
│   ├── __init__.py                 # make src a Python module 
│   ├── process.py                  # process data before training model
│   └── train_model.py              # train model
└── tests                           # store tests
    ├── __init__.py                 # make tests a Python module 
    ├── test_process.py             # test functions for process.py
    └── test_train_model.py         # test functions for train_model.py

Set up the environment

  1. Install Poetry
  2. Set up the environment:
make activate
make setup

Install new packages

To install new PyPI packages, run:

poetry add <package-name>

Run the entire pipeline

To run the entire pipeline, type:

dvc repo

Version your data

Read this article on how to use DVC to version your data.

Basically, you start with setting up a remote storage. The remote storage is where your data is stored. You can store your data on DagsHub, Google Drive, Amazon S3, Azure Blob Storage, Google Cloud Storage, Aliyun OSS, SSH, HDFS, and HTTP.

dvc remote add -d remote <REMOTE-URL>

Commit the config file:

git commit .dvc/config -m "Configure remote storage"

Push the data to remote storage:

dvc push 

Add and push all changes to Git:

git add .
git commit -m 'commit-message'
git push origin <branch>

Auto-generate API documentation

To auto-generate API document for your project, run:

make docs
Tip!

Press p or to see the previous file or, n or to see the next file

About

A deep reinforcement learning-based drug molecule generator focused on generation of molecules using SELFIES to exploit the guarantee of valid molecular structures.

Collaborators 1

Comments

Loading...