Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Type:  dataset Data Domain:  nlp Integration:  dvc git github
hlib 34e81dc3d2
Merge branch 'master' of https://github.com/giganticode/datasets
4 years ago
7703a7a308
add google drive remote
4 years ago
0e666e309a
add stage for computing the stats for devanbu small corpus
4 years ago
71e95fd455
improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq
4 years ago
71e95fd455
improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq
4 years ago
47542df864
add stage for extracting devanbu small corpus
4 years ago
34e81dc3d2
Merge branch 'master' of https://github.com/giganticode/datasets
4 years ago
8667f6c94a
Initial commit
4 years ago
0e666e309a
add stage for computing the stats for devanbu small corpus
4 years ago
71e95fd455
improments to pre-processing stage: track also the resulting vocab; use a separate venv to run codeprep; extract codeprpe version with yq
4 years ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File
About

No description

Collaborators 1

Comments

Loading...