Skip to content

DVC and Data Management Crash Course

DVC is an open source data versioning tool created specifically with data scientist and machine learning teams in mind. Essentially, DVC is to data as Git is to code.

DagsHub's integration with DVC, enables you to connect your remote DVC storage to DagsHub and view DVC-versioned files alongside your code files. We provide a built-in, zero-configurations DVC remote with every DagsHub repo.

Additionally, we launched our Data Streaming & Upload capabilities, a component of the DagsHub client and API libraries that enable users to stream DVC-versioned data from, and upload it to, any DagsHub project. More importantly, DagsHub's data access tools doesn't require making any changes to your project or dataset.

In this workshop, you will learn:

  • What is DVC, and how does it work?
  • Basic DVC commands
  • How to stream & upload DVC versioned data using DagsHub's data access tools

Materials