Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

split.py 502 B

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  1. import pandas as pd
  2. from dvc.api import params_show
  3. from preprocess import change_dtype, split
  4. if __name__ == "__main__":
  5. PATHS = params_show()['PATHS']
  6. df = pd.read_csv(PATHS['preprocessed_data'],delimiter=',')
  7. change_dtype(df)
  8. print(df.info())
  9. idx_train, idx_test = split(df,test_size=params_show()['split']['ratio'])
  10. train_df = df.iloc[idx_train]
  11. test_df = df.iloc[idx_test]
  12. train_df.to_csv(PATHS['train'],index=False)
  13. test_df.to_csv(PATHS['test'],index=False)
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...