Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

get_data.py 567 B

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  1. from sklearn.datasets import make_classification
  2. from sklearn.model_selection import train_test_split
  3. import numpy as np
  4. import os
  5. seed = 42
  6. # Generate data
  7. X, y = make_classification(n_samples = 100000, random_state=seed)
  8. # Make a train test split
  9. X_train, X_test, y_train, y_test = train_test_split(X,y, random_state=seed)
  10. # Save it
  11. if not os.path.isdir("data"):
  12. os.mkdir("data")
  13. np.savetxt("data/train_features.csv",X_train)
  14. np.savetxt("data/test_features.csv",X_test)
  15. np.savetxt("data/train_labels.csv",y_train)
  16. np.savetxt("data/test_labels.csv",y_test)
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...