Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  dvc git github
Alexander Levin 853f70f71e
Update README.md
3 years ago
145077da0a
outputs cleared
3 years ago
ea88fcaed5
tf-idf + logreg added
4 years ago
10582f5043
name changes
4 years ago
853f70f71e
Update README.md
3 years ago
45a0bcfcdb
changed names
3 years ago
45a0bcfcdb
changed names
3 years ago
748f379a7b
dataset uploaded
4 years ago
145077da0a
outputs cleared
3 years ago
8ba024a8c5
no message
4 years ago
45a0bcfcdb
changed names
3 years ago
Storage Buckets

README.md

You have to be logged in to leave a comment. Sign In

Source Code Classification

nl2ml_notebook_parser.py - script for parsing Kaggle notebooks and process them to JSON/CSV/Pandas.

bert_distances.ipynb - notebook with expiremints concerning sense of distance between BERT embeddings where input tokens were tokenized source code chunks.

bert_classifier.ipynb - notebook with preprocessing and training pipeline.

regex.ipynb - notebook with creating labels for code chunks with regex

logreg_classifier.ipynb.ipynb - notebook with building logreg on the regex labels with tf-idf

Tip!

Press p or to see the previous file or, n or to see the next file

About

nl2ml

Collaborators 3

Comments

Loading...