Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  dvc git
1acd3b64db
Initial dataset commit
8 months ago
1acd3b64db
Initial dataset commit
8 months ago
033c2977e2
Initial dataset commit
8 months ago
1acd3b64db
Initial dataset commit
8 months ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In
language tags pretty_name size_categories task_categories configs license
[en] [student performance tabular_classification multiclass_classification UCI] Diamond [10K<n<100K] [tabular-classification] [encoding cut cut_binary] cc

Diamonds

The Diamonds dataset from Kaggle. Dataset collecting properties of cut diamonds to determine the cut quality.

Configurations and tasks

Configuration Task Description
encoding Encoding dictionary showing original values of encoded features.
cut Multiclass classification Predict the cut quality of the diamond.
cut_binary Binary classification Is the cut quality at least very good?

Usage

from datasets import load_dataset

dataset = load_dataset("mstz/diamonds", "cut")["train"]

Features

Feature Description
carat float32
color string
clarity float32
depth float32
table float32
price float32
observation_point_on_axis_x float32
observation_point_on_axis_y float32
observation_point_on_axis_z float32
cut int8
Tip!

Press p or to see the previous file or, n or to see the next file

About

No description

Collaborators 1

Comments

Loading...