Dataset Profiler

CSV input
Dataset summary
Rows
7 (data: 6)
Columns
7
Missing (overall)
2.4%
Duplicate rows
0
Key duplicates
0 set keys in Options
UTF-8 BOM
none
Column profiles
Column
Type
Non-null
Missing
Unique
Highlights
id
integer
6
0
6
min 1, p50 3.50, max 6, outliers 0
name
text
6
0
6
len 4~8 (avg 5.3)
age
integer
6
0
5
min 27, p50 29.50, max 31, outliers 0
city
text
6
0
4
len 6~13 (avg 8.2)
sales
float
5
1
5
min 75, p50 120.50, max 200, outliers 0
joined_at
datetime
6
0
5
sample 2023-01-03
active
boolean
6
0
2
sample true
JSON report
Options
Parsing
Report
Keys
Provide comma-separated key columns to check duplicate keys.

About Dataset Profiler

Dataset Profiler scans your CSV and builds a structured overview: dataset-level stats, per-column type inference, completeness, uniqueness, distributions, and data-quality warnings. Use it to quickly understand data shape and decide the next cleaning or analysis steps.

What you’ll see

  • Dataset: rows/columns, overall missing ratio, duplicate rows, BOM hint.
  • Columns: inferred type (integer/float/boolean/datetime/categorical/text), missing rate, unique count.
  • Numeric: min/max/mean/std and p05/p25/p50/p75/p95 with IQR-based outlier count.
  • Categorical: Top-N values with frequencies and high-cardinality warning.
  • Text: length distribution and simple pattern hints (URL/email-like).

Tips

  • Set key columns to detect duplicate keys before joins or merges.
  • Tune the categorical rule (by unique ratio or absolute count) for your dataset size.
  • Use the JSON report for downstream automation or to pipe into other tools.