AI Security Lab

A guided lab to peer review AI training datasets. You’ll pick a dataset release (TUUID), fetch 2–5 samples, compare “what the AI saw”, and export a peer review report you can share or archive.

Keep labs free → donate

What am I looking at?

Each sample record includes normal metadata + a field named decrypted_training_data. That training data is a structured set of features (numbers/flags/fields) used by the model.

This lab helps you verify: (1) schema consistency, (2) sane numeric ranges, (3) label plausibility, (4) duplicates/outliers, and (5) provenance fields.

Progress

Tip: Use the big buttons. If something is locked, it will tell you exactly what to do next.

Peer Review Steps

Do these in order. The big Next button stays disabled until you're ready.

Current step: 1 • Pick a dataset to begin.

Step 1 — Pick a dataset release (TUUID)

TUUIDs load automatically. Choose one and we’ll load its hash list.

Quick start

1) Pick a dataset in the dropdown • 2) Click Load dataset • 3) Click Next

Dataset (TUUID)

What is a TUUID? Close

A TUUID is a dataset release ID. Each release contains a list of sample hashes. You pick a release, then you pick a few sample hashes from it to review.

Your goal is to sanity-check what’s inside: consistent schema, plausible values, and label/provenance quality.

Step 2 — Pick 2–5 sample hashes

Use Random 2 if you don’t know what to pick.

Hashes in this dataset

—

Filter hashes

Selected: 0 / 5

Minimum is 2.

How to choose samples Close

Easy mode: click Random 2 and move on.

Better: pick one “normal looking” and one “odd looking” sample after you fetch and see values.

Finding: if multiple hashes map to the same CID later, that’s worth noting in the report.

Step 3 — Fetch sample records

For each hash: resolve CID → fetch dataset record. Retries on 429/5xx.

Pipeline per hash: ipfs/find → ipfs/search.

What should I watch for? Close

After fetch, you’ll review:

Missing fields / inconsistent schema
Outliers (entropy, counts, sizes)
Verdict vs capabilities mismatch
Duplicates (same CID across hashes)

Quests

Tiny tasks that teach peer review.

Score: 0

Fetched samples

Add notes per SHA. Then choose left/right and click Compare.

SHA-256	CID	Verdict	Model	Capabilities	Notes

Comparison (with explanations)

Tabs show different “views” of the same training features. Use the explanations to learn what each view means.

Feature set: Max:

What “Vector + keys” means

The model can’t read “files” directly. It uses a list of numeric features (a vector). Here you can see which keys were used and what values went into the vector.

Left

—

Verdict

—

File type

—

Feature preview

Select and compare two samples.

Capabilities

—

Right

—

Verdict

—

File type

—

Feature preview

Select and compare two samples.

Capabilities

—

Key differences

Shows numeric fields with the largest differences. Use this to spot outliers fast.

Showing —

Feature	Left	Right	Δ	Why it matters

What “Core metrics” means

These are common structural fields (sizes, counts). Big anomalies often indicate packing, corruption, or unusual builds.

Left snapshot

—