Output & Quality

One fold, many views.

A single pass of the engine produces four things at once: the structure, an explainable quality report, a value-data layer that single-structure predictors never give you, and an interactive HTML. None of them changes the physics — they are all read-only views of the same simulation.

Read-only by design Blind — no leakage Audit-ready

The principle

Every number you can act on
was earned blind.

One simulation, many lenses

Nothing here is a second guess.

The report, the value layer and the HTML are not separate models run after the fold — they are different lenses on the very same physical trajectory. Read them, slice them, export them; the structure is byte-for-byte unchanged.

The leakage wall

The answer never leaks in.

Almost every output is blind — it never touches a known answer. Native-derived metrics (Cα-RMSD, TM-score, contact F1) exist only when you supply a reference structure, and they are used purely for scoring. They feed nothing back into the fold.

The bundle

AWhat lands in your output folder.

Every file shares the stem of your --out path — point it at run/1CRN.pdb and you get 1CRN.json, 1CRN.html, 1CRN.per_residue.csv and the rest alongside it. No naming to manage, no flags to enable.

<stem>.pdb

The structure

The refined all-atom model. Its B-factor column carries the per-residue confidence.

always

.json · .md

The report

Every metric, the verdict, the physics validity and the full explainability ladder — machine and human readable.

always

.html

Interactive report

3D viewer, per-residue maps and contact-map charts in one self-contained page. No server needed.

always

.per_residue.csv

Per-residue track

One row per amino acid: secondary structure, exposure, burial, flexibility, confidence.

always

.hbonds · .contact_map

Network & topology

The backbone H-bond network and the predicted Cα–Cα contact map, as CSV.

always

.trajectory.pdb

The folding path

A multi-model PDB capturing the folding pathway as an ordered series of snapshots.

always

.pml

PyMOL script

Worst residues and conflict root-causes highlighted, ready to open and inspect.

always

.ensemble.pdb

The ensemble

Multiple valid conformations, superposed — the spread the model considers plausible.

best-of ≥2

aggregate.csv · .json

The roll-up

Per-seed (or per-structure) summary with the winner flagged — and each seed's full bundle under seeds/.

best-of / batch

The quality report

BNot just a structure — a verdict.

The core report carries three kinds of signal: the native-derived accuracy metrics (scoring only), the blind physics-validity check that every run must pass, and an explainability ladder that tells you not just that something is off, but where and why.

Accuracy · native-derived, scoring only

How close to the truth.

ca_rmsd, tm_score, contact_f1, bin_accuracy, max_residue_error and the heavy-atom RMSDs — present only when you supply a reference. An allatom_rmsd_informative flag tells you honestly whether the side-chain RMSD is a real measurement or a reference copy.

Physics validity · always, blind

Is it even possible.

A verdict (PASS / PASS-WITH-WARNINGS) backed by hard physics: the ramachandran_forbidden_fraction, severe and near steric clashes (lj_severe / lj_near), rotamer outliers, and which disulfides actually locked. When the engine cannot find the right fold, it returns a wrong-but-physical one — never an impossible one, and a failure_classification block names the reason in plain terms (e.g. backbone target missed).

The explainability ladder

From where to why, blindly.

Five rungs on the same data — each one a step deeper into the cause, none of them peeking at a known answer.

Worst pairs & residues localization

The first question is simply where. The report ranks the worst-fitting distance pairs and residues, pointing you straight at the regions that disagree with the evidence.

Residue conflicts causal attribution

A blind triangle-inequality test finds restraints that are mutually incompatible — which pull the structure in, which push it out, and by how many ångström. It blames the inputs, not the fold.

Conflict summary root cause

A greedy set-cover collapses the conflicts to two or three root causes, with a signature that says whether the data is provably self-contradictory or the tension is merely structural.

Keep set closed loop

A blind hand-off (keep_set) of the restraints worth trusting — the seed for a second, cleaner refinement round, derived without ever seeing the answer.

Held-out evaluation non-gameable

Optional, and the only score that cannot be gamed: recall against independent evidence the fold never saw. It catches a structure that satisfies its own inputs perfectly yet is still wrong.

The value layer

CWhat a single PDB can't tell you.

From the same fold the engine extracts a layer of scientific signal that single-structure predictors simply do not produce — secondary structure, networks, dynamics, and AlphaFold-style confidence analogs. All of it read-only and blind.

Structure & chemistry — every run
secondary_structure	DSSP-like assignment — the helix / strand / coil string, residue by residue.
hbond_network	The backbone H-bonds holding the fold together — which residues keep it stable.
solvent_accessibility	Surface vs. core — per-residue exposure and burial state, for active sites and mutations.
contact_map	The topology — long-range tertiary Cα–Cα contacts that define the fold.
energy_decomposition	Where the energy goes — what costs energy in the structure, and where.
disulfides_formed	Which Cys–Cys pairs locked into a covalent bond.
rotamers	Side-chain conformations — the χ1 / χ2 angles the engine settled on.
trajectory_frames	The folding pathway — ordered snapshots of the chain assembling itself.
Confidence & dynamics — needs an ensemble
ensemblebest-of ≥2	Multiple valid structures, superposed — the conformational spread.
flexibilitybest-of ≥2	RMSF / B-factor — which regions are rigid and which flex, from cross-seed variance.
confidencebest-of ≥3	A pLDDT analog — per-residue [0–100] confidence telling you where to trust the model. Written into the structure's B-factor.
paebest-of ≥3	An AlphaFold-PAE analog — an N×N map of how surely residue j is placed relative to i. Low blocks reveal rigid domains; high values flag uncertain interfaces.

The confidence is calibrated, not decorative

Per-residue confidence is built from the ensemble's own spread (cross-seed RMSF), so it tracks real uncertainty rather than a restraint-fit proxy. Run --best-of 6 and it is genuinely calibrated: the map turns into a p_err_lt_2A column — P(Cα error < 2 Å) — validated on held-out data. In lighter ensembles it falls back to a descriptive guide — and the report always tells you which mode you are in.

The per-residue track

DAn ID card for every amino acid.

All the per-residue signals collapse into one tidy CSV — one row per residue, ready for a spreadsheet, a notebook, or a downstream pipeline.

1CRN.per_residue.csv

index,residue,burial_state,exposure,ss,rmsf
1,T,buried,0.128,C,0.007
2,C,buried,0.156,E,0.103
3,C,buried,0.113,G,0.025
4,P,intermediate,0.402,G,0.026
5,S,intermediate,0.526,E,0.036
…

ss H/E/G/C · exposure 0→1 (buried→exposed) · rmsf flexibility in Å. With --best-of ≥3 two more columns appear: confidence [0–100] and p_err_lt_2A — the calibrated P(Cα error < 2 Å).

The interactive report

EOpen one file. See everything.

The HTML report is fully self-contained — the structure and ensemble are embedded inline, the 3D engine loads from a CDN, and it needs no server. Double-click it and explore.

3D viewer · switchable colouring

Spectrum — N → C terminus, the default

Secondary structure — helix / strand / coil

Errors & conflicts — worst residues in red

Confidence — blue (high) → red (low); added with --best-of ≥3

Around the viewer sit the rest of the story: headline cards (Cα-RMSD, TM-score, ensemble size, helices / sheets — plus a confidence card once you run an ensemble), a per-residue map with secondary-structure (and, with an ensemble, confidence) strips, a predicted contact-map heatmap, and collapsible sections for every metric in the report.

It is the whole package in a form a collaborator can open without installing anything — the data room and the lab notebook in one file.

Honest by construction

FYou get exactly what you ran for.

There are no output flags to forget. What you receive is a clean function of how you ran — and the things that need an ensemble cannot be faked from a single fold, so we simply don't pretend to produce them.

What you ran	What you get
any --run	Structure + JSON/MD report + full value layer (per-residue, H-bonds, contact map, trajectory) + interactive HTML + PyMOL script
--best-of ≥2	+ ensemble + flexibility (RMSF)
--best-of ≥3	+ per-residue confidence + PAE map
--best-of 6	confidence becomes calibrated — `P(Cα error < 2 Å)`
--held-out-input	+ the blind out-of-sample evaluation block

Why some signals aren't always there: ensemble, flexibility, confidence and PAE are derived from the spread across multiple seeds. In a single run there is nothing to measure variance over — so they are absent by physics, not by a withheld flag.

See it for real

A real result, nothing hidden.

Reading about the bundle only goes so far. Everything on this page — the structure, the quality report, the value-data layer, the per-residue track and the self-contained interactive HTML — ships together on every run. You don't have to take our word for it: the complete output bundles for our entire benchmark, across all targets, conditions and evidence ladders, are published in the open on GitHub. Open any one and you'll find exactly the artifacts described above.

Browse all results on GitHub → Buy credits →

One fold, many views.

Every number you can act onwas earned blind.

Nothing here is a second guess.

The answer never leaks in.

AWhat lands in your output folder.

The structure

The report

Interactive report

Per-residue track

Network & topology

The folding path

PyMOL script

The ensemble

The roll-up

BNot just a structure — a verdict.

How close to the truth.

Is it even possible.

From where to why, blindly.

Worst pairs & residues localization

Residue conflicts causal attribution

Conflict summary root cause

Keep set closed loop

Held-out evaluation non-gameable

CWhat a single PDB can't tell you.

The confidence is calibrated, not decorative

DAn ID card for every amino acid.

EOpen one file. See everything.

FYou get exactly what you ran for.

A real result, nothing hidden.

Every number you can act on
was earned blind.