
Beyond Repo Scanning: How AIRI Expanded the Risk Vocabulary in STEM BIO-AI 1.7.x
How STEM BIO-AI uses the MIT AI Risk Repository as a governed local risk-vocabulary layer without replacing deterministic repository scanning
Series
STEM-AI:Soverign Trust Evaluator for Medical AI ArtifactsPart 6 of 7

This is the second half of the same
1.7.x transition.In the previous post, I wrote about calibration governance: how STEM BIO-AI keeps score authority from drifting when users simulate policy posture.
That was about how the system decides.
This post is about a different layer:
how the system speaks about risk.
A local repository scanner can become trapped inside its own vocabulary.
It can detect dependency issues, weak provenance language, shallow validation, reproducibility gaps, and risky exception handling.
But if every finding stays only inside the scanner's internal language, the report may remain too narrow.
That is the problem AIRI helped address in STEM BIO-AI
1.7.x.In this context, AIRI is used as a local risk-vocabulary layer built from the MIT AI Risk Repository ecosystem.
The point is not to replace deterministic repository scanning with an external risk database.
The point is to give local findings a broader risk vocabulary without turning that vocabulary into a truth claim.
Basic AIRI Context

The MIT AI Risk Repository is a public AI risk resource from the MIT AI Risk Initiative.
It helps organize fragmented AI risk language across research, policy, and industry sources.
The repository includes three main parts:
- an AI Risk Database
- a Causal Taxonomy of AI Risks
- a Domain Taxonomy of AI Risks
According to the MIT AI Risk Repository site, the database collects 1,700+ risks from 74 existing AI risk frameworks and classifications. The public domain taxonomy organizes risks into 7 domains and 24 subdomains.
Some of those domain taxonomy nodes include:
2. Privacy & Security
2.1 Compromise of privacy by obtaining, leaking or correctly inferring sensitive information
2.2 AI system security vulnerabilities and attacks
6.5 Governance failure
7. AI System Safety, Failures, & Limitations
7.3 Lack of capability or robustness
7.4 Lack of transparency or interpretability
That makes AIRI useful as a vocabulary source.
But vocabulary is not truth.
A local scanner should not say:
this repository caused this risk.
It should say something more careful:
this local finding belongs to a broader class of AI risk language.
That distinction is the design boundary.
What Problem AIRI Was Meant to Solve

STEM BIO-AI began as a deterministic evidence-surface scanner for bio and medical AI repositories.
That core remains.
The scanner looks at observable repository surfaces:
- README and docs
- code structure
- CI configuration
- dependency manifests
- changelogs
- reproducibility signals
- clinical-adjacent boundary language
But once STEM BIO-AI started producing richer audit outputs, a new question appeared:
How should the system talk about the broader risk territory around a detected finding?
For example:
- a fail-open exception path may have implications beyond code quality
- weak provenance language may connect to reproducibility and trust concerns
- shallow validation around sensitive inputs may point toward a wider harm surface than the repository alone makes obvious
Without a broader vocabulary, those findings remain local and narrow.
AIRI helps widen the vocabulary without making the scanner less deterministic.
A Short Note on Detector Families
In this article, a detector family means a bounded local analysis surface inside STEM BIO-AI.
It does not mean an AI model judging the repository.
Examples include:
- code integrity detectors such as hardcoded credential or fail-open exception checks
- AST contract detectors such as shallow validator checks
- bio diagnostics such as SMILES parser-guard or silent mock fallback checks
- provenance and reproducibility evidence surfaces
A detector family produces a local finding.
The AIRI layer does not replace that finding.
It gives the finding a broader vocabulary anchor.
AIRI Does Not Replace the Scan

This boundary matters.
The AIRI layer does not:
- validate that a real-world incident happened
- prove that a repository causes a given harm
- turn a detector hit into a clinical danger claim
- replace due diligence or domain review
- override the deterministic score
Instead, it gives the system a structured way to say:
- what broader risk territory a finding may relate to
- which risk vocabulary exists around that class of concern
- where known coverage gaps remain
That is why AIRI is a risk-vocabulary layer, not a truth layer.
If a report says something like:
that should not be read as:
or:
A better interpretation is:
within the detector scope currently mapped into the curated AIRI runtime layer, this scan triggered findings that connect to these AIRI risk entries.
That is narrower.
It is also more useful.
From External Repository to Local Governance Layer

The AIRI story in STEM BIO-AI changed during
1.7.x.The initial direction was simple: use AIRI to provide broader risk labels around local findings.
That was useful, but not enough.
If an audit system relies on an external risk source, it needs governance around that source.
So STEM BIO-AI separates AIRI into three local layers:
Local layer | Purpose |
airi_registry_full.v1.json | normalized full local registry derived from the upstream AIRI snapshot |
airi_runtime_bundle.v1.json | curated runtime subset used by deterministic scans |
airi_detector_mapping.v1.json | detector-to-risk mapping registry plus known-gap records |
This separation prevents a common mistake:
confusing the full upstream AIRI universe with the smaller curated runtime bundle used by the scanner.
The scanner uses the curated runtime bundle, not the entire upstream AIRI universe.
That keeps runtime outputs deterministic, reviewable, and tied to a known local snapshot.
What “Governed” Means Here
In the current
1.7.5 state of the 1.7.x line, governed does not mean that every mapping has gone through an external review board.It means something narrower and more concrete:
- AIRI data is stored as versioned local artifacts
- runtime scan output uses a curated bundle, not the entire upstream universe
- detector mappings are separated from the full registry
- known gaps are recorded as part of the mapping layer
- artifact metadata surfaces AIRI registry, bundle, mapping, snapshot, and license information
- changes to registry, runtime bundle, or mapping versions require explicit version bumps
That is the current governance level.
It is not final.
But it is stronger than attaching a risk dataset as an unversioned appendix.
The Curation Logic

This is the part that matters most.
AIRI is broad. STEM BIO-AI is narrow.
STEM BIO-AI does not need every AIRI entry active at runtime. It needs the subset that can be responsibly connected to deterministic repository evidence.
So the runtime bundle is curated by exclusion as much as inclusion.
A risk vocabulary node should stay outside the runtime bundle when:
- No local evidence surface exists The scanner has no repository-level signal that can responsibly connect to that risk.
- The mapping would require causal inference The scanner would have to imply that harm occurred, that users were affected, or that the repository caused a risk.
- The risk is too broad for repository-local evidence Broad societal, geopolitical, or macroeconomic risks may be important in AIRI, but they should not become runtime scan outputs unless a local detector surface can support the mapping.
- The mapping would confuse vocabulary with score authority If a risk label might be read as changing the formal score or certifying danger, it should remain outside the runtime layer until the reporting semantics are clear.
So the runtime bundle is not a summary of all AI risk.
It is the subset of risk vocabulary that the scanner can use responsibly.
Example: Detector Hit to AIRI Domain Vocabulary

A concrete example helps.
Suppose STEM BIO-AI detects a shallow validator around sensitive or clinical-adjacent inputs.
The local finding might be:
At the repository level, this is a code-contract finding.
It says:
- the function appears to validate input
- the validation is shallow
- the implementation may not enforce the boundary implied by its name
The AIRI layer should not turn that into:
this repository caused privacy harm.
That would be too strong.
A safer mapping uses AIRI as vocabulary:
Local detector surface | Local meaning | AIRI vocabulary anchor |
CC3_shallow_validator | validation function appears shallower than its name implies | 7.3 Lack of capability or robustness; possibly 2.1 Compromise of privacy... if sensitive information handling is in scope |
fail-open exception path | code path may silently continue after failure | 7.3 Lack of capability or robustness |
hardcoded credential signal | repository surface suggests exposed secret-like pattern | 2.2 AI system security vulnerabilities and attacks |
weak provenance surface | repository gives weak evidence about data/source traceability | 7.4 Lack of transparency or interpretability; possibly 6.5 Governance failure |
silent mock fallback | production-like path may fall back to simulated behavior | 7.3 Lack of capability or robustness; 7.4 Lack of transparency or interpretability |
The mapping does not prove harm.
It tells the reviewer which broader AIRI vocabulary may be relevant to the local finding.
That is the difference between:
this detector proves a risk occurred
and:
this detector finding belongs near this risk-language area.
The second claim is weaker.
It is also the correct claim.
Why Local Provenance Matters

AIRI is external.
That means STEM BIO-AI needs to answer governance questions explicitly:
- which upstream snapshot is being used?
- which subset is active at runtime?
- which risks are included in the curated bundle?
- which risks are known gaps?
- which detector maps to which AIRI entry?
- what version of the mapping is active?
This is why the AIRI work matters.
It is not just adding labels.
It is turning risk vocabulary into a governed local data layer.
In the current governance note, the upstream source is recorded as:
- upstream source:
https://airisk.mit.edu/
- upstream artifact:
The AI Risk Repository V4_03
- upstream license:
MIT
- local snapshot date:
2026-04-23
That provenance is not cosmetic.
It allows an audit artifact to say which risk vocabulary it was using when the scan was produced.
What Is Implemented in the Current 1.7.5 State of 1.7.x
The current AIRI layer is implemented, but bounded.
Implemented surfaces include:
- AIRI-backed coverage surfaces in scan outputs
- local curated runtime bundle
- local registry and mapping schemas
- detector-to-AIRI mapping layer
- known-gap reporting
- provenance and bundle/source labeling
In current scan results,
airi_risk_coverage is the main artifact surface for this layer.The public result contract includes AIRI fields such as:
airi_registry_version
airi_bundle_version
airi_mapping_version
airi_bundle_scope
airi_upstream_snapshot_date
airi_upstream_license
total_risks_in_registry
total_risks_in_bundle
total_risks_in_detector_scope
detectors_triggered
covered_risks
covered_count
coverage_rate
known_gaps_in_bundle
known_gaps_outside_bundle
These fields matter because they let a reviewer distinguish three things that are easy to confuse: the upstream AIRI source, the local runtime bundle, and the detector mapping actually used by the scan.
The important part is not only that these fields exist.
The important part is that AIRI usage becomes auditable from the artifact itself.
If two scans use different AIRI snapshots or mappings, that difference should not be hidden.
Coverage Is Not a Safety Percentage
AIRI coverage in STEM BIO-AI is an audit-surface concept, not a safety percentage.
It does not mean:
- the repository is safe
- the repository is unsafe
- the scanner covers all AI risk
- the covered percentage is a compliance score
It means:
a local deterministic finding has been mapped to a known risk-vocabulary entry inside the curated AIRI runtime layer.
That is useful because it gives reviewers a wider frame.
But it does not turn local evidence into a global safety claim.
This is the same discipline used elsewhere in STEM BIO-AI:
- scoring is not clinical validation
- advisory interpretation is not scoring authority
- reproducibility evidence is not automatic score authority
- AIRI coverage is not a safety percentage
Each layer has a role.
Each layer has a boundary.
What Changed in 1.7.x
The
1.7.x AIRI story is not simply “we added AIRI.”The actual change was a move from loose risk labeling toward governed local vocabulary.
1.7.0
AIRI V4 integration appeared in scan outputs.
The scanner began producing an
airi_risk_coverage section that maps triggered detector findings to AIRI risk IDs, coverage rate, and known gaps.The same release also introduced Layer 2 AST contract detectors such as
CC1, CC2, and CC3, which expanded the local detector surface available for risk-vocabulary mapping.1.7.1
AIRI became a governed local data layer.
The architecture separated:
- full local registry
- curated runtime bundle
- detector mapping registry
This release also replaced hardcoded AIRI detector mappings and known-gap lists with packaged local registry files.
Runtime outputs began surfacing registry version, bundle version, mapping version, upstream snapshot date, license, attribution note, and split known gaps into
known_gaps_in_bundle and known_gaps_outside_bundle.1.7.2
No major AIRI architecture change.
The important governance point was regression stability: same-target self-scan comparison verified no drift in
airi_risk_coverage alongside score, tier, code contract, detector summary, and evidence ledger count.1.7.3
No major AIRI architecture change.
The release focused on runtime cleanup, stale demo wording, layout stabilization, and output routing.
1.7.4
AIRI presentation became clearer across demo and report outputs.
The release surfaced AIRI summary material more clearly across the Hugging Face overview card and markdown/explain report sections.
1.7.5
No new AIRI data architecture change.
But artifact-level governance improved more broadly through additive evidence-ledger quality fields and audit-freshness metadata.
That matters because AIRI is most useful when it lives inside a report surface that already carries freshness, evidence quality, and provenance signals.
The important change across the line is this:
AIRI moved from attached dataset toward versioned local risk-vocabulary layer.
What This Still Does Not Do

The AIRI layer still does not:
- verify real incidents
- prove causality
- certify repository safety
- replace domain review
- turn AIRI categories into deterministic truth claims
- collapse the full upstream AIRI universe into the runtime scanner
Those are not missing features.
They are the boundaries that keep the layer useful.
Where This Could Go
The next useful direction is not to overload the scanner with external systems.
It is to improve:
- registry provenance
- bundle governance
- mapping confidence
- known-gap clarity
- artifact-visible mapping metadata
- disciplined links to incident-oriented resources
The broader MIT AIRI ecosystem also includes related incident-oriented resources such as the AI Incident Tracker.
That ecosystem is relevant context, but it is not the same thing as current runtime integration in STEM BIO-AI.
A future version may choose to reference incident-oriented resources more explicitly, but deterministic scans should not ingest them casually or blur them with repository-local findings.
A future version should be able to say not only:
this detector maps to this AIRI risk vocabulary area.
But also:
this mapping has this confidence level, this review status, this local evidence family, and this known limitation.
That is the next governance step.
Final Thought

That is the role of AIRI in this release line.
Not truth replacement.
Not safety certification.
Not incident proof.
A governed vocabulary bridge.
Local evidence first.
External vocabulary second.
Explicit provenance always.
References and Acknowledgment
- MIT AI Risk Repository: https://airisk.mit.edu/
- MIT AI Incident Tracker: https://airisk.mit.edu/ai-incident-tracker
- STEM BIO-AI repository: https://github.com/flamehaven01/STEM-BIO-AI
This AIRI-related direction in STEM BIO-AI was informed by broader public AI risk work, including the MIT AI Risk Repository ecosystem.
The framing around AIRI as a broader risk-vocabulary layer, rather than a repository-local truth layer, was also strengthened by public commentary and ecosystem work from people in this space, including Peter Slattery, PhD.
These references informed the vocabulary and governance direction described here. They do not imply endorsement of STEM BIO-AI or responsibility for its implementation choices.
Next Step
If your AI system works in demos but still feels fragile, start here.
Flamehaven reviews where AI systems overclaim, drift quietly, or remain operationally fragile under real conditions. Start with a direct technical conversation or review how the work is structured before you reach out.
Direct founder contact · Response within 1-2 business days
Share
Continue the series
View all in seriesRelated Reading
Scientific & BioAI Infrastructure
From Score to Workflow: Turning STEM BIO-AI Into a Local Audit System
Scientific & BioAI Infrastructure
I Audited 10 Open-Source Bio-AI Repos. Most Could Produce Outputs. Few Could Establish Trust.
Scientific & BioAI Infrastructure