Beyond Repo Scanning: How AIRI Expanded the Risk Vocabulary in STEM BIO-AI 1.7.x

This is the second half of the same 1.7.x transition.

In the previous post, I wrote about calibration governance: how STEM BIO-AI keeps score authority from drifting when users simulate policy posture.

That was about how the system decides.

This post is about a different layer:

how the system speaks about risk.

A local repository scanner can become trapped inside its own vocabulary.

It can detect dependency issues, weak provenance language, shallow validation, reproducibility gaps, and risky exception handling.

But if every finding stays only inside the scanner's internal language, the report may remain too narrow.

That is the problem AIRI helped address in STEM BIO-AI 1.7.x.

In this context, AIRI is used as a local risk-vocabulary layer built from the MIT AI Risk Repository ecosystem.

The point is not to replace deterministic repository scanning with an external risk database.

The point is to give local findings a broader risk vocabulary without turning that vocabulary into a truth claim.

Basic AIRI Context

The MIT AI Risk Repository is a public AI risk resource from the MIT AI Risk Initiative.

It helps organize fragmented AI risk language across research, policy, and industry sources.

The repository includes three main parts:

an AI Risk Database

a Causal Taxonomy of AI Risks

a Domain Taxonomy of AI Risks

According to the MIT AI Risk Repository site, the database collects 1,700+ risks from 74 existing AI risk frameworks and classifications. The public domain taxonomy organizes risks into 7 domains and 24 subdomains.

Some of those domain taxonomy nodes include:

2. Privacy & Security

2.1 Compromise of privacy by obtaining, leaking or correctly inferring sensitive information

2.2 AI system security vulnerabilities and attacks

6.5 Governance failure

7. AI System Safety, Failures, & Limitations

7.3 Lack of capability or robustness

7.4 Lack of transparency or interpretability

That makes AIRI useful as a vocabulary source.

But vocabulary is not truth.

A local scanner should not say:

this repository caused this risk.

It should say something more careful:

this local finding belongs to a broader class of AI risk language.

That distinction is the design boundary.

What Problem AIRI Was Meant to Solve

Local scanners are trapped in their own vocabulary

STEM BIO-AI began as a deterministic evidence-surface scanner for bio and medical AI repositories.

That core remains.

The scanner looks at observable repository surfaces:

README and docs

code structure

CI configuration

dependency manifests

changelogs

reproducibility signals

clinical-adjacent boundary language

But once STEM BIO-AI started producing richer audit outputs, a new question appeared:

How should the system talk about the broader risk territory around a detected finding?

For example:

a fail-open exception path may have implications beyond code quality

weak provenance language may connect to reproducibility and trust concerns

shallow validation around sensitive inputs may point toward a wider harm surface than the repository alone makes obvious

Without a broader vocabulary, those findings remain local and narrow.

AIRI helps widen the vocabulary without making the scanner less deterministic.

A Short Note on Detector Families

In this article, a detector family means a bounded local analysis surface inside STEM BIO-AI.

It does not mean an AI model judging the repository.

Examples include:

code integrity detectors such as hardcoded credential or fail-open exception checks

AST contract detectors such as shallow validator checks

bio diagnostics such as SMILES parser-guard or silent mock fallback checks

provenance and reproducibility evidence surfaces

A detector family produces a local finding.

The AIRI layer does not replace that finding.

It gives the finding a broader vocabulary anchor.

AIRI Does Not Replace the Scan

This boundary matters.

The AIRI layer does not:

validate that a real-world incident happened

prove that a repository causes a given harm

turn a detector hit into a clinical danger claim

replace due diligence or domain review

override the deterministic score

Instead, it gives the system a structured way to say:

what broader risk territory a finding may relate to

which risk vocabulary exists around that class of concern

where known coverage gaps remain

That is why AIRI is a risk-vocabulary layer, not a truth layer.

If a report says something like:

that should not be read as:

or:

A better interpretation is:

within the detector scope currently mapped into the curated AIRI runtime layer, this scan triggered findings that connect to these AIRI risk entries.

That is narrower.

It is also more useful.

From External Repository to Local Governance Layer

The AIRI story in STEM BIO-AI changed during 1.7.x.

The initial direction was simple: use AIRI to provide broader risk labels around local findings.

That was useful, but not enough.

If an audit system relies on an external risk source, it needs governance around that source.

So STEM BIO-AI separates AIRI into three local layers:

Local layer	Purpose
`airi_registry_full.v1.json`	normalized full local registry derived from the upstream AIRI snapshot
`airi_runtime_bundle.v1.json`	curated runtime subset used by deterministic scans
`airi_detector_mapping.v1.json`	detector-to-risk mapping registry plus known-gap records

This separation prevents a common mistake:

confusing the full upstream AIRI universe with the smaller curated runtime bundle used by the scanner.

The scanner uses the curated runtime bundle, not the entire upstream AIRI universe.

That keeps runtime outputs deterministic, reviewable, and tied to a known local snapshot.

What “Governed” Means Here

In the current 1.7.5 state of the 1.7.x line, governed does not mean that every mapping has gone through an external review board.

It means something narrower and more concrete:

AIRI data is stored as versioned local artifacts

runtime scan output uses a curated bundle, not the entire upstream universe

detector mappings are separated from the full registry

known gaps are recorded as part of the mapping layer

artifact metadata surfaces AIRI registry, bundle, mapping, snapshot, and license information

changes to registry, runtime bundle, or mapping versions require explicit version bumps

That is the current governance level.

It is not final.

But it is stronger than attaching a risk dataset as an unversioned appendix.

The Curation Logic

This is the part that matters most.

AIRI is broad. STEM BIO-AI is narrow.

STEM BIO-AI does not need every AIRI entry active at runtime. It needs the subset that can be responsibly connected to deterministic repository evidence.

So the runtime bundle is curated by exclusion as much as inclusion.

A risk vocabulary node should stay outside the runtime bundle when:

No local evidence surface exists The scanner has no repository-level signal that can responsibly connect to that risk.

The mapping would require causal inference The scanner would have to imply that harm occurred, that users were affected, or that the repository caused a risk.

The risk is too broad for repository-local evidence Broad societal, geopolitical, or macroeconomic risks may be important in AIRI, but they should not become runtime scan outputs unless a local detector surface can support the mapping.

The mapping would confuse vocabulary with score authority If a risk label might be read as changing the formal score or certifying danger, it should remain outside the runtime layer until the reporting semantics are clear.

So the runtime bundle is not a summary of all AI risk.

It is the subset of risk vocabulary that the scanner can use responsibly.

Example: Detector Hit to AIRI Domain Vocabulary

A concrete example helps.

Suppose STEM BIO-AI detects a shallow validator around sensitive or clinical-adjacent inputs.

The local finding might be:

At the repository level, this is a code-contract finding.

It says:

the function appears to validate input

the validation is shallow

the implementation may not enforce the boundary implied by its name

The AIRI layer should not turn that into:

this repository caused privacy harm.

That would be too strong.

A safer mapping uses AIRI as vocabulary:

Local detector surface	Local meaning	AIRI vocabulary anchor
`CC3_shallow_validator`	validation function appears shallower than its name implies	`7.3 Lack of capability or robustness`; possibly `2.1 Compromise of privacy...` if sensitive information handling is in scope
fail-open exception path	code path may silently continue after failure	`7.3 Lack of capability or robustness`
hardcoded credential signal	repository surface suggests exposed secret-like pattern	`2.2 AI system security vulnerabilities and attacks`
weak provenance surface	repository gives weak evidence about data/source traceability	`7.4 Lack of transparency or interpretability`; possibly `6.5 Governance failure`
silent mock fallback	production-like path may fall back to simulated behavior	`7.3 Lack of capability or robustness`; `7.4 Lack of transparency or interpretability`

The mapping does not prove harm.

It tells the reviewer which broader AIRI vocabulary may be relevant to the local finding.

That is the difference between:

this detector proves a risk occurred

and:

this detector finding belongs near this risk-language area.

The second claim is weaker.

It is also the correct claim.

Why Local Provenance Matters

AIRI is external.

That means STEM BIO-AI needs to answer governance questions explicitly:

which upstream snapshot is being used?

which subset is active at runtime?

which risks are included in the curated bundle?

which risks are known gaps?

which detector maps to which AIRI entry?

what version of the mapping is active?

This is why the AIRI work matters.

It is not just adding labels.

It is turning risk vocabulary into a governed local data layer.

In the current governance note, the upstream source is recorded as:

upstream source: https://airisk.mit.edu/

upstream artifact: The AI Risk Repository V4_03

upstream license: MIT

local snapshot date: 2026-04-23

That provenance is not cosmetic.

It allows an audit artifact to say which risk vocabulary it was using when the scan was produced.

What Is Implemented in the Current 1.7.5 State of 1.7.x

The current AIRI layer is implemented, but bounded.

Implemented surfaces include:

AIRI-backed coverage surfaces in scan outputs

local curated runtime bundle

local registry and mapping schemas

detector-to-AIRI mapping layer

known-gap reporting

provenance and bundle/source labeling

In current scan results, airi_risk_coverage is the main artifact surface for this layer.

The public result contract includes AIRI fields such as:

airi_registry_version

airi_bundle_version

airi_mapping_version

airi_bundle_scope

airi_upstream_snapshot_date

airi_upstream_license

total_risks_in_registry

total_risks_in_bundle

total_risks_in_detector_scope

detectors_triggered

covered_risks

covered_count

coverage_rate

known_gaps_in_bundle

known_gaps_outside_bundle

These fields matter because they let a reviewer distinguish three things that are easy to confuse: the upstream AIRI source, the local runtime bundle, and the detector mapping actually used by the scan.

The important part is not only that these fields exist.

The important part is that AIRI usage becomes auditable from the artifact itself.

If two scans use different AIRI snapshots or mappings, that difference should not be hidden.

Coverage Is Not a Safety Percentage

AIRI coverage in STEM BIO-AI is an audit-surface concept, not a safety percentage.

It does not mean:

the repository is safe

the repository is unsafe

the scanner covers all AI risk

the covered percentage is a compliance score

It means:

a local deterministic finding has been mapped to a known risk-vocabulary entry inside the curated AIRI runtime layer.

That is useful because it gives reviewers a wider frame.

But it does not turn local evidence into a global safety claim.

This is the same discipline used elsewhere in STEM BIO-AI:

scoring is not clinical validation

advisory interpretation is not scoring authority

reproducibility evidence is not automatic score authority

AIRI coverage is not a safety percentage

Each layer has a role.

Each layer has a boundary.

What Changed in 1.7.x

The 1.7.x AIRI story is not simply “we added AIRI.”

The actual change was a move from loose risk labeling toward governed local vocabulary.

1.7.0

AIRI V4 integration appeared in scan outputs.

The scanner began producing an airi_risk_coverage section that maps triggered detector findings to AIRI risk IDs, coverage rate, and known gaps.

The same release also introduced Layer 2 AST contract detectors such as CC1, CC2, and CC3, which expanded the local detector surface available for risk-vocabulary mapping.

1.7.1

AIRI became a governed local data layer.

The architecture separated:

full local registry

curated runtime bundle

detector mapping registry

This release also replaced hardcoded AIRI detector mappings and known-gap lists with packaged local registry files.

Runtime outputs began surfacing registry version, bundle version, mapping version, upstream snapshot date, license, attribution note, and split known gaps into known_gaps_in_bundle and known_gaps_outside_bundle.

1.7.2

No major AIRI architecture change.

The important governance point was regression stability: same-target self-scan comparison verified no drift in airi_risk_coverage alongside score, tier, code contract, detector summary, and evidence ledger count.

1.7.3

No major AIRI architecture change.

The release focused on runtime cleanup, stale demo wording, layout stabilization, and output routing.

1.7.4

AIRI presentation became clearer across demo and report outputs.

The release surfaced AIRI summary material more clearly across the Hugging Face overview card and markdown/explain report sections.

1.7.5

No new AIRI data architecture change.

But artifact-level governance improved more broadly through additive evidence-ledger quality fields and audit-freshness metadata.

That matters because AIRI is most useful when it lives inside a report surface that already carries freshness, evidence quality, and provenance signals.

The important change across the line is this:

AIRI moved from attached dataset toward versioned local risk-vocabulary layer.

What This Still Does Not Do

Local evidence first, external vocabulary second

The AIRI layer still does not:

verify real incidents

prove causality

certify repository safety

replace domain review

turn AIRI categories into deterministic truth claims

collapse the full upstream AIRI universe into the runtime scanner

Those are not missing features.

They are the boundaries that keep the layer useful.

Where This Could Go

The next useful direction is not to overload the scanner with external systems.

It is to improve:

registry provenance

bundle governance

mapping confidence

known-gap clarity

artifact-visible mapping metadata

disciplined links to incident-oriented resources

The broader MIT AIRI ecosystem also includes related incident-oriented resources such as the AI Incident Tracker.

That ecosystem is relevant context, but it is not the same thing as current runtime integration in STEM BIO-AI.

A future version may choose to reference incident-oriented resources more explicitly, but deterministic scans should not ingest them casually or blur them with repository-local findings.

A future version should be able to say not only:

this detector maps to this AIRI risk vocabulary area.

But also:

this mapping has this confidence level, this review status, this local evidence family, and this known limitation.

That is the next governance step.

Final Thought

That is the role of AIRI in this release line.

Not truth replacement.

Not safety certification.

Not incident proof.

A governed vocabulary bridge.

Local evidence first.

External vocabulary second.

Explicit provenance always.

References and Acknowledgment

MIT AI Risk Repository: https://airisk.mit.edu/

MIT AI Incident Tracker: https://airisk.mit.edu/ai-incident-tracker

STEM BIO-AI repository: https://github.com/flamehaven01/STEM-BIO-AI

This AIRI-related direction in STEM BIO-AI was informed by broader public AI risk work, including the MIT AI Risk Repository ecosystem.

The framing around AIRI as a broader risk-vocabulary layer, rather than a repository-local truth layer, was also strengthened by public commentary and ecosystem work from people in this space, including Peter Slattery, PhD.

These references informed the vocabulary and governance direction described here. They do not imply endorsement of STEM BIO-AI or responsibility for its implementation choices.