Flamehaven projects
Core tracks, system domains, and the technical thesis behind the work.
Projects is the domain map. It explains the long-horizon tracks that organize how Flamehaven thinks, builds, and verifies systems. Selected Work is the proof layer beneath it. This page defines the territory first, then points to the artifacts that make it concrete.
Control, auditability, and safe boundaries
AI Governance Systems
This track focuses on the layers that make AI behavior inspectable before it reaches production: policy boundaries, fail-closed gates, and governance logic that can survive legal, operational, or safety review.
The goal is not to add superficial compliance language after a model is already wired into your workflow. The goal is to define where the system may act, when it must stop, and what evidence exists for those decisions.
Flamehaven uses governance as a systems problem: constraints, audit trails, review surfaces, and runtime behavior should align. If they do not, the architecture is still fragile even if the demo looks polished.
Related Selected Work
CCGE: Fail-Closed Governance Engine
Fail-closed governance engine for healthcare AI systems, ensuring deterministic boundaries around probabilistic models.
AI-SLOP-Detector
A long-running code review and anti-slop inspection system designed to surface low-integrity patterns before they harden into production debt.
Flamehaven-Tensor-Canon
Universal Data Governance Engine ∴ Enforcing structural covenants and detecting drift (MMD) for PyTorch & NumPy pipelines.
Related Writing
Short writing list focused on governance, safety, and architectural control.
My LLM Kept Forgetting My Project. So I Built a Governance Schema.
Session loss isn't a UX inconvenience — it's a structural failure with compounding consequences for long-running AI projects. This post defines the problem precisely and introduces MICA, a governance schema for AI context management.
From Fail-Closed Blocking to Reproducible PASS/BLOCK Separation (EXP-032B)
A validation study showing how EXP-032B achieved reproducible PASS/BLOCK separation across A/B/C control arms by patching false-blocking causes, improving observability, and measuring replay drift under observer-shadow conditions.
Why AI Dismisses Your Best Work in One Second
Why do AI models dismiss original work in seconds? This essay explores the hidden mechanics of AI skimming—shortcut learning, probabilistic safety, fast-thinking defaults, and why depth requires time.
Inference quality, validation, and proof surfaces
Reasoning / Verification Engines
This track covers systems that inspect claims, reasoning steps, and structural integrity. The emphasis is not “can the model answer” but “can the system justify, verify, and reject weak output.”
Reasoning infrastructure matters when downstream decisions are expensive, regulated, or irreversible. In those environments, plausible output without verification is just delayed failure.
Flamehaven treats verification as part of the product architecture itself: not a QA afterthought, but a required layer that shapes which outputs are allowed to survive.
Related Selected Work
AI-SLOP-Detector
A long-running code review and anti-slop inspection system designed to surface low-integrity patterns before they harden into production debt.
ProofCore-AI-Benchmark
ProofCore is a browser-native, 100% offline-first, hybrid mathematical proof verification engine. It combines rigorous symbolic math with semantic understanding to reliably verify mathematical proofs, offering zero ex...
HRPO-X
Hybrid Reasoning Policy Optimization (HRPO): a research prototype for hybrid latent reasoning with RL.
Related Writing
Posts linked to reasoning quality, verification, proof, and evaluation.
🧠 Why Your 128K Context Still Fails — And How CRoM Fixes It
Most large language models fail in long prompts due to context rot. CRoM is a lightweight framework that improves memory, reasoning, and stability without heavy pipelines.
Beyond the Mirror: What We Truly Want from AI
AI mirrors us but forgets itself. True AI ethics is continuity: giving systems roots and spines so they don’t drift apart.
The Silent Failure in AI — And How We Learned to Catch It
Drift in AI isn’t abstract. It’s already here. From medicine to finance, here’s how we caught it with real systems, real code, and real lessons.
Evidence-aware scientific systems
Scientific & BioAI Infrastructure
This track is for scientific and BioAI environments where reproducibility, validation boundaries, and explicit methodological structure matter more than generic model enthusiasm.
Scientific systems need more than automation. They need traceable assumptions, screened hypotheses, and outputs that can be inspected by technical stakeholders without hand-waving.
Flamehaven approaches BioAI and scientific infrastructure as high-stakes engineering: evidence pathways, reviewable artifacts, and architectures that stay useful when the domain becomes more demanding.
Related Selected Work
RExSyn-Nexus
A governance-aware orchestration framework for AI systems that need structured reasoning, explicit controls, and traceable decision paths.
Flamehaven-TOE
A research-side validation engine for structured hypothesis extraction, experimental framing, and multi-step reasoning review.
ARR-medic-cyp3a4
Research-side CYP3A4 interaction prediction system for pharmacology education, exploratory screening, and BioAI workflow design.
Related Writing
Posts connected to scientific workflows, BioAI, and evidence-bound research systems.
How do you know when your entire AI pipeline is wrong — not just one model? (EXP-033)
EXP-033 shows how to validate an entire AI pipeline, not just one model, using five-gate checkpoints, reproducible PASS/BLOCK parity, AlphaGenome on/off testing, and fully traceable governance decisions.
What AI Changed About Research Code — and What It Didn’t
The old bottleneck was writing the code. The new bottleneck is proving that the code still means what the theory meant.
What an AI Reasoning Engine Built for Alzheimer's Metabolic Research: A Code Walkthrough
A code walkthrough of an AI reasoning engine for Alzheimer’s metabolic research, showing how literature ingestion, causal inference, and executable biomarker scaffolds generate falsifiable pre-validation hypotheses.
Operational surfaces that survive real deployment
Cloud & Engineering Foundations
This track covers the engineering foundations that hold everything else up: deployment surfaces, delivery tooling, developer infrastructure, and the production scaffolding that turns concept work into systems teams can operate.
A strong idea still fails if the surrounding engineering is weak. Infrastructure, automation, and delivery logic determine whether the system can be sustained after the initial build.
Flamehaven treats operational foundations as part of the same thesis: architecture should be governable, observable, and practical to evolve under real production pressure.
Related Selected Work
Flamehaven-Filesearch
Open-source semantic document search (RAG) engine with FastAPI and instant self-hosted deployment
copilot-guardian
Autonomous CI/CD recovery tool powered by GitHub Copilot CLI. Analyzes failures with multi-hypothesis reasoning, generates risk-stratified patches (Conservative/Balanced/Aggressive), and auto-applies fixes with full t...
Dir2md
CLI pipeline that converts codebases into structured markdown context for AI-assisted engineering, review, and documentation workflows.
FlashRecord
The fastest Python-first CLI screen recorder ∴ Instant screenshots (@sc) and lightweight GIF recording (@sv) for developer automation. No GUI, just speed.
Related Writing
Posts tied to engineering practice, deployment, and production infrastructure.
Prompt, Pray & Push: Why Your AI Agent Keeps Failing You
The one concept that turns expensive spaghetti into great agentic engineering.
The Pull Request Illusion: How AI Is Hollowing Out Software’s Last Line of Defense
GitHub Just Added a Switch to Turn Off Pull Requests. That’s Not a Feature. It’s a Warning.
AI Agents Are Poisoning Your Codebase From the Inside
Explore how AI-generated code can silently degrade software quality through weakened tests, rising code churn, and duplication—and how teams can prevent it with better governance.
Trend shifts, market movement, and strategic signals
AI Signals & Market Shifts
This track covers meaningful AI market movement, platform shifts, product signals, and operational changes that matter to teams building under real constraints.
The goal is not to repost headlines. The goal is to surface changes that affect architecture, risk posture, product timing, and strategic decision-making.
Flamehaven treats AI signals as decision inputs: market structure, platform behavior, and ecosystem drift all matter when systems need to hold up beyond the current cycle.
Related Selected Work
Related Writing
Posts connected to AI trend shifts, platform movement, and market-relevant signals.
95% of AI Businesses Will Die. Here’s How to Not Be One of Them.
What the data, a founder’s confession, and 70 years of tech history tell us about who actually survives.
Is MCP Really Dead? A History of AI Hype — Told Through the Rise and Fall of a Protocol
When a protocol doesn’t die — it just stops being interesting. A forensic look at MCP, OpenClaw, and the psychology of AI hype cycles.
Beyond AI FOMO — From Tulip Mania to OpenClaw 2026: The Governor That Saves You
The real breach wasn’t in the code. It was in you.