Skip to main content

Physical Harnessing: Why OTP Beats Policy

Every AI safety mechanism in 2026 has one thing in common: it lives inside the AI’s execution environment. YAML policies. Prompt instructions. Configuration files. Sandbox rules. The AI can read them. The AI runs inside them. And what AI can read, AI can — in theory — circumvent. NIIA’s OTP is different. It lives outside the AI’s world entirely.

The Problem with Software Harnessing

YAML policy (NVIDIA OpenShell):
  File on disk → AI's sandbox reads it → enforces rules
  Attack: modify the file, escape the sandbox, exploit the policy parser

Prompt rules (Claude Computer Use):
  System prompt → "always ask permission before accessing new apps"
  Attack: prompt injection overrides system prompt

Config file (Codex sandbox):
  JSON config → defines writable paths, network rules
  Attack: the AI generates a command that the policy doesn't anticipate

All three:
  The constraint and the constrained run in the SAME environment.
  Sophisticated enough AI can find the gap.
This isn’t theoretical. 80% of organizations in 2026 report risky agent behaviors including unauthorized system access. The governance-containment gap is the defining security challenge of the year.

Physical vs Software Constraints

Software constraint:
  Rule lives in AI's environment → AI can potentially interact with it
  Strength: as strong as the implementation
  Weakness: same-environment attacks (injection, escape, manipulation)

Physical constraint:
  Rule lives OUTSIDE AI's environment → AI cannot interact with it at all
  Strength: absolute (within the constraint's domain)
  Weakness: requires human action (slower)
A locked door is harder to bypass than a “please don’t enter” sign, no matter how sophisticated the reader is.

How NIIA OTP Works

1. AI (or human) requests elevated access:
   $ niia control unlock --scope full --duration 1h

2. NIIA server sends OTP to human's email:
   [email] Your OTP: 847291 (expires in 5 minutes)

3. Human enters OTP in terminal:
   $ Enter OTP: 847291

4. Access granted for specified scope and duration:
   [unlocked] scope=full, expires=2026-04-01T12:00:00

5. After duration expires:
   [locked] All control commands blocked again.
The OTP travels through a channel the AI cannot access:
  • The human’s email account
  • On the human’s device
  • Protected by the human’s authentication
No amount of prompt injection, jailbreaking, or policy manipulation can extract an OTP from a human’s email inbox.

Attack Surface Comparison

Attack VectorYAML PolicyPrompt RulesConfig FileNIIA OTP
Prompt injection⚠️ indirect❌ bypassed⚠️ indirect✅ immune
Policy file modification❌ bypassedN/A❌ bypassed✅ immune
Environment variable manipulation⚠️ possible⚠️ possible⚠️ possible✅ immune
AI self-approval⚠️ possible❌ bypassed⚠️ possible✅ impossible
AI delegating approval to another AI⚠️ possible⚠️ possible⚠️ possible✅ impossible
Social engineering the humanN/AN/AN/A⚠️ possible
The only attack vector against OTP is social engineering — tricking the human. That’s not an AI safety problem. That’s a human security problem. And it exists for every authentication system, not just AI harnessing.

Two-Layer Architecture

NIIA separates capability from control into two independent binaries:
kernel-cli (raw capability):
  Can press any button. Type any text. Run any script.
  No restrictions. Full OS access.
  = The engine.

niia observe/control (harness):
  OTP-gated. Time-limited. Scope-limited. Audit-logged.
  Same capabilities, but locked behind physical key.
  = The steering wheel + brakes + seatbelt.
Other tools:
  ┌─────────────────────────────┐
  │  Capability + Constraint    │  ← same binary, same environment
  │  (AI can see both)          │
  └─────────────────────────────┘

NIIA:
  ┌─────────────────┐  ┌──────────────────┐
  │  kernel-cli     │  │  niia control    │
  │  (capability)   │  │  (harness)       │
  │  raw, free      │  │  OTP-gated       │
  └─────────────────┘  └──────────────────┘
          ↑                      ↑
    AI uses this          Human unlocks this
    (after unlock)        (from email/device)
The AI never touches the harness. The AI uses the capability. The human controls the harness. The harness gates the capability.

Why This Matters More Over Time

2024: AI is helpful but limited.
      Software constraints are sufficient.
      AI can't bypass a well-written YAML policy.

2025: AI is capable and autonomous.
      Software constraints mostly work.
      Edge cases emerge. Some policies are circumvented.

2026: AI agents control desktops, write code, deploy to production.
      Software constraints are necessary but not sufficient.
      80% of orgs report risky agent behavior.

2027+: AI is deeply integrated into infrastructure.
       Software constraints become the weak link.
       Physical constraints become the standard.
The smarter AI gets, the better it gets at finding gaps in software constraints. Physical constraints don’t have gaps — they have locks.

In connector.json

{
  "agents": [
    {
      "model": "claude",
      "capabilities": {
        "pty": true,
        "kernel": {
          "observe": true,
          "control": {
            "allowed": ["ax-press", "key", "type"],
            "requires_otp": true,
            "scope": "deploy-workflow",
            "max_duration_ms": 3600000
          }
        }
      }
    }
  ]
}
The connector.json declares what the agent CAN do. The OTP determines WHETHER it can do it right now. The declaration is software. The gate is physical.

Summary

Every harness in 2026:     AI reads the rules → rules constrain AI
NIIA OTP:                  Human holds the key → key unlocks capability

Rules can be circumvented.
Keys cannot be forged.

This is the difference between
  "the AI is told not to" and
  "the AI physically cannot."