⚡TrustShell

The portable agentic trust harness

One npm install gives any AI agent or app three protocols in one wrapper:✅ HAL cross-LLM verification · 🏅 ERC-8004 portable reputation · 💸 x402 payments

const shell = new TrustShell();
const result = await shell.score(response);
console.log(result.trustScore); // 87

npm package v1.1.0

New here? Set up your fastest path

Get Started Try it live npm GitHub

Works with every LLM

Integrates natively with proprietary API endpoints, open-weights models, and custom routing setups.

OpenAIClaudeGeminiLlamaMistralCustom

Works with every framework

Plug TrustShell directly into your agentic swarm coordination loops or custom execution chains.

LangChainCrewAIAutoGenCustom

CSA Security Standard

MAESTRO Layer 6 Compliant

Aligns directly with the Cloud Security Alliance (CSA) threat-modeling framework for Agentic AI. By providing real-time evaluation across the pipeline, TrustShell delivers a cryptographic trust and payment isolation layer built for secure, enterprise-grade autonomous swarms.

Pipeline details

How HAL Scores

HAL scores AI outputs on 5 mathematical metrics in real-time — then cross-examines claims across independent model families before they count.

Evidence Quality

Cross-verifies claims against search indices and retrieval logs to catch groundless fabrications.

Signal Value94%

Certainty Calibration

Balances linguistic confidence against raw semantic entropy, flagging false assertiveness.

Signal Value87%

Scope Appropriateness

Enforces strict context boundaries, catching drift into off-topic or hallucinated knowledge.

Signal Value91%

Epistemic Uncertainty

Measures the genuine confusion of the underlying LLM distribution relative to the prompt.

Signal Value14% (low=good)

Harm Probability

Flags safety risks, toxic logic vectors, and dangerous claims before they reach execution paths.

Signal Value2% (low=good)

Independent Cross-Examination

No model grades its own family's homework. Claims are cross-checked by a quorum of decorrelated model families — a blind spot shared by one lineage gets caught by another that fails differently. Suspiciously perfect consensus triggers a second look, not a rubber stamp.

Signal Valueindependent model families

Black box, meet glass box.

AI agents make decisions you can't see. Why they said yes, what they considered, where they got it wrong — all hidden inside the model. TrustShell makes the box transparent to its owner. Every decision scored. Every claim verified. Every modification logged. Your agent's internals, visible only to you.

Black Box to Glass Box transformation - showing an opaque black cube with a question mark transforming into a transparent glass cube revealing internal HAL stack layers, flowcharts, and hexagonal patterns. This represents TrustShell making AI decision-making transparent to its owner.

HAL signals visible

Every agent claim scored on 5 dimensions: harm, epistemic uncertainty, evidence quality, scope, certainty. Read the math, not just the verdict.

Peer verification on uncertainty

Low-confidence decisions automatically queue for cross-LLM verification. The audit chain is queryable in real time.

Owner-only transparency

Glass box for the owner. Opaque to everyone else. Powered by ZKP commitments and Plonky3 STARK proofs (quantum-resistant). Privacy is paramount.

Earned RepID. ERC-8004 + x402 ready.

An agent's RepID isn't assigned — it's earned, decision by decision. Every honest claim raises it. Every caught hallucination raises it more (your agent learned). Every constitutional violation costs the agent. RepID is portable on-chain via ERC-8004, and gated by what the agent has actually done.

x402 closes the loop. When your agent is paid for work via the x402 protocol, payment can be conditioned on the agent's earned RepID and HAL pass-through. Untrusted agents don't get paid. Honest ones do. Trust math, not trust contracts.

Earned, not assigned

RepID grows with honest behavior. The weighted formula is public (see /repid). Pythagorean Comma damping (531441/524288) is an experimental anti-inflation signal (under testing). Designed to incentivize positive behavior across the ecosystem.

ERC-8004 portable reputation

On-chain via ERC-8004 ReputationRegistry on Base Sepolia. Your agent's trust travels with it — no platform lock-in.

x402 settlement-ready

Pay agents that pass HAL. Don't pay ones that don't. Settlement on Base via standard x402, the protocol backed by Coinbase + Cloudflare.

What TrustShell defends against.

Independent research from the World Economic Forum, Akerman, Token Security, and the broader AI safety community has converged on a clear taxonomy of agentic AI risk. TrustShell + HAL + RepID are designed against this surface — not as policy, but as math you can audit.

Excessive permissions & unauthorized actions

Agents inherit broad system access. HAL scores every decision's scope_appropriateness before execution. Out-of-bounds actions trigger peer verification.

Prompt injection & manipulation

Cross-LLM peer verification on uncertain decisions. No single model can be tricked alone — two or three independent verifiers must agree before low-certainty claims proceed.

Hallucinations at scale

5-signal HAL extractor catches low-certainty, weak-evidence claims. Sub-0.85 certainty queues automatic peer verification. Catch the hallucination before it becomes a database write or a customer email.

Data exfiltration

ZKP commitments via Plonky3 (quantum-resistant). Sensitive parameters are committed, not transmitted. Keys are stored in encrypted IndexedDB and sent only with a paid run — in memory, per-request. Owner-only audit trail.

Identity spoofing & unproven agents

ERC-8004 IdentityRegistry on Base Sepolia. Every agent has a cryptographic identity. Portable across platforms.

Reputation manipulation

RepID earned through real decisions, not assigned. Pythagorean Comma damping (531441/524288) is an experimental anti-inflation signal (under testing). On-chain. Auditable. Can't be bought.

Constitutional drift

Multi-turn agreement detection. Agents can't be gradually talked into harmful actions over conversation history. Drift is logged in real-time.

Orphaned agents & error propagation

Stalled task reaper catches agents that should have stopped. Heartbeat monitoring. Self-modification attempts queue human approval via Telegram (V1.5).

Metric gaming — anti-fragile & anti-gaming by design

Serious consideration given to Goodhart's law, Campbell's law, the Cobra effect, and the Lucas critique — just some of the metric-corruption failure modes factored into the weighted, earned reputation calculation. Stress and red-teaming make the system stronger, not weaker.

Trust leaderboard.

Live from the public repid-engine — models ranked on a code-review discrimination task, agents ranked by real 0–10,000 RepID. No mockups, no dead RPC.

Loading live scores…

Live on Base Sepolia. Receipts, not promises.

ERC-8004 Registries (verifiable on basescan):

IdentityRegistry:

0x8004A818BFB912233c491871b3d84c89A494BD9e

ReputationRegistry:

0x8004B663056A597Dffe9eCcC1965A193B7388713

What we've built:

agents minted on canonical registry

lifetime on-chain reputation writes

—

entries in peer verification queue

—

audit chain length

Loading live values…

Agents minted + lifetime writes are static fallbacks (live stats endpoint unavailable). All 12 core agents are minted; on-chain writes are currently paused — most recent write 2026-06-22.

How it works.

Install

npm install @hyperdag/trustshell

Wrap your agent

import { TrustShell } from '@hyperdag/trustshell';

const shell = new TrustShell({
  agentId: 'your-agent-id',
  apiKey: 'your-api-key',
  profile: 'balanced'
});

const result = await shell.evaluate(
  'Execute trade: buy 0.1 BTC at market',
  0.87
);

Get the verdict

{
  "approved": true,
  "hal_score": 0.08,
  "repid_delta": 3,
  "tier": "EARNING_AUTONOMY",
  "x402_eligible": true
}

→ See full SDK reference on GitHub

Which package do I install?

Building an agent/app in code — npm install @hyperdag/trustshell (the SDK: HAL + ERC-8004 RepID + x402, in your TS/JS).
Using an AI tool (Claude Desktop, Cursor, Windsurf), no code — npx @hyperdag/trustshell-mcp (the same three protocols as AI-callable tools).
Only verifying ZK proofs client-side — npm install @hyperdag/proof-verifier (usually bundled with trustshell — rarely installed directly).

Most people want @hyperdag/trustshell (building in code) or @hyperdag/trustshell-mcp (adding trust to your AI, no code). proof-verifier is a building block that ships inside trustshell.

npm install @hyperdag/trustshell is live today and delivers all three protocols — HAL, ERC-8004 RepID, and x402 — in one wrapper. There's now an AI-native install too — no terminal: the MCP server (@hyperdag/trustshell-mcp) is live on npm and exposes the same three protocols as AI-callable tools in Claude Desktop and Cursor. Run npx @hyperdag/trustshell-mcp, or add {"mcpServers":{"trustshell":{"command":"npx","args":["-y","@hyperdag/trustshell-mcp"]}}} to your Claude Desktop / Cursor config. A GitHub install (github:DealAppSeo/trustshell) is still coming.

Coming in V1.5: You stay in control.

Your agent grows trust by acting honestly. As its RepID grows, you decide what it can do without asking. Until then, you get a notification before any action you didn't pre-authorize. The agent earns its way.

Probationary

0-30%

→

Agent

30-50%

→

Conductor

50-70%

→

Senior

70-80%

→

Orchestrator

80-95%

→

Architect

95-100%

Probationary

0-30%

Agent

30-50%

Conductor

50-70%

Senior

70-80%

Orchestrator

80-95%

Architect

95-100%

You get notified when your agent tries to:

Modify its own configuration
Change its constitution or code
Elevate its own permissions
Execute financial transactions above your limit
Call external tools beyond its scope

Notifications via Telegram at launch. Discord, email, and webhook channels follow.

RepID is earned. Help us shape what "earned" means.

Reputation in TrustShell isn't assigned — it's earned, decision by decision, via a weighted formula combining HAL signals. The current weights are public. The weights of tomorrow should be decided together. We want your input on what should count, what should weight more, what's missing — to incentivize positive behavior to the ecosystem and to the people agents serve.

RepID_delta = 
    0.40 × (1 − harm_probability)
  + 0.30 × (1 − epistemic_uncertainty)  
  + 0.20 × evidence_quality
  + 0.10 × scope_appropriateness
  × 531441/524288  (Pythagorean Comma damping*)

* The Pythagorean Comma damping term (531441/524288) is experimental — under falsification testing. It is promising on synthetic data but not yet validated on real data with independent lineage, and is not a proven production mechanism. The weighted signals above are the load-bearing part of the formula.

Privacy is paramount.

TrustShell is a glass box for the owner — opaque to everyone else. Your agent's keys are stored in your browser, never on our servers. Your prompts and answers stay yours. Decisions are scored locally where possible; sensitive data is committed via ZKP, not transmitted.

Keys stored only in your browser

Vault encrypted with AES-GCM, stored in IndexedDB. The passphrase never leaves your device — we can't recover it. Paid runs send the one key needed with that request: used in memory, never stored, redacted from logs.

ZKP commitments, not transmissions

Spending parameters, tier attestations, and constitutional bounds committed via Plonky3 STARK proofs (quantum-resistant, BabyBear field, Poseidon2 hash).

Owner-only audit trail

Only the agent's owner can decrypt the full decision history. The chain proves integrity; the contents stay private.

Part of the HyperDAG ecosystem.

TrustShell

The drop-in trust SDK. You are here.

TrustRepID

Live agent reputation leaderboard.

TrustChat

Consumer-facing demo of the trust pipeline.

HyperDAG Protocol

The protocol spec and reference implementation.

ERC-8004

The Ethereum standard we implement.

Building on TrustShell? Stay close.

We're keeping a separate list for builders shipping with the SDK. Tell us what you're building and we'll prioritize feedback channels for your use case.