The staged autonomy pattern ("trust is earnable") maps directly to what we built with protect-mcp — shadow mode first (log everything, block nothing), then enforce when you've seen enough data to trust the policies.
For the prompt injection concern: protect-mcp wraps MCP tool calls with per-tool policies. Even if the agent gets injected, it can't call tools outside the policy. Every decision is optionally Ed25519-signed and verifiable offline.
For the prompt injection concern: protect-mcp wraps MCP tool calls with per-tool policies. Even if the agent gets injected, it can't call tools outside the policy. Every decision is optionally Ed25519-signed and verifiable offline.
npmjs.com/package/protect-mcp