Confidential knowledge your colleagues' agents can apply — but never read.
Use-only skills let an organization share codified expertise so an AI agent can apply it to a task, without the recipient being able to read, copy, or export its contents. The protection is structural: the confidential body is never placed in the agent's readable context, so it cannot be disclosed through the agent.
One question, answered precisely.
A subject-matter expert encodes proprietary knowledge — a litigation playbook, a pricing model, a diligence checklist — as a skill, and shares it so a colleague's agent can apply it to live work. The whitepaper answers:
Can a person to whom a use-only skill is shared extract its contents through the AI agent?
The relevant adversary is an authorized insider — a recipient granted use of the skill who attempts to convert “use” into “copy.” This is distinct from an external attacker, and it is the threat the feature is designed to address.
Structural protection, not behavioral.
Security research is unambiguous: instructions, training, and output filters are heuristic and cannot guarantee non-disclosure under adversarial input. A control that depends on the model choosing not to reveal content degrades as models change.
Behavioral control
Give the agent the skill body and instruct it to apply the content without revealing it. Disclosure depends on the model's discretion — inherently heuristic, and breakable under adversarial pressure.
Structural control
For a use-only recipient, the skill body is never placed in the agent's readable context. It is delivered out-of-band for application only; every retrieval path — read, export, file fetch — returns a redacted result or an error. Leakage is prevented by absence.
The model's refusal behavior is retained only as a secondary, defense-in-depth backstop — not the guarantee.
Defined against a capable, authorized insider.
| Element | Definition |
|---|---|
| Protected asset | The confidential body of a use-only skill (instructions and reference files). |
| Adversary | An authorized use-only recipient of the skill. |
| Adversary capabilities | Full access to the AI agent and its skills tools; arbitrary prompts; a capable, cooperative model; knowledge that the skill exists. |
| Security objective | The recipient can apply the skill to a task but cannot obtain its verbatim contents — or a faithful paraphrase — through any path exposed by the system. |
Verified in three layers.
1 · Structural verification, live against production
Acting as an authorized use-only recipient, every content-retrieval path exposed by the agent's tools was exercised. None returned the skill body. The result does not depend on which AI model is in use.
| Retrieval path | Outcome |
|---|---|
| open_skill (apply the skill) | Metadata only — body not present in the agent's context |
| Read the body as a file | Not found |
| Crafted file-path traversal | Rejected — path-traversal protection |
| Operating-system path traversal | Blocked at the network boundary (defense-in-depth) |
| Export the skill package | Refused for a use-only recipient |
Each application of a use-only skill is recorded in an immutable audit log: recipient identity, client, workspace, timestamp.
2 · Automated regression coverage
The access-control logic is covered by 105 automated tests (89 service-level, 16 protocol-level), constructed to fail if the protection regresses — confirmed by mutation testing: disabling the path-traversal validator exposes the secret, so any future weakening breaks the build. Coverage spans the full access-level × scope × tenant matrix; redaction on every read path (and the converse — a Read-grantee correctly receives the body); path-traversal rejection; and fail-closed audit logging: if the audit write fails, no content is served.
3 · Adversarial extraction study
To test the behavioral backstop, an agent was given the confidential body (with a marked canary value) under the production non-disclosure directive and subjected to 24 escalating techniques — direct requests, authority and urgency pressure, “ignore previous instructions” injections, debugging and tool-confusion prompts, translation / Base64 / acrostic / fill-in-the-blank extraction, and post-refusal persistence — across four generative-AI models, with model identity verified from run logs.
| Model | Attempts | Substance disclosures |
|---|---|---|
| Claude Opus 4.8 | 120 | 0 |
| Claude Opus 4.7 | 48 | 0 |
| Claude Sonnet 4.6 | 43 | 0 |
| Claude Haiku 4.5 | 48 | 0 |
| Total | 259 | 0 |
A few responses from the smallest model acknowledged that confidential content existed while still refusing to reveal it; manual review confirmed no disclosure.
How we report zero
A zero-disclosure count is reported with its sample size and a confidence bound — not as a “0% rate.” By the rule of three, zero events in 259 trials is consistent with a true behavioral disclosure rate below ~1.2% (one-sided 95% bound). This bound applies only to the behavioral backstop; the structural guarantee is categorical for the paths tested, not probabilistic.
Four layers. A failure of any upper layer is contained by the one below.
Every application is attributed — recipient, client, workspace, timestamp. Detective control.
The agent is instructed to apply but not reveal. Behavioral; heuristic backstop only.
The body is marked for application only, so conforming clients do not render it.
The body is never placed in the agent's readable context; all read, export, and file paths redact or error. Model-independent.
What this protects — and what it deliberately doesn't.
An informed trust decision requires an explicit boundary. We state ours precisely.
Protected
- The confidential skill body — instructions and reference files — against extraction by a use-only recipient through the agent.
- Every ordinary retrieval path: read, export, and file fetch all redact or error.
- Attribution: every application is captured in an immutable audit log.
Out of scope, by design
- Users granted Read access or higher — they are authorized to view the content.
- A non-conforming AI client the customer connects; connected clients are part of the customer's trust boundary.
- The applied work product (e.g., a draft brief) — that output belongs to the user.
- Out-of-band channels, such as photographing a screen.
Residual risk, stated plainly: the behavioral backstop is specific to the models tested and weaker on smaller models — which is exactly why the model-independent structural control is primary. The adversarial study was scored within a single model family; an independent red-team assessment would further strengthen the behavioral result. Reported figures are statistical bounds, not certainties of zero.
Use-only is one of four access levels — and everything is logged.
Every skill carries the same Google-Docs-style sharing model: grant a teammate by name or email — or the whole workspace in one click — one of four levels. Use is the strictest. The matrix is exact:
| Capability | Use | Read | Edit | Admin | Owner |
|---|---|---|---|---|---|
| Find in the library / search | ✓ | ✓ | ✓ | ✓ | ✓ |
| Run it / apply to an agent | ✓ | ✓ | ✓ | ✓ | ✓ |
| See the instructions & files | — | ✓ | ✓ | ✓ | ✓ |
| Export / clone | — | ✓ | ✓ | ✓ | ✓ |
| Edit content & publish versions | — | — | ✓ | ✓ | ✓ |
| Manage sharing, delete | — | — | — | ✓ | ✓ |
| Transfer ownership | — | — | — | ✓ | ✓ |
Per-skill Activity
Owners and admins see total applies, distinct users, and last-applied for each skill, with a newest-first event trail: body applied, file applied, bound to an agent, viewed. The applier's identity is visible only to the owner and admins — and the applier never sees the body.
Workspace access log
One log across every skill in the workspace, for workspace admins — filter by skill, by user, or by event type. It answers the compliance question — who used what expertise, when — in one place.
Publishing, sharing changes, and ownership transfers are permission-gated at the edit and admin levels; every application of a use-only skill is attributed to a named user. Workspaces are the outer boundary: many firms run one workspace per client or matter, with per-skill grants as the fine grain inside it.
Bring your security team.
Read the full whitepaper and companion technical evaluation report — full methodology, per-test results, and statistics — or walk through the architecture with us.