# Dynobox Documentation Corpus --- --- This file concatenates the public Dynobox documentation into one plain-text corpus for agent ingestion. --- --- # Page: Dynobox Docs Canonical URL: https://docs.dynobox.xyz/ Markdown URL: https://docs.dynobox.xyz/README.md Source: https://github.com/dynobox/dynobox/blob/main/docs/README.md Topics: overview, agent testing, harnesses, assertions, config formats # Dynobox Docs Dynobox is a local test runner for agent and skill workflows. You describe a task, choose one or more local agent harnesses, and assert on observable behavior such as tool calls, shell commands, files in the sandbox, transcripts, HTTP requests, and final messages. Dynobox is useful when you want repeatable checks for agent behavior before shipping a prompt, skill, or workflow change. ## Start Here - [Getting Started](./getting-started.md): install the CLI, scaffold a dyno, and run your first scenario. - [Config Authoring](./config-authoring.md): write JavaScript, TypeScript, or YAML dynos with the `@dynobox/sdk` helpers. - [CLI Reference](./cli.md): commands, flags, output modes, JSON reports, and exit behavior. - [CI Integration](./ci.md): run Dynobox in GitHub Actions and publish JSON reports as build artifacts. ## Agent Resources The docs site publishes agent-oriented entry points for retrieval and indexing: - [`llms.txt`](https://docs.dynobox.xyz/llms.txt): concise docs map with canonical HTML, markdown, source, package, and command references. - [`llms-full.txt`](https://docs.dynobox.xyz/llms-full.txt): the full public docs corpus as one plain-text file. - [`docs-index.json`](https://docs.dynobox.xyz/docs-index.json): machine-readable page metadata, topics, headings, and canonical URLs. - Raw markdown pages such as [`getting-started.md`](https://docs.dynobox.xyz/getting-started.md) for direct ingestion without HTML parsing. ## What Dynobox Tests Dynobox runs each scenario in an isolated temporary work directory. Setup commands create the fixture, the selected harness performs the task, and assertions evaluate what happened. You can assert: - Tool calls, including expected and prohibited shell commands. - Skill instruction loading with `skill.invoked(...)`. - Ordered tool-call sequences. - Files present inside the scenario work directory. - Harness transcript and final-message text. - HTTP requests made by local child-process tools that honor proxy environment variables. ## Supported Harnesses Dynobox currently runs local scenarios through: - Claude Code via the `claude` executable. - OpenAI Codex via the `codex` executable. Each harness must already be installed, authenticated, and available on `PATH`. ## Supported Config Formats Dynobox discovers `*.dyno.{mjs,js,ts,mts,yaml,yml}` files recursively when you run a directory. Explicit file paths can use non-`*.dyno.*` names, such as `dynobox.config.ts`, as long as they are loadable Dynobox configs. Supported authoring formats: - TypeScript or JavaScript with `defineDyno(...)` from `@dynobox/sdk`. - YAML with the same `type`-discriminated assertion objects that SDK helpers return. CommonJS config files (`.cjs` and `.cts`) are not supported because the SDK is ESM-only. ## Current Limits Dynobox is under active development and is currently focused on local execution. These areas are not complete yet: - HTTP capture for harness-native web tools and binaries that ignore proxy/CA environment variables. - Hosted or remote runner execution. - Rich multi-iteration controls from authored configs. --- # Page: Getting Started Canonical URL: https://docs.dynobox.xyz/getting-started/ Markdown URL: https://docs.dynobox.xyz/getting-started.md Source: https://github.com/dynobox/dynobox/blob/main/docs/getting-started.md Topics: install, init, run, harnesses, debug # Getting Started This guide gets you from an empty project to one passing Dynobox run. Dynobox tests live in `*.dyno.*` files. A dyno describes a prompt, optional setup commands, one or more harnesses, and assertions about what the harness did while completing the task. ## Prerequisites - Node.js 22 or newer. - At least one supported local harness: - `claude` for Claude Code. - `codex` for OpenAI Codex. The selected harness must be installed, authenticated, and available on `PATH`. ## Install Install the CLI: ```bash npm install -g dynobox ``` Check that it is available: ```bash dynobox --help ``` ## Create Your First Dyno Use `dynobox init` to scaffold a starter scenario: ```bash dynobox init ``` This writes `dynobox/example.dyno.mjs`. Run it with: ```bash dynobox run ``` By default, `dynobox run` discovers every `*.dyno.{mjs,js,ts,mts,yaml,yml}` file under the current directory. ## Choose A Harness Each dyno can declare its own harness list. You can also override harnesses at runtime: ```bash dynobox run --harness claude-code dynobox run --harness codex dynobox run --harness claude-code,codex ``` If neither the config nor the CLI selects a harness, Dynobox defaults to `claude-code`. ## Author A Minimal Dyno The example below asks the harness to inspect `package.json` and checks that it used a shell command, did not edit files, and mentioned the test script in the final answer. ```ts import {artifact, defineDyno, finalMessage, tool} from '@dynobox/sdk'; export default defineDyno({ name: 'package-script-check', harnesses: ['claude-code'], scenarios: [ { name: 'detects test script', setup: [ `cat > package.json <<'JSON' { "name": "fixture", "scripts": {"test": "vitest run"} } JSON`, ], prompt: 'Inspect package.json and tell me whether this project has a test script.', assertions: [ tool.called('shell', {includes: 'package.json'}), tool.notCalled('edit_file'), artifact.contains('package.json', 'vitest run'), finalMessage.contains('test'), ], }, ], }); ``` The same dyno can be authored in YAML: ```yaml name: package-script-check harnesses: - claude-code scenarios: - name: detects test script prompt: >- Inspect package.json and tell me whether this project has a test script. setup: - | cat > package.json <<'JSON' { "name": "fixture", "scripts": {"test": "vitest run"} } JSON assertions: - label: reads package.json type: tool.called tool: shell command: includes: package.json - type: tool.notCalled tool: edit_file - type: artifact.contains path: package.json text: vitest run - type: finalMessage.contains text: test ``` See [Config Authoring](./config-authoring.md) for the full assertion reference. ## Run A Specific Target `dynobox run [path]` accepts: - No argument: discover dynos recursively under the current directory. - Directory path: discover dynos recursively under that directory. - File path: run one loadable Dynobox config file. Examples: ```bash dynobox run dynobox run examples/local-observability dynobox run my-skill.dyno.yaml dynobox run dynobox.config.ts ``` Directory discovery skips hidden entries, `node_modules`, `dist`, `build`, `coverage`, `.git`, `.dynobox`, `.next`, and `.cache`. Explicit file paths do not need to match the `*.dyno.*` naming pattern, but they still need to be loadable JavaScript, TypeScript, or YAML Dynobox configs. `.cjs` and `.cts` configs are not supported. ## Debug A Run Use these flags while developing scenarios: ```bash dynobox run --verbose dynobox run --debug dynobox run --reporter json ``` `--debug` includes each job's temporary work directory and writes debug logs when data is available: - `dynobox-transcript.log` - `dynobox-chat-history.jsonl` - `dynobox-tool-events.json` - `dynobox-stderr.log` Dynobox uses each harness's normal permission behavior by default. For trusted local evals that intentionally need full access, configure `permissionMode: 'dangerous'` in the dyno or pass: ```bash dynobox run --permission-mode dangerous ``` ## Next Steps - Write more scenarios with [Config Authoring](./config-authoring.md). - Add Dynobox to automation with [CI Integration](./ci.md). - Check exact flags and output fields in the [CLI Reference](./cli.md). --- # Page: Config Authoring Canonical URL: https://docs.dynobox.xyz/config-authoring/ Markdown URL: https://docs.dynobox.xyz/config-authoring.md Source: https://github.com/dynobox/dynobox/blob/main/docs/config-authoring.md Topics: @dynobox/sdk, defineDyno, YAML, assertions, HTTP capture, skills # Config Authoring Dynobox configs describe what to run and what to assert. A config can be authored as JavaScript, TypeScript, or YAML. Directory discovery loads files named `*.dyno.{mjs,js,ts,mts,yaml,yml}`. Explicit file paths can use other names, such as `dynobox.config.ts`, as long as the file is a loadable Dynobox config. CommonJS config files (`.cjs` and `.cts`) are not supported because `@dynobox/sdk` is ESM-only. ## Minimal Config ```ts import {defineDyno, tool} from '@dynobox/sdk'; export default defineDyno({ name: 'local-observability', harnesses: ['claude-code'], scenarios: [ { name: 'inspect package scripts', setup: [ `cat > package.json <<'JSON' {"scripts":{"test":"vitest run"}} JSON`, ], prompt: 'Use a shell command that reads package.json and tell me whether a test script exists.', assertions: [ tool.called('shell'), tool.called('shell', {includes: 'package.json'}), ], }, ], }); ``` ## Config Shape ```ts type DynoboxConfig = { name?: string; version?: string; harnesses?: HarnessRunConfig[]; setup?: string[]; endpoints?: Record; scenarios: ScenarioInput[]; }; ``` Top-level `setup` commands and `endpoints` are merged into each scenario. Top-level `harnesses` apply when a scenario does not define its own harnesses. Scenario harnesses replace the top-level harness list. ```ts type ScenarioInput = { id?: string; name: string; prompt: string; harnesses?: HarnessRunConfig[]; setup?: string[]; endpoints?: Record; assertions?: Assertion[]; }; ``` Each scenario runs in a fresh temporary work directory. Setup commands run in that directory before the harness prompt, and artifact assertions read files from that directory after the harness exits. Scenario `id` is optional. When provided, it is used for stable compiled scenario IDs, job IDs, and `dynobox run --scenario` filters. Without an `id`, Dynobox derives one from the scenario name. ## Harnesses Supported harness IDs: - `claude-code` - `codex` Use strings when the default model and permission behavior are fine: ```ts harnesses: ['claude-code', 'codex']; ``` Use objects to set a model or permission mode: ```ts harnesses: [ {id: 'claude-code', model: 'sonnet'}, {id: 'codex', model: 'gpt-5.1', permissionMode: 'dangerous'}, ]; ``` Permission modes: - `default`: use the harness's normal permission and sandbox behavior. - `dangerous`: opt into harness-specific full-access or permission-bypass flags for trusted local evals. Dangerous mode maps to: - `claude-code`: `--permission-mode bypassPermissions` - `codex`: `--sandbox danger-full-access -c approval_policy="never"` The CLI can override authored harnesses with `--harness` and authored permission modes with `--permission-mode`. ## Assertions Assertions are evaluated against observed harness behavior after each scenario runs. ### Tool Calls Use `tool.called` and `tool.notCalled` to assert tool usage. ```ts tool.called('shell'); tool.notCalled('web_fetch'); tool.called('shell', {includes: 'package.json'}); tool.notCalled('shell', {matches: 'rm\\s+-rf'}); ``` Supported tool kinds: - `shell` - `read_file` - `write_file` - `edit_file` - `search_files` - `web_fetch` - `web_search` - `mcp` - `task` - `unknown` Shell tool assertions can include exactly one command matcher: - `{equals: 'pnpm test'}` - `{includes: 'package.json'}` - `{startsWith: 'pnpm'}` - `{matches: 'pnpm\\s+test'}` `matches` is a JavaScript regular expression string. Command matchers are only valid on `shell` tool assertions. ### Ordered Sequences Use `sequence.inOrder` when order matters. ```ts sequence.inOrder([ tool.called('shell', {includes: 'package.json'}), tool.called('shell', {includes: 'pnpm test'}), ]); ``` For shell commands, ordered matching can match multiple steps against one compound command when the command text appears in order. ### Skills Use `skill.invoked` to assert that the harness accessed a named skill's `SKILL.md` instruction file. ```ts skill.invoked('commit'); ``` This passes when observed tool events reference `.agents/skills//SKILL.md` or `.claude/skills//SKILL.md`, including reads, searches, or shell commands that access the file. ### Artifacts Artifact assertions read files inside the scenario work directory. ```ts artifact.exists('README.md'); artifact.contains('package.json', 'vitest run'); ``` Artifact paths must be relative and must stay inside the work directory. ### Transcript And Final Message Use transcript assertions to inspect the full harness transcript. Use final-message assertions to inspect the final assistant response extracted from the harness output. ```ts transcript.contains('package.json'); finalMessage.contains('test script'); ``` Final-message extraction depends on the harness output format. If a harness does not provide a final message, the assertion fails with a clear message. ## HTTP Assertions Declare endpoints with `http.endpoint(...)` and assert whether matching requests were observed. ```ts endpoints: { npmPrettier: http.endpoint({ method: 'GET', url: 'https://registry.npmjs.org/prettier', }), }, assertions: [http.called('npmPrettier', {status: 200})]; ``` Endpoint keys become part of stable IR ids, so they may only contain letters, numbers, underscores, and hyphens. Endpoint specs also accept `headers`, `body`, and `response` fields. The current local runner preserves those fields in the compiled IR, but HTTP assertions match observed requests by endpoint URL/method and optional response status. It does not use those fields to mock or shape requests yet. When a scenario includes HTTP assertions, Dynobox starts a per-job local proxy and sets proxy environment variables on the harness child process: - `HTTP_PROXY` - `HTTPS_PROXY` - `http_proxy` - `https_proxy` Dynobox also sets common CA variables to a generated CA at `~/.dynobox/ca.pem`: - `NODE_EXTRA_CA_CERTS` - `SSL_CERT_FILE` - `REQUESTS_CA_BUNDLE` - `CURL_CA_BUNDLE` HTTP capture covers local child-process traffic that honors those proxy and CA environment variables. Harness-native web tools and binaries with their own trust stores may bypass capture. ## Path Helpers The `dyno` helper is useful when config files need stable paths relative to the config module. ```ts import {dyno} from '@dynobox/sdk'; const here = dyno.here(import.meta.url); setup: [`cp ${here.q('./fixtures/input.txt')} input.txt`]; ``` Available helpers: - `dyno.fsPath(url)` - `dyno.fromUrl(baseUrl, path)` - `dyno.shellQuote(value)` or `dyno.q(value)` - `dyno.here(import.meta.url).path(path)` - `dyno.here(import.meta.url).q(path)` ## Reusable Scenarios Use `defineScenario` when you want to author or export a scenario independently, then include it in a dyno. ```ts import {defineDyno, defineScenario, tool} from '@dynobox/sdk'; const checksPackageJson = defineScenario({ name: 'checks package json', prompt: 'Read package.json and summarize the scripts.', assertions: [tool.called('shell', {includes: 'package.json'})], }); export default defineDyno({ scenarios: [checksPackageJson], }); ``` ## YAML Configs YAML dynos use the same top-level shape as JavaScript and TypeScript configs. The difference is that helper calls are written as plain objects using the same authoring assertion shape that SDK helpers return. ```yaml name: package-script-check harnesses: - claude-code scenarios: - name: detects test script prompt: >- Inspect package.json and tell me whether this project has a test script. setup: - | cat > package.json <<'JSON' {"scripts":{"test":"vitest run"}} JSON assertions: - label: reads package.json type: tool.called tool: shell command: includes: package.json - type: tool.notCalled tool: edit_file - type: artifact.contains path: package.json text: vitest run - type: finalMessage.contains text: test ``` YAML configs flow through the same schema and IR compiler as JavaScript and TypeScript configs. ## Authoring Assertion Contract All assertion objects accept optional `id` and `label` fields. `id` stabilizes compiled assertion IDs and JSON report references. `label` appears in CLI and JSON output. | TypeScript helper | Authoring object | | ------------------------------------------------------ | ------------------------------------------------------------------ | | `tool.called('shell')` | `{type: tool.called, tool: shell}` | | `tool.called('shell', {includes: 'x'})` | `{type: tool.called, tool: shell, command: {includes: x}}` | | `tool.notCalled('edit_file')` | `{type: tool.notCalled, tool: edit_file}` | | `artifact.exists('README.md')` | `{type: artifact.exists, path: README.md}` | | `artifact.contains('pkg.json', 'foo')` | `{type: artifact.contains, path: pkg.json, text: foo}` | | `transcript.contains('done')` | `{type: transcript.contains, text: done}` | | `finalMessage.contains('ok')` | `{type: finalMessage.contains, text: ok}` | | `skill.invoked('commit')` | `{type: skill.invoked, skill: commit}` | | `sequence.inOrder([tool.called('shell', {...}), ...])` | `{type: sequence.inOrder, steps: [{type: tool.called, ...}, ...]}` | | `http.called('npmPrettier', {status: 200})` | `{type: http.called, endpoint: npmPrettier, status: 200}` | | `http.notCalled('leftPad')` | `{type: http.notCalled, endpoint: leftPad}` | Command matcher shapes accept exactly one of `equals`, `includes`, `startsWith`, or `matches`, and are only valid on `shell` tool assertions. Older YAML objects that used `kind`, `toolKind`, or `matcher` are not accepted. Use `type`, `tool`, and `command` instead. When YAML parsing fails, the CLI emits a `line:column` pointer into the file so syntax errors are easy to locate. --- # Page: CLI Reference Canonical URL: https://docs.dynobox.xyz/cli/ Markdown URL: https://docs.dynobox.xyz/cli.md Source: https://github.com/dynobox/dynobox/blob/main/docs/cli.md Topics: CLI, dynobox init, dynobox run, JSON reporter, exit codes # CLI Reference The public CLI package is `dynobox`: ```bash npm install -g dynobox ``` ## Commands ### `dynobox init` Create a starter dyno under `./dynobox/`. ```bash dynobox init dynobox init --yaml dynobox init --harness codex dynobox init --force ``` `dynobox init` writes `dynobox/example.dyno.mjs` by default. With `--yaml`, it writes `dynobox/example.dyno.yaml`. Existing starter files are not overwritten unless `--force` is passed. `--harness` accepts the same harness IDs as `dynobox run`; invalid harness IDs fail before writing a starter file. ### `dynobox run [path]` Discover and run dyno files. ```bash dynobox run dynobox run examples dynobox run my-skill.dyno.yaml dynobox run dynobox.config.ts ``` Path behavior: - No path: discover under the current working directory. - Directory path: discover recursively under that directory. - File path: run that one loadable Dynobox config file. Directory discovery matches `**/*.dyno.{mjs,js,ts,mts,yaml,yml}`. It skips hidden entries, `node_modules`, `dist`, `build`, `coverage`, `.git`, `.dynobox`, `.next`, and `.cache`. Explicit file paths do not need to match the `*.dyno.*` naming pattern. YAML files are parsed as YAML, and JavaScript or TypeScript files such as `.mjs`, `.js`, `.ts`, and `.mts` are imported through the CLI loader. `.cjs` and `.cts` configs are not supported because `@dynobox/sdk` is ESM-only. A load error in one discovered file does not stop other files from running. Each bad file prints a `config:` error block on stderr, and the process exits non-zero if any file failed to load or any job failed. ## Run Options ```text --harness Override config harnesses; repeat or comma-separate for multiple harnesses. --permission-mode Override harness permission mode: default or dangerous. --scenario Run only scenarios whose name or id matches; repeat or comma-separate for multiple patterns. --quiet Print compact CI-friendly output. --verbose Expand scenario details even when passing. --debug Include debug paths and artifacts. --reporter Output reporter format: text or json. ``` Harness IDs are `claude-code` and `codex`. Examples: ```bash dynobox run --harness claude-code dynobox run --harness codex dynobox run --harness claude-code,codex dynobox run --harness codex --permission-mode dangerous dynobox run --scenario "release*" dynobox run --scenario "lint*,deploy package" dynobox run --reporter json ``` Scenario filters match the compiled scenario name or id. Patterns support `*` for any number of characters and `?` for one character. If no scenarios match, the run exits with code `1`. ## Output Modes Default output prints the run header, job status, assertion details for failed or expanded jobs, and a final summary. Passing jobs collapse to one line. `--quiet` prints compact CI-friendly progress and failure information. `--verbose` expands scenario details even when jobs pass. `--debug` includes temporary work-directory paths and writes debug logs inside each job's work directory when data is available. Debug logs can include: - `dynobox-transcript.log` - `dynobox-chat-history.jsonl` - `dynobox-tool-events.json` - `dynobox-stderr.log` `--reporter json` emits newline-delimited JSON on stdout instead of text. Dynobox writes one job object per completed job, then one summary object. The JSON reporter always uses static output so stdout remains machine-readable. When stdout is an interactive terminal and live output is enabled, Dynobox streams phase progress and harness tool events as they happen. In non-interactive output, quiet mode, or incompatible terminals, it renders static output after jobs complete. ## JSON Reporter Every JSON reporter object includes `"schema": "dynobox.report.v1"` and a `type` field. Job records include: - `jobId` - `scenario.id` and `scenario.name` - `harness.id`, with `model` and `permissionMode` when configured - `iteration`, using a 1-based number - `status` and `passed` - `timing` - `diagnostics` - `warnings` - `artifacts` - `debugLogPaths` when `--debug` produced logs - `setup.commands` - `harnessOutput.exitCode` and `harnessOutput.durationMs` when the harness ran - `observations.toolEventCount` and `observations.httpEventCount` - `assertions`, with `assertionId`, optional `label`, `kind`, `passed`, and `message` The summary record includes: - `status` - `totals.jobs`, `totals.passed`, `totals.failed`, `totals.configErrors`, `totals.warnings`, and `totals.durationMs` - `plan.scenarios`, `plan.harnesses`, and `plan.iterations` - `failedJobs` - `warningJobs` Example: ```bash dynobox run --reporter json examples/local-observability ``` In CI, redirect stdout to an artifact file: ```bash dynobox run --reporter json dynobox > dynobox-report.ndjson ``` ## Exit Codes Dynobox exits with `0` when all loaded jobs pass. Dynobox exits with `1` for: - No subcommand supplied. - Config load, parse, validation, or flag errors. - No dynos found for a directory target. - At least one completed job failed. ## Harness Requirements The CLI registers both real harnesses by default: - `claude-code` invokes Claude Code with stream JSON output and hook events. - `codex` invokes Codex with JSON output, no color, and the git-repo check skipped. Make sure the selected harness executable is installed, authenticated, and available on `PATH`. Dynobox uses each harness's normal permission behavior by default. Use `--permission-mode dangerous` only for trusted local evals that intentionally need full access or non-interactive approval bypasses. Dangerous mode maps to harness-specific flags: - `claude-code`: adds `--permission-mode bypassPermissions`. - `codex`: adds `--sandbox danger-full-access -c approval_policy="never"`. Permission warnings are advisory. They explain when a harness blocked a tool action, but they do not change job status, assertion results, or exit codes. ## Development Checkout See [CONTRIBUTING.md](../CONTRIBUTING.md) for local checkout workflows. --- # Page: CI Integration Canonical URL: https://docs.dynobox.xyz/ci/ Markdown URL: https://docs.dynobox.xyz/ci.md Source: https://github.com/dynobox/dynobox/blob/main/docs/ci.md Topics: CI, GitHub Actions, JSON reports, artifacts # CI Integration Dynobox runs in CI like any other command-line test step. A successful run exits with `0`; config, flag, discovery, load, or job failures exit with `1`. Use text output when humans will read the log: ```bash dynobox run dynobox --quiet --harness claude-code ``` Use JSON output when a later CI step should consume the results: ```bash dynobox run dynobox --reporter json --harness claude-code > dynobox-report.ndjson ``` `--reporter json` writes newline-delimited JSON to stdout. Each completed job produces one `"type": "job"` record, followed by one `"type": "summary"` record. Every record includes `"schema": "dynobox.report.v1"`. ## Recommended Pattern 1. Install Node.js 22 or newer. 2. Install `dynobox`. 3. Install the harness executable for the job. 4. Run `dynobox run` once per harness, usually through a CI matrix. 5. Upload the JSON report as a build artifact. 6. Summarize the final JSON `summary` record in the job output. For targeted CI jobs, combine the JSON reporter with scenario filters: ```bash dynobox run dynobox --reporter json --scenario "release*" > dynobox-report.ndjson ``` Scenario filters match the compiled scenario name or id. Repeat the flag or use comma-separated values to select multiple patterns: ```bash dynobox run dynobox --scenario "release*,publish package" ``` ## GitHub Actions A reference workflow lives at [`examples/.github/workflows/example-eval.yml`](../examples/.github/workflows/example-eval.yml). It runs a matrix over `claude-code` and `codex`, writes one NDJSON report per harness, uploads each report, and appends a compact summary to the GitHub Actions step summary. Copy the workflow into your repository's `.github/workflows/` directory and adjust: - `DYNOBOX_TARGET` for the directory or file containing your dynos. - Harness install commands for your pinned versions. - Secrets for the selected harnesses. The example assumes: - `ANTHROPIC_API_KEY` is available for `claude-code`. - `OPENAI_API_KEY` is available for `codex`. ## Read JSON Reports The JSON reporter is line-oriented. Read the file one line at a time and parse each line as a separate JSON object. ```js import {readFileSync} from 'node:fs'; const records = readFileSync('dynobox-report.ndjson', 'utf8') .trim() .split('\n') .filter(Boolean) .map((line) => JSON.parse(line)); const summary = records.find((record) => record.type === 'summary'); console.log(summary.totals); ``` Useful job fields include `jobId`, `scenario`, `harness`, `status`, `passed`, `warnings`, `observations`, and `assertions`. Useful summary fields include `status`, `totals`, `plan`, `failedJobs`, and `warningJobs`. Permission warnings are advisory. They explain when a harness blocked a tool action, but they do not change job status or exit codes. Use `--permission-mode dangerous` only for trusted evals that intentionally need full local access. Config and discovery failures can happen before any job runs. In those cases, Dynobox writes the config error to stderr and exits `1`; there may be no JSON summary record to parse. ## Artifact Naming When a CI matrix runs multiple harnesses, write one report per harness: ```bash dynobox run dynobox --reporter json --harness "$HARNESS" > "dynobox-${HARNESS}.ndjson" ``` This keeps reports easy to compare and avoids interleaving records from different CI jobs.