From bae8538695701fa254863d4363f7f4d7d4ff6bdc Mon Sep 17 00:00:00 2001 From: gavrielc Date: Mon, 2 Mar 2026 13:28:28 +0200 Subject: [PATCH] Fix/shadow env in container (#646) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix: shadow .env file in container to prevent agents from reading secrets The main agent's container mounts the project root read-only, which inadvertently exposed the .env file containing API keys. Mount /dev/null over /workspace/project/.env to shadow it — secrets are already passed via stdin and never need to be read from disk inside the container. Co-Authored-By: Claude Opus 4.6 * fix: adapt .env shadowing and runtime for Apple Container Apple Container (VirtioFS) only supports directory mounts, not file mounts. The previous /dev/null host-side mount over .env crashes with VZErrorDomain "A directory sharing device configuration is invalid". - Dockerfile: entrypoint now shadows .env via mount --bind inside the container, then drops privileges via setpriv to the host UID/GID - container-runner: main containers skip --user and pass RUN_UID/RUN_GID env vars so entrypoint starts as root for mount --bind - container-runtime: switch to Apple Container CLI (container), fix cleanupOrphans to use container list --format json - Skill: add Dockerfile and container-runner.ts to convert-to-apple-container skill (v1.1.0) Co-Authored-By: Claude Opus 4.6 * fix: revert src to Docker runtime, keep Apple Container in skill only The source files should remain Docker-compatible. The Apple Container adaptations live in the convert-to-apple-container skill and are applied on demand. Co-Authored-By: Claude Opus 4.6 --------- Co-authored-by: Claude Opus 4.6 --- .../convert-to-apple-container/SKILL.md | 12 +- .../convert-to-apple-container/manifest.yaml | 4 +- .../modify/container/Dockerfile | 68 ++ .../modify/container/Dockerfile.intent.md | 31 + .../modify/src/container-runner.ts | 694 ++++++++++++++++++ .../modify/src/container-runner.ts.intent.md | 33 + src/container-runner.ts | 11 + 7 files changed, 850 insertions(+), 3 deletions(-) create mode 100644 .claude/skills/convert-to-apple-container/modify/container/Dockerfile create mode 100644 .claude/skills/convert-to-apple-container/modify/container/Dockerfile.intent.md create mode 100644 .claude/skills/convert-to-apple-container/modify/src/container-runner.ts create mode 100644 .claude/skills/convert-to-apple-container/modify/src/container-runner.ts.intent.md diff --git a/.claude/skills/convert-to-apple-container/SKILL.md b/.claude/skills/convert-to-apple-container/SKILL.md index 8bfaebb..802ffd6 100644 --- a/.claude/skills/convert-to-apple-container/SKILL.md +++ b/.claude/skills/convert-to-apple-container/SKILL.md @@ -13,11 +13,13 @@ This skill switches NanoClaw's container runtime from Docker to Apple Container - Startup check: `docker info` → `container system status` (with auto-start) - Orphan detection: `docker ps --filter` → `container ls --format json` - Build script default: `docker` → `container` +- Dockerfile entrypoint: `.env` shadowing via `mount --bind` inside the container (Apple Container only supports directory mounts, not file mounts like Docker's `/dev/null` overlay) +- Container runner: main-group containers start as root for `mount --bind`, then drop privileges via `setpriv` **What stays the same:** -- Dockerfile (shared by both runtimes) -- Container runner code (`src/container-runner.ts`) - Mount security/allowlist validation +- All exported interfaces and IPC protocol +- Non-main container behavior (still uses `--user` flag) - All other functionality ## Prerequisites @@ -72,11 +74,15 @@ npx tsx scripts/apply-skill.ts .claude/skills/convert-to-apple-container This deterministically: - Replaces `src/container-runtime.ts` with the Apple Container implementation - Replaces `src/container-runtime.test.ts` with Apple Container-specific tests +- Updates `src/container-runner.ts` with .env shadow mount fix and privilege dropping +- Updates `container/Dockerfile` with entrypoint that shadows .env via `mount --bind` - Updates `container/build.sh` to default to `container` runtime - Records the application in `.nanoclaw/state.yaml` If the apply reports merge conflicts, read the intent files: - `modify/src/container-runtime.ts.intent.md` — what changed and invariants +- `modify/src/container-runner.ts.intent.md` — .env shadow and privilege drop changes +- `modify/container/Dockerfile.intent.md` — entrypoint changes for .env shadowing - `modify/container/build.sh.intent.md` — what changed for build script ### Validate code changes @@ -172,4 +178,6 @@ Check directory permissions on the host. The container runs as uid 1000. |------|----------------| | `src/container-runtime.ts` | Full replacement — Docker → Apple Container API | | `src/container-runtime.test.ts` | Full replacement — tests for Apple Container behavior | +| `src/container-runner.ts` | .env shadow mount removed, main containers start as root with privilege drop | +| `container/Dockerfile` | Entrypoint: `mount --bind` for .env shadowing, `setpriv` privilege drop | | `container/build.sh` | Default runtime: `docker` → `container` | diff --git a/.claude/skills/convert-to-apple-container/manifest.yaml b/.claude/skills/convert-to-apple-container/manifest.yaml index d9f65b6..90b0156 100644 --- a/.claude/skills/convert-to-apple-container/manifest.yaml +++ b/.claude/skills/convert-to-apple-container/manifest.yaml @@ -1,12 +1,14 @@ skill: convert-to-apple-container -version: 1.0.0 +version: 1.1.0 description: "Switch container runtime from Docker to Apple Container (macOS)" core_version: 0.1.0 adds: [] modifies: - src/container-runtime.ts - src/container-runtime.test.ts + - src/container-runner.ts - container/build.sh + - container/Dockerfile structured: {} conflicts: [] depends: [] diff --git a/.claude/skills/convert-to-apple-container/modify/container/Dockerfile b/.claude/skills/convert-to-apple-container/modify/container/Dockerfile new file mode 100644 index 0000000..65763df --- /dev/null +++ b/.claude/skills/convert-to-apple-container/modify/container/Dockerfile @@ -0,0 +1,68 @@ +# NanoClaw Agent Container +# Runs Claude Agent SDK in isolated Linux VM with browser automation + +FROM node:22-slim + +# Install system dependencies for Chromium +RUN apt-get update && apt-get install -y \ + chromium \ + fonts-liberation \ + fonts-noto-color-emoji \ + libgbm1 \ + libnss3 \ + libatk-bridge2.0-0 \ + libgtk-3-0 \ + libx11-xcb1 \ + libxcomposite1 \ + libxdamage1 \ + libxrandr2 \ + libasound2 \ + libpangocairo-1.0-0 \ + libcups2 \ + libdrm2 \ + libxshmfence1 \ + curl \ + git \ + && rm -rf /var/lib/apt/lists/* + +# Set Chromium path for agent-browser +ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium +ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium + +# Install agent-browser and claude-code globally +RUN npm install -g agent-browser @anthropic-ai/claude-code + +# Create app directory +WORKDIR /app + +# Copy package files first for better caching +COPY agent-runner/package*.json ./ + +# Install dependencies +RUN npm install + +# Copy source code +COPY agent-runner/ ./ + +# Build TypeScript +RUN npm run build + +# Create workspace directories +RUN mkdir -p /workspace/group /workspace/global /workspace/extra /workspace/ipc/messages /workspace/ipc/tasks /workspace/ipc/input + +# Create entrypoint script +# Secrets are passed via stdin JSON — temp file is deleted immediately after Node reads it +# Follow-up messages arrive via IPC files in /workspace/ipc/input/ +# Apple Container only supports directory mounts (VirtioFS), so .env cannot be +# shadowed with a host-side /dev/null file mount. Instead the entrypoint starts +# as root, uses mount --bind to shadow .env, then drops to the host user via setpriv. +RUN printf '#!/bin/bash\nset -e\n\n# Shadow .env so the agent cannot read host secrets (requires root)\nif [ "$(id -u)" = "0" ] && [ -f /workspace/project/.env ]; then\n mount --bind /dev/null /workspace/project/.env\nfi\n\n# Compile agent-runner\ncd /app && npx tsc --outDir /tmp/dist 2>&1 >&2\nln -s /app/node_modules /tmp/dist/node_modules\nchmod -R a-w /tmp/dist\n\n# Capture stdin (secrets JSON) to temp file\ncat > /tmp/input.json\n\n# Drop privileges if running as root (main-group containers)\nif [ "$(id -u)" = "0" ] && [ -n "$RUN_UID" ]; then\n chown "$RUN_UID:$RUN_GID" /tmp/input.json /tmp/dist\n exec setpriv --reuid="$RUN_UID" --regid="$RUN_GID" --clear-groups -- node /tmp/dist/index.js < /tmp/input.json\nfi\n\nexec node /tmp/dist/index.js < /tmp/input.json\n' > /app/entrypoint.sh && chmod +x /app/entrypoint.sh + +# Set ownership to node user (non-root) for writable directories +RUN chown -R node:node /workspace && chmod 777 /home/node + +# Set working directory to group workspace +WORKDIR /workspace/group + +# Entry point reads JSON from stdin, outputs JSON to stdout +ENTRYPOINT ["/app/entrypoint.sh"] diff --git a/.claude/skills/convert-to-apple-container/modify/container/Dockerfile.intent.md b/.claude/skills/convert-to-apple-container/modify/container/Dockerfile.intent.md new file mode 100644 index 0000000..6fd2e8a --- /dev/null +++ b/.claude/skills/convert-to-apple-container/modify/container/Dockerfile.intent.md @@ -0,0 +1,31 @@ +# Intent: container/Dockerfile modifications + +## What changed +Updated the entrypoint script to shadow `.env` inside the container and drop privileges at runtime, replacing the Docker-style host-side file mount approach. + +## Why +Apple Container (VirtioFS) only supports directory mounts, not file mounts. The Docker approach of mounting `/dev/null` over `.env` from the host causes `VZErrorDomain Code=2 "A directory sharing device configuration is invalid"`. The fix moves the shadowing into the entrypoint using `mount --bind` (which works inside the Linux VM). + +## Key sections + +### Entrypoint script +- Added: `mount --bind /dev/null /workspace/project/.env` when running as root and `.env` exists +- Added: Privilege drop via `setpriv --reuid=$RUN_UID --regid=$RUN_GID --clear-groups` for main-group containers +- Added: `chown` of `/tmp/input.json` and `/tmp/dist` to target user before dropping privileges +- Removed: `USER node` directive — main containers start as root to perform the bind mount, then drop privileges in the entrypoint. Non-main containers still get `--user` from the host. + +### Dual-path execution +- Root path (main containers): shadow .env → compile → capture stdin → chown → setpriv drop → exec node +- Non-root path (other containers): compile → capture stdin → exec node + +## Invariants +- The entrypoint still reads JSON from stdin and runs the agent-runner +- The compiled output goes to `/tmp/dist` (read-only after build) +- `node_modules` is symlinked, not copied +- Non-main containers are unaffected (they arrive as non-root via `--user`) + +## Must-keep +- The `set -e` at the top +- The stdin capture to `/tmp/input.json` (required because setpriv can't forward stdin piping) +- The `chmod -R a-w /tmp/dist` (prevents agent from modifying its own runner) +- The `chown -R node:node /workspace` in the build step diff --git a/.claude/skills/convert-to-apple-container/modify/src/container-runner.ts b/.claude/skills/convert-to-apple-container/modify/src/container-runner.ts new file mode 100644 index 0000000..21d7ab9 --- /dev/null +++ b/.claude/skills/convert-to-apple-container/modify/src/container-runner.ts @@ -0,0 +1,694 @@ +/** + * Container Runner for NanoClaw + * Spawns agent execution in containers and handles IPC + */ +import { ChildProcess, exec, spawn } from 'child_process'; +import fs from 'fs'; +import path from 'path'; + +import { + CONTAINER_IMAGE, + CONTAINER_MAX_OUTPUT_SIZE, + CONTAINER_TIMEOUT, + DATA_DIR, + GROUPS_DIR, + IDLE_TIMEOUT, + TIMEZONE, +} from './config.js'; +import { readEnvFile } from './env.js'; +import { resolveGroupFolderPath, resolveGroupIpcPath } from './group-folder.js'; +import { logger } from './logger.js'; +import { + CONTAINER_RUNTIME_BIN, + readonlyMountArgs, + stopContainer, +} from './container-runtime.js'; +import { validateAdditionalMounts } from './mount-security.js'; +import { RegisteredGroup } from './types.js'; + +// Sentinel markers for robust output parsing (must match agent-runner) +const OUTPUT_START_MARKER = '---NANOCLAW_OUTPUT_START---'; +const OUTPUT_END_MARKER = '---NANOCLAW_OUTPUT_END---'; + +export interface ContainerInput { + prompt: string; + sessionId?: string; + groupFolder: string; + chatJid: string; + isMain: boolean; + isScheduledTask?: boolean; + assistantName?: string; + secrets?: Record; +} + +export interface ContainerOutput { + status: 'success' | 'error'; + result: string | null; + newSessionId?: string; + error?: string; +} + +interface VolumeMount { + hostPath: string; + containerPath: string; + readonly: boolean; +} + +function buildVolumeMounts( + group: RegisteredGroup, + isMain: boolean, +): VolumeMount[] { + const mounts: VolumeMount[] = []; + const projectRoot = process.cwd(); + const groupDir = resolveGroupFolderPath(group.folder); + + if (isMain) { + // Main gets the project root read-only. Writable paths the agent needs + // (group folder, IPC, .claude/) are mounted separately below. + // Read-only prevents the agent from modifying host application code + // (src/, dist/, package.json, etc.) which would bypass the sandbox + // entirely on next restart. + mounts.push({ + hostPath: projectRoot, + containerPath: '/workspace/project', + readonly: true, + }); + + // Main also gets its group folder as the working directory + mounts.push({ + hostPath: groupDir, + containerPath: '/workspace/group', + readonly: false, + }); + } else { + // Other groups only get their own folder + mounts.push({ + hostPath: groupDir, + containerPath: '/workspace/group', + readonly: false, + }); + + // Global memory directory (read-only for non-main) + // Only directory mounts are supported, not file mounts + const globalDir = path.join(GROUPS_DIR, 'global'); + if (fs.existsSync(globalDir)) { + mounts.push({ + hostPath: globalDir, + containerPath: '/workspace/global', + readonly: true, + }); + } + } + + // Per-group Claude sessions directory (isolated from other groups) + // Each group gets their own .claude/ to prevent cross-group session access + const groupSessionsDir = path.join( + DATA_DIR, + 'sessions', + group.folder, + '.claude', + ); + fs.mkdirSync(groupSessionsDir, { recursive: true }); + const settingsFile = path.join(groupSessionsDir, 'settings.json'); + if (!fs.existsSync(settingsFile)) { + fs.writeFileSync( + settingsFile, + JSON.stringify( + { + env: { + // Enable agent swarms (subagent orchestration) + // https://code.claude.com/docs/en/agent-teams#orchestrate-teams-of-claude-code-sessions + CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS: '1', + // Load CLAUDE.md from additional mounted directories + // https://code.claude.com/docs/en/memory#load-memory-from-additional-directories + CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD: '1', + // Enable Claude's memory feature (persists user preferences between sessions) + // https://code.claude.com/docs/en/memory#manage-auto-memory + CLAUDE_CODE_DISABLE_AUTO_MEMORY: '0', + }, + }, + null, + 2, + ) + '\n', + ); + } + + // Sync skills from container/skills/ into each group's .claude/skills/ + const skillsSrc = path.join(process.cwd(), 'container', 'skills'); + const skillsDst = path.join(groupSessionsDir, 'skills'); + if (fs.existsSync(skillsSrc)) { + for (const skillDir of fs.readdirSync(skillsSrc)) { + const srcDir = path.join(skillsSrc, skillDir); + if (!fs.statSync(srcDir).isDirectory()) continue; + const dstDir = path.join(skillsDst, skillDir); + fs.cpSync(srcDir, dstDir, { recursive: true }); + } + } + mounts.push({ + hostPath: groupSessionsDir, + containerPath: '/home/node/.claude', + readonly: false, + }); + + // Per-group IPC namespace: each group gets its own IPC directory + // This prevents cross-group privilege escalation via IPC + const groupIpcDir = resolveGroupIpcPath(group.folder); + fs.mkdirSync(path.join(groupIpcDir, 'messages'), { recursive: true }); + fs.mkdirSync(path.join(groupIpcDir, 'tasks'), { recursive: true }); + fs.mkdirSync(path.join(groupIpcDir, 'input'), { recursive: true }); + mounts.push({ + hostPath: groupIpcDir, + containerPath: '/workspace/ipc', + readonly: false, + }); + + // Copy agent-runner source into a per-group writable location so agents + // can customize it (add tools, change behavior) without affecting other + // groups. Recompiled on container startup via entrypoint.sh. + const agentRunnerSrc = path.join( + projectRoot, + 'container', + 'agent-runner', + 'src', + ); + const groupAgentRunnerDir = path.join( + DATA_DIR, + 'sessions', + group.folder, + 'agent-runner-src', + ); + if (!fs.existsSync(groupAgentRunnerDir) && fs.existsSync(agentRunnerSrc)) { + fs.cpSync(agentRunnerSrc, groupAgentRunnerDir, { recursive: true }); + } + mounts.push({ + hostPath: groupAgentRunnerDir, + containerPath: '/app/src', + readonly: false, + }); + + // Additional mounts validated against external allowlist (tamper-proof from containers) + if (group.containerConfig?.additionalMounts) { + const validatedMounts = validateAdditionalMounts( + group.containerConfig.additionalMounts, + group.name, + isMain, + ); + mounts.push(...validatedMounts); + } + + return mounts; +} + +/** + * Read allowed secrets from .env for passing to the container via stdin. + * Secrets are never written to disk or mounted as files. + */ +function readSecrets(): Record { + return readEnvFile(['CLAUDE_CODE_OAUTH_TOKEN', 'ANTHROPIC_API_KEY']); +} + +function buildContainerArgs( + mounts: VolumeMount[], + containerName: string, + isMain: boolean, +): string[] { + const args: string[] = ['run', '-i', '--rm', '--name', containerName]; + + // Pass host timezone so container's local time matches the user's + args.push('-e', `TZ=${TIMEZONE}`); + + // Run as host user so bind-mounted files are accessible. + // Skip when running as root (uid 0), as the container's node user (uid 1000), + // or when getuid is unavailable (native Windows without WSL). + const hostUid = process.getuid?.(); + const hostGid = process.getgid?.(); + if (hostUid != null && hostUid !== 0 && hostUid !== 1000) { + if (isMain) { + // Main containers start as root so the entrypoint can mount --bind + // to shadow .env. Privileges are dropped via setpriv in entrypoint.sh. + args.push('-e', `RUN_UID=${hostUid}`); + args.push('-e', `RUN_GID=${hostGid}`); + } else { + args.push('--user', `${hostUid}:${hostGid}`); + } + args.push('-e', 'HOME=/home/node'); + } + + for (const mount of mounts) { + if (mount.readonly) { + args.push(...readonlyMountArgs(mount.hostPath, mount.containerPath)); + } else { + args.push('-v', `${mount.hostPath}:${mount.containerPath}`); + } + } + + args.push(CONTAINER_IMAGE); + + return args; +} + +export async function runContainerAgent( + group: RegisteredGroup, + input: ContainerInput, + onProcess: (proc: ChildProcess, containerName: string) => void, + onOutput?: (output: ContainerOutput) => Promise, +): Promise { + const startTime = Date.now(); + + const groupDir = resolveGroupFolderPath(group.folder); + fs.mkdirSync(groupDir, { recursive: true }); + + const mounts = buildVolumeMounts(group, input.isMain); + const safeName = group.folder.replace(/[^a-zA-Z0-9-]/g, '-'); + const containerName = `nanoclaw-${safeName}-${Date.now()}`; + const containerArgs = buildContainerArgs(mounts, containerName, input.isMain); + + logger.debug( + { + group: group.name, + containerName, + mounts: mounts.map( + (m) => + `${m.hostPath} -> ${m.containerPath}${m.readonly ? ' (ro)' : ''}`, + ), + containerArgs: containerArgs.join(' '), + }, + 'Container mount configuration', + ); + + logger.info( + { + group: group.name, + containerName, + mountCount: mounts.length, + isMain: input.isMain, + }, + 'Spawning container agent', + ); + + const logsDir = path.join(groupDir, 'logs'); + fs.mkdirSync(logsDir, { recursive: true }); + + return new Promise((resolve) => { + const container = spawn(CONTAINER_RUNTIME_BIN, containerArgs, { + stdio: ['pipe', 'pipe', 'pipe'], + }); + + onProcess(container, containerName); + + let stdout = ''; + let stderr = ''; + let stdoutTruncated = false; + let stderrTruncated = false; + + // Pass secrets via stdin (never written to disk or mounted as files) + input.secrets = readSecrets(); + container.stdin.write(JSON.stringify(input)); + container.stdin.end(); + // Remove secrets from input so they don't appear in logs + delete input.secrets; + + // Streaming output: parse OUTPUT_START/END marker pairs as they arrive + let parseBuffer = ''; + let newSessionId: string | undefined; + let outputChain = Promise.resolve(); + + container.stdout.on('data', (data) => { + const chunk = data.toString(); + + // Always accumulate for logging + if (!stdoutTruncated) { + const remaining = CONTAINER_MAX_OUTPUT_SIZE - stdout.length; + if (chunk.length > remaining) { + stdout += chunk.slice(0, remaining); + stdoutTruncated = true; + logger.warn( + { group: group.name, size: stdout.length }, + 'Container stdout truncated due to size limit', + ); + } else { + stdout += chunk; + } + } + + // Stream-parse for output markers + if (onOutput) { + parseBuffer += chunk; + let startIdx: number; + while ((startIdx = parseBuffer.indexOf(OUTPUT_START_MARKER)) !== -1) { + const endIdx = parseBuffer.indexOf(OUTPUT_END_MARKER, startIdx); + if (endIdx === -1) break; // Incomplete pair, wait for more data + + const jsonStr = parseBuffer + .slice(startIdx + OUTPUT_START_MARKER.length, endIdx) + .trim(); + parseBuffer = parseBuffer.slice(endIdx + OUTPUT_END_MARKER.length); + + try { + const parsed: ContainerOutput = JSON.parse(jsonStr); + if (parsed.newSessionId) { + newSessionId = parsed.newSessionId; + } + hadStreamingOutput = true; + // Activity detected — reset the hard timeout + resetTimeout(); + // Call onOutput for all markers (including null results) + // so idle timers start even for "silent" query completions. + outputChain = outputChain.then(() => onOutput(parsed)); + } catch (err) { + logger.warn( + { group: group.name, error: err }, + 'Failed to parse streamed output chunk', + ); + } + } + } + }); + + container.stderr.on('data', (data) => { + const chunk = data.toString(); + const lines = chunk.trim().split('\n'); + for (const line of lines) { + if (line) logger.debug({ container: group.folder }, line); + } + // Don't reset timeout on stderr — SDK writes debug logs continuously. + // Timeout only resets on actual output (OUTPUT_MARKER in stdout). + if (stderrTruncated) return; + const remaining = CONTAINER_MAX_OUTPUT_SIZE - stderr.length; + if (chunk.length > remaining) { + stderr += chunk.slice(0, remaining); + stderrTruncated = true; + logger.warn( + { group: group.name, size: stderr.length }, + 'Container stderr truncated due to size limit', + ); + } else { + stderr += chunk; + } + }); + + let timedOut = false; + let hadStreamingOutput = false; + const configTimeout = group.containerConfig?.timeout || CONTAINER_TIMEOUT; + // Grace period: hard timeout must be at least IDLE_TIMEOUT + 30s so the + // graceful _close sentinel has time to trigger before the hard kill fires. + const timeoutMs = Math.max(configTimeout, IDLE_TIMEOUT + 30_000); + + const killOnTimeout = () => { + timedOut = true; + logger.error( + { group: group.name, containerName }, + 'Container timeout, stopping gracefully', + ); + exec(stopContainer(containerName), { timeout: 15000 }, (err) => { + if (err) { + logger.warn( + { group: group.name, containerName, err }, + 'Graceful stop failed, force killing', + ); + container.kill('SIGKILL'); + } + }); + }; + + let timeout = setTimeout(killOnTimeout, timeoutMs); + + // Reset the timeout whenever there's activity (streaming output) + const resetTimeout = () => { + clearTimeout(timeout); + timeout = setTimeout(killOnTimeout, timeoutMs); + }; + + container.on('close', (code) => { + clearTimeout(timeout); + const duration = Date.now() - startTime; + + if (timedOut) { + const ts = new Date().toISOString().replace(/[:.]/g, '-'); + const timeoutLog = path.join(logsDir, `container-${ts}.log`); + fs.writeFileSync( + timeoutLog, + [ + `=== Container Run Log (TIMEOUT) ===`, + `Timestamp: ${new Date().toISOString()}`, + `Group: ${group.name}`, + `Container: ${containerName}`, + `Duration: ${duration}ms`, + `Exit Code: ${code}`, + `Had Streaming Output: ${hadStreamingOutput}`, + ].join('\n'), + ); + + // Timeout after output = idle cleanup, not failure. + // The agent already sent its response; this is just the + // container being reaped after the idle period expired. + if (hadStreamingOutput) { + logger.info( + { group: group.name, containerName, duration, code }, + 'Container timed out after output (idle cleanup)', + ); + outputChain.then(() => { + resolve({ + status: 'success', + result: null, + newSessionId, + }); + }); + return; + } + + logger.error( + { group: group.name, containerName, duration, code }, + 'Container timed out with no output', + ); + + resolve({ + status: 'error', + result: null, + error: `Container timed out after ${configTimeout}ms`, + }); + return; + } + + const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); + const logFile = path.join(logsDir, `container-${timestamp}.log`); + const isVerbose = + process.env.LOG_LEVEL === 'debug' || process.env.LOG_LEVEL === 'trace'; + + const logLines = [ + `=== Container Run Log ===`, + `Timestamp: ${new Date().toISOString()}`, + `Group: ${group.name}`, + `IsMain: ${input.isMain}`, + `Duration: ${duration}ms`, + `Exit Code: ${code}`, + `Stdout Truncated: ${stdoutTruncated}`, + `Stderr Truncated: ${stderrTruncated}`, + ``, + ]; + + const isError = code !== 0; + + if (isVerbose || isError) { + logLines.push( + `=== Input ===`, + JSON.stringify(input, null, 2), + ``, + `=== Container Args ===`, + containerArgs.join(' '), + ``, + `=== Mounts ===`, + mounts + .map( + (m) => + `${m.hostPath} -> ${m.containerPath}${m.readonly ? ' (ro)' : ''}`, + ) + .join('\n'), + ``, + `=== Stderr${stderrTruncated ? ' (TRUNCATED)' : ''} ===`, + stderr, + ``, + `=== Stdout${stdoutTruncated ? ' (TRUNCATED)' : ''} ===`, + stdout, + ); + } else { + logLines.push( + `=== Input Summary ===`, + `Prompt length: ${input.prompt.length} chars`, + `Session ID: ${input.sessionId || 'new'}`, + ``, + `=== Mounts ===`, + mounts + .map((m) => `${m.containerPath}${m.readonly ? ' (ro)' : ''}`) + .join('\n'), + ``, + ); + } + + fs.writeFileSync(logFile, logLines.join('\n')); + logger.debug({ logFile, verbose: isVerbose }, 'Container log written'); + + if (code !== 0) { + logger.error( + { + group: group.name, + code, + duration, + stderr, + stdout, + logFile, + }, + 'Container exited with error', + ); + + resolve({ + status: 'error', + result: null, + error: `Container exited with code ${code}: ${stderr.slice(-200)}`, + }); + return; + } + + // Streaming mode: wait for output chain to settle, return completion marker + if (onOutput) { + outputChain.then(() => { + logger.info( + { group: group.name, duration, newSessionId }, + 'Container completed (streaming mode)', + ); + resolve({ + status: 'success', + result: null, + newSessionId, + }); + }); + return; + } + + // Legacy mode: parse the last output marker pair from accumulated stdout + try { + // Extract JSON between sentinel markers for robust parsing + const startIdx = stdout.indexOf(OUTPUT_START_MARKER); + const endIdx = stdout.indexOf(OUTPUT_END_MARKER); + + let jsonLine: string; + if (startIdx !== -1 && endIdx !== -1 && endIdx > startIdx) { + jsonLine = stdout + .slice(startIdx + OUTPUT_START_MARKER.length, endIdx) + .trim(); + } else { + // Fallback: last non-empty line (backwards compatibility) + const lines = stdout.trim().split('\n'); + jsonLine = lines[lines.length - 1]; + } + + const output: ContainerOutput = JSON.parse(jsonLine); + + logger.info( + { + group: group.name, + duration, + status: output.status, + hasResult: !!output.result, + }, + 'Container completed', + ); + + resolve(output); + } catch (err) { + logger.error( + { + group: group.name, + stdout, + stderr, + error: err, + }, + 'Failed to parse container output', + ); + + resolve({ + status: 'error', + result: null, + error: `Failed to parse container output: ${err instanceof Error ? err.message : String(err)}`, + }); + } + }); + + container.on('error', (err) => { + clearTimeout(timeout); + logger.error( + { group: group.name, containerName, error: err }, + 'Container spawn error', + ); + resolve({ + status: 'error', + result: null, + error: `Container spawn error: ${err.message}`, + }); + }); + }); +} + +export function writeTasksSnapshot( + groupFolder: string, + isMain: boolean, + tasks: Array<{ + id: string; + groupFolder: string; + prompt: string; + schedule_type: string; + schedule_value: string; + status: string; + next_run: string | null; + }>, +): void { + // Write filtered tasks to the group's IPC directory + const groupIpcDir = resolveGroupIpcPath(groupFolder); + fs.mkdirSync(groupIpcDir, { recursive: true }); + + // Main sees all tasks, others only see their own + const filteredTasks = isMain + ? tasks + : tasks.filter((t) => t.groupFolder === groupFolder); + + const tasksFile = path.join(groupIpcDir, 'current_tasks.json'); + fs.writeFileSync(tasksFile, JSON.stringify(filteredTasks, null, 2)); +} + +export interface AvailableGroup { + jid: string; + name: string; + lastActivity: string; + isRegistered: boolean; +} + +/** + * Write available groups snapshot for the container to read. + * Only main group can see all available groups (for activation). + * Non-main groups only see their own registration status. + */ +export function writeGroupsSnapshot( + groupFolder: string, + isMain: boolean, + groups: AvailableGroup[], + registeredJids: Set, +): void { + const groupIpcDir = resolveGroupIpcPath(groupFolder); + fs.mkdirSync(groupIpcDir, { recursive: true }); + + // Main sees all groups; others see nothing (they can't activate groups) + const visibleGroups = isMain ? groups : []; + + const groupsFile = path.join(groupIpcDir, 'available_groups.json'); + fs.writeFileSync( + groupsFile, + JSON.stringify( + { + groups: visibleGroups, + lastSync: new Date().toISOString(), + }, + null, + 2, + ), + ); +} diff --git a/.claude/skills/convert-to-apple-container/modify/src/container-runner.ts.intent.md b/.claude/skills/convert-to-apple-container/modify/src/container-runner.ts.intent.md new file mode 100644 index 0000000..869843f --- /dev/null +++ b/.claude/skills/convert-to-apple-container/modify/src/container-runner.ts.intent.md @@ -0,0 +1,33 @@ +# Intent: src/container-runner.ts modifications + +## What changed +Updated `buildContainerArgs` to support Apple Container's .env shadowing mechanism. The function now accepts an `isMain` parameter and uses it to decide how container user identity is configured. + +## Why +Apple Container (VirtioFS) only supports directory mounts, not file mounts. The previous approach of mounting `/dev/null` over `.env` from the host causes a `VZErrorDomain` crash. Instead, main-group containers now start as root so the entrypoint can `mount --bind /dev/null` over `.env` inside the Linux VM, then drop to the host user via `setpriv`. + +## Key sections + +### buildContainerArgs (signature change) +- Added: `isMain: boolean` parameter +- Main containers: passes `RUN_UID`/`RUN_GID` env vars instead of `--user`, so the container starts as root +- Non-main containers: unchanged, still uses `--user` flag + +### buildVolumeMounts +- Removed: the `/dev/null` → `/workspace/project/.env` shadow mount (was in the committed `37228a9` fix) +- The .env shadowing is now handled inside the container entrypoint instead + +### runContainerAgent (call site) +- Changed: `buildContainerArgs(mounts, containerName)` → `buildContainerArgs(mounts, containerName, input.isMain)` + +## Invariants +- All exported interfaces unchanged: `ContainerInput`, `ContainerOutput`, `runContainerAgent`, `writeTasksSnapshot`, `writeGroupsSnapshot`, `AvailableGroup` +- Non-main containers behave identically (still get `--user` flag) +- Mount list for non-main containers is unchanged +- Secrets still passed via stdin, never mounted as files +- Output parsing (streaming + legacy) unchanged + +## Must-keep +- The `isMain` parameter on `buildContainerArgs` (consumed by `runContainerAgent`) +- The `RUN_UID`/`RUN_GID` env vars for main containers (consumed by entrypoint.sh) +- The `--user` flag for non-main containers (file permission compatibility) diff --git a/src/container-runner.ts b/src/container-runner.ts index b754690..3683940 100644 --- a/src/container-runner.ts +++ b/src/container-runner.ts @@ -74,6 +74,17 @@ function buildVolumeMounts( readonly: true, }); + // Shadow .env so the agent cannot read secrets from the mounted project root. + // Secrets are passed via stdin instead (see readSecrets()). + const envFile = path.join(projectRoot, '.env'); + if (fs.existsSync(envFile)) { + mounts.push({ + hostPath: '/dev/null', + containerPath: '/workspace/project/.env', + readonly: true, + }); + } + // Main also gets its group folder as the working directory mounts.push({ hostPath: groupDir,