nanoclaw

Author	SHA1	Message	Date
gavrielc	495b7df5fc	merge: resolve conflict with origin/main Keep ASSISTANT_NAME import, drop removed GROUPS_DIR import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 00:03:20 +02:00
gavrielc	77f7423172	fix: pass host timezone to container and reject UTC-suffixed timestamps (#371 ) Containers had no TZ set, so any time-aware code inside ran in UTC while the host interpreted bare timestamps as local time. Now TIMEZONE from config.ts is passed via -e TZ= to the container args. Also rejects Z-suffixed or offset-suffixed timestamps in the container's schedule_task validation, since bare timestamps are expected to be local time and silently accepting UTC suffixes would cause an offset mismatch. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 23:23:34 +02:00
Dan Shapiro	107aff850c	fix: pass assistantName to container agent instead of hardcoding 'Andy' The container agent-runner had 'Andy' hardcoded as the sender name in archived conversation transcripts. This ignored the configurable ASSISTANT_NAME setting, so users who changed their assistant's name (via .env or config) would still see 'Andy' in transcripts. - Add assistantName field to ContainerInput interface (both host and container copies) - Pass ASSISTANT_NAME from config through to container in index.ts and task-scheduler.ts - Thread assistantName through createPreCompactHook and formatTranscriptMarkdown in the agent-runner - Use 'AssistantNameMissing' as fallback instead of 'Andy' so a missing name is visible rather than silently wrong Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 12:22:07 -08:00
Lawyered	02d8528684	fix: pause malformed scheduled tasks	2026-02-22 21:01:53 +02:00
Lawyered	c6391cceb1	fix: block group folder path escapes	2026-02-22 21:01:53 +02:00
gavrielc	5fb10645cd	fix: mount project root read-only to prevent container escape (#392 ) The main group's project root was mounted read-write, allowing the container agent to modify host application code (e.g. dist/container-runner.js) to inject arbitrary mounts on next restart — a full sandbox escape. Fix: mount the project root read-only. Writable paths the agent needs (group folder, IPC, .claude/) are already mounted separately. The agent-runner source is now copied into a per-group writable location so agents can still customize container-side behavior without affecting host code or other groups. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 20:57:57 +02:00
gavrielc	92d14405c5	refactor: move setup scripts out of src/ to reduce build token count Setup scripts are standalone CLI tools run via tsx with no runtime imports from the main app. Moving them out of src/ excludes them from the tsc build output and reduces the compiled bundle size. - git mv src/setup/ setup/ - Fix imports to use ../src/logger.js and ../src/config.js - Update package.json, vitest.config.ts, SKILL.md references - Fix platform tests to be cross-platform (macOS + Linux) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:43:22 +02:00
Daniel M	8fc1c23925	Migrate setup from bash scripts to cross-platform Node.js modules (#382 ) * refactor: migrate setup from bash scripts to cross-platform Node.js modules Replace 9 bash scripts + qr-auth.html with a two-phase setup system: a bash bootstrap (setup.sh) for Node.js/npm verification, and TypeScript modules (src/setup/) for everything else. Resolves cross-platform issues: sed -i replaced with fs operations, sqlite3 CLI replaced with better-sqlite3, browser opening made cross-platform, service management supports launchd/ systemd/WSL nohup fallback, SQL injection prevented with parameterized queries. Add Linux systemctl equivalents alongside macOS launchctl commands in 8 skill files and CLAUDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: setup migration issues — pairing code, systemd fallback, nohup escaping - Emit WhatsApp pairing code immediately when received, before polling for auth completion. Previously the code was only shown in the final status block after auth succeeded — a catch-22 since the user needs the code to authenticate. (whatsapp-auth.ts) - Add systemd user session pre-check before attempting to write the user-level service unit. Falls back to nohup wrapper when user-level systemd is unavailable (e.g. su session without login/D-Bus). (service.ts) - Rewrite nohup wrapper template using array join instead of template literal to fix shell variable escaping (\\$ → $). (service.ts) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: detect stale docker group and kill orphaned processes on Linux systemd * fix: remove redundant shell option from execSync to fix TS2769 execSync already runs in a shell by default; the explicit `shell: true` caused a type error with @types/node which expects string, not boolean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: hide QR browser auth option on headless Linux Emit IS_HEADLESS from environment step and condition SKILL.md to only show pairing code + QR terminal when no display server is available (headless Linux without WSL). WSL is excluded from the headless gate because browser opening works via Windows interop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:25:11 +02:00
gavrielc	5f58941db2	fix: add .catch() handlers to fire-and-forget async calls (#221 ) (#355 ) Several async calls in the message loop and group queue are fire-and-forget without .catch() handlers. When WhatsApp disconnects or containers fail unexpectedly, these produce unhandled rejections that can crash the process. Add explicit .catch() at each call site so errors are logged with full context (groupJid, taskId) instead of crashing: - channel.setTyping() in message loop (adapted for channel abstraction) - startMessageLoop() in main() - runForGroup() and runTask() in group-queue (5 call sites) Closes #221 Co-authored-by: Naveen Jain <1779929+naveenspark@users.noreply.github.com> Co-authored-by: Skip Potter <skip.potter.va@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 23:22:51 +02:00
gavrielc	cb294405a5	fix: update voice note test to match empty-content skip behavior The test expected voice notes (audioMessage with no caption) to be delivered with empty content, but `6f177ad` added a guard that skips messages with no text content. Update assertion accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 23:13:43 +02:00
vaibhav	6e22abb1ba	fix: replace hardcoded /Users/user fallback with os.homedir() The HOME_DIR fallback '/Users/user' is macOS-specific and incorrect. Use os.homedir() from Node's os module which works cross-platform and returns the actual home directory from /etc/passwd on Linux. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-21 22:40:57 +02:00
Gavriel Cohen	3d8c0d1c0d	test: add coverage for isTaskContainer and idleWaiting reset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 22:19:24 +02:00
Gavriel Cohen	c6b69e87a9	fix: correctly trigger idle preemption in streaming input mode The original notifyIdle condition (!result.result) never fired in streaming input mode because every result has non-null text content. This caused due tasks to wait up to 30 minutes for the idle timer. - Call notifyIdle for ALL successful results (not just null ones) - Add isTaskContainer flag so user messages queue instead of being forwarded to task containers (which blocked notifyIdle from the message container's onOutput path) - Reset idleWaiting in sendMessage so containers aren't preempted while actively working on a new incoming message - Replace 30-min IDLE_TIMEOUT with 10s close timer for task containers since they are single-turn and should exit promptly after their result Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 22:19:24 +02:00
gavrielc	93bb94ff55	fix: only preempt idle containers when scheduled tasks enqueue Containers that finish work but stay alive in waitForIpcMessage() block queued scheduled tasks. Previous approaches killed active containers mid-work. This fix tracks idle state via the session-update marker (status: success, result: null) and only preempts when the container is idle-waiting, not actively working. Closes #293 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-21 22:19:24 +02:00
Peyton-Spencer	6f177adafe	fix: skip empty WhatsApp protocol messages WhatsApp protocol messages (encryption key distribution, read receipts, ephemeral settings, senderKeyDistributionMessage) have a message envelope but no text content. These were being stored in messages.db and processed by the agent, causing: 1. Agent responds to empty messages, wasting API tokens 2. Container stays alive indefinitely (idle timer resets) 3. Scheduled tasks blocked (queue slot occupied) This fix skips messages with empty content before calling onMessage, preventing protocol messages from being stored and processed. Fixes #250 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-21 17:27:07 +02:00
Stefan Gasser	d336b32460	fix: copy skill subdirectories recursively (#175 ) copyFileSync crashes with EISDIR when a skill contains subdirectories like scripts/. Skills support nested folders (scripts/, examples/, templates/) per the Claude Code spec. Use fs.cpSync to handle the complete skill structure.	2026-02-21 17:23:19 +02:00
Gio Lodi	94ba537310	Decouple formatting test from `@Andy` (#329 ) * Fix trigger pattern tests to use config name Tests hardcoded "Andy" but the pattern is built from `ASSISTANT_NAME` which comes from `.env`. --- Generated with the help of Claude Code, https://claude.ai/code Co-Authored-By: Claude Code Opus 4.6 <noreply@anthropic.com> * Restore usage comment in trim test --- Generated with the help of Claude Code, https://claude.ai/code Co-Authored-By: Claude Code Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Code Opus 4.6 <noreply@anthropic.com>	2026-02-21 17:18:08 +02:00
gavrielc	607623aa59	feat: convert container runtime from Apple Container to Docker (#323 ) Swap container-runtime.ts to the Docker variant: - CONTAINER_RUNTIME_BIN: 'container' → 'docker' - readonlyMountArgs: --mount bind,readonly → -v host:container:ro - ensureContainerRuntimeRunning: container system status → docker info - cleanupOrphans: Apple Container JSON format → docker ps --filter - build.sh default: container → docker Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 13:20:13 +02:00
gavrielc	c6e1bfecc6	refactor: extract runtime-specific code into src/container-runtime.ts (#321 ) Move all container-runtime-specific logic (binary name, mount args, stop command, startup check, orphan cleanup) into a single file so swapping runtimes only requires replacing this one file. Neutralize "Apple Container" references in comments and docs that would become incorrect after a runtime swap. References that list both runtimes as options are left unchanged. No behavior change — Apple Container remains the default runtime. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 13:13:55 +02:00
gavrielc	51788de3b9	Skills engine v0.1 + multi-channel infrastructure (#307 ) * refactor: multi-channel infrastructure with explicit channel/is_group tracking - Add channels[] array and findChannel() routing in index.ts, replacing hardcoded whatsapp.* calls with channel-agnostic callbacks - Add channel TEXT and is_group INTEGER columns to chats table with COALESCE upsert to protect existing values from null overwrites - is_group defaults to 0 (safe: unknown chats excluded from groups) - WhatsApp passes explicit channel='whatsapp' and isGroup to onChatMetadata - getAvailableGroups filters on is_group instead of JID pattern matching - findChannel logs warnings instead of silently dropping unroutable JIDs - Migration backfills channel/is_group from JID patterns for existing DBs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: skills engine v0.1 — deterministic skill packages with rerere resolution Three-way merge engine for applying skill packages on top of a core codebase. Skills declare which files they add/modify, and the engine uses git merge-file for conflict detection with git rerere for automatic resolution of previously-seen conflicts. Key components: - apply: three-way merge with backup/rollback safety net - replay: clean-slate replay for uninstall and rebase - update: core version updates with deletion detection - rebase: bake applied skills into base (one-way) - manifest: validation with path traversal protection - resolution-cache: pre-computed rerere resolutions - structured: npm deps, env vars, docker-compose merging - CI: per-skill test matrix with conflict detection 151 unit tests covering merge, rerere, backup, replay, uninstall, update, rebase, structured ops, and edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add Discord and Telegram skill packages Skill packages for adding Discord and Telegram channels to NanoClaw. Each package includes: - Channel implementation (add/src/channels/) - Three-way merge targets for index.ts, config.ts, routing.test.ts - Intent docs explaining merge invariants - Standalone integration tests - manifest.yaml with dependency/conflict declarations Applied via: npx tsx scripts/apply-skill.ts .claude/skills/add-discord These are inert until applied — no runtime impact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * remove unused docs (skills-system-status, implementation-guide) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 01:55:00 +02:00
Koshkoshinsk	802805d2ec	Fix/WA reconnect, container perms, assist name in env (#297 ) * fix: WA 515 stream error reconnect exiting early before key sync Pass isReconnect flag on 515 reconnect so the registered-creds check doesn't bail out before the handshake completes (caused "logging in..." hang after successful pairing). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: container permission errors on Docker with non-default uid Make /home/node world-writable in the Dockerfile so the SDK can write .claude.json. Add --user flag matching host uid/gid in container-runner so bind-mounted files are accessible. Skip when running as root (uid 0), as the container's node user (uid 1000), or on native Windows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: write ASSISTANT_NAME to .env during setup When a custom assistant name is chosen, persist it to .env so config.ts picks it up at runtime. Uses temp file for cross-platform sed compatibility (macOS/Linux/WSL). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:29:55 +02:00
gavrielc	88140ec1bb	feat: add setup skill with scripted steps (#258 ) Replace inline SKILL.md instructions with executable shell scripts for each setup phase (environment check, deps, container, auth, groups, channels, mounts, service, verify). Scripts emit structured status blocks for reliable parsing. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 00:23:49 +02:00
gavrielc	9261a25531	feat: add is_bot_message column and support dedicated phone numbers (#235 ) * feat: add is_bot_message column and support dedicated phone numbers Replace fragile content-prefix bot detection with an explicit is_bot_message database column. The old prefix check (content NOT LIKE 'Andy:%') is kept as a backstop for pre-migration messages. - Add is_bot_message column with automatic backfill migration - Add ASSISTANT_HAS_OWN_NUMBER env var to skip name prefix when the assistant has its own WhatsApp number - Move prefix logic into WhatsApp channel (no longer a router concern) - Remove prefixAssistantName from Channel interface - Load .env via dotenv so launchd-managed processes pick up config - WhatsApp bot detection: fromMe for own number, prefix match for shared Based on #160 and #173. Co-Authored-By: Stefan Gasser <stefan@stefangasser.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: extract shared .env parser and remove dotenv dependency Extract .env parsing into src/env.ts, used by both config.ts and container-runner.ts. Reads only requested keys without loading secrets into process.env, avoiding leaking API keys to child processes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Stefan Gasser <stefan@stefangasser.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 15:31:57 +02:00
Gavriel Cohen	6f2e10f0c3	fix: typing indicator now shows on every message, not just the first Two issues fixed: - Use 'paused' instead of 'available' to stop typing. Baileys' sendPresenceUpdate('available') sends a global <presence> stanza and ignores the JID, so chatstate never left 'composing' and WhatsApp suppressed duplicate composing notifications per XEP-0085. - Add setTyping call when piping messages to an already-running container. Previously only the first message (which spawns a new container) triggered the typing indicator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 14:28:08 +02:00
gavrielc	5c68deef76	fix: repair WhatsApp channel tests (missing Browsers mock and async flush) Added missing Browsers mock to the Baileys vi.mock and made triggerMessages async to flush microtasks before assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 23:08:34 +02:00
gavrielc	ae474fd344	fix: use available instead of paused when stopping typing indicator Sending 'paused' after the first response caused WhatsApp to stop relaying subsequent 'composing' presence updates. Using 'available' keeps the bot in a state where typing indicators work consistently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 23:05:36 +02:00
gavrielc	658f6b02d3	fix: send available presence on connect so typing indicators work consistently Without announcing 'available' after connecting, WhatsApp stops relaying composing/paused presence updates after the first message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:53:03 +02:00
Cole	1a07869329	security: sanitize env vars from agent Bash subprocesses (#171 ) Use a PreToolUse SDK hook to prepend `unset ANTHROPIC_API_KEY CLAUDE_CODE_OAUTH_TOKEN` to every Bash command Kit runs, preventing secret leakage via env/printenv/echo/$PROC. Secrets are now passed via stdin JSON instead of mounted env files, closing all known exfiltration vectors. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:33:39 +02:00
Gavriel Cohen	b5a6757211	fix: pass requiresTrigger through IPC and auto-discover additional directories - IPC register_group handler now passes requiresTrigger field to registerGroup(), fixing groups silently defaulting to trigger-required mode - Agent runner scans /workspace/extra/* and passes them as additionalDirectories to the SDK query, so CLAUDE.md files in mounted dirs are loaded automatically Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 12:18:16 +02:00
Gavriel Cohen	acdc6454db	fix: WhatsApp auth improvements and LID translation for DMs - Add pairing code auth with 515 reconnect handling (Baileys stream error after pairing is now retried instead of failing) - Use Browsers.macOS('Chrome') identifier for WhatsApp compatibility - Fix LID-to-phone translation for DMs using signalRepository.getPNForLID - Strip device suffix (:0) from resolved phone JIDs - Update setup skill with three auth options (browser QR, pairing code, terminal QR), DM channel type, and LID troubleshooting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 22:49:04 +02:00
Tom Granot	6863c0bf6b	test: add comprehensive WhatsApp connector tests (#182 ) 38 tests covering connection lifecycle, authentication, reconnection, message handling (text, image, video, voice, extended text), LID↔JID translation, outgoing message queue, group metadata sync, JID ownership, and typing indicators. Based on deep-dive audit of Baileys v7 internals. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 17:43:28 +02:00
gavrielc	8eb80d4ed0	fix: prevent infinite message replay on container timeout (#164 ) Container timeout and idle timeout both fire at 30min, racing the graceful shutdown. The hard kill returns error status, rolling back the message cursor even though output was already sent — causing duplicate messages indefinitely. - Grace period: hard timeout is now IDLE_TIMEOUT + 30s minimum - Timeout after output resolves as success (idle cleanup, not failure) - Don't roll back cursor if output was already sent to user - Remove src/telegram.ts and config vars (added to PR #156 by mistake) - Add typecheck step to CI workflow - Add container-runner timeout behavior tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 17:25:42 +02:00
gavrielc	2b56fecfdc	Refactor index (#156 ) * feat: add Telegram channel with agent swarm support Add Telegram as a messaging channel that can run alongside WhatsApp or standalone (TELEGRAM_ONLY mode). Includes bot pool support for agent swarms where each subagent appears as a different bot identity in the group. - Add grammy dependency for Telegram Bot API - Route messages through tg: JID prefix convention - Add storeMessageDirect for non-Baileys channels - Add sender field to IPC send_message for swarm identity - Support TELEGRAM_BOT_TOKEN, TELEGRAM_ONLY, TELEGRAM_BOT_POOL config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add index.ts refactor plan Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: extract channel abstraction, IPC, and router from index.ts Break the 1088-line monolith into focused modules: - src/channels/whatsapp.ts: WhatsAppChannel class implementing Channel interface - src/ipc.ts: IPC watcher and task processing with dependency injection - src/router.ts: message formatting, outbound routing, channel lookup - src/types.ts: Channel interface, OnInboundMessage, OnChatMetadata types Also adds regression test suite (98 tests), updates all documentation and skill files to reflect the new architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: add test workflow for PRs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove accidentally committed pool-bot assets Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): remove grammy from base dependencies Grammy is installed by the /add-telegram skill, not a base dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 00:36:37 +02:00
Gavriel Cohen	b3f5814f48	feat: move to Claude's native memory management Enable CLAUDE_CODE_DISABLE_AUTO_MEMORY=0 and CLAUDE_CODE_ADDITIONAL_DIRECTORIES_CLAUDE_MD=1 in container env so agents use Claude Code's built-in persistent memory instead of manually editing CLAUDE.md. Remove instructions that told agents to write context into CLAUDE.md files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 09:28:25 +02:00
gavrielc	6f02ee530b	Adds Agent Swarms * feat: streaming container mode, IPC messaging, agent teams support Major architectural shift from single-shot container runs to long-lived streaming containers with IPC-based message injection. - Agent runner: query loop with AsyncIterable prompt to keep stdin open for agent teams (fixes isSingleUserTurn premature shutdown) - New standalone stdio MCP server (ipc-mcp-stdio.ts) inheritable by subagents, with send_message and schedule_task tools - Streaming output: parse OUTPUT_START/END markers in real-time, send results to WhatsApp as they arrive - IPC file-based messaging: host writes to ipc/{group}/input/, agent polls for follow-up messages without respawning containers - Per-group settings.json with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 - SDK bumped to 0.2.34 for TeamCreate tool support - Container idle timeout (30min) with _close sentinel for shutdown - Orphaned container cleanup on startup - alwaysRespond flag for groups that skip trigger pattern check - Uncaught exception/rejection handlers with timestamps in logger - Combined SDK documentation into single deep dive reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove unused ipc-mcp.ts (replaced by ipc-mcp-stdio.ts) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: clarify agent communication model in docs and tool descriptions - CLAUDE.md (main + global): split communication instructions into "responding to messages" vs "scheduled tasks" sections - send_message tool: note that scheduled task output is not sent to user - Remove structured output (outputFormat) — not needed with current flow - Regular output is sent to WhatsApp; scheduled task output is only logged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: ignore dynamic group data while preserving base structure Only track groups/main/CLAUDE.md and groups/global/CLAUDE.md. All other group directories and files are ignored to prevent tracking user-specific session data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve critical bugs in streaming container mode Bug 1 (scheduled task hang): Task scheduler now passes onOutput callback with idle timer that writes _close sentinel after IDLE_TIMEOUT, so containers exit cleanly instead of blocking queue slots for 30 minutes. Scheduled tasks stay alive for interactive follow-up via IPC. Bug 2 (timeout disabled): Remove resetTimeout() from stderr handler. SDK writes debug logs continuously, resetting the timer on every line. Timeout now only resets on actual output markers in stdout. Bug 3 (trigger bypass): Piped messages in startMessageLoop now check trigger pattern for non-main groups. Non-trigger messages accumulate in DB and are pulled as context via getMessagesSince when a trigger arrives. Bug 7 (non-atomic IPC writes): GroupQueue.sendMessage uses temp file + rename for atomic writes, matching ipc-mcp-stdio.ts pattern. Also: flip isVerbose back to false (debug leftover), add isScheduledTask to host-side ContainerInput interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: idle timer not starting + scheduled task groupFolder missing Two bugs that prevented the scheduled task idle timeout fix from working: 1. onOutput was only called when parsed.result !== null, but session update markers have result: null. The idle timer never started for "silent" query completions, leaving containers parked at waitForIpcMessage until hard timeout. 2. Scheduler's onProcess callback didn't pass groupFolder to queue.registerProcess, so closeStdin no-oped (groupFolder was null). The _close sentinel was never written even when the idle timer fired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: duplicate messages and timestamp rollback in piping path Two bugs introduced by the trigger context accumulation change: 1. processGroupMessages didn't advance lastAgentTimestamp until after the container finished. The piping path's getMessagesSince(lastAgent Timestamp) re-fetched messages already sent as the initial prompt, causing duplicates. 2. processGroupMessages overwrote lastAgentTimestamp with the original batch timestamp on completion, rolling back any advancement made by the piping path while the container was running. Fix: advance lastAgentTimestamp immediately after building the prompt, before starting the container. This matches the piping path behavior and eliminates both the overlap and the rollback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: container idles 30 extra minutes after _close during query When _close was detected during pollIpcDuringQuery, it was consumed (deleted) and stream.end() was called. But after runQuery returned, main() still emitted a session-update marker (resetting the host's idle timer) and called waitForIpcMessage (which polled forever since _close was already gone). The container had to wait for a second _close. Fix: runQuery now returns closedDuringQuery. When true, main() skips the session-update marker and waitForIpcMessage, exiting immediately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resume branching, internal tags, and output forwarding - Fix resume branching: pass resumeSessionAt with last assistant UUID to anchor each query loop resume to the correct conversation tree position. Prevents agent responses landing on invisible branches when agent teams subagents create parallel JSONL entries. - Add <internal> tag stripping: agent can wrap internal reasoning in <internal> tags which are logged but not sent to WhatsApp. Prevents duplicate messages and internal monologue reaching users. - Forward scheduled task output: scheduled tasks now send result text to WhatsApp (with <internal> stripping), matching regular message behavior. No more special-case instructions. - Update Communication guidance in CLAUDE.md: simplified to "your output is sent to the user or group" with soft guidance on <internal> tags and send_message usage. - Add messaging behavior docs to schedule_task tool: prompts the scheduling agent to include guidance on whether the task should always/conditionally/never message the user. - Mount security: containerPath now optional, defaults to basename of hostPath. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: cursor rollback on error, flush guard, verbose logging - Roll back lastAgentTimestamp on container error so retries can re-process the messages instead of silently losing them. - Add guard flag to flushOutgoingQueue to prevent duplicate sends from concurrent flushes during rapid WA reconnects. - Revert isVerbose from hardcoded false back to env-based check (LOG_LEVEL=debug\|trace). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: orphan container cleanup was silently failing The startup cleanup used `container ls --format {{.Names}}` which is Docker Go-template syntax. Apple Container only supports `--format json` or `--format table`. The command errored with exit code 64, but the catch block silently swallowed it — orphan containers were never cleaned up on restart. Fixed to use `--format json` and parse `configuration.id` from the JSON output. Also filters by `status: running` and logs a warning on failure instead of silently catching. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Discord badge and community section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: idle timer reset on null results and flush queue message loss - Only reset idle timer on actual results (non-null), not session-update markers. Prevents containers staying alive 30 extra minutes after the agent finishes work. - flushOutgoingQueue now uses shift() instead of splice(0) so unattempted messages stay in the queue if an unexpected error bails the loop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Agent Swarms to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update Telegram skill for current architecture Rewrite integration instructions to match the per-group queue/SQLite architecture: remove onMessage callback pattern (store to DB, let message loop pick up), fix startSchedulerLoop signature, add TELEGRAM_ONLY service startup, SQLite registration, data/env/env sync, @mention-to-trigger translation, and BotFather group privacy docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Telegram skill message chunking, media placeholders, chat discovery - Split long messages at Telegram's 4096 char limit to prevent silent send failures - Store placeholder text for non-text messages (photos, voice, stickers, etc.) so the agent knows media was sent - Update getAvailableGroups filter to include tg: chats so the agent can discover and register Telegram chats via IPC - Fix removal step numbering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update REQUIREMENTS.md and SPEC.md for SQLite architecture - Replace all registered_groups.json / sessions.json / router_state.json references with SQLite equivalents - Fix CONTAINER_TIMEOUT default (300000 → 1800000) - Add missing config exports (IDLE_TIMEOUT, MAX_CONCURRENT_CONTAINERS) - Update folder structure: add missing src files (logger, group-queue, mount-security), remove non-existent utils.ts, list all skills - Fix agent-runner entry (ipc-mcp.ts → ipc-mcp-stdio.ts) - Update startup sequence to reflect per-group queue architecture - Fix env mounting description (data/env/env, not extracted vars) - Update troubleshooting to use sqlite3 commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: fix README architecture description, revert SPEC.md env error - README: update architecture blurb to mention per-group queue, add group-queue.ts to key files, update file descriptions - SPEC.md: restore correct credential filtering description (only auth vars are extracted from .env, not the full file) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 02:50:43 +02:00
gavrielc	f26468c9b0	fix: setup skill reliability, requiresTrigger option, agent-browser visibility Setup skill fixes: - Run QR auth in foreground with long timeout, not background - Replace fragile message-based registration with DB group sync lookup - Personal chats: ask for phone number instead of querying empty DB - Consolidate trigger word + security model + channel selection into one step - Remove `timeout` shell command (unavailable on macOS), use Bash tool timeout - Query 40 groups, display 10 at a time, support name lookup requiresTrigger support: - Add requiresTrigger field to RegisteredGroup type and DB schema - Skip trigger check when requiresTrigger is false (for solo/personal chats) - Main group still always processes all messages (unchanged) Agent-browser visibility: - Append global CLAUDE.md to non-main agent system prompts via SDK - Add browser tool docs to global and main CLAUDE.md - Update skill description to be broader (not just "web testing") - Reference agent-browser.md in root CLAUDE.md key files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 01:39:31 +02:00
Gavriel Cohen	675ed30ba0	fix: improve container error logging to include full stdout/stderr Always log detailed input/output/stderr on error (not just in verbose mode), and stop truncating stderr/stdout in structured log fields. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 00:17:46 +02:00
gavrielc	44f0b3d99c	fix: improve agent output schema, tool descriptions, and shutdown robustness - Rename status→outputType, responded/silent→message/log for clarity - Remove scheduled task special-casing: userMessage now sent for all contexts - Update schema, tool, and CLAUDE.md descriptions to be clear and non-contradictory about communication mechanisms - Use full tool name mcp__nanoclaw__send_message in docs - Change schedule_task target_group to accept JID instead of folder name - Only show target_group_jid parameter to main group agents - Add defense-in-depth sanitization and error callback to exec() in shutdown - Use "user or group" consistently (supports both 1:1 and group chats) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 20:22:45 +02:00
gavrielc	ae177156ec	feat: per-group queue, SQLite state, graceful shutdown (#111 ) * fix: wire up queue processMessagesFn before recovery to prevent silent message loss recoverPendingMessages() was called after startMessageLoop(), which meant: 1. Recovery could race with the message loop's first iteration 2. processMessagesFn was set inside startMessageLoop, so recovery enqueues would fire runForGroup with processMessagesFn still null, silently skipping message processing Move setProcessMessagesFn and recoverPendingMessages before startMessageLoop so the queue is fully wired before any messages are enqueued. https://claude.ai/code/session_01PCY8zNjDa2N29jvBAV5vfL * feat: structured agent output to fix infinite retry on silent responses (#113) Use Agent SDK's outputFormat with json_schema to get typed responses from the agent. The agent now returns { status: 'responded' \| 'silent', userMessage?, internalLog? } instead of a plain string. This fixes a critical bug where a null/empty agent response caused infinite 5-second retry loops by conflating "nothing to say" with "error". - Agent runner: add AGENT_RESPONSE_SCHEMA and parse structured_output - Host: advance lastAgentTimestamp on both responded AND silent status - GroupQueue: add exponential backoff (5s-80s) with max 5 retries for actual errors, replacing unbounded fixed-interval retries https://claude.ai/code/session_014SLc8MxP9BYhEhDCLox9U8 Co-authored-by: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-06 18:54:26 +02:00
gavrielc	03df69e9b5	fix: address review feedback for per-group queue reliability - Fix startup recovery running before WhatsApp connects, which could permanently lose agent responses by advancing lastAgentTimestamp before sock is initialized - Add 5s retry on container failure so messages aren't silently dropped until a new message arrives for the group - Use `container stop` in shutdown instead of raw SIGTERM to CLI wrapper, ensuring proper container cleanup - Replace unnecessary dynamic imports with static imports in processTaskIpc - Guard JSON.parse of DB-stored last_agent_timestamp against corruption - Validate MAX_CONCURRENT_CONTAINERS (default 5, min 1, NaN-safe) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 16:45:00 +02:00
gavrielc	eac9a6acfd	feat: per-group queue, SQLite state, graceful shutdown Add per-group container locking with global concurrency limit to prevent concurrent containers for the same group (#89) and cap total containers. Fix message batching bug where lastAgentTimestamp advanced to trigger message instead of latest in batch, causing redundant re-processing. Move router state, sessions, and registered groups from JSON files to SQLite with automatic one-time migration. Add SIGTERM/SIGINT handlers with graceful shutdown (SIGTERM -> grace period -> SIGKILL). Add startup recovery for messages missed during crash. Remove dead code: utils.ts, Session type, isScheduledTask flag, ContainerConfig.env, getTaskRunLogs, GroupQueue.isActive. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 07:38:07 +02:00
gavrielc	db216a459e	fix: proper container lifecycle management to prevent stopped container accumulation - Name containers (nanoclaw-{group}-{timestamp}) for trackability - Replace SIGKILL timeout with graceful `container stop` so --rm fires - Add startup sweep to clean up stopped nanoclaw containers from previous runs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 07:10:26 +02:00
Gavriel Cohen	3a4d340f80	Fix duplicate responses caused by reconnect-stacking loops WhatsApp reconnections called startMessageLoop/startSchedulerLoop/ startIpcWatcher and setInterval again without stopping the previous instances, creating parallel loops that processed the same messages. Add guard flags so each loop starts only once per process lifetime. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 00:18:55 +02:00
Ejae-dev	117980175e	refactor: deduplicate logger into shared module (#39 ) three files created identical pino logger instances with the same config. extract into src/logger.ts and import from each consumer. net -9 lines, no behavior change. Co-authored-by: ejae <ejae_dev@ejaes-Mac-mini.home> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 00:40:58 +02:00
yingchao	392ba6262c	fix: translate WhatsApp LID JIDs to phone JIDs for self-chat messages (#62 ) WhatsApp recently changed to send self-chat messages using LID (Linked ID) format (e.g., xxxxxx@lid) instead of phone number format (e.g., xxxxxx@s.whatsapp.net). This caused messages to yourself to be silently dropped because they didn't match any registered group. ## How to reproduce 1. Send a message to yourself on WhatsApp with the trigger 2. Message is received by Baileys but remoteJid is in LID format 3. LID JID doesn't match registered group JID (phone format) 4. Message is not stored and no response is sent ## The fix - Build a LID-to-phone mapping from sock.user on connection open - Translate incoming LID JIDs to phone JIDs before storing/processing messages - This allows self-chat messages to correctly match the registered main channel The mapping is populated from sock.user.id (phone) and sock.user.lid (LID) which Baileys provides after successful authentication.	2026-02-04 00:33:50 +02:00
gavrielc	21c66df2b1	Add prettier Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 17:14:17 +02:00
Gavriel	4711ec435a	Add register_group IPC command for dynamic group registration Main agent can now register new groups via MCP tool without restart. Host updates both in-memory state and JSON file, creates group folders. Authorization enforced at both agent and host level. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 00:08:40 +02:00
gavrielc	05a29d562f	Security improvements: per-group session isolation, remove built-in Gmail - Isolate Claude sessions per-group (data/sessions/{group}/.claude/) to prevent cross-group access to conversation history - Remove Gmail MCP from built-in (now available via /add-gmail skill) - Add SECURITY.md documenting the security model - Move docs to docs/ folder (SPEC.md, REQUIREMENTS.md, SECURITY.md) - Update documentation to reflect changes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 00:07:59 +02:00
gavrielc	d000f33928	Add container output size limiting to prevent memory issues (#18 ) * Fix potential memory DoS via unbounded container output Add CONTAINER_MAX_OUTPUT_SIZE (default 10MB) to limit accumulated stdout/stderr from container processes. Without this limit, a malicious or buggy container could emit huge output leading to host memory exhaustion. Changes: - Add configurable CONTAINER_MAX_OUTPUT_SIZE in config.ts - Implement size-limited output buffering in runContainerAgent - Log warnings when truncation occurs - Include truncation status in container logs https://claude.ai/code/session_01TjVDwwaGwbcFDdmrFF2y8B * Update package-lock.json https://claude.ai/code/session_01TjVDwwaGwbcFDdmrFF2y8B --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-01 23:09:50 +02:00
gavrielc	33ef0c68d3	Fix message cursor to only advance on successful processing (#17 ) Previously, lastTimestamp was unconditionally advanced after each message, even if processMessage failed. This caused transient errors to permanently drop messages since they would never be retried. Now the cursor only advances after successful processing, implementing at-least-once delivery semantics. On failure, the loop breaks and the failed message will be retried on the next poll iteration. https://claude.ai/code/session_01SEQDWxXeZHe7t1bb5cw2CA Co-authored-by: Claude <noreply@anthropic.com>	2026-02-01 23:05:37 +02:00

1 2

94 Commits