Files
nanoclaw/.claude/skills/add-pdf-reader/SKILL.md
glifocat 0b260ece57 feat(skills): add pdf-reader skill (#772)
Thanks @glifocat! Clean skill package — good docs, solid tests, nice intent files. Pushed a small fix for path traversal on the PDF filename before merging.
2026-03-06 18:47:12 +02:00

2.6 KiB

name, description
name description
add-pdf-reader Add PDF reading to NanoClaw agents. Extracts text from PDFs via pdftotext CLI. Handles WhatsApp attachments, URLs, and local files.

Add PDF Reader

Adds PDF reading capability to all container agents using poppler-utils (pdftotext/pdfinfo). PDFs sent as WhatsApp attachments are auto-downloaded to the group workspace.

Phase 1: Pre-flight

Check if already applied

Read .nanoclaw/state.yaml. If add-pdf-reader is in applied_skills, skip to Phase 3 (Verify).

Phase 2: Apply Code Changes

Initialize skills system (if needed)

If .nanoclaw/ directory doesn't exist:

npx tsx scripts/apply-skill.ts --init

Apply the skill

npx tsx scripts/apply-skill.ts .claude/skills/add-pdf-reader

This deterministically:

  • Adds container/skills/pdf-reader/SKILL.md (agent-facing documentation)
  • Adds container/skills/pdf-reader/pdf-reader (CLI script)
  • Three-way merges poppler-utils + COPY into container/Dockerfile
  • Three-way merges PDF attachment download into src/channels/whatsapp.ts
  • Three-way merges PDF tests into src/channels/whatsapp.test.ts
  • Records application in .nanoclaw/state.yaml

If merge conflicts occur, read the intent files:

  • modify/container/Dockerfile.intent.md
  • modify/src/channels/whatsapp.ts.intent.md
  • modify/src/channels/whatsapp.test.ts.intent.md

Validate

npm test
npm run build

Rebuild container

./container/build.sh

Restart service

launchctl kickstart -k gui/$(id -u)/com.nanoclaw  # macOS
# Linux: systemctl --user restart nanoclaw

Phase 3: Verify

Test PDF extraction

Send a PDF file in any registered WhatsApp chat. The agent should:

  1. Download the PDF to attachments/
  2. Respond acknowledging the PDF
  3. Be able to extract text when asked

Test URL fetching

Ask the agent to read a PDF from a URL. It should use pdf-reader fetch <url>.

Check logs if needed

tail -f logs/nanoclaw.log | grep -i pdf

Look for:

  • Downloaded PDF attachment — successful download
  • Failed to download PDF attachment — media download issue

Troubleshooting

Agent says pdf-reader command not found

Container needs rebuilding. Run ./container/build.sh and restart the service.

PDF text extraction is empty

The PDF may be scanned (image-based). pdftotext only handles text-based PDFs. Consider using the agent-browser to open the PDF visually instead.

WhatsApp PDF not detected

Verify the message has documentMessage with mimetype: application/pdf. Some file-sharing apps send PDFs as generic files without the correct mimetype.