Files
nanoclaw/.claude/skills/add-voice-transcription/modify/src/channels/whatsapp.ts.intent.md
gavrielc a4072162b7 feat: add voice transcription as nanorepo skill (#326)
Add voice transcription skill package at
.claude/skills/add-voice-transcription/ so it can be applied via the
skills engine. Skill adds src/transcription.ts (OpenAI Whisper), modifies
whatsapp.ts to detect/transcribe voice notes, and includes intent files,
3 test cases, and 8 skill validation tests.

Also fixes skills engine runNpmInstall() to use --legacy-peer-deps,
needed for any skill adding deps with Zod v3 peer requirements.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 14:18:54 +02:00

1.3 KiB

Intent: src/channels/whatsapp.ts modifications

What changed

Added voice message transcription support. When a WhatsApp voice note (PTT audio) arrives, it is downloaded and transcribed via OpenAI Whisper before being stored as message content.

Key sections

Imports (top of file)

  • Added: isVoiceMessage, transcribeAudioMessage from ../transcription.js

messages.upsert handler (inside connectInternal)

  • Added: let finalContent = content variable to allow voice transcription to override text content
  • Added: isVoiceMessage(msg) check after content extraction
  • Added: try/catch block calling transcribeAudioMessage(msg, this.sock)
    • Success: finalContent = '[Voice: <transcript>]'
    • Null result: finalContent = '[Voice Message - transcription unavailable]'
    • Error: finalContent = '[Voice Message - transcription failed]'
  • Changed: this.opts.onMessage() call uses finalContent instead of content

Invariants (must-keep)

  • All existing message handling (conversation, extendedTextMessage, imageMessage, videoMessage) unchanged
  • Connection lifecycle (connect, reconnect, disconnect) unchanged
  • LID translation logic unchanged
  • Outgoing message queue unchanged
  • Group metadata sync unchanged
  • sendMessage prefix logic unchanged
  • setTyping, ownsJid, isConnected — all unchanged