mail2couch/CLAUDE.md
Ole-Morten Duesund ea6235b674 feat: implement real IMAP message parsing with native CouchDB attachments
- Replace placeholder message generation with actual IMAP message fetching using go-message library
- Add per-account CouchDB databases for better organization and isolation
- Implement native CouchDB attachment storage with proper revision management
- Add command line argument parsing with --max-messages flag for controlling message processing limits
- Support both sync and archive modes with proper document synchronization
- Add comprehensive test environment with Podman containers (GreenMail IMAP server + CouchDB)
- Implement full MIME multipart parsing for proper body and attachment extraction
- Add TLS and plain IMAP connection support based on port configuration
- Update configuration system to support sync vs archive modes
- Create test scripts and sample data for development and testing

Key technical improvements:
- Real email envelope and header processing with go-imap v2 API
- MIME Content-Type and Content-Disposition parsing for attachment detection
- CouchDB document ID generation using mailbox_uid format for uniqueness
- Duplicate detection and prevention to avoid re-storing existing messages
- Proper error handling and connection management for IMAP operations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 17:04:10 +02:00

5.8 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

mail2couch is a utility for backing up mail from various sources (primarily IMAP) to CouchDB. The project supports two implementations:

  • Go implementation: Located in /go/ directory (currently the active implementation)
  • Rust implementation: Planned but not yet implemented

Development Commands

Go Implementation (Primary)

# Build the application
cd go && go build -o mail2couch .

# Run the application with automatic config discovery
cd go && ./mail2couch

# Run with specific config file
cd go && ./mail2couch -config /path/to/config.json

# Run with message limit (useful for large mailboxes)
cd go && ./mail2couch -max-messages 100

# Run with both config and message limit
cd go && ./mail2couch -config /path/to/config.json -max-messages 50

# Run linting/static analysis
cd go && go vet ./...

# Run tests (currently no tests exist)
cd go && go test ./...

# Check dependencies
cd go && go mod tidy

Architecture

Core Components

  1. Configuration (config/): JSON-based configuration system

    • Supports multiple mail sources with filtering options
    • CouchDB connection settings
    • Each source can have folder and message filters
  2. Mail Handling (mail/): IMAP client implementation

    • Uses github.com/emersion/go-imap/v2 for IMAP operations
    • Supports TLS connections
    • Currently only lists mailboxes (backup functionality not yet implemented)
  3. CouchDB Integration (couch/): Database operations

    • Uses github.com/go-kivik/kivik/v4 as CouchDB driver
    • Handles database creation and document management
    • Defines MailDocument structure for email storage

Configuration Structure

The application uses config.json for configuration with the following structure:

  • couchDb: Database connection settings (URL, credentials, database name - note: the database field is now ignored as each mail source gets its own database)
  • mailSources: Array of mail sources with individual settings:
    • Protocol support (currently only IMAP)
    • Connection details (host, port, credentials)
    • mode: Either "sync" or "archive" (defaults to "archive" if not specified)
      • sync: 1-to-1 relationship - CouchDB documents match exactly what's in the mail account (may remove documents from CouchDB)
      • archive: Archive mode - CouchDB keeps all messages ever seen, even if deleted from mail account (never removes documents)
    • Filtering options for folders and messages
    • Enable/disable per source

Configuration File Discovery

The application automatically searches for configuration files in the following order:

  1. Path specified by -config command line flag
  2. ./config.json (current working directory)
  3. ./config/config.json (config subdirectory)
  4. ~/.config/mail2couch/config.json (user XDG config directory)
  5. ~/.mail2couch.json (user home directory)

This design ensures the same config.json format will work for both Go and Rust implementations.

Current Implementation Status

  • Configuration loading with automatic file discovery
  • Command line flag support for config file path
  • Per-account CouchDB database creation and management
  • IMAP connection and mailbox listing
  • Build error fixes
  • Email message retrieval framework (with placeholder data)
  • Email storage to CouchDB framework with native attachments
  • Folder filtering logic
  • Date filtering support
  • Duplicate detection and prevention
  • Sync vs Archive mode implementation
  • CouchDB attachment storage for email attachments
  • Real IMAP message parsing (currently uses placeholder data)
  • Full message body and attachment handling
  • Incremental sync functionality
  • Rust implementation

Key Dependencies

  • github.com/emersion/go-imap/v2: IMAP client library
  • github.com/go-kivik/kivik/v4: CouchDB client library

Development Notes

  • The main entry point is main.go which orchestrates the configuration loading, CouchDB setup, and mail source processing
  • Each mail source gets its own CouchDB database named using GenerateAccountDBName() function
  • Each mail source is processed sequentially with proper error handling
  • The application currently uses placeholder message data for testing the storage pipeline
  • Message filtering by folder (include/exclude) and date (since) is implemented
  • Duplicate detection prevents re-storing existing messages
  • Sync vs Archive mode determines whether to remove documents from CouchDB when they're no longer in the mail account
  • Email attachments are stored as native CouchDB attachments linked to the email document
  • No tests are currently implemented
  • The application uses automatic config file discovery as documented above

Next Steps

To complete the implementation, the following items need to be addressed:

  1. Real IMAP Message Parsing: Replace placeholder message generation with actual IMAP message fetching and parsing using the correct go-imap/v2 API
  2. Message Body Extraction: Implement proper text/plain and text/html body extraction from multipart messages
  3. Keyword Filtering: Add support for filtering messages by keywords in:
    • Subject line (subjectKeywords)
    • Sender addresses (senderKeywords)
    • Recipient addresses (recipientKeywords)
  4. Attachment Handling: Add support for email attachments (optional)
  5. Error Recovery: Add retry logic for network failures and partial sync recovery
  6. Performance: Add batch operations for better CouchDB insertion performance
  7. Testing: Add unit tests for all major components

Development Guidelines

Code Quality and Standards

  • All code requires perfect linting and tool-formatting, exceptions are allowed only if documented properly