mail2couch/CLAUDE.md
Ole-Morten Duesund e280aa0aaa refactor: remove webmail interface, focus on core mail storage functionality
- Remove obsolete CouchDB design documents (webmail.json, dashboard.json)
- Clean up webmail-related code from couch/couch.go (WebmailViews, CreateWebmailViews, etc.)
- Update documentation to focus on core mail-to-CouchDB storage functionality
- Add Future Plans section describing planned webmail viewer as separate component
- Apply go fmt formatting and ensure code quality standards
- Update test documentation to show raw CouchDB API access patterns
- Remove compiled binary from repository

This refactor simplifies the codebase to focus on its core purpose: efficiently
backing up emails from IMAP to CouchDB. The webmail interface will be developed
as a separate, optional component to maintain clean separation of concerns.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-02 14:57:51 +02:00

162 lines
No EOL
7 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
mail2couch is a utility for backing up mail from various sources (primarily IMAP) to CouchDB. The project supports two implementations:
- **Go implementation**: Located in `/go/` directory (currently the active implementation)
- **Rust implementation**: Planned but not yet implemented
## Development Commands
### Go Implementation (Primary)
```bash
# Build the application
cd go && go build -o mail2couch .
# Run the application with automatic config discovery
cd go && ./mail2couch
# Run with specific config file
cd go && ./mail2couch -config /path/to/config.json
# Run with message limit (useful for large mailboxes)
cd go && ./mail2couch -max-messages 100
# Run with both config and message limit
cd go && ./mail2couch -config /path/to/config.json -max-messages 50
# Run linting/static analysis
cd go && go vet ./...
# Run integration tests with Podman containers
cd test && ./run-tests.sh
# Run specialized tests
cd test && ./test-wildcard-patterns.sh
cd test && ./test-incremental-sync.sh
# Run unit tests (none currently implemented)
cd go && go test ./...
# Check dependencies
cd go && go mod tidy
```
## Architecture
### Core Components
1. **Configuration (`config/`)**: JSON-based configuration system
- Supports multiple mail sources with filtering options
- CouchDB connection settings
- Each source can have folder and message filters
2. **Mail Handling (`mail/`)**: IMAP client implementation
- Uses `github.com/emersion/go-imap/v2` for IMAP operations
- Supports TLS connections
- Fetches and processes email messages from IMAP mailboxes
3. **CouchDB Integration (`couch/`)**: Database operations
- Uses `github.com/go-kivik/kivik/v4` as CouchDB driver
- Handles database creation and document management
- Defines `MailDocument` structure for email storage
### Configuration Structure
The application uses `config.json` for configuration with the following structure:
- `couchDb`: Database connection settings (URL, credentials)
- `mailSources`: Array of mail sources with individual settings:
- Protocol support (currently only IMAP)
- Connection details (host, port, credentials)
- `mode`: Either "sync" or "archive" (defaults to "archive" if not specified)
- **sync**: 1-to-1 relationship - CouchDB documents match exactly what's in the mail account (may remove documents from CouchDB)
- **archive**: Archive mode - CouchDB keeps all messages ever seen, even if deleted from mail account (never removes documents)
- Filtering options for folders and messages with wildcard support
- Enable/disable per source
### Configuration File Discovery
The application automatically searches for configuration files in the following order:
1. Path specified by `-config` command line flag
2. `./config.json` (current working directory)
3. `./config/config.json` (config subdirectory)
4. `~/.config/mail2couch/config.json` (user XDG config directory)
5. `~/.mail2couch.json` (user home directory)
This design ensures the same `config.json` format will work for both Go and Rust implementations.
### Current Implementation Status
- ✅ Configuration loading with automatic file discovery
- ✅ Command line flag support for config file path
- ✅ Per-account CouchDB database creation and management
- ✅ IMAP connection and mailbox listing
- ✅ Build error fixes
- ✅ Real IMAP message retrieval and parsing
- ✅ Email storage to CouchDB framework with native attachments
- ✅ Folder filtering logic with wildcard support (`*`, `?`, `[abc]` patterns)
- ✅ Date filtering support
- ✅ Keyword filtering support (subject, sender, recipient keywords)
- ✅ Duplicate detection and prevention
- ✅ Sync vs Archive mode implementation
- ✅ CouchDB attachment storage for email attachments
- ✅ Full message body and attachment handling with MIME multipart support
- ✅ Command line argument support (--max-messages flag)
- ✅ Per-account CouchDB databases for better organization
- ✅ Incremental sync functionality with IMAP SEARCH and sync metadata tracking
- ❌ Rust implementation
### Key Dependencies
- `github.com/emersion/go-imap/v2`: IMAP client library
- `github.com/go-kivik/kivik/v4`: CouchDB client library
### Incremental Sync Implementation
The application implements intelligent incremental synchronization to avoid re-processing messages:
- **Sync Metadata Storage**: Each mailbox sync operation stores metadata including last sync timestamp and highest UID processed
- **IMAP SEARCH Integration**: Uses IMAP SEARCH with SINCE criteria for efficient server-side filtering of new messages
- **Per-Mailbox Tracking**: Sync state is tracked independently for each mailbox in each account
- **Fallback Behavior**: Gracefully falls back to fetching recent messages if IMAP SEARCH fails
- **First Sync Handling**: Initial sync can use config `since` date or perform full sync
Sync metadata documents are stored in CouchDB with ID format: `sync_metadata_{mailbox}` and include:
- `lastSyncTime`: When this mailbox was last successfully synced
- `lastMessageUID`: Highest UID processed in the last sync
- `messageCount`: Number of messages processed in the last sync
### Development Notes
- The main entry point is `main.go` which orchestrates the configuration loading, CouchDB setup, and mail source processing
- Each mail source gets its own CouchDB database named using `GenerateAccountDBName()` function with `m2c_` prefix
- Each mail source is processed sequentially with proper error handling
- The application uses real IMAP message parsing with go-message library for full email processing
- Message filtering by folder (wildcard patterns), date (since), and keywords is implemented
- Duplicate detection prevents re-storing existing messages
- Sync vs Archive mode determines whether to remove documents from CouchDB when they're no longer in the mail account
- Email attachments are stored as native CouchDB attachments linked to the email document
- Comprehensive test environment with Podman containers and automated test scripts
- The application uses automatic config file discovery as documented above
### Next Steps
The following enhancements could further improve the implementation:
1. **Error Recovery**: Add retry logic for network failures and partial sync recovery
2. **Performance Optimization**: Add batch operations for better CouchDB insertion performance
3. **Unit Testing**: Add comprehensive unit tests for all major components
4. **Advanced Filtering**: Add support for more complex filter expressions and regex patterns
5. **Monitoring**: Add metrics and logging for production deployment
6. **Configuration Validation**: Enhanced validation for configuration files
7. **Multi-threading**: Parallel processing of multiple mailboxes or accounts
## Development Guidelines
### Code Quality and Standards
- All code requires perfect linting and tool-formatting, exceptions are allowed only if documented properly
- We always want linting and formatting of our code to be perfect