# mail2couch A powerful email backup utility that synchronizes mail from IMAP accounts to CouchDB databases with intelligent incremental sync, comprehensive filtering, and native attachment support. ## Features ### Core Functionality - **IMAP Email Backup**: Connect to any IMAP server (Gmail, Outlook, self-hosted) - **CouchDB Storage**: Store emails as JSON documents with native CouchDB attachments - **Incremental Sync**: Efficiently sync only new messages using IMAP SEARCH with timestamp tracking - **Per-Account Databases**: Each mail source gets its own CouchDB database for better organization - **Duplicate Prevention**: Automatic detection and prevention of duplicate message storage ### Sync Modes - **Archive Mode**: Preserve all messages ever seen, even if deleted from mail server (default) - **Sync Mode**: Maintain 1-to-1 relationship with mail server (removes deleted messages from CouchDB) ### Advanced Filtering - **Wildcard Folder Patterns**: Use `*`, `?`, `[abc]` patterns for flexible folder selection - **Keyword Filtering**: Filter messages by keywords in subjects, senders, or recipients - **Date Filtering**: Process only messages since a specific date - **Include/Exclude Logic**: Combine multiple filter types for precise control ### Message Processing - **Full MIME Support**: Parse multipart messages, HTML/plain text, and embedded content - **Native Attachments**: Store email attachments as CouchDB native attachments with compression - **Complete Headers**: Preserve all email headers and metadata - **UTF-8 Support**: Handle international characters and special content ### HTML Webmail Interface - **Beautiful Web Interface**: Modern, responsive HTML presentations for viewing archived emails - **Gmail-like Design**: Professional, mobile-friendly interface with clean typography - **Message Lists**: Dynamic HTML lists with sorting, filtering, and folder organization - **Individual Messages**: Rich HTML display with proper formatting, URL linking, and collapsible headers - **Attachment Support**: Direct download links with file type and size information - **Search Integration**: Full-text subject search with keyword highlighting - **Folder Analytics**: Message count summaries and folder-based navigation - **Mobile Responsive**: Optimized for desktop, tablet, and mobile viewing ### Operational Features - **Automatic Config Discovery**: Finds configuration files in standard locations - **Command Line Control**: GNU-style options with `--max-messages`/`-m` and `--config`/`-c` flags - **Comprehensive Logging**: Detailed output for monitoring and troubleshooting - **Error Resilience**: Graceful handling of network issues and server problems ## Quick Start ### Installation 1. **Install dependencies**: ```bash # Go 1.21+ required go version ``` 2. **Clone and build**: ```bash git clone cd mail2couch/go go build -o mail2couch . ``` ### Basic Usage 1. **Create configuration file** (`config.json`): ```json { "couchDb": { "url": "http://localhost:5984", "user": "admin", "password": "password" }, "mailSources": [ { "name": "Personal Gmail", "enabled": true, "protocol": "imap", "host": "imap.gmail.com", "port": 993, "user": "your-email@gmail.com", "password": "your-app-password", "mode": "archive", "folderFilter": { "include": ["*"], "exclude": ["[Gmail]/Trash", "[Gmail]/Spam"] } } ] } ``` 2. **Run mail2couch**: ```bash ./mail2couch ``` The application will: - Create a CouchDB database named `m2c_personal_gmail` - Sync all folders except Trash and Spam - Store messages with native attachments - Track sync state for efficient incremental updates ## Configuration ### Configuration File Discovery mail2couch automatically searches for configuration files in this order: 1. Path specified by `--config`/`-c` flag 2. `./config.json` (current directory) 3. `./config/config.json` (config subdirectory) 4. `~/.config/mail2couch/config.json` (user config directory) 5. `~/.mail2couch.json` (user home directory) ### Command Line Options ```bash ./mail2couch [options] Options: -c, --config FILE Path to configuration file -m, --max-messages N Limit messages processed per mailbox per run (0 = unlimited) -h, --help Show help message ``` ### Folder Pattern Examples | Pattern | Description | Matches | |---------|-------------|---------| | `"*"` | All folders | `INBOX`, `Sent`, `Work/Projects`, etc. | | `"INBOX"` | Exact match | `INBOX` only | | `"Work*"` | Prefix match | `Work`, `Work/Projects`, `WorkStuff` | | `"*/Archive"` | Suffix match | `Personal/Archive`, `Work/Archive` | | `"Work/*"` | Subfolder match | `Work/Projects`, `Work/Clients` | ### Keyword Filtering Examples ```json { "messageFilter": { "subjectKeywords": ["urgent", "meeting", "invoice"], "senderKeywords": ["@company.com", "noreply@"], "recipientKeywords": ["team@", "support@"] } } ``` ## Advanced Configuration Examples See the [example configurations](#example-configurations) section below for detailed configuration scenarios. ## Testing A comprehensive test environment is included with Podman containers: ```bash cd test # Quick automated testing (recommended) ./run-tests.sh # Complete integration test with automatic cleanup # Specialized feature testing ./test-wildcard-patterns.sh # Test folder pattern matching ./test-incremental-sync.sh # Test incremental synchronization # Manual testing environment ./start-test-env.sh # Start persistent test environment # ... manual testing with various configurations ... ./stop-test-env.sh # Clean up when done ``` ## Architecture ### Database Structure - **Per-Account Databases**: Each mail source creates its own CouchDB database with `m2c_` prefix - **Message Documents**: Each email becomes a CouchDB document with metadata - **Native Attachments**: Email attachments stored as CouchDB attachments (compressed) - **Sync Metadata**: Tracks incremental sync state per mailbox - **HTML Webmail Views**: CouchDB design documents with show/list functions for web interface ### Document Structure ```json { "_id": "INBOX_12345", "sourceUid": "12345", "mailbox": "INBOX", "from": ["sender@example.com"], "to": ["recipient@example.com"], "subject": "Sample Email", "date": "2024-01-15T10:30:00Z", "body": "Email content...", "headers": {"Content-Type": ["text/plain"]}, "storedAt": "2024-01-15T10:35:00Z", "docType": "mail", "hasAttachments": true, "_attachments": { "document.pdf": { "content_type": "application/pdf", "length": 54321 } } } ``` ### Accessing Stored Emails Once mail2couch has synced your emails, you can access them through CouchDB's REST API: #### Raw Data Access ```bash # List all databases http://localhost:5984/_all_dbs # View database info http://localhost:5984/{database} # List all documents in database http://localhost:5984/{database}/_all_docs # Get individual message http://localhost:5984/{database}/{message_id} # Get message with attachments http://localhost:5984/{database}/{message_id}/{attachment_name} ``` ## Example Configurations ### Simple Configuration Basic setup for a single Gmail account: ```json { "couchDb": { "url": "http://localhost:5984", "user": "admin", "password": "password" }, "mailSources": [ { "name": "Personal Gmail", "enabled": true, "protocol": "imap", "host": "imap.gmail.com", "port": 993, "user": "your-email@gmail.com", "password": "your-app-password", "mode": "archive", "folderFilter": { "include": ["INBOX", "Sent"], "exclude": [] }, "messageFilter": { "since": "2024-01-01" } } ] } ``` ### Advanced Multi-Account Configuration Complex setup with multiple accounts, filtering, and different sync modes: ```json { "couchDb": { "url": "https://your-couchdb.example.com:5984", "user": "backup_user", "password": "secure_password" }, "mailSources": [ { "name": "Work Email", "enabled": true, "protocol": "imap", "host": "outlook.office365.com", "port": 993, "user": "you@company.com", "password": "app-password", "mode": "sync", "folderFilter": { "include": ["*"], "exclude": ["Deleted Items", "Junk Email", "Drafts"] }, "messageFilter": { "since": "2023-01-01", "subjectKeywords": ["project", "meeting", "urgent"], "senderKeywords": ["@company.com", "@client.com"] } }, { "name": "Personal Gmail", "enabled": true, "protocol": "imap", "host": "imap.gmail.com", "port": 993, "user": "personal@gmail.com", "password": "gmail-app-password", "mode": "archive", "folderFilter": { "include": ["INBOX", "Important", "Work/*", "Personal/*"], "exclude": ["[Gmail]/Trash", "[Gmail]/Spam", "*Temp*"] }, "messageFilter": { "recipientKeywords": ["family@", "personal@"] } }, { "name": "Self-Hosted Mail", "enabled": true, "protocol": "imap", "host": "mail.yourdomain.com", "port": 143, "user": "admin@yourdomain.com", "password": "mail-password", "mode": "archive", "folderFilter": { "include": ["INBOX", "Archive/*", "Projects/*"], "exclude": ["*/Drafts", "Trash"] }, "messageFilter": { "since": "2023-06-01", "subjectKeywords": ["invoice", "receipt", "statement"] } }, { "name": "Legacy Account", "enabled": false, "protocol": "imap", "host": "legacy.mailserver.com", "port": 993, "user": "old@account.com", "password": "legacy-password", "mode": "archive", "folderFilter": { "include": ["INBOX"], "exclude": [] }, "messageFilter": {} } ] } ``` ### Configuration Options Reference #### CouchDB Configuration - `url`: CouchDB server URL with protocol and port - `user`: CouchDB username with database access - `password`: CouchDB password #### Mail Source Configuration - `name`: Descriptive name (used for database naming) - `enabled`: Boolean to enable/disable this source - `protocol`: Only `"imap"` currently supported - `host`: IMAP server hostname - `port`: IMAP port (993 for TLS, 143 for plain, 3143 for testing) - `user`: Email account username - `password`: Email account password (use app passwords for Gmail/Outlook) - `mode`: `"sync"` (mirror server) or `"archive"` (preserve all messages) #### Folder Filter Configuration - `include`: Array of folder patterns to process (empty = all folders) - `exclude`: Array of folder patterns to skip #### Message Filter Configuration - `since`: Date string (YYYY-MM-DD) to process messages from - `subjectKeywords`: Array of keywords that must appear in subject line - `senderKeywords`: Array of keywords that must appear in sender addresses - `recipientKeywords`: Array of keywords that must appear in recipient addresses ## Production Deployment ### Security Considerations - Use app passwords instead of account passwords - Store configuration files with restricted permissions (600) - Use HTTPS for CouchDB connections in production - Consider encrypting sensitive configuration data ### Monitoring and Maintenance - Review sync metadata documents for sync health - Monitor CouchDB database sizes and compaction - Set up log rotation for application output - Schedule regular backups of CouchDB databases ### Performance Tuning - Use `--max-messages`/`-m` to limit processing load - Run during off-peak hours for large initial syncs - Monitor IMAP server rate limits and connection limits - Consider running multiple instances for different accounts ## Troubleshooting ### Common Issues **Connection Errors**: - Verify IMAP server settings and credentials - Check firewall and network connectivity - Ensure correct ports (993 for TLS, 143 for plain) **Authentication Failures**: - Use app passwords for Gmail, Outlook, and other providers - Enable "Less Secure Apps" if required by provider - Verify account permissions and 2FA settings **Sync Issues**: - Check CouchDB connectivity and permissions - Review sync metadata documents for error states - Verify folder names and patterns match server structure **Performance Problems**: - Use date filtering (`since`) for large mailboxes - Implement `--max-messages`/`-m` limits for initial syncs - Monitor server-side rate limiting For detailed troubleshooting, see the [test environment documentation](test/README.md). ## Future Plans ### CouchDB-Hosted Webmail Viewer We plan to develop a comprehensive webmail interface for viewing the archived emails directly through CouchDB. This will include: - **📧 Modern Web Interface**: A responsive, Gmail-style webmail viewer built on CouchDB design documents - **🔍 Advanced Search**: Full-text search across subjects, senders, and message content - **📁 Folder Organization**: Browse messages by mailbox with visual indicators and statistics - **📎 Attachment Viewer**: Direct download and preview of email attachments - **📱 Mobile Support**: Optimized interface for tablets and smartphones - **🎨 Customizable Themes**: Multiple UI themes and layout options - **⚡ Real-time Updates**: Live synchronization as new emails are archived - **🔐 Authentication**: Secure access controls and user management - **📊 Analytics Dashboard**: Email statistics and storage insights This webmail viewer will be implemented as: - **CouchDB Design Documents**: Views, shows, and list functions for data access - **Self-contained HTML/CSS/JS**: No external dependencies or servers required - **RESTful Architecture**: Clean API endpoints for integration with other tools - **Progressive Enhancement**: Works with JavaScript disabled for basic functionality The webmail interface will be a separate component that can be optionally installed alongside the core mail2couch storage functionality, maintaining the clean separation between data archival and presentation layers. ## Contributing This project welcomes contributions! Please see [CLAUDE.md](CLAUDE.md) for development setup and architecture details. ## License [License information to be added]