## Documentation Enhancements - Create comprehensive README with installation, configuration, and usage examples - Add simple, advanced, and provider-specific configuration examples - Document all features: incremental sync, wildcard patterns, keyword filtering, attachment support - Include production deployment guidance and troubleshooting section - Add architecture documentation with database structure and document format examples ## Configuration Cleanup - Remove unnecessary `database` field from CouchDB configuration - Add `m2c_` prefix to all CouchDB database names for better namespace isolation - Update GenerateAccountDBName() to consistently prefix databases with `m2c_` - Clean up all configuration examples to remove deprecated database field ## Test Environment Simplification - Simplify test script structure to eliminate confusion and redundancy - Remove redundant populate-test-messages.sh wrapper script - Update run-tests.sh to be comprehensive automated test with cleanup - Maintain clear separation: automated tests vs manual testing environment - Update all test scripts to expect m2c-prefixed database names ## Configuration Examples Added - config-simple.json: Basic single Gmail account setup - config-advanced.json: Multi-account with complex filtering and different providers - config-providers.json: Real-world configurations for Gmail, Outlook, Yahoo, iCloud ## Benefits - Clear documentation for users from beginner to advanced - Namespace isolation prevents database conflicts in shared CouchDB instances - Simplified test workflow eliminates user confusion about which scripts to use - Comprehensive examples cover common email provider configurations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
11 KiB
mail2couch
A powerful email backup utility that synchronizes mail from IMAP accounts to CouchDB databases with intelligent incremental sync, comprehensive filtering, and native attachment support.
Features
Core Functionality
- IMAP Email Backup: Connect to any IMAP server (Gmail, Outlook, self-hosted)
- CouchDB Storage: Store emails as JSON documents with native CouchDB attachments
- Incremental Sync: Efficiently sync only new messages using IMAP SEARCH with timestamp tracking
- Per-Account Databases: Each mail source gets its own CouchDB database for better organization
- Duplicate Prevention: Automatic detection and prevention of duplicate message storage
Sync Modes
- Archive Mode: Preserve all messages ever seen, even if deleted from mail server (default)
- Sync Mode: Maintain 1-to-1 relationship with mail server (removes deleted messages from CouchDB)
Advanced Filtering
- Wildcard Folder Patterns: Use
*,?,[abc]patterns for flexible folder selection - Keyword Filtering: Filter messages by keywords in subjects, senders, or recipients
- Date Filtering: Process only messages since a specific date
- Include/Exclude Logic: Combine multiple filter types for precise control
Message Processing
- Full MIME Support: Parse multipart messages, HTML/plain text, and embedded content
- Native Attachments: Store email attachments as CouchDB native attachments with compression
- Complete Headers: Preserve all email headers and metadata
- UTF-8 Support: Handle international characters and special content
Operational Features
- Automatic Config Discovery: Finds configuration files in standard locations
- Command Line Control: Override settings with
--max-messagesand--configflags - Comprehensive Logging: Detailed output for monitoring and troubleshooting
- Error Resilience: Graceful handling of network issues and server problems
Quick Start
Installation
-
Install dependencies:
# Go 1.21+ required go version -
Clone and build:
git clone <repository-url> cd mail2couch/go go build -o mail2couch .
Basic Usage
-
Create configuration file (
config.json):{ "couchDb": { "url": "http://localhost:5984", "user": "admin", "password": "password" }, "mailSources": [ { "name": "Personal Gmail", "enabled": true, "protocol": "imap", "host": "imap.gmail.com", "port": 993, "user": "your-email@gmail.com", "password": "your-app-password", "mode": "archive", "folderFilter": { "include": ["*"], "exclude": ["[Gmail]/Trash", "[Gmail]/Spam"] } } ] } -
Run mail2couch:
./mail2couch
The application will:
- Create a CouchDB database named
m2c_personal_gmail - Sync all folders except Trash and Spam
- Store messages with native attachments
- Track sync state for efficient incremental updates
Configuration
Configuration File Discovery
mail2couch automatically searches for configuration files in this order:
- Path specified by
--configflag ./config.json(current directory)./config/config.json(config subdirectory)~/.config/mail2couch/config.json(user config directory)~/.mail2couch.json(user home directory)
Command Line Options
./mail2couch [options]
Options:
--config PATH Specify configuration file path
--max-messages N Limit messages processed per mailbox per run (0 = unlimited)
Folder Pattern Examples
| Pattern | Description | Matches |
|---|---|---|
"*" |
All folders | INBOX, Sent, Work/Projects, etc. |
"INBOX" |
Exact match | INBOX only |
"Work*" |
Prefix match | Work, Work/Projects, WorkStuff |
"*/Archive" |
Suffix match | Personal/Archive, Work/Archive |
"Work/*" |
Subfolder match | Work/Projects, Work/Clients |
Keyword Filtering Examples
{
"messageFilter": {
"subjectKeywords": ["urgent", "meeting", "invoice"],
"senderKeywords": ["@company.com", "noreply@"],
"recipientKeywords": ["team@", "support@"]
}
}
Advanced Configuration Examples
See the example configurations section below for detailed configuration scenarios.
Testing
A comprehensive test environment is included with Podman containers:
cd test
# Quick automated testing (recommended)
./run-tests.sh # Complete integration test with automatic cleanup
# Specialized feature testing
./test-wildcard-patterns.sh # Test folder pattern matching
./test-incremental-sync.sh # Test incremental synchronization
# Manual testing environment
./start-test-env.sh # Start persistent test environment
# ... manual testing with various configurations ...
./stop-test-env.sh # Clean up when done
Architecture
Database Structure
- Per-Account Databases: Each mail source creates its own CouchDB database with
m2c_prefix - Message Documents: Each email becomes a CouchDB document with metadata
- Native Attachments: Email attachments stored as CouchDB attachments (compressed)
- Sync Metadata: Tracks incremental sync state per mailbox
Document Structure
{
"_id": "INBOX_12345",
"sourceUid": "12345",
"mailbox": "INBOX",
"from": ["sender@example.com"],
"to": ["recipient@example.com"],
"subject": "Sample Email",
"date": "2024-01-15T10:30:00Z",
"body": "Email content...",
"headers": {"Content-Type": ["text/plain"]},
"storedAt": "2024-01-15T10:35:00Z",
"docType": "mail",
"hasAttachments": true,
"_attachments": {
"document.pdf": {
"content_type": "application/pdf",
"length": 54321
}
}
}
Example Configurations
Simple Configuration
Basic setup for a single Gmail account:
{
"couchDb": {
"url": "http://localhost:5984",
"user": "admin",
"password": "password"
},
"mailSources": [
{
"name": "Personal Gmail",
"enabled": true,
"protocol": "imap",
"host": "imap.gmail.com",
"port": 993,
"user": "your-email@gmail.com",
"password": "your-app-password",
"mode": "archive",
"folderFilter": {
"include": ["INBOX", "Sent"],
"exclude": []
},
"messageFilter": {
"since": "2024-01-01"
}
}
]
}
Advanced Multi-Account Configuration
Complex setup with multiple accounts, filtering, and different sync modes:
{
"couchDb": {
"url": "https://your-couchdb.example.com:5984",
"user": "backup_user",
"password": "secure_password"
},
"mailSources": [
{
"name": "Work Email",
"enabled": true,
"protocol": "imap",
"host": "outlook.office365.com",
"port": 993,
"user": "you@company.com",
"password": "app-password",
"mode": "sync",
"folderFilter": {
"include": ["*"],
"exclude": ["Deleted Items", "Junk Email", "Drafts"]
},
"messageFilter": {
"since": "2023-01-01",
"subjectKeywords": ["project", "meeting", "urgent"],
"senderKeywords": ["@company.com", "@client.com"]
}
},
{
"name": "Personal Gmail",
"enabled": true,
"protocol": "imap",
"host": "imap.gmail.com",
"port": 993,
"user": "personal@gmail.com",
"password": "gmail-app-password",
"mode": "archive",
"folderFilter": {
"include": ["INBOX", "Important", "Work/*", "Personal/*"],
"exclude": ["[Gmail]/Trash", "[Gmail]/Spam", "*Temp*"]
},
"messageFilter": {
"recipientKeywords": ["family@", "personal@"]
}
},
{
"name": "Self-Hosted Mail",
"enabled": true,
"protocol": "imap",
"host": "mail.yourdomain.com",
"port": 143,
"user": "admin@yourdomain.com",
"password": "mail-password",
"mode": "archive",
"folderFilter": {
"include": ["INBOX", "Archive/*", "Projects/*"],
"exclude": ["*/Drafts", "Trash"]
},
"messageFilter": {
"since": "2023-06-01",
"subjectKeywords": ["invoice", "receipt", "statement"]
}
},
{
"name": "Legacy Account",
"enabled": false,
"protocol": "imap",
"host": "legacy.mailserver.com",
"port": 993,
"user": "old@account.com",
"password": "legacy-password",
"mode": "archive",
"folderFilter": {
"include": ["INBOX"],
"exclude": []
},
"messageFilter": {}
}
]
}
Configuration Options Reference
CouchDB Configuration
url: CouchDB server URL with protocol and portuser: CouchDB username with database accesspassword: CouchDB password
Mail Source Configuration
name: Descriptive name (used for database naming)enabled: Boolean to enable/disable this sourceprotocol: Only"imap"currently supportedhost: IMAP server hostnameport: IMAP port (993 for TLS, 143 for plain, 3143 for testing)user: Email account usernamepassword: Email account password (use app passwords for Gmail/Outlook)mode:"sync"(mirror server) or"archive"(preserve all messages)
Folder Filter Configuration
include: Array of folder patterns to process (empty = all folders)exclude: Array of folder patterns to skip
Message Filter Configuration
since: Date string (YYYY-MM-DD) to process messages fromsubjectKeywords: Array of keywords that must appear in subject linesenderKeywords: Array of keywords that must appear in sender addressesrecipientKeywords: Array of keywords that must appear in recipient addresses
Production Deployment
Security Considerations
- Use app passwords instead of account passwords
- Store configuration files with restricted permissions (600)
- Use HTTPS for CouchDB connections in production
- Consider encrypting sensitive configuration data
Monitoring and Maintenance
- Review sync metadata documents for sync health
- Monitor CouchDB database sizes and compaction
- Set up log rotation for application output
- Schedule regular backups of CouchDB databases
Performance Tuning
- Use
--max-messagesto limit processing load - Run during off-peak hours for large initial syncs
- Monitor IMAP server rate limits and connection limits
- Consider running multiple instances for different accounts
Troubleshooting
Common Issues
Connection Errors:
- Verify IMAP server settings and credentials
- Check firewall and network connectivity
- Ensure correct ports (993 for TLS, 143 for plain)
Authentication Failures:
- Use app passwords for Gmail, Outlook, and other providers
- Enable "Less Secure Apps" if required by provider
- Verify account permissions and 2FA settings
Sync Issues:
- Check CouchDB connectivity and permissions
- Review sync metadata documents for error states
- Verify folder names and patterns match server structure
Performance Problems:
- Use date filtering (
since) for large mailboxes - Implement
--max-messageslimits for initial syncs - Monitor server-side rate limiting
For detailed troubleshooting, see the test environment documentation.
Contributing
This project welcomes contributions! Please see CLAUDE.md for development setup and architecture details.
License
[License information to be added]