feat: add comprehensive README documentation and clean up configuration
## Documentation Enhancements - Create comprehensive README with installation, configuration, and usage examples - Add simple, advanced, and provider-specific configuration examples - Document all features: incremental sync, wildcard patterns, keyword filtering, attachment support - Include production deployment guidance and troubleshooting section - Add architecture documentation with database structure and document format examples ## Configuration Cleanup - Remove unnecessary `database` field from CouchDB configuration - Add `m2c_` prefix to all CouchDB database names for better namespace isolation - Update GenerateAccountDBName() to consistently prefix databases with `m2c_` - Clean up all configuration examples to remove deprecated database field ## Test Environment Simplification - Simplify test script structure to eliminate confusion and redundancy - Remove redundant populate-test-messages.sh wrapper script - Update run-tests.sh to be comprehensive automated test with cleanup - Maintain clear separation: automated tests vs manual testing environment - Update all test scripts to expect m2c-prefixed database names ## Configuration Examples Added - config-simple.json: Basic single Gmail account setup - config-advanced.json: Multi-account with complex filtering and different providers - config-providers.json: Real-world configurations for Gmail, Outlook, Yahoo, iCloud ## Benefits - Clear documentation for users from beginner to advanced - Namespace isolation prevents database conflicts in shared CouchDB instances - Simplified test workflow eliminates user confusion about which scripts to use - Comprehensive examples cover common email provider configurations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
357cd06264
commit
c2ad55eaaf
17 changed files with 1139 additions and 111 deletions
388
README.md
388
README.md
|
|
@ -1,5 +1,389 @@
|
|||
# mail2couch
|
||||
|
||||
A utility to back up mail from various sources to couchdb
|
||||
A powerful email backup utility that synchronizes mail from IMAP accounts to CouchDB databases with intelligent incremental sync, comprehensive filtering, and native attachment support.
|
||||
|
||||
At least two implementations will be available, on in Rust and one in Go.
|
||||
## Features
|
||||
|
||||
### Core Functionality
|
||||
- **IMAP Email Backup**: Connect to any IMAP server (Gmail, Outlook, self-hosted)
|
||||
- **CouchDB Storage**: Store emails as JSON documents with native CouchDB attachments
|
||||
- **Incremental Sync**: Efficiently sync only new messages using IMAP SEARCH with timestamp tracking
|
||||
- **Per-Account Databases**: Each mail source gets its own CouchDB database for better organization
|
||||
- **Duplicate Prevention**: Automatic detection and prevention of duplicate message storage
|
||||
|
||||
### Sync Modes
|
||||
- **Archive Mode**: Preserve all messages ever seen, even if deleted from mail server (default)
|
||||
- **Sync Mode**: Maintain 1-to-1 relationship with mail server (removes deleted messages from CouchDB)
|
||||
|
||||
### Advanced Filtering
|
||||
- **Wildcard Folder Patterns**: Use `*`, `?`, `[abc]` patterns for flexible folder selection
|
||||
- **Keyword Filtering**: Filter messages by keywords in subjects, senders, or recipients
|
||||
- **Date Filtering**: Process only messages since a specific date
|
||||
- **Include/Exclude Logic**: Combine multiple filter types for precise control
|
||||
|
||||
### Message Processing
|
||||
- **Full MIME Support**: Parse multipart messages, HTML/plain text, and embedded content
|
||||
- **Native Attachments**: Store email attachments as CouchDB native attachments with compression
|
||||
- **Complete Headers**: Preserve all email headers and metadata
|
||||
- **UTF-8 Support**: Handle international characters and special content
|
||||
|
||||
### Operational Features
|
||||
- **Automatic Config Discovery**: Finds configuration files in standard locations
|
||||
- **Command Line Control**: Override settings with `--max-messages` and `--config` flags
|
||||
- **Comprehensive Logging**: Detailed output for monitoring and troubleshooting
|
||||
- **Error Resilience**: Graceful handling of network issues and server problems
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Installation
|
||||
|
||||
1. **Install dependencies**:
|
||||
```bash
|
||||
# Go 1.21+ required
|
||||
go version
|
||||
```
|
||||
|
||||
2. **Clone and build**:
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd mail2couch/go
|
||||
go build -o mail2couch .
|
||||
```
|
||||
|
||||
### Basic Usage
|
||||
|
||||
1. **Create configuration file** (`config.json`):
|
||||
```json
|
||||
{
|
||||
"couchDb": {
|
||||
"url": "http://localhost:5984",
|
||||
"user": "admin",
|
||||
"password": "password"
|
||||
},
|
||||
"mailSources": [
|
||||
{
|
||||
"name": "Personal Gmail",
|
||||
"enabled": true,
|
||||
"protocol": "imap",
|
||||
"host": "imap.gmail.com",
|
||||
"port": 993,
|
||||
"user": "your-email@gmail.com",
|
||||
"password": "your-app-password",
|
||||
"mode": "archive",
|
||||
"folderFilter": {
|
||||
"include": ["*"],
|
||||
"exclude": ["[Gmail]/Trash", "[Gmail]/Spam"]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
2. **Run mail2couch**:
|
||||
```bash
|
||||
./mail2couch
|
||||
```
|
||||
|
||||
The application will:
|
||||
- Create a CouchDB database named `m2c_personal_gmail`
|
||||
- Sync all folders except Trash and Spam
|
||||
- Store messages with native attachments
|
||||
- Track sync state for efficient incremental updates
|
||||
|
||||
## Configuration
|
||||
|
||||
### Configuration File Discovery
|
||||
|
||||
mail2couch automatically searches for configuration files in this order:
|
||||
1. Path specified by `--config` flag
|
||||
2. `./config.json` (current directory)
|
||||
3. `./config/config.json` (config subdirectory)
|
||||
4. `~/.config/mail2couch/config.json` (user config directory)
|
||||
5. `~/.mail2couch.json` (user home directory)
|
||||
|
||||
### Command Line Options
|
||||
|
||||
```bash
|
||||
./mail2couch [options]
|
||||
|
||||
Options:
|
||||
--config PATH Specify configuration file path
|
||||
--max-messages N Limit messages processed per mailbox per run (0 = unlimited)
|
||||
```
|
||||
|
||||
### Folder Pattern Examples
|
||||
|
||||
| Pattern | Description | Matches |
|
||||
|---------|-------------|---------|
|
||||
| `"*"` | All folders | `INBOX`, `Sent`, `Work/Projects`, etc. |
|
||||
| `"INBOX"` | Exact match | `INBOX` only |
|
||||
| `"Work*"` | Prefix match | `Work`, `Work/Projects`, `WorkStuff` |
|
||||
| `"*/Archive"` | Suffix match | `Personal/Archive`, `Work/Archive` |
|
||||
| `"Work/*"` | Subfolder match | `Work/Projects`, `Work/Clients` |
|
||||
|
||||
### Keyword Filtering Examples
|
||||
|
||||
```json
|
||||
{
|
||||
"messageFilter": {
|
||||
"subjectKeywords": ["urgent", "meeting", "invoice"],
|
||||
"senderKeywords": ["@company.com", "noreply@"],
|
||||
"recipientKeywords": ["team@", "support@"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Advanced Configuration Examples
|
||||
|
||||
See the [example configurations](#example-configurations) section below for detailed configuration scenarios.
|
||||
|
||||
## Testing
|
||||
|
||||
A comprehensive test environment is included with Podman containers:
|
||||
|
||||
```bash
|
||||
cd test
|
||||
|
||||
# Quick automated testing (recommended)
|
||||
./run-tests.sh # Complete integration test with automatic cleanup
|
||||
|
||||
# Specialized feature testing
|
||||
./test-wildcard-patterns.sh # Test folder pattern matching
|
||||
./test-incremental-sync.sh # Test incremental synchronization
|
||||
|
||||
# Manual testing environment
|
||||
./start-test-env.sh # Start persistent test environment
|
||||
# ... manual testing with various configurations ...
|
||||
./stop-test-env.sh # Clean up when done
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Database Structure
|
||||
- **Per-Account Databases**: Each mail source creates its own CouchDB database with `m2c_` prefix
|
||||
- **Message Documents**: Each email becomes a CouchDB document with metadata
|
||||
- **Native Attachments**: Email attachments stored as CouchDB attachments (compressed)
|
||||
- **Sync Metadata**: Tracks incremental sync state per mailbox
|
||||
|
||||
### Document Structure
|
||||
```json
|
||||
{
|
||||
"_id": "INBOX_12345",
|
||||
"sourceUid": "12345",
|
||||
"mailbox": "INBOX",
|
||||
"from": ["sender@example.com"],
|
||||
"to": ["recipient@example.com"],
|
||||
"subject": "Sample Email",
|
||||
"date": "2024-01-15T10:30:00Z",
|
||||
"body": "Email content...",
|
||||
"headers": {"Content-Type": ["text/plain"]},
|
||||
"storedAt": "2024-01-15T10:35:00Z",
|
||||
"docType": "mail",
|
||||
"hasAttachments": true,
|
||||
"_attachments": {
|
||||
"document.pdf": {
|
||||
"content_type": "application/pdf",
|
||||
"length": 54321
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Example Configurations
|
||||
|
||||
### Simple Configuration
|
||||
Basic setup for a single Gmail account:
|
||||
|
||||
```json
|
||||
{
|
||||
"couchDb": {
|
||||
"url": "http://localhost:5984",
|
||||
"user": "admin",
|
||||
"password": "password"
|
||||
},
|
||||
"mailSources": [
|
||||
{
|
||||
"name": "Personal Gmail",
|
||||
"enabled": true,
|
||||
"protocol": "imap",
|
||||
"host": "imap.gmail.com",
|
||||
"port": 993,
|
||||
"user": "your-email@gmail.com",
|
||||
"password": "your-app-password",
|
||||
"mode": "archive",
|
||||
"folderFilter": {
|
||||
"include": ["INBOX", "Sent"],
|
||||
"exclude": []
|
||||
},
|
||||
"messageFilter": {
|
||||
"since": "2024-01-01"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Advanced Multi-Account Configuration
|
||||
Complex setup with multiple accounts, filtering, and different sync modes:
|
||||
|
||||
```json
|
||||
{
|
||||
"couchDb": {
|
||||
"url": "https://your-couchdb.example.com:5984",
|
||||
"user": "backup_user",
|
||||
"password": "secure_password"
|
||||
},
|
||||
"mailSources": [
|
||||
{
|
||||
"name": "Work Email",
|
||||
"enabled": true,
|
||||
"protocol": "imap",
|
||||
"host": "outlook.office365.com",
|
||||
"port": 993,
|
||||
"user": "you@company.com",
|
||||
"password": "app-password",
|
||||
"mode": "sync",
|
||||
"folderFilter": {
|
||||
"include": ["*"],
|
||||
"exclude": ["Deleted Items", "Junk Email", "Drafts"]
|
||||
},
|
||||
"messageFilter": {
|
||||
"since": "2023-01-01",
|
||||
"subjectKeywords": ["project", "meeting", "urgent"],
|
||||
"senderKeywords": ["@company.com", "@client.com"]
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Personal Gmail",
|
||||
"enabled": true,
|
||||
"protocol": "imap",
|
||||
"host": "imap.gmail.com",
|
||||
"port": 993,
|
||||
"user": "personal@gmail.com",
|
||||
"password": "gmail-app-password",
|
||||
"mode": "archive",
|
||||
"folderFilter": {
|
||||
"include": ["INBOX", "Important", "Work/*", "Personal/*"],
|
||||
"exclude": ["[Gmail]/Trash", "[Gmail]/Spam", "*Temp*"]
|
||||
},
|
||||
"messageFilter": {
|
||||
"recipientKeywords": ["family@", "personal@"]
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Self-Hosted Mail",
|
||||
"enabled": true,
|
||||
"protocol": "imap",
|
||||
"host": "mail.yourdomain.com",
|
||||
"port": 143,
|
||||
"user": "admin@yourdomain.com",
|
||||
"password": "mail-password",
|
||||
"mode": "archive",
|
||||
"folderFilter": {
|
||||
"include": ["INBOX", "Archive/*", "Projects/*"],
|
||||
"exclude": ["*/Drafts", "Trash"]
|
||||
},
|
||||
"messageFilter": {
|
||||
"since": "2023-06-01",
|
||||
"subjectKeywords": ["invoice", "receipt", "statement"]
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "Legacy Account",
|
||||
"enabled": false,
|
||||
"protocol": "imap",
|
||||
"host": "legacy.mailserver.com",
|
||||
"port": 993,
|
||||
"user": "old@account.com",
|
||||
"password": "legacy-password",
|
||||
"mode": "archive",
|
||||
"folderFilter": {
|
||||
"include": ["INBOX"],
|
||||
"exclude": []
|
||||
},
|
||||
"messageFilter": {}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration Options Reference
|
||||
|
||||
#### CouchDB Configuration
|
||||
- `url`: CouchDB server URL with protocol and port
|
||||
- `user`: CouchDB username with database access
|
||||
- `password`: CouchDB password
|
||||
|
||||
#### Mail Source Configuration
|
||||
- `name`: Descriptive name (used for database naming)
|
||||
- `enabled`: Boolean to enable/disable this source
|
||||
- `protocol`: Only `"imap"` currently supported
|
||||
- `host`: IMAP server hostname
|
||||
- `port`: IMAP port (993 for TLS, 143 for plain, 3143 for testing)
|
||||
- `user`: Email account username
|
||||
- `password`: Email account password (use app passwords for Gmail/Outlook)
|
||||
- `mode`: `"sync"` (mirror server) or `"archive"` (preserve all messages)
|
||||
|
||||
#### Folder Filter Configuration
|
||||
- `include`: Array of folder patterns to process (empty = all folders)
|
||||
- `exclude`: Array of folder patterns to skip
|
||||
|
||||
#### Message Filter Configuration
|
||||
- `since`: Date string (YYYY-MM-DD) to process messages from
|
||||
- `subjectKeywords`: Array of keywords that must appear in subject line
|
||||
- `senderKeywords`: Array of keywords that must appear in sender addresses
|
||||
- `recipientKeywords`: Array of keywords that must appear in recipient addresses
|
||||
|
||||
## Production Deployment
|
||||
|
||||
### Security Considerations
|
||||
- Use app passwords instead of account passwords
|
||||
- Store configuration files with restricted permissions (600)
|
||||
- Use HTTPS for CouchDB connections in production
|
||||
- Consider encrypting sensitive configuration data
|
||||
|
||||
### Monitoring and Maintenance
|
||||
- Review sync metadata documents for sync health
|
||||
- Monitor CouchDB database sizes and compaction
|
||||
- Set up log rotation for application output
|
||||
- Schedule regular backups of CouchDB databases
|
||||
|
||||
### Performance Tuning
|
||||
- Use `--max-messages` to limit processing load
|
||||
- Run during off-peak hours for large initial syncs
|
||||
- Monitor IMAP server rate limits and connection limits
|
||||
- Consider running multiple instances for different accounts
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Connection Errors**:
|
||||
- Verify IMAP server settings and credentials
|
||||
- Check firewall and network connectivity
|
||||
- Ensure correct ports (993 for TLS, 143 for plain)
|
||||
|
||||
**Authentication Failures**:
|
||||
- Use app passwords for Gmail, Outlook, and other providers
|
||||
- Enable "Less Secure Apps" if required by provider
|
||||
- Verify account permissions and 2FA settings
|
||||
|
||||
**Sync Issues**:
|
||||
- Check CouchDB connectivity and permissions
|
||||
- Review sync metadata documents for error states
|
||||
- Verify folder names and patterns match server structure
|
||||
|
||||
**Performance Problems**:
|
||||
- Use date filtering (`since`) for large mailboxes
|
||||
- Implement `--max-messages` limits for initial syncs
|
||||
- Monitor server-side rate limiting
|
||||
|
||||
For detailed troubleshooting, see the [test environment documentation](test/README.md).
|
||||
|
||||
## Contributing
|
||||
|
||||
This project welcomes contributions! Please see [CLAUDE.md](CLAUDE.md) for development setup and architecture details.
|
||||
|
||||
## License
|
||||
|
||||
[License information to be added]
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue