## Wildcard Folder Selection
- Add support for wildcard patterns (`*`, `?`, `[abc]`) using filepath.Match
- Implement special case: `"*"` selects ALL available folders
- Support for complex include/exclude pattern combinations
- Maintain backwards compatibility with exact string matching
- Enable subfolder pattern matching (e.g., `Work/*`, `*/Drafts`)
## Keyword Filtering
- Add SubjectKeywords, SenderKeywords, RecipientKeywords to MessageFilter config
- Implement case-insensitive keyword matching across message fields
- Support multiple keywords per filter type with inclusive OR logic
- Add ShouldProcessMessage method for message-level filtering
## Enhanced Test Environment
- Create comprehensive wildcard pattern test scenarios
- Add 12 test folders covering various pattern types: Work/*, Important/*, Archive/*, exact matches
- Implement dedicated wildcard test script (test-wildcard-patterns.sh)
- Update test configurations to demonstrate real-world wildcard usage patterns
- Enhance test data generation with folder-specific messages for validation
## Documentation
- Create FOLDER_PATTERNS.md with comprehensive wildcard examples and use cases
- Update CLAUDE.md to reflect all implemented features and current status
- Enhance test README with detailed wildcard pattern explanations
- Provide configuration examples for common email organization scenarios
## Message Origin Tracking
- Verify all messages in CouchDB properly tagged with origin folder in `mailbox` field
- Maintain per-account database isolation for better organization
- Document ID format: `{folder}_{uid}` ensures uniqueness across folders
Key patterns supported:
- `["*"]` - All folders (with excludes)
- `["Work*", "Important*"]` - Prefix matching
- `["Work/*", "Archive/*"]` - Subfolder patterns
- `["INBOX", "Sent"]` - Exact matches
- Complex include/exclude combinations
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
142 lines
No EOL
6.1 KiB
Markdown
142 lines
No EOL
6.1 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
mail2couch is a utility for backing up mail from various sources (primarily IMAP) to CouchDB. The project supports two implementations:
|
|
- **Go implementation**: Located in `/go/` directory (currently the active implementation)
|
|
- **Rust implementation**: Planned but not yet implemented
|
|
|
|
## Development Commands
|
|
|
|
### Go Implementation (Primary)
|
|
|
|
```bash
|
|
# Build the application
|
|
cd go && go build -o mail2couch .
|
|
|
|
# Run the application with automatic config discovery
|
|
cd go && ./mail2couch
|
|
|
|
# Run with specific config file
|
|
cd go && ./mail2couch -config /path/to/config.json
|
|
|
|
# Run with message limit (useful for large mailboxes)
|
|
cd go && ./mail2couch -max-messages 100
|
|
|
|
# Run with both config and message limit
|
|
cd go && ./mail2couch -config /path/to/config.json -max-messages 50
|
|
|
|
# Run linting/static analysis
|
|
cd go && go vet ./...
|
|
|
|
# Run tests (currently no tests exist)
|
|
cd go && go test ./...
|
|
|
|
# Check dependencies
|
|
cd go && go mod tidy
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Core Components
|
|
|
|
1. **Configuration (`config/`)**: JSON-based configuration system
|
|
- Supports multiple mail sources with filtering options
|
|
- CouchDB connection settings
|
|
- Each source can have folder and message filters
|
|
|
|
2. **Mail Handling (`mail/`)**: IMAP client implementation
|
|
- Uses `github.com/emersion/go-imap/v2` for IMAP operations
|
|
- Supports TLS connections
|
|
- Currently only lists mailboxes (backup functionality not yet implemented)
|
|
|
|
3. **CouchDB Integration (`couch/`)**: Database operations
|
|
- Uses `github.com/go-kivik/kivik/v4` as CouchDB driver
|
|
- Handles database creation and document management
|
|
- Defines `MailDocument` structure for email storage
|
|
|
|
### Configuration Structure
|
|
|
|
The application uses `config.json` for configuration with the following structure:
|
|
- `couchDb`: Database connection settings (URL, credentials, database name - note: the database field is now ignored as each mail source gets its own database)
|
|
- `mailSources`: Array of mail sources with individual settings:
|
|
- Protocol support (currently only IMAP)
|
|
- Connection details (host, port, credentials)
|
|
- `mode`: Either "sync" or "archive" (defaults to "archive" if not specified)
|
|
- **sync**: 1-to-1 relationship - CouchDB documents match exactly what's in the mail account (may remove documents from CouchDB)
|
|
- **archive**: Archive mode - CouchDB keeps all messages ever seen, even if deleted from mail account (never removes documents)
|
|
- Filtering options for folders and messages with wildcard support
|
|
- Enable/disable per source
|
|
|
|
### Configuration File Discovery
|
|
|
|
The application automatically searches for configuration files in the following order:
|
|
1. Path specified by `-config` command line flag
|
|
2. `./config.json` (current working directory)
|
|
3. `./config/config.json` (config subdirectory)
|
|
4. `~/.config/mail2couch/config.json` (user XDG config directory)
|
|
5. `~/.mail2couch.json` (user home directory)
|
|
|
|
This design ensures the same `config.json` format will work for both Go and Rust implementations.
|
|
|
|
### Current Implementation Status
|
|
|
|
- ✅ Configuration loading with automatic file discovery
|
|
- ✅ Command line flag support for config file path
|
|
- ✅ Per-account CouchDB database creation and management
|
|
- ✅ IMAP connection and mailbox listing
|
|
- ✅ Build error fixes
|
|
- ✅ Email message retrieval framework (with placeholder data)
|
|
- ✅ Email storage to CouchDB framework with native attachments
|
|
- ✅ Folder filtering logic with wildcard support (`*`, `?`, `[abc]` patterns)
|
|
- ✅ Date filtering support
|
|
- ✅ Keyword filtering support (subject, sender, recipient keywords)
|
|
- ✅ Duplicate detection and prevention
|
|
- ✅ Sync vs Archive mode implementation
|
|
- ✅ CouchDB attachment storage for email attachments
|
|
- ✅ Real IMAP message parsing with go-message library
|
|
- ✅ Full message body and attachment handling with MIME multipart support
|
|
- ✅ Command line argument support (--max-messages flag)
|
|
- ✅ Per-account CouchDB databases for better organization
|
|
- ❌ Incremental sync functionality
|
|
- ❌ Rust implementation
|
|
|
|
### Key Dependencies
|
|
|
|
- `github.com/emersion/go-imap/v2`: IMAP client library
|
|
- `github.com/go-kivik/kivik/v4`: CouchDB client library
|
|
|
|
### Development Notes
|
|
|
|
- The main entry point is `main.go` which orchestrates the configuration loading, CouchDB setup, and mail source processing
|
|
- Each mail source gets its own CouchDB database named using `GenerateAccountDBName()` function
|
|
- Each mail source is processed sequentially with proper error handling
|
|
- The application currently uses placeholder message data for testing the storage pipeline
|
|
- Message filtering by folder (include/exclude) and date (since) is implemented
|
|
- Duplicate detection prevents re-storing existing messages
|
|
- Sync vs Archive mode determines whether to remove documents from CouchDB when they're no longer in the mail account
|
|
- Email attachments are stored as native CouchDB attachments linked to the email document
|
|
- No tests are currently implemented
|
|
- The application uses automatic config file discovery as documented above
|
|
|
|
### Next Steps
|
|
|
|
To complete the implementation, the following items need to be addressed:
|
|
|
|
1. **Real IMAP Message Parsing**: Replace placeholder message generation with actual IMAP message fetching and parsing using the correct go-imap/v2 API
|
|
2. **Message Body Extraction**: Implement proper text/plain and text/html body extraction from multipart messages
|
|
3. **Keyword Filtering**: Add support for filtering messages by keywords in:
|
|
- Subject line (`subjectKeywords`)
|
|
- Sender addresses (`senderKeywords`)
|
|
- Recipient addresses (`recipientKeywords`)
|
|
4. **Attachment Handling**: Add support for email attachments (optional)
|
|
5. **Error Recovery**: Add retry logic for network failures and partial sync recovery
|
|
6. **Performance**: Add batch operations for better CouchDB insertion performance
|
|
7. **Testing**: Add unit tests for all major components
|
|
|
|
## Development Guidelines
|
|
|
|
### Code Quality and Standards
|
|
- All code requires perfect linting and tool-formatting, exceptions are allowed only if documented properly |