docs: complete documentation reorganization by removing old files

- Remove all documentation files from root directory
- All content has been moved to docs/ directory with updated status
- Clean up project structure for better organization
- Documentation now properly reflects production-ready status

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Ole-Morten Duesund 2025-08-05 19:34:05 +02:00
commit d3d104ee71
9 changed files with 0 additions and 1258 deletions

View file

@ -1,112 +0,0 @@
### Comprehensive Analysis of `mail2couch` Implementations
This document provides an updated, in-depth analysis of the `mail2couch` project, integrating findings from the original `ANALYSIS.md` with a fresh review of the current Go and Rust codebases. It evaluates the current state, compares the two implementations, and outlines a roadmap for future improvements.
---
### 1. Current State of the Implementations
The project currently consists of two distinct implementations of the same core tool.
* **The Go Implementation**: This is a mature, functional, and straightforward command-line tool. It is built on a simple, sequential architecture and effectively synchronizes emails from IMAP servers to CouchDB. It serves as a solid baseline for the project's core functionality.
* **The Rust Implementation**: Contrary to the description in the original `ANALYSIS.md`, the Rust version is **no longer a non-functional placeholder**. It is now a complete, and in many ways, more advanced alternative to the Go version. It is built on a highly modular, asynchronous architecture, prioritizing performance, robustness, and an expanded feature set.
---
### 2. Analysis of Points from Original `ANALYSIS.md`
Several key issues and suggestions were raised in the original analysis. Here is their current status:
* **`Incomplete Rust Implementation`**: **(Addressed)** The Rust implementation is now fully functional and surpasses the Go version in features and robustness.
* **`Performance for Large-Scale Use (Concurrency)`**: **(Addressed in Rust)** The Go version remains sequential. The Rust version, however, is fully asynchronous, allowing for concurrent network operations, which directly addresses this performance concern.
* **`Inefficient Keyword Filtering`**: **(Addressed in Rust)** The Go version still performs keyword filtering client-side. The Rust version implements server-side filtering using `IMAP SEARCH` with keywords, which is significantly more efficient.
* **`Primary Weakness: Security`**: **(Still an Issue)** Both implementations still require plaintext passwords in the configuration file. This remains a primary weakness.
* **`Missing Core Feature: Web Interface`**: **(Still an Issue)** This feature has not been implemented in either version.
* **`Usability Enhancement: Dry-Run Mode`**: **(✅ Resolved)** Both implementations now include comprehensive `--dry-run/-n` mode functionality that allows safe configuration testing without making any CouchDB changes.
---
### 3. Comparative Analysis: Go vs. Rust
#### **The Go Version**
* **Pros**:
* **Simplicity**: The code is sequential and easy to follow, making it highly approachable for new contributors.
* **Stability**: It provides a solid, functional baseline that effectively accomplishes the core mission of the project.
* **Fast Compilation**: Quick compile times make for a fast development cycle.
* **Dry-Run Support**: Now includes comprehensive `--dry-run` mode for safe configuration testing.
* **Cons**:
* **Performance**: The lack of concurrency makes it slow for users with multiple accounts or large mailboxes.
* **Inefficiency**: Client-side keyword filtering wastes bandwidth and processing time.
* **Basic Error Handling**: The absence of retry logic makes it brittle in the face of transient network errors.
#### **The Rust Version**
* **Pros**:
* **Performance**: The `async` architecture provides superior performance through concurrency.
* **Robustness**: Automatic retry logic for network calls makes it highly resilient to temporary failures.
* **Feature-Rich**: Implements more efficient server-side filtering, better folder-matching logic, and a more professional CLI.
* **Safety & Maintainability**: The modular design and Rust's compile-time guarantees make the code safer and easier to maintain and extend.
* **Comprehensive Dry-Run**: Includes detailed `--dry-run` mode with enhanced simulation logging and summary reporting.
* **Cons**:
* **Complexity**: The codebase is significantly more complex due to its asynchronous nature, abstract design, and the inherent learning curve of Rust.
* **Slower Compilation**: Longer compile times can slow down development.
---
### 4. Recent Implementation Updates
#### **Dry-Run Mode Implementation (August 2025)**
Both Go and Rust implementations now include comprehensive `--dry-run` functionality:
##### **Go Implementation Features:**
- **CLI Integration**: Added `--dry-run/-n` flag using pflag with GNU-style options
- **Comprehensive Skipping**: All CouchDB write operations bypassed in dry-run mode
- **IMAP Preservation**: Maintains full IMAP operations for realistic email discovery
- **Detailed Simulation**: Shows what would be done with informative logging
- **Enhanced Reporting**: Clear distinction between dry-run and normal mode output
- **Bash Completion**: Updated completion script includes new flag
##### **Rust Implementation Features:**
- **CLI Integration**: Added `--dry-run/-n` flag using clap with structured argument parsing
- **Advanced Simulation**: Detailed logging of what would be stored including message subjects
- **Async-Safe Skipping**: All async CouchDB operations properly bypassed
- **Enhanced Summary**: Comprehensive dry-run vs normal mode reporting with emoji indicators
- **Test Coverage**: All tests updated to include new dry_run field
##### **Implementation Benefits:**
- **Risk Mitigation**: Users can validate configurations without database changes
- **Debugging Aid**: Shows exactly what emails would be processed and stored
- **Development Tool**: Enables safe testing of configuration changes
- **Documentation**: Demonstrates the full sync process without side effects
This addresses the critical usability requirement identified in the original analysis and significantly improves the user experience for configuration validation and troubleshooting.
---
### 5. Future Improvements and Missing Features
This roadmap combines suggestions from both analyses, prioritizing the most impactful changes.
#### **Tier 1: Critical Needs**
1. **Fix the Security Model (Both)**: This is the most urgent issue.
* **Short-Term**: Add support for reading credentials from environment variables (e.g., `M2C_IMAP_PASSWORD`).
* **Long-Term**: Implement OAuth2 for modern providers like Gmail and Outlook. This is the industry standard and eliminates the need to store passwords.
2. **Implement a Web Interface (Either)**: As noted in the original analysis, this is the key missing feature for making the archived data useful. This would involve creating CouchDB design documents and a simple web server to render the views.
3. ~~**Add a `--dry-run` Mode (Both)**~~: **✅ COMPLETED** - Both implementations now include comprehensive dry-run functionality with the `--dry-run/-n` flag that allows users to test their configuration safely before making any changes to their database.
#### **Tier 2: High-Impact Enhancements**
1. **Add Concurrency to the Go Version**: To bring the Go implementation closer to the performance of the Rust version, it should be updated to use goroutines to process accounts and/or mailboxes in parallel.
2. **Improve Attachment Handling in Rust**: The `TODO` in the Rust IMAP client for parsing binary attachments should be completed to ensure all attachment types are saved correctly.
3. **URL-Encode Document IDs in Rust**: The CouchDB client in the Rust version should URL-encode document IDs to prevent errors when mailbox names contain special characters.
4. **Add Progress Indicators (Rust)**: For a better user experience during long syncs, the Rust version would benefit greatly from progress bars (e.g., using the `indicatif` crate).
#### **Tier 3: "Nice-to-Have" Features**
1. **Interactive Setup (Either)**: A `mail2couch setup` command to interactively generate the `config.json` file would significantly improve first-time user experience.
2. **Support for Other Protocols/Backends (Either)**: Extend the tool to support POP3 or JMAP, or to use other databases like PostgreSQL or Elasticsearch as a storage backend.
3. **Backfill Command (Either)**: A `--backfill-all` flag to ignore existing sync metadata and perform a complete re-sync of an account.

View file

@ -1,102 +0,0 @@
# Folder Pattern Matching in mail2couch
mail2couch supports powerful wildcard patterns for selecting which folders to process. This allows flexible configuration for different mail backup scenarios.
## Pattern Syntax
The folder filtering uses Go's `filepath.Match` syntax, which supports:
- `*` matches any sequence of characters (including none)
- `?` matches any single character
- `[abc]` matches any character within the brackets
- `[a-z]` matches any character in the range
- `\` escapes special characters
## Special Cases
- `"*"` in the include list means **ALL available folders** will be processed
- Empty include list with exclude patterns will process all folders except excluded ones
- Exact string matching is supported for backwards compatibility
## Examples
### Include All Folders
```json
{
"folderFilter": {
"include": ["*"],
"exclude": ["Drafts", "Trash", "Spam"]
}
}
```
This processes all folders except Drafts, Trash, and Spam.
### Work-Related Folders Only
```json
{
"folderFilter": {
"include": ["Work*", "Projects*", "INBOX"],
"exclude": ["*Temp*", "*Draft*"]
}
}
```
This includes folders starting with "Work" or "Projects", plus INBOX, but excludes any folder containing "Temp" or "Draft".
### Archive Patterns
```json
{
"folderFilter": {
"include": ["Archive*", "*Important*", "INBOX"],
"exclude": ["*Temp"]
}
}
```
This includes folders starting with "Archive", any folder containing "Important", and INBOX, excluding temporary folders.
### Specific Folders Only
```json
{
"folderFilter": {
"include": ["INBOX", "Sent", "Important"],
"exclude": []
}
}
```
This processes only the exact folders: INBOX, Sent, and Important.
### Subfolder Patterns
```json
{
"folderFilter": {
"include": ["Work/*", "Personal/*"],
"exclude": ["*/Drafts"]
}
}
```
This includes all subfolders under Work and Personal, but excludes any Drafts subfolder.
## Folder Hierarchy
Different IMAP servers use different separators for folder hierarchies:
- Most servers use `/` (e.g., `Work/Projects`, `Archive/2024`)
- Some use `.` (e.g., `Work.Projects`, `Archive.2024`)
The patterns work with whatever separator your IMAP server uses.
## Common Use Cases
1. **Corporate Email**: `["*"]` with exclude `["Drafts", "Trash", "Spam"]` for complete backup
2. **Selective Backup**: `["INBOX", "Sent", "Important"]` for essential folders only
3. **Project-based**: `["Project*", "Client*"]` to backup work-related folders
4. **Archive Mode**: `["Archive*", "*Important*"]` for long-term storage
5. **Sync Mode**: `["INBOX"]` for real-time synchronization
## Message Origin Tracking
All messages stored in CouchDB include a `mailbox` field that records the original folder name. This ensures you can always identify which folder a message came from, regardless of how it was selected by the folder filter.
## Performance Considerations
- Using `"*"` processes all folders, which may be slow for accounts with many folders
- Specific folder names are faster than wildcard patterns
- Consider using exclude patterns to filter out large, unimportant folders like Trash or Spam

View file

@ -1,560 +0,0 @@
# Go vs Rust Implementation Comparison
This document provides a comprehensive technical analysis comparing the Go and Rust implementations of mail2couch, helping users and developers choose the best implementation for their needs.
## Executive Summary
The mail2couch project offers two distinct architectural approaches to email backup:
- **Go Implementation**: A straightforward, sequential approach emphasizing simplicity and ease of understanding
- **Rust Implementation**: A sophisticated, asynchronous architecture prioritizing performance, reliability, and advanced features
**Key Finding**: The Rust implementation (~3,056 LOC across 9 modules) is significantly more feature-complete and architecturally advanced than the Go implementation (~1,355 LOC across 4 modules), representing a mature evolution rather than a simple port.
---
## Architecture & Design Philosophy
### Go Implementation: Sequential Simplicity
**Design Philosophy**: Straightforward, imperative programming with minimal abstraction
- **Processing Model**: Sequential processing of sources → mailboxes → messages
- **Error Handling**: Basic error propagation with continue-on-error for non-critical failures
- **Modularity**: Simple package structure (`config`, `couch`, `mail`, `main`)
- **State Management**: Minimal state, mostly function-based operations
```go
// Example: Sequential processing approach
func processImapSource(source *config.MailSource, couchClient *couch.Client,
dbName string, maxMessages int, dryRun bool) error {
// Connect to IMAP server
imapClient, err := mail.NewImapClient(source)
if err != nil {
return fmt.Errorf("failed to connect to IMAP server: %w", err)
}
defer imapClient.Logout()
// Process each mailbox sequentially
for _, mailbox := range mailboxes {
// Process messages one by one
messages, currentUIDs, err := imapClient.GetMessages(...)
// Store messages synchronously
}
}
```
### Rust Implementation: Async Orchestration
**Design Philosophy**: Modular, type-safe architecture with comprehensive error handling
- **Processing Model**: Asynchronous coordination with concurrent network operations
- **Error Handling**: Sophisticated retry logic, structured error types, graceful degradation
- **Modularity**: Well-separated concerns (`cli`, `config`, `couch`, `imap`, `sync`, `filters`, `schemas`)
- **State Management**: Stateful coordinator pattern with proper resource management
```rust
// Example: Asynchronous coordination approach
impl SyncCoordinator {
pub async fn sync_all_sources(&mut self) -> Result<Vec<SourceSyncResult>> {
let mut results = Vec::new();
let sources = self.config.mail_sources.clone();
for source in &sources {
if !source.enabled {
info!("Skipping disabled source: {}", source.name);
continue;
}
match self.sync_source(source).await {
Ok(result) => {
info!("✅ Completed sync for {}: {} messages across {} mailboxes",
result.source_name, result.total_messages, result.mailboxes_processed);
results.push(result);
}
Err(e) => {
error!("❌ Failed to sync source {}: {}", source.name, e);
// Continue with other sources even if one fails
}
}
}
Ok(results)
}
}
```
---
## Performance & Scalability
### Concurrency Models
| Aspect | Go Implementation | Rust Implementation |
|--------|------------------|-------------------|
| **Processing Model** | Sequential (blocking) | Asynchronous (non-blocking) |
| **Account Processing** | One at a time | One at a time with internal concurrency |
| **Mailbox Processing** | One at a time | One at a time with async I/O |
| **Message Processing** | One at a time | Batch processing with async operations |
| **Network Operations** | Blocking I/O | Non-blocking async I/O |
### IMAP Filtering Efficiency
**Go: Client-Side Filtering**
```go
// Downloads ALL messages first, then filters locally
messages := imap.FetchAll()
filtered := []Message{}
for _, msg := range messages {
if ShouldProcessMessage(msg, filter) {
filtered = append(filtered, msg)
}
}
```
**Rust: Server-Side Filtering**
```rust
// Filters on server, only downloads matching messages
pub async fn search_messages_advanced(
&mut self,
since_date: Option<&DateTime<Utc>>,
subject_keywords: Option<&[String]>,
from_keywords: Option<&[String]>,
) -> Result<Vec<u32>> {
let mut search_parts = Vec::new();
if let Some(keywords) = subject_keywords {
for keyword in keywords {
search_parts.push(format!("SUBJECT \"{}\"", keyword));
}
}
// Server processes the filter, returns only matching UIDs
}
```
**Performance Impact**: For a mailbox with 10,000 emails where you only want recent messages:
- **Go**: Downloads all 10,000 emails, then filters locally
- **Rust**: Server filters first, downloads only matching emails (potentially 10x less data transfer)
### Error Recovery and Resilience
**Go: Basic Error Handling**
```go
err := processImapSource(&source, couchClient, dbName, args.MaxMessages, args.DryRun)
if err != nil {
log.Printf("ERROR: Failed to process IMAP source %s: %v", source.Name, err)
}
// Continues with next source, no retry logic
```
**Rust: Intelligent Retry Logic**
```rust
async fn retry_operation<F, Fut, T>(&self, operation_name: &str, operation: F) -> Result<T>
where F: Fn() -> Fut, Fut: std::future::Future<Output = Result<T>>
{
const MAX_RETRIES: u32 = 3;
const RETRY_DELAY_MS: u64 = 1000;
for attempt in 1..=MAX_RETRIES {
match operation().await {
Ok(result) => return Ok(result),
Err(e) => {
let is_retryable = match &e.downcast_ref::<CouchError>() {
Some(CouchError::Http(_)) => true,
Some(CouchError::CouchDb { status, .. }) => *status >= 500,
_ => false,
};
if is_retryable && attempt < MAX_RETRIES {
warn!("Attempt {}/{} failed for {}: {}. Retrying in {}ms...",
attempt, MAX_RETRIES, operation_name, e, RETRY_DELAY_MS);
async_std::task::sleep(Duration::from_millis(RETRY_DELAY_MS)).await;
} else {
error!("Operation {} failed after {} attempts: {}",
operation_name, attempt, e);
return Err(e);
}
}
}
}
unreachable!()
}
```
---
## Developer Experience
### Code Complexity and Learning Curve
| Aspect | Go Implementation | Rust Implementation |
|--------|------------------|-------------------|
| **Lines of Code** | 1,355 | 3,056 |
| **Number of Files** | 4 | 9 |
| **Dependencies** | 4 external | 14+ external |
| **Compilation Time** | 2-3 seconds | 6+ seconds |
| **Learning Curve** | Low | Medium-High |
| **Debugging Ease** | Simple stack traces | Rich error context |
### Dependency Management
**Go Dependencies (minimal approach):**
```go
require (
github.com/emersion/go-imap/v2 v2.0.0-beta.5
github.com/emersion/go-message v0.18.1
github.com/go-kivik/kivik/v4 v4.4.0
github.com/spf13/pflag v1.0.7
)
```
**Rust Dependencies (rich ecosystem):**
```toml
[dependencies]
anyhow = "1.0"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tokio = { version = "1.0", features = ["full"] }
reqwest = { version = "0.11", features = ["json"] }
clap = { version = "4.0", features = ["derive"] }
log = "0.4"
env_logger = "0.10"
chrono = { version = "0.4", features = ["serde"] }
async-imap = "0.9"
mail-parser = "0.6"
thiserror = "1.0"
glob = "0.3"
dirs = "5.0"
```
**Trade-offs**:
- **Go**: Faster builds, fewer potential security vulnerabilities, simpler dependency tree
- **Rust**: Richer functionality, better error types, more battle-tested async ecosystem
---
## Feature Comparison Matrix
| Feature | Go Implementation | Rust Implementation | Notes |
|---------|------------------|-------------------|-------|
| **Core Functionality** |
| IMAP Email Sync | ✅ | ✅ | Both fully functional |
| CouchDB Storage | ✅ | ✅ | Both support attachments |
| Incremental Sync | ✅ | ✅ | Both use metadata tracking |
| **Configuration** |
| JSON Config Files | ✅ | ✅ | Same format, auto-discovery |
| Folder Filtering | ✅ | ✅ | Both support wildcards |
| Date Filtering | ✅ | ✅ | Since date support |
| Keyword Filtering | ✅ (client-side) | ✅ (server-side) | Rust is more efficient |
| **CLI Features** |
| GNU-style Arguments | ✅ | ✅ | Both use standard conventions |
| Dry-run Mode | ✅ | ✅ | Both recently implemented |
| Bash Completion | ✅ | ✅ | Auto-generated scripts |
| Help System | Basic | Rich | Rust uses clap framework |
| **Reliability** |
| Error Handling | Basic | Advanced | Rust has retry logic |
| Connection Recovery | Manual | Automatic | Rust handles reconnections |
| Resource Management | Manual (defer) | Automatic (RAII) | Rust prevents leaks |
| **Performance** |
| Concurrent Processing | ❌ | ✅ | Rust uses async/await |
| Server-side Filtering | ❌ | ✅ | Rust reduces bandwidth |
| Memory Efficiency | Good | Excellent | Rust zero-copy where possible |
| **Development** |
| Test Coverage | Minimal | Comprehensive | Rust has extensive tests |
| Documentation | Basic | Rich | Rust has detailed docs |
| Type Safety | Good | Excellent | Rust prevents more errors |
---
## Use Case Recommendations
### Choose Go Implementation When:
#### 🎯 **Personal Use & Simplicity**
- Single email account or small number of accounts
- Infrequent synchronization (daily/weekly)
- Simple setup requirements
- You want to understand/modify the code easily
#### 🎯 **Resource Constraints**
- Memory-limited environments
- CPU-constrained systems
- Quick deployment needed
- Minimal disk space for binaries
#### 🎯 **Development Preferences**
- Team familiar with Go
- Preference for simple, readable code
- Fast compilation important for development cycle
- Minimal external dependencies preferred
**Example Use Case**: Personal backup of 1-2 Gmail accounts, running weekly on a Raspberry Pi.
### Choose Rust Implementation When:
#### 🚀 **Performance Critical Scenarios**
- Multiple email accounts (3+ accounts)
- Large mailboxes (10,000+ emails)
- Frequent synchronization (hourly/real-time)
- High-volume email processing
#### 🚀 **Production Environments**
- Business-critical email backups
- Need for reliable error recovery
- 24/7 operation requirements
- Professional deployment standards
#### 🚀 **Advanced Features Required**
- Server-side IMAP filtering needed
- Complex folder filtering patterns
- Detailed logging and monitoring
- Long-term maintenance planned
**Example Use Case**: Corporate email backup system handling 10+ accounts with complex filtering rules, running continuously in a production environment.
---
## Performance Benchmarks
### Theoretical Performance Comparison
| Scenario | Go Implementation | Rust Implementation | Improvement |
|----------|------------------|-------------------|-------------|
| **Single small account** (1,000 emails) | 2-3 minutes | 1-2 minutes | 33-50% faster |
| **Multiple accounts** (3 accounts, 5,000 emails each) | 15-20 minutes | 8-12 minutes | 40-47% faster |
| **Large mailbox** (50,000 emails with filtering) | 45-60 minutes | 15-25 minutes | 58-67% faster |
| **Network errors** (5% packet loss) | May fail/restart | Continues with retry | Much more reliable |
*Note: These are estimated performance improvements based on architectural differences. Actual performance will vary based on network conditions, server capabilities, and email characteristics.*
### Resource Usage
| Metric | Go Implementation | Rust Implementation |
|--------|------------------|-------------------|
| **Memory Usage** | 20-50 MB | 15-40 MB |
| **CPU Usage** | Low (single-threaded) | Medium (multi-threaded) |
| **Network Efficiency** | Lower (downloads then filters) | Higher (filters then downloads) |
| **Disk I/O** | Sequential writes | Batched writes |
---
## Migration Guide
### From Go to Rust
If you're currently using the Go implementation and considering migration:
#### **When to Migrate**:
- You experience performance issues with large mailboxes
- You need better error recovery and reliability
- You want more efficient network usage
- You're planning long-term maintenance
#### **Migration Steps**:
1. **Test in parallel**: Run both implementations with `--dry-run` to compare results
2. **Backup existing data**: Ensure your CouchDB data is backed up
3. **Update configuration**: Configuration format is identical, no changes needed
4. **Replace binary**: Simply replace the Go binary with the Rust binary
5. **Monitor performance**: Compare sync times and resource usage
#### **Compatibility Notes**:
- ✅ Configuration files are 100% compatible
- ✅ CouchDB database format is identical
- ✅ Command-line arguments are the same
- ✅ Dry-run mode works identically
### Staying with Go
The Go implementation remains fully supported and is appropriate when:
- Current performance meets your needs
- Simplicity is more important than features
- Team lacks Rust expertise
- Resource usage is already optimized for your environment
---
## Technical Architecture Details
### Go Implementation Structure
```
go/
├── main.go # Entry point and orchestration
├── config/
│ └── config.go # Configuration loading and CLI parsing
├── couch/
│ └── couch.go # CouchDB client and operations
└── mail/
└── imap.go # IMAP client and message processing
```
**Key Characteristics**:
- Monolithic processing flow
- Synchronous I/O operations
- Basic error handling
- Minimal abstraction layers
### Rust Implementation Structure
```
rust/src/
├── main.rs # Entry point
├── lib.rs # Library exports
├── cli.rs # Command-line interface
├── config.rs # Configuration management
├── sync.rs # Synchronization coordinator
├── imap.rs # IMAP client with retry logic
├── couch.rs # CouchDB client with error handling
├── filters.rs # Filtering utilities
└── schemas.rs # Data structure definitions
```
**Key Characteristics**:
- Modular architecture with clear separation
- Asynchronous I/O with tokio runtime
- Comprehensive error handling
- Rich abstraction layers
---
## Security Considerations
Both implementations currently share the same security limitations and features:
### Current Security Features
- ✅ TLS/SSL support for IMAP and CouchDB connections
- ✅ Configuration file validation
- ✅ Safe handling of email content
### Shared Security Limitations
- ⚠️ Plaintext passwords in configuration files
- ⚠️ No OAuth2 support for modern email providers
- ⚠️ No credential encryption at rest
### Future Security Improvements (Recommended for Both)
1. **Environment Variable Credentials**: Support reading passwords from environment variables
2. **OAuth2 Integration**: Support modern authentication for Gmail, Outlook, etc.
3. **Credential Encryption**: Encrypt stored credentials with system keyring integration
4. **Audit Logging**: Enhanced logging of authentication and access events
---
## Deployment Considerations
### Go Implementation Deployment
**Binary Name**: `mail2couch-go`
**Advantages**:
- Single binary deployment
- Minimal system dependencies
- Lower memory footprint
- Faster startup time
**Best Practices**:
```bash
# Build for production using justfile
just build-go-release
# Or build directly
cd go && go build -ldflags="-s -w" -o mail2couch-go .
# Deploy with systemd service
sudo cp go/mail2couch-go /usr/local/bin/
sudo systemctl enable mail2couch-go.service
```
### Rust Implementation Deployment
**Binary Name**: `mail2couch-rs`
**Advantages**:
- Better resource utilization under load
- Superior error recovery
- More detailed logging and monitoring
- Enhanced CLI experience
**Best Practices**:
```bash
# Build optimized release using justfile
just build-rust-release
# Or build directly
cd rust && cargo build --release
# Deploy with enhanced monitoring
sudo cp rust/target/release/mail2couch-rs /usr/local/bin/
sudo systemctl enable mail2couch-rs.service
# Configure structured logging
export RUST_LOG=info
export MAIL2COUCH_LOG_FORMAT=json
```
### Universal Installation
```bash
# Build and install both implementations (user-local)
just install
# This installs to ~/bin/mail2couch-go and ~/bin/mail2couch-rs
# Build and install both implementations (system-wide)
sudo just system-install
# This installs to /usr/local/bin/mail2couch-go and /usr/local/bin/mail2couch-rs
```
---
## Future Development Roadmap
### Short-term Improvements (Both Implementations)
1. **Security Enhancements**
- Environment variable credential support
- OAuth2 authentication for major providers
- Encrypted credential storage
2. **Usability Improvements**
- Interactive configuration wizard
- Progress indicators for long-running operations
- Enhanced error messages with solutions
### Long-term Strategic Direction
#### Go Implementation (Maintenance Mode)
- Bug fixes and security updates
- Maintain compatibility with Rust version
- Focus on simplicity and stability
- Target: Personal and small-scale deployments
#### Rust Implementation (Active Development)
- Performance optimizations
- Advanced features (web interface, monitoring APIs)
- Enterprise features (clustering, high availability)
- Target: Production and large-scale deployments
### Recommended Development Focus
1. **Primary Development**: Focus on Rust implementation for new features
2. **Compatibility Maintenance**: Ensure Go version remains compatible
3. **Migration Path**: Provide clear migration guidance and tooling
4. **Documentation**: Maintain comprehensive documentation for both
---
## Conclusion
Both implementations represent excellent software engineering practices and serve different market segments effectively:
- **Go Implementation**: Ideal for users who prioritize simplicity, fast deployment, and ease of understanding. Perfect for personal use and small-scale deployments.
- **Rust Implementation**: Superior choice for users who need performance, reliability, and advanced features. Recommended for production environments and large-scale email processing.
### Final Recommendation
**For new deployments**: Start with the Rust implementation unless simplicity is your primary concern. The performance benefits and reliability features provide significant value.
**For existing Go users**: Consider migration if you experience performance limitations or need better error recovery. The migration path is straightforward due to configuration compatibility.
**For development contributions**: Focus on the Rust implementation for new features, while maintaining the Go version for bug fixes and compatibility.
The project demonstrates that having two implementations can serve different user needs effectively, with each leveraging the strengths of its respective programming language and ecosystem.

47
TODO.md
View file

@ -1,47 +0,0 @@
# mail2couch TODO and Feature Requests
## Planned Features
### Keyword Filtering for Messages
Add support for filtering messages by keywords in various message fields. This would extend the current `messageFilter` configuration.
**Proposed Configuration Extension:**
```json
{
"messageFilter": {
"since": "2024-01-01",
"subjectKeywords": ["urgent", "important", "meeting"],
"senderKeywords": ["@company.com", "notifications"],
"recipientKeywords": ["team@company.com", "all@"]
}
}
```
**Implementation Details:**
- `subjectKeywords`: Array of keywords to match in email subject lines
- `senderKeywords`: Array of keywords to match in sender email addresses or names
- `recipientKeywords`: Array of keywords to match in recipient (To/CC/BCC) addresses or names
- Keywords should support both inclusive (must contain) and exclusive (must not contain) patterns
- Case-insensitive matching by default
- Support for simple wildcards or regex patterns
**Use Cases:**
1. **Corporate Email Filtering**: Only backup emails from specific domains or containing work-related keywords
2. **Project-based Archiving**: Filter emails related to specific projects or clients
3. **Notification Management**: Exclude or include automated notifications based on sender patterns
4. **Security**: Filter out potential spam/phishing by excluding certain keywords or senders
**Implementation Priority:** Medium - useful for reducing storage requirements and focusing on relevant emails.
## Other Planned Improvements
1. **Real IMAP Message Parsing**: Replace placeholder data with actual message content
2. **Message Body Extraction**: Support for HTML/plain text and multipart messages
3. **Attachment Handling**: Optional support for email attachments
4. **Batch Operations**: Improve CouchDB insertion performance
5. **Error Recovery**: Retry logic and partial sync recovery
6. **Testing**: Comprehensive unit test coverage

View file

@ -1,207 +0,0 @@
# CouchDB Document Schemas
This document defines the CouchDB document schemas used by mail2couch. These schemas must be maintained consistently across all implementations (Go, Rust, etc.).
## Mail Document Schema
**Document Type**: `mail`
**Document ID Format**: `{mailbox}_{uid}` (e.g., `INBOX_123`)
**Purpose**: Stores individual email messages with metadata and content
```json
{
"_id": "INBOX_123",
"_rev": "1-abc123...",
"_attachments": {
"attachment1.pdf": {
"content_type": "application/pdf",
"length": 12345,
"stub": true
}
},
"sourceUid": "123",
"mailbox": "INBOX",
"from": ["sender@example.com"],
"to": ["recipient@example.com"],
"subject": "Email Subject",
"date": "2025-08-02T12:16:10Z",
"body": "Email body content",
"headers": {
"Content-Type": ["text/plain; charset=utf-8"],
"Message-ID": ["<msg123@example.com>"],
"Date": ["Sat, 02 Aug 2025 14:16:10 +0200"]
},
"storedAt": "2025-08-02T14:16:22.375241322+02:00",
"docType": "mail",
"hasAttachments": true
}
```
### Field Definitions
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `_id` | string | Yes | CouchDB document ID: `{mailbox}_{uid}` |
| `_rev` | string | Auto | CouchDB revision (managed by CouchDB) |
| `_attachments` | object | No | CouchDB native attachments (email attachments) |
| `sourceUid` | string | Yes | Original IMAP UID from mail server |
| `mailbox` | string | Yes | Source mailbox name (e.g., "INBOX", "Sent") |
| `from` | array[string] | Yes | Sender email addresses |
| `to` | array[string] | Yes | Recipient email addresses |
| `subject` | string | Yes | Email subject line |
| `date` | string (ISO8601) | Yes | Email date from headers |
| `body` | string | Yes | Email body content (plain text) |
| `headers` | object | Yes | All email headers as key-value pairs |
| `storedAt` | string (ISO8601) | Yes | When document was stored in CouchDB |
| `docType` | string | Yes | Always "mail" for email documents |
| `hasAttachments` | boolean | Yes | Whether email has attachments |
### Attachment Stub Schema
When emails have attachments, they are stored as CouchDB native attachments:
```json
{
"filename.ext": {
"content_type": "mime/type",
"length": 12345,
"stub": true
}
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `content_type` | string | Yes | MIME type of attachment |
| `length` | integer | No | Size in bytes |
| `stub` | boolean | No | Indicates attachment is stored separately |
## Sync Metadata Document Schema
**Document Type**: `sync_metadata`
**Document ID Format**: `sync_metadata_{mailbox}` (e.g., `sync_metadata_INBOX`)
**Purpose**: Tracks synchronization state for incremental syncing
```json
{
"_id": "sync_metadata_INBOX",
"_rev": "1-def456...",
"docType": "sync_metadata",
"mailbox": "INBOX",
"lastSyncTime": "2025-08-02T14:26:08.281094+02:00",
"lastMessageUID": 15,
"messageCount": 18,
"updatedAt": "2025-08-02T14:26:08.281094+02:00"
}
```
### Field Definitions
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `_id` | string | Yes | CouchDB document ID: `sync_metadata_{mailbox}` |
| `_rev` | string | Auto | CouchDB revision (managed by CouchDB) |
| `docType` | string | Yes | Always "sync_metadata" for sync documents |
| `mailbox` | string | Yes | Mailbox name this metadata applies to |
| `lastSyncTime` | string (ISO8601) | Yes | When this mailbox was last synced |
| `lastMessageUID` | integer | Yes | Highest IMAP UID processed in last sync |
| `messageCount` | integer | Yes | Number of messages processed in last sync |
| `updatedAt` | string (ISO8601) | Yes | When this metadata was last updated |
## Database Naming Convention
**Format**: `m2c_{account_name}`
**Rules**:
- Prefix all databases with `m2c_`
- Convert account names to lowercase
- Replace invalid characters with underscores
- Ensure database name starts with a letter
- If account name starts with non-letter, prefix with `mail_`
**Examples**:
- Account "Personal Gmail" → Database `m2c_personal_gmail`
- Account "123work" → Database `m2c_mail_123work`
- Email "user@example.com" → Database `m2c_user_example_com`
## Document ID Conventions
### Mail Documents
- **Format**: `{mailbox}_{uid}`
- **Examples**: `INBOX_123`, `Sent_456`, `Work/Projects_789`
- **Uniqueness**: Combination of mailbox and IMAP UID ensures uniqueness
### Sync Metadata Documents
- **Format**: `sync_metadata_{mailbox}`
- **Examples**: `sync_metadata_INBOX`, `sync_metadata_Sent`
- **Purpose**: One metadata document per mailbox for tracking sync state
## Data Type Mappings
### Go to JSON
| Go Type | JSON Type | Example |
|---------|-----------|---------|
| `string` | string | `"text"` |
| `[]string` | array | `["item1", "item2"]` |
| `map[string][]string` | object | `{"key": ["value1", "value2"]}` |
| `time.Time` | string (ISO8601) | `"2025-08-02T14:26:08.281094+02:00"` |
| `uint32` | number | `123` |
| `int` | number | `456` |
| `bool` | boolean | `true` |
### Rust Considerations
When implementing in Rust, ensure:
- Use `chrono::DateTime<Utc>` for timestamps with ISO8601 serialization
- Use `Vec<String>` for string arrays
- Use `HashMap<String, Vec<String>>` for headers
- Use `serde` with `#[serde(rename = "fieldName")]` for JSON field mapping
- Handle optional fields with `Option<T>`
## Validation Rules
### Required Fields
All documents must include:
- `_id`: Valid CouchDB document ID
- `docType`: Identifies document type for filtering
- `mailbox`: Source mailbox name (for mail documents)
### Data Constraints
- Email addresses: No validation enforced (preserve as-is from IMAP)
- Dates: Must be valid ISO8601 format
- UIDs: Must be positive integers
- Document IDs: Must be valid CouchDB IDs (no spaces, special chars)
### Attachment Handling
- Store email attachments as CouchDB native attachments
- Preserve original filenames and MIME types
- Use attachment stubs in document metadata
- Support binary content through CouchDB attachment API
## Backward Compatibility
When modifying schemas:
1. Add new fields as optional
2. Never remove existing fields
3. Maintain existing field types and formats
4. Document any breaking changes clearly
5. Provide migration guidance for existing data
## Implementation Notes
### CouchDB Features Used
- **Native Attachments**: For email attachments
- **Document IDs**: Predictable format for easy access
- **Bulk Operations**: For efficient storage
- **Conflict Resolution**: CouchDB handles revision conflicts
### Performance Considerations
- Index by `docType` for efficient filtering
- Index by `mailbox` for folder-based queries
- Index by `date` for chronological access
- Use bulk insert operations for multiple messages
### Future Extensions
This schema supports future enhancements:
- **Webmail Views**: CouchDB design documents for HTML interface
- **Search Indexes**: Full-text search with CouchDB-Lucene
- **Replication**: Multi-database sync scenarios
- **Analytics**: Message statistics and reporting

View file

@ -1,42 +0,0 @@
{
"_id": "INBOX_123",
"_rev": "1-abc123def456789",
"_attachments": {
"report.pdf": {
"content_type": "application/pdf",
"length": 245760,
"stub": true
},
"image.png": {
"content_type": "image/png",
"length": 12345,
"stub": true
}
},
"sourceUid": "123",
"mailbox": "INBOX",
"from": ["sender@example.com", "alias@example.com"],
"to": ["recipient@company.com", "cc@company.com"],
"subject": "Monthly Report - Q3 2025",
"date": "2025-08-02T12:16:10Z",
"body": "Please find the attached monthly report for Q3 2025.\n\nBest regards,\nSender Name",
"headers": {
"Content-Type": ["multipart/mixed; boundary=\"----=_Part_123456\""],
"Content-Transfer-Encoding": ["7bit"],
"Date": ["Sat, 02 Aug 2025 14:16:10 +0200"],
"From": ["sender@example.com"],
"To": ["recipient@company.com"],
"Cc": ["cc@company.com"],
"Subject": ["Monthly Report - Q3 2025"],
"Message-ID": ["<msg123.456@example.com>"],
"MIME-Version": ["1.0"],
"X-Mailer": ["Mail Client 1.0"],
"Return-Path": ["<sender@example.com>"],
"Received": [
"from smtp.example.com (smtp.example.com [192.168.1.100]) by mx.company.com (Postfix) with ESMTP id ABC123; Sat, 02 Aug 2025 14:16:10 +0200"
]
},
"storedAt": "2025-08-02T14:16:22.375241322+02:00",
"docType": "mail",
"hasAttachments": true
}

View file

@ -1,10 +0,0 @@
{
"_id": "sync_metadata_INBOX",
"_rev": "2-def456abc789123",
"docType": "sync_metadata",
"mailbox": "INBOX",
"lastSyncTime": "2025-08-02T14:26:08.281094+02:00",
"lastMessageUID": 123,
"messageCount": 45,
"updatedAt": "2025-08-02T14:26:08.281094+02:00"
}

View file

@ -1,24 +0,0 @@
{
"_id": "Sent_456",
"_rev": "1-xyz789abc123def",
"sourceUid": "456",
"mailbox": "Sent",
"from": ["user@company.com"],
"to": ["client@external.com"],
"subject": "Meeting Follow-up",
"date": "2025-08-02T10:30:00Z",
"body": "Thank you for the productive meeting today. As discussed, I'll send the proposal by end of week.\n\nBest regards,\nUser Name",
"headers": {
"Content-Type": ["text/plain; charset=utf-8"],
"Content-Transfer-Encoding": ["7bit"],
"Date": ["Sat, 02 Aug 2025 12:30:00 +0200"],
"From": ["user@company.com"],
"To": ["client@external.com"],
"Subject": ["Meeting Follow-up"],
"Message-ID": ["<sent456.789@company.com>"],
"MIME-Version": ["1.0"]
},
"storedAt": "2025-08-02T12:30:45.123456789+02:00",
"docType": "mail",
"hasAttachments": false
}

View file

@ -1,154 +0,0 @@
# Test Configuration Comparison: Rust vs Go
## Overview
Two identical test configurations have been created for testing both Rust and Go implementations with the test environment:
- **Rust**: `/home/olemd/src/mail2couch/rust/config-test-rust.json`
- **Go**: `/home/olemd/src/mail2couch/go/config-test-go.json`
## Configuration Details
Both configurations use the **same test environment** from `/home/olemd/src/mail2couch/test/` with:
### Database Connection
- **CouchDB URL**: `http://localhost:5984`
- **Admin Credentials**: `admin` / `password`
### IMAP Test Server
- **Host**: `localhost`
- **Port**: `3143` (GreenMail test server)
- **Connection**: Plain (no TLS for testing)
### Test Accounts
Both configurations use the **same IMAP test accounts**:
| Username | Password | Purpose |
|----------|----------|---------|
| `testuser1` | `password123` | Wildcard all folders test |
| `syncuser` | `syncpass` | Work pattern test (sync mode) |
| `archiveuser` | `archivepass` | Specific folders test |
| `testuser2` | `password456` | Subfolder pattern test (disabled) |
### Mail Sources Configuration
Both configurations define **identical mail sources** with only the account names differing:
#### 1. Wildcard All Folders Test
- **Account Name**: "**Rust** Wildcard All Folders Test" vs "**Go** Wildcard All Folders Test"
- **Mode**: `archive`
- **Folders**: All folders (`*`) except `Drafts` and `Trash`
- **Filters**: Subject keywords: `["meeting", "important"]`, Sender keywords: `["@company.com"]`
#### 2. Work Pattern Test
- **Account Name**: "**Rust** Work Pattern Test" vs "**Go** Work Pattern Test"
- **Mode**: `sync` (delete removed emails)
- **Folders**: `Work*`, `Important*`, `INBOX` (exclude `*Temp*`)
- **Filters**: Recipient keywords: `["support@", "team@"]`
#### 3. Specific Folders Only
- **Account Name**: "**Rust** Specific Folders Only" vs "**Go** Specific Folders Only"
- **Mode**: `archive`
- **Folders**: Exactly `INBOX`, `Sent`, `Personal`
- **Filters**: None
#### 4. Subfolder Pattern Test (Disabled)
- **Account Name**: "**Rust** Subfolder Pattern Test" vs "**Go** Subfolder Pattern Test"
- **Mode**: `archive`
- **Folders**: `Work/*`, `Archive/*` (exclude `*/Drafts`)
- **Status**: `enabled: false`
## Expected Database Names
When run, each implementation will create **different databases** due to the account name differences:
### Rust Implementation Databases
- `m2c_rust_wildcard_all_folders_test`
- `m2c_rust_work_pattern_test`
- `m2c_rust_specific_folders_only`
- `m2c_rust_subfolder_pattern_test` (disabled)
### Go Implementation Databases
- `m2c_go_wildcard_all_folders_test`
- `m2c_go_work_pattern_test`
- `m2c_go_specific_folders_only`
- `m2c_go_subfolder_pattern_test` (disabled)
## Testing Commands
### Start Test Environment
```bash
cd /home/olemd/src/mail2couch/test
./start-test-env.sh
```
### Run Rust Implementation
```bash
cd /home/olemd/src/mail2couch/rust
cargo build --release
./target/release/mail2couch -c config-test-rust.json
```
### Run Go Implementation
```bash
cd /home/olemd/src/mail2couch/go
go build -o mail2couch .
./mail2couch -c config-test-go.json
```
### Verify Results
```bash
# List all databases
curl http://localhost:5984/_all_dbs
# Check Rust databases
curl http://localhost:5984/m2c_rust_wildcard_all_folders_test
curl http://localhost:5984/m2c_rust_work_pattern_test
curl http://localhost:5984/m2c_rust_specific_folders_only
# Check Go databases
curl http://localhost:5984/m2c_go_wildcard_all_folders_test
curl http://localhost:5984/m2c_go_work_pattern_test
curl http://localhost:5984/m2c_go_specific_folders_only
```
### Stop Test Environment
```bash
cd /home/olemd/src/mail2couch/test
./stop-test-env.sh
```
## Validation Points
Both implementations should produce **identical results** when processing the same IMAP accounts:
1. **Database Structure**: Same document schemas and field names
2. **Message Processing**: Same email parsing and storage logic
3. **Folder Filtering**: Same wildcard pattern matching
4. **Message Filtering**: Same keyword filtering behavior
5. **Sync Behavior**: Same incremental sync and deletion handling
6. **Error Handling**: Same retry logic and error recovery
The only differences should be:
- Database names (due to account name prefixes)
- Timestamp precision (implementation-specific)
- Internal document IDs format (if any)
## Use Cases
### Feature Parity Testing
Run both implementations with the same configuration to verify identical behavior:
```bash
# Run both implementations
./test-both-implementations.sh
# Compare database contents
./compare-database-results.sh
```
### Performance Comparison
Use identical configurations to benchmark performance differences between Rust and Go implementations.
### Development Testing
Use separate configurations during development to avoid database conflicts when testing both implementations simultaneously.