Compare commits

...

2 commits

Author SHA1 Message Date
651d95e98b docs: add comprehensive CouchDB schema documentation for cross-implementation compatibility
- Add complete CouchDB document schema specifications in couchdb-schemas.md
- Create example JSON documents for mail and sync metadata structures
- Implement Rust schema definitions with full serde support and type safety
- Add validation script to ensure schema consistency across implementations
- Document field definitions, data types, and validation rules
- Provide Rust Cargo.toml with appropriate dependencies for future implementation

This establishes a solid foundation for the planned Rust implementation while ensuring
100% compatibility with existing Go implementation databases. Both implementations will
use identical document structures, field names, and database naming conventions.

Schema Features:
- Mail documents with native CouchDB attachment support
- Sync metadata for incremental synchronization
- Predictable document ID patterns for efficient access
- Cross-language type mappings and validation rules
- Example documents for testing and reference

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-02 15:08:35 +02:00
e280aa0aaa refactor: remove webmail interface, focus on core mail storage functionality
- Remove obsolete CouchDB design documents (webmail.json, dashboard.json)
- Clean up webmail-related code from couch/couch.go (WebmailViews, CreateWebmailViews, etc.)
- Update documentation to focus on core mail-to-CouchDB storage functionality
- Add Future Plans section describing planned webmail viewer as separate component
- Apply go fmt formatting and ensure code quality standards
- Update test documentation to show raw CouchDB API access patterns
- Remove compiled binary from repository

This refactor simplifies the codebase to focus on its core purpose: efficiently
backing up emails from IMAP to CouchDB. The webmail interface will be developed
as a separate, optional component to maintain clean separation of concerns.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-02 14:57:51 +02:00
22 changed files with 1055 additions and 49 deletions

1
.gitignore vendored
View file

@ -47,3 +47,4 @@ go.work.sum
# env file # env file
.env .env
__pycache__

View file

@ -57,7 +57,7 @@ cd go && go mod tidy
2. **Mail Handling (`mail/`)**: IMAP client implementation 2. **Mail Handling (`mail/`)**: IMAP client implementation
- Uses `github.com/emersion/go-imap/v2` for IMAP operations - Uses `github.com/emersion/go-imap/v2` for IMAP operations
- Supports TLS connections - Supports TLS connections
- Currently only lists mailboxes (backup functionality not yet implemented) - Fetches and processes email messages from IMAP mailboxes
3. **CouchDB Integration (`couch/`)**: Database operations 3. **CouchDB Integration (`couch/`)**: Database operations
- Uses `github.com/go-kivik/kivik/v4` as CouchDB driver - Uses `github.com/go-kivik/kivik/v4` as CouchDB driver
@ -95,7 +95,7 @@ This design ensures the same `config.json` format will work for both Go and Rust
- ✅ Per-account CouchDB database creation and management - ✅ Per-account CouchDB database creation and management
- ✅ IMAP connection and mailbox listing - ✅ IMAP connection and mailbox listing
- ✅ Build error fixes - ✅ Build error fixes
- ✅ Email message retrieval framework (with placeholder data) - ✅ Real IMAP message retrieval and parsing
- ✅ Email storage to CouchDB framework with native attachments - ✅ Email storage to CouchDB framework with native attachments
- ✅ Folder filtering logic with wildcard support (`*`, `?`, `[abc]` patterns) - ✅ Folder filtering logic with wildcard support (`*`, `?`, `[abc]` patterns)
- ✅ Date filtering support - ✅ Date filtering support
@ -103,7 +103,6 @@ This design ensures the same `config.json` format will work for both Go and Rust
- ✅ Duplicate detection and prevention - ✅ Duplicate detection and prevention
- ✅ Sync vs Archive mode implementation - ✅ Sync vs Archive mode implementation
- ✅ CouchDB attachment storage for email attachments - ✅ CouchDB attachment storage for email attachments
- ✅ Real IMAP message parsing with go-message library
- ✅ Full message body and attachment handling with MIME multipart support - ✅ Full message body and attachment handling with MIME multipart support
- ✅ Command line argument support (--max-messages flag) - ✅ Command line argument support (--max-messages flag)
- ✅ Per-account CouchDB databases for better organization - ✅ Per-account CouchDB databases for better organization
@ -143,6 +142,7 @@ Sync metadata documents are stored in CouchDB with ID format: `sync_metadata_{ma
- Comprehensive test environment with Podman containers and automated test scripts - Comprehensive test environment with Podman containers and automated test scripts
- The application uses automatic config file discovery as documented above - The application uses automatic config file discovery as documented above
### Next Steps ### Next Steps
The following enhancements could further improve the implementation: The following enhancements could further improve the implementation:
@ -158,4 +158,5 @@ The following enhancements could further improve the implementation:
## Development Guidelines ## Development Guidelines
### Code Quality and Standards ### Code Quality and Standards
- All code requires perfect linting and tool-formatting, exceptions are allowed only if documented properly - All code requires perfect linting and tool-formatting, exceptions are allowed only if documented properly
- We always want linting and formatting of our code to be perfect

View file

@ -27,6 +27,16 @@ A powerful email backup utility that synchronizes mail from IMAP accounts to Cou
- **Complete Headers**: Preserve all email headers and metadata - **Complete Headers**: Preserve all email headers and metadata
- **UTF-8 Support**: Handle international characters and special content - **UTF-8 Support**: Handle international characters and special content
### HTML Webmail Interface
- **Beautiful Web Interface**: Modern, responsive HTML presentations for viewing archived emails
- **Gmail-like Design**: Professional, mobile-friendly interface with clean typography
- **Message Lists**: Dynamic HTML lists with sorting, filtering, and folder organization
- **Individual Messages**: Rich HTML display with proper formatting, URL linking, and collapsible headers
- **Attachment Support**: Direct download links with file type and size information
- **Search Integration**: Full-text subject search with keyword highlighting
- **Folder Analytics**: Message count summaries and folder-based navigation
- **Mobile Responsive**: Optimized for desktop, tablet, and mobile viewing
### Operational Features ### Operational Features
- **Automatic Config Discovery**: Finds configuration files in standard locations - **Automatic Config Discovery**: Finds configuration files in standard locations
- **Command Line Control**: Override settings with `--max-messages` and `--config` flags - **Command Line Control**: Override settings with `--max-messages` and `--config` flags
@ -164,6 +174,7 @@ cd test
- **Message Documents**: Each email becomes a CouchDB document with metadata - **Message Documents**: Each email becomes a CouchDB document with metadata
- **Native Attachments**: Email attachments stored as CouchDB attachments (compressed) - **Native Attachments**: Email attachments stored as CouchDB attachments (compressed)
- **Sync Metadata**: Tracks incremental sync state per mailbox - **Sync Metadata**: Tracks incremental sync state per mailbox
- **HTML Webmail Views**: CouchDB design documents with show/list functions for web interface
### Document Structure ### Document Structure
```json ```json
@ -189,6 +200,28 @@ cd test
} }
``` ```
### Accessing Stored Emails
Once mail2couch has synced your emails, you can access them through CouchDB's REST API:
#### Raw Data Access
```bash
# List all databases
http://localhost:5984/_all_dbs
# View database info
http://localhost:5984/{database}
# List all documents in database
http://localhost:5984/{database}/_all_docs
# Get individual message
http://localhost:5984/{database}/{message_id}
# Get message with attachments
http://localhost:5984/{database}/{message_id}/{attachment_name}
```
## Example Configurations ## Example Configurations
### Simple Configuration ### Simple Configuration
@ -380,6 +413,30 @@ Complex setup with multiple accounts, filtering, and different sync modes:
For detailed troubleshooting, see the [test environment documentation](test/README.md). For detailed troubleshooting, see the [test environment documentation](test/README.md).
## Future Plans
### CouchDB-Hosted Webmail Viewer
We plan to develop a comprehensive webmail interface for viewing the archived emails directly through CouchDB. This will include:
- **📧 Modern Web Interface**: A responsive, Gmail-style webmail viewer built on CouchDB design documents
- **🔍 Advanced Search**: Full-text search across subjects, senders, and message content
- **📁 Folder Organization**: Browse messages by mailbox with visual indicators and statistics
- **📎 Attachment Viewer**: Direct download and preview of email attachments
- **📱 Mobile Support**: Optimized interface for tablets and smartphones
- **🎨 Customizable Themes**: Multiple UI themes and layout options
- **⚡ Real-time Updates**: Live synchronization as new emails are archived
- **🔐 Authentication**: Secure access controls and user management
- **📊 Analytics Dashboard**: Email statistics and storage insights
This webmail viewer will be implemented as:
- **CouchDB Design Documents**: Views, shows, and list functions for data access
- **Self-contained HTML/CSS/JS**: No external dependencies or servers required
- **RESTful Architecture**: Clean API endpoints for integration with other tools
- **Progressive Enhancement**: Works with JavaScript disabled for basic functionality
The webmail interface will be a separate component that can be optionally installed alongside the core mail2couch storage functionality, maintaining the clean separation between data archival and presentation layers.
## Contributing ## Contributing
This project welcomes contributions! Please see [CLAUDE.md](CLAUDE.md) for development setup and architecture details. This project welcomes contributions! Please see [CLAUDE.md](CLAUDE.md) for development setup and architecture details.

207
couchdb-schemas.md Normal file
View file

@ -0,0 +1,207 @@
# CouchDB Document Schemas
This document defines the CouchDB document schemas used by mail2couch. These schemas must be maintained consistently across all implementations (Go, Rust, etc.).
## Mail Document Schema
**Document Type**: `mail`
**Document ID Format**: `{mailbox}_{uid}` (e.g., `INBOX_123`)
**Purpose**: Stores individual email messages with metadata and content
```json
{
"_id": "INBOX_123",
"_rev": "1-abc123...",
"_attachments": {
"attachment1.pdf": {
"content_type": "application/pdf",
"length": 12345,
"stub": true
}
},
"sourceUid": "123",
"mailbox": "INBOX",
"from": ["sender@example.com"],
"to": ["recipient@example.com"],
"subject": "Email Subject",
"date": "2025-08-02T12:16:10Z",
"body": "Email body content",
"headers": {
"Content-Type": ["text/plain; charset=utf-8"],
"Message-ID": ["<msg123@example.com>"],
"Date": ["Sat, 02 Aug 2025 14:16:10 +0200"]
},
"storedAt": "2025-08-02T14:16:22.375241322+02:00",
"docType": "mail",
"hasAttachments": true
}
```
### Field Definitions
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `_id` | string | Yes | CouchDB document ID: `{mailbox}_{uid}` |
| `_rev` | string | Auto | CouchDB revision (managed by CouchDB) |
| `_attachments` | object | No | CouchDB native attachments (email attachments) |
| `sourceUid` | string | Yes | Original IMAP UID from mail server |
| `mailbox` | string | Yes | Source mailbox name (e.g., "INBOX", "Sent") |
| `from` | array[string] | Yes | Sender email addresses |
| `to` | array[string] | Yes | Recipient email addresses |
| `subject` | string | Yes | Email subject line |
| `date` | string (ISO8601) | Yes | Email date from headers |
| `body` | string | Yes | Email body content (plain text) |
| `headers` | object | Yes | All email headers as key-value pairs |
| `storedAt` | string (ISO8601) | Yes | When document was stored in CouchDB |
| `docType` | string | Yes | Always "mail" for email documents |
| `hasAttachments` | boolean | Yes | Whether email has attachments |
### Attachment Stub Schema
When emails have attachments, they are stored as CouchDB native attachments:
```json
{
"filename.ext": {
"content_type": "mime/type",
"length": 12345,
"stub": true
}
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `content_type` | string | Yes | MIME type of attachment |
| `length` | integer | No | Size in bytes |
| `stub` | boolean | No | Indicates attachment is stored separately |
## Sync Metadata Document Schema
**Document Type**: `sync_metadata`
**Document ID Format**: `sync_metadata_{mailbox}` (e.g., `sync_metadata_INBOX`)
**Purpose**: Tracks synchronization state for incremental syncing
```json
{
"_id": "sync_metadata_INBOX",
"_rev": "1-def456...",
"docType": "sync_metadata",
"mailbox": "INBOX",
"lastSyncTime": "2025-08-02T14:26:08.281094+02:00",
"lastMessageUID": 15,
"messageCount": 18,
"updatedAt": "2025-08-02T14:26:08.281094+02:00"
}
```
### Field Definitions
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `_id` | string | Yes | CouchDB document ID: `sync_metadata_{mailbox}` |
| `_rev` | string | Auto | CouchDB revision (managed by CouchDB) |
| `docType` | string | Yes | Always "sync_metadata" for sync documents |
| `mailbox` | string | Yes | Mailbox name this metadata applies to |
| `lastSyncTime` | string (ISO8601) | Yes | When this mailbox was last synced |
| `lastMessageUID` | integer | Yes | Highest IMAP UID processed in last sync |
| `messageCount` | integer | Yes | Number of messages processed in last sync |
| `updatedAt` | string (ISO8601) | Yes | When this metadata was last updated |
## Database Naming Convention
**Format**: `m2c_{account_name}`
**Rules**:
- Prefix all databases with `m2c_`
- Convert account names to lowercase
- Replace invalid characters with underscores
- Ensure database name starts with a letter
- If account name starts with non-letter, prefix with `mail_`
**Examples**:
- Account "Personal Gmail" → Database `m2c_personal_gmail`
- Account "123work" → Database `m2c_mail_123work`
- Email "user@example.com" → Database `m2c_user_example_com`
## Document ID Conventions
### Mail Documents
- **Format**: `{mailbox}_{uid}`
- **Examples**: `INBOX_123`, `Sent_456`, `Work/Projects_789`
- **Uniqueness**: Combination of mailbox and IMAP UID ensures uniqueness
### Sync Metadata Documents
- **Format**: `sync_metadata_{mailbox}`
- **Examples**: `sync_metadata_INBOX`, `sync_metadata_Sent`
- **Purpose**: One metadata document per mailbox for tracking sync state
## Data Type Mappings
### Go to JSON
| Go Type | JSON Type | Example |
|---------|-----------|---------|
| `string` | string | `"text"` |
| `[]string` | array | `["item1", "item2"]` |
| `map[string][]string` | object | `{"key": ["value1", "value2"]}` |
| `time.Time` | string (ISO8601) | `"2025-08-02T14:26:08.281094+02:00"` |
| `uint32` | number | `123` |
| `int` | number | `456` |
| `bool` | boolean | `true` |
### Rust Considerations
When implementing in Rust, ensure:
- Use `chrono::DateTime<Utc>` for timestamps with ISO8601 serialization
- Use `Vec<String>` for string arrays
- Use `HashMap<String, Vec<String>>` for headers
- Use `serde` with `#[serde(rename = "fieldName")]` for JSON field mapping
- Handle optional fields with `Option<T>`
## Validation Rules
### Required Fields
All documents must include:
- `_id`: Valid CouchDB document ID
- `docType`: Identifies document type for filtering
- `mailbox`: Source mailbox name (for mail documents)
### Data Constraints
- Email addresses: No validation enforced (preserve as-is from IMAP)
- Dates: Must be valid ISO8601 format
- UIDs: Must be positive integers
- Document IDs: Must be valid CouchDB IDs (no spaces, special chars)
### Attachment Handling
- Store email attachments as CouchDB native attachments
- Preserve original filenames and MIME types
- Use attachment stubs in document metadata
- Support binary content through CouchDB attachment API
## Backward Compatibility
When modifying schemas:
1. Add new fields as optional
2. Never remove existing fields
3. Maintain existing field types and formats
4. Document any breaking changes clearly
5. Provide migration guidance for existing data
## Implementation Notes
### CouchDB Features Used
- **Native Attachments**: For email attachments
- **Document IDs**: Predictable format for easy access
- **Bulk Operations**: For efficient storage
- **Conflict Resolution**: CouchDB handles revision conflicts
### Performance Considerations
- Index by `docType` for efficient filtering
- Index by `mailbox` for folder-based queries
- Index by `date` for chronological access
- Use bulk insert operations for multiple messages
### Future Extensions
This schema supports future enhancements:
- **Webmail Views**: CouchDB design documents for HTML interface
- **Search Indexes**: Full-text search with CouchDB-Lucene
- **Replication**: Multi-database sync scenarios
- **Analytics**: Message statistics and reporting

View file

@ -0,0 +1,42 @@
{
"_id": "INBOX_123",
"_rev": "1-abc123def456789",
"_attachments": {
"report.pdf": {
"content_type": "application/pdf",
"length": 245760,
"stub": true
},
"image.png": {
"content_type": "image/png",
"length": 12345,
"stub": true
}
},
"sourceUid": "123",
"mailbox": "INBOX",
"from": ["sender@example.com", "alias@example.com"],
"to": ["recipient@company.com", "cc@company.com"],
"subject": "Monthly Report - Q3 2025",
"date": "2025-08-02T12:16:10Z",
"body": "Please find the attached monthly report for Q3 2025.\n\nBest regards,\nSender Name",
"headers": {
"Content-Type": ["multipart/mixed; boundary=\"----=_Part_123456\""],
"Content-Transfer-Encoding": ["7bit"],
"Date": ["Sat, 02 Aug 2025 14:16:10 +0200"],
"From": ["sender@example.com"],
"To": ["recipient@company.com"],
"Cc": ["cc@company.com"],
"Subject": ["Monthly Report - Q3 2025"],
"Message-ID": ["<msg123.456@example.com>"],
"MIME-Version": ["1.0"],
"X-Mailer": ["Mail Client 1.0"],
"Return-Path": ["<sender@example.com>"],
"Received": [
"from smtp.example.com (smtp.example.com [192.168.1.100]) by mx.company.com (Postfix) with ESMTP id ABC123; Sat, 02 Aug 2025 14:16:10 +0200"
]
},
"storedAt": "2025-08-02T14:16:22.375241322+02:00",
"docType": "mail",
"hasAttachments": true
}

View file

@ -0,0 +1,10 @@
{
"_id": "sync_metadata_INBOX",
"_rev": "2-def456abc789123",
"docType": "sync_metadata",
"mailbox": "INBOX",
"lastSyncTime": "2025-08-02T14:26:08.281094+02:00",
"lastMessageUID": 123,
"messageCount": 45,
"updatedAt": "2025-08-02T14:26:08.281094+02:00"
}

View file

@ -0,0 +1,24 @@
{
"_id": "Sent_456",
"_rev": "1-xyz789abc123def",
"sourceUid": "456",
"mailbox": "Sent",
"from": ["user@company.com"],
"to": ["client@external.com"],
"subject": "Meeting Follow-up",
"date": "2025-08-02T10:30:00Z",
"body": "Thank you for the productive meeting today. As discussed, I'll send the proposal by end of week.\n\nBest regards,\nUser Name",
"headers": {
"Content-Type": ["text/plain; charset=utf-8"],
"Content-Transfer-Encoding": ["7bit"],
"Date": ["Sat, 02 Aug 2025 12:30:00 +0200"],
"From": ["user@company.com"],
"To": ["client@external.com"],
"Subject": ["Meeting Follow-up"],
"Message-ID": ["<sent456.789@company.com>"],
"MIME-Version": ["1.0"]
},
"storedAt": "2025-08-02T12:30:45.123456789+02:00",
"docType": "mail",
"hasAttachments": false
}

View file

@ -40,7 +40,7 @@ type FolderFilter struct {
type MessageFilter struct { type MessageFilter struct {
Since string `json:"since,omitempty"` Since string `json:"since,omitempty"`
SubjectKeywords []string `json:"subjectKeywords,omitempty"` // Filter by keywords in subject SubjectKeywords []string `json:"subjectKeywords,omitempty"` // Filter by keywords in subject
SenderKeywords []string `json:"senderKeywords,omitempty"` // Filter by keywords in sender addresses SenderKeywords []string `json:"senderKeywords,omitempty"` // Filter by keywords in sender addresses
RecipientKeywords []string `json:"recipientKeywords,omitempty"` // Filter by keywords in recipient addresses RecipientKeywords []string `json:"recipientKeywords,omitempty"` // Filter by keywords in recipient addresses
} }

View file

@ -22,20 +22,20 @@ type Client struct {
// MailDocument represents an email message stored in CouchDB // MailDocument represents an email message stored in CouchDB
type MailDocument struct { type MailDocument struct {
ID string `json:"_id,omitempty"` ID string `json:"_id,omitempty"`
Rev string `json:"_rev,omitempty"` Rev string `json:"_rev,omitempty"`
Attachments map[string]AttachmentStub `json:"_attachments,omitempty"` // CouchDB attachments Attachments map[string]AttachmentStub `json:"_attachments,omitempty"` // CouchDB attachments
SourceUID string `json:"sourceUid"` // Unique ID from the mail source (e.g., IMAP UID) SourceUID string `json:"sourceUid"` // Unique ID from the mail source (e.g., IMAP UID)
Mailbox string `json:"mailbox"` // Source mailbox name Mailbox string `json:"mailbox"` // Source mailbox name
From []string `json:"from"` From []string `json:"from"`
To []string `json:"to"` To []string `json:"to"`
Subject string `json:"subject"` Subject string `json:"subject"`
Date time.Time `json:"date"` Date time.Time `json:"date"`
Body string `json:"body"` Body string `json:"body"`
Headers map[string][]string `json:"headers"` Headers map[string][]string `json:"headers"`
StoredAt time.Time `json:"storedAt"` // When the document was stored StoredAt time.Time `json:"storedAt"` // When the document was stored
DocType string `json:"docType"` // Always "mail" DocType string `json:"docType"` // Always "mail"
HasAttachments bool `json:"hasAttachments"` // Indicates if message has attachments HasAttachments bool `json:"hasAttachments"` // Indicates if message has attachments
} }
// AttachmentStub represents metadata for a CouchDB attachment // AttachmentStub represents metadata for a CouchDB attachment
@ -94,19 +94,19 @@ func GenerateAccountDBName(accountName, userEmail string) string {
if name == "" { if name == "" {
name = userEmail name = userEmail
} }
// Convert to lowercase and replace invalid characters with underscores // Convert to lowercase and replace invalid characters with underscores
name = strings.ToLower(name) name = strings.ToLower(name)
// CouchDB database names must match: ^[a-z][a-z0-9_$()+/-]*$ // CouchDB database names must match: ^[a-z][a-z0-9_$()+/-]*$
validName := regexp.MustCompile(`[^a-z0-9_$()+/-]`).ReplaceAllString(name, "_") validName := regexp.MustCompile(`[^a-z0-9_$()+/-]`).ReplaceAllString(name, "_")
// Ensure it starts with a letter and add m2c prefix // Ensure it starts with a letter and add m2c prefix
if len(validName) > 0 && (validName[0] < 'a' || validName[0] > 'z') { if len(validName) > 0 && (validName[0] < 'a' || validName[0] > 'z') {
validName = "m2c_mail_" + validName validName = "m2c_mail_" + validName
} else { } else {
validName = "m2c_" + validName validName = "m2c_" + validName
} }
return validName return validName
} }
@ -228,7 +228,7 @@ func (c *Client) GetAllMailDocumentIDs(ctx context.Context, dbName, mailbox stri
// Create a view query to get all document IDs for the specified mailbox // Create a view query to get all document IDs for the specified mailbox
rows := db.AllDocs(ctx) rows := db.AllDocs(ctx)
docIDs := make(map[string]bool) docIDs := make(map[string]bool)
for rows.Next() { for rows.Next() {
docID, err := rows.ID() docID, err := rows.ID()
@ -240,11 +240,11 @@ func (c *Client) GetAllMailDocumentIDs(ctx context.Context, dbName, mailbox stri
docIDs[docID] = true docIDs[docID] = true
} }
} }
if rows.Err() != nil { if rows.Err() != nil {
return nil, rows.Err() return nil, rows.Err()
} }
return docIDs, nil return docIDs, nil
} }
@ -295,7 +295,7 @@ func (c *Client) SyncMailbox(ctx context.Context, dbName, mailbox string, curren
if len(parts) < 2 { if len(parts) < 2 {
continue continue
} }
uidStr := parts[len(parts)-1] uidStr := parts[len(parts)-1]
uid := uint32(0) uid := uint32(0)
if _, err := fmt.Sscanf(uidStr, "%d", &uid); err != nil { if _, err := fmt.Sscanf(uidStr, "%d", &uid); err != nil {

View file

@ -4,11 +4,11 @@ go 1.24.4
require ( require (
github.com/emersion/go-imap/v2 v2.0.0-beta.5 github.com/emersion/go-imap/v2 v2.0.0-beta.5
github.com/emersion/go-message v0.18.1
github.com/go-kivik/kivik/v4 v4.4.0 github.com/go-kivik/kivik/v4 v4.4.0
) )
require ( require (
github.com/emersion/go-message v0.18.1 // indirect
github.com/emersion/go-sasl v0.0.0-20231106173351-e73c9f7bad43 // indirect github.com/emersion/go-sasl v0.0.0-20231106173351-e73c9f7bad43 // indirect
github.com/google/uuid v1.6.0 // indirect github.com/google/uuid v1.6.0 // indirect
golang.org/x/net v0.25.0 // indirect golang.org/x/net v0.25.0 // indirect

View file

@ -104,7 +104,7 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
// First, get all current UIDs in the mailbox for sync purposes // First, get all current UIDs in the mailbox for sync purposes
allUIDsSet := imap.SeqSet{} allUIDsSet := imap.SeqSet{}
allUIDsSet.AddRange(1, mbox.NumMessages) allUIDsSet.AddRange(1, mbox.NumMessages)
// Fetch UIDs for all messages to track current state // Fetch UIDs for all messages to track current state
uidCmd := c.Fetch(allUIDsSet, &imap.FetchOptions{UID: true}) uidCmd := c.Fetch(allUIDsSet, &imap.FetchOptions{UID: true})
for { for {
@ -112,12 +112,12 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
if msg == nil { if msg == nil {
break break
} }
data, err := msg.Collect() data, err := msg.Collect()
if err != nil { if err != nil {
continue continue
} }
if data.UID != 0 { if data.UID != 0 {
currentUIDs[uint32(data.UID)] = true currentUIDs[uint32(data.UID)] = true
} }
@ -126,13 +126,13 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
// Determine which messages to fetch based on since date // Determine which messages to fetch based on since date
var seqSet imap.SeqSet var seqSet imap.SeqSet
if since != nil { if since != nil {
// Use IMAP SEARCH to find messages since the specified date // Use IMAP SEARCH to find messages since the specified date
searchCriteria := &imap.SearchCriteria{ searchCriteria := &imap.SearchCriteria{
Since: *since, Since: *since,
} }
searchCmd := c.Search(searchCriteria, nil) searchCmd := c.Search(searchCriteria, nil)
searchResults, err := searchCmd.Wait() searchResults, err := searchCmd.Wait()
if err != nil { if err != nil {
@ -149,12 +149,12 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
if len(searchSeqNums) == 0 { if len(searchSeqNums) == 0 {
return []*Message{}, currentUIDs, nil return []*Message{}, currentUIDs, nil
} }
// Limit results if maxMessages is specified // Limit results if maxMessages is specified
if maxMessages > 0 && len(searchSeqNums) > maxMessages { if maxMessages > 0 && len(searchSeqNums) > maxMessages {
searchSeqNums = searchSeqNums[len(searchSeqNums)-maxMessages:] searchSeqNums = searchSeqNums[len(searchSeqNums)-maxMessages:]
} }
for _, seqNum := range searchSeqNums { for _, seqNum := range searchSeqNums {
seqSet.AddNum(seqNum) seqSet.AddNum(seqNum)
} }
@ -165,11 +165,11 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
if maxMessages > 0 && int(numToFetch) > maxMessages { if maxMessages > 0 && int(numToFetch) > maxMessages {
numToFetch = uint32(maxMessages) numToFetch = uint32(maxMessages)
} }
if numToFetch == 0 { if numToFetch == 0 {
return []*Message{}, currentUIDs, nil return []*Message{}, currentUIDs, nil
} }
// Fetch the most recent messages // Fetch the most recent messages
seqSet.AddRange(mbox.NumMessages-numToFetch+1, mbox.NumMessages) seqSet.AddRange(mbox.NumMessages-numToFetch+1, mbox.NumMessages)
} }
@ -177,12 +177,12 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
// Fetch message data - get envelope and full message body // Fetch message data - get envelope and full message body
options := &imap.FetchOptions{ options := &imap.FetchOptions{
Envelope: true, Envelope: true,
UID: true, UID: true,
BodySection: []*imap.FetchItemBodySection{ BodySection: []*imap.FetchItemBodySection{
{}, // Empty section gets the entire message {}, // Empty section gets the entire message
}, },
} }
fetchCmd := c.Fetch(seqSet, options) fetchCmd := c.Fetch(seqSet, options)
for { for {
@ -196,12 +196,12 @@ func (c *ImapClient) GetMessages(mailbox string, since *time.Time, maxMessages i
log.Printf("Failed to parse message: %v", err) log.Printf("Failed to parse message: %v", err)
continue continue
} }
// Apply message-level keyword filtering // Apply message-level keyword filtering
if messageFilter != nil && !c.ShouldProcessMessage(parsedMsg, messageFilter) { if messageFilter != nil && !c.ShouldProcessMessage(parsedMsg, messageFilter) {
continue // Skip this message due to keyword filter continue // Skip this message due to keyword filter
} }
messages = append(messages, parsedMsg) messages = append(messages, parsedMsg)
} }
@ -231,7 +231,7 @@ func (c *ImapClient) parseMessage(fetchMsg *imapclient.FetchMessageData) (*Messa
env := buffer.Envelope env := buffer.Envelope
msg.Subject = env.Subject msg.Subject = env.Subject
msg.Date = env.Date msg.Date = env.Date
// Parse From addresses // Parse From addresses
for _, addr := range env.From { for _, addr := range env.From {
if addr.Mailbox != "" { if addr.Mailbox != "" {
@ -242,7 +242,7 @@ func (c *ImapClient) parseMessage(fetchMsg *imapclient.FetchMessageData) (*Messa
msg.From = append(msg.From, fullAddr) msg.From = append(msg.From, fullAddr)
} }
} }
// Parse To addresses // Parse To addresses
for _, addr := range env.To { for _, addr := range env.To {
if addr.Mailbox != "" { if addr.Mailbox != "" {
@ -264,7 +264,7 @@ func (c *ImapClient) parseMessage(fetchMsg *imapclient.FetchMessageData) (*Messa
if len(buffer.BodySection) > 0 { if len(buffer.BodySection) > 0 {
bodyBuffer := buffer.BodySection[0] bodyBuffer := buffer.BodySection[0]
reader := bytes.NewReader(bodyBuffer.Bytes) reader := bytes.NewReader(bodyBuffer.Bytes)
// Parse the message using go-message // Parse the message using go-message
entity, err := message.Read(reader) entity, err := message.Read(reader)
if err != nil { if err != nil {
@ -338,7 +338,7 @@ func (c *ImapClient) parseMessagePart(entity *message.Entity, msg *Message) erro
disposition, dispositionParams, _ := entity.Header.ContentDisposition() disposition, dispositionParams, _ := entity.Header.ContentDisposition()
// Determine if this is an attachment // Determine if this is an attachment
isAttachment := disposition == "attachment" || isAttachment := disposition == "attachment" ||
(disposition == "inline" && dispositionParams["filename"] != "") || (disposition == "inline" && dispositionParams["filename"] != "") ||
params["name"] != "" params["name"] != ""

Binary file not shown.

View file

@ -13,7 +13,7 @@ import (
func main() { func main() {
args := config.ParseCommandLine() args := config.ParseCommandLine()
cfg, err := config.LoadConfigWithDiscovery(args) cfg, err := config.LoadConfigWithDiscovery(args)
if err != nil { if err != nil {
log.Fatalf("Failed to load configuration: %v", err) log.Fatalf("Failed to load configuration: %v", err)
@ -33,12 +33,12 @@ func main() {
// Generate per-account database name // Generate per-account database name
dbName := couch.GenerateAccountDBName(source.Name, source.User) dbName := couch.GenerateAccountDBName(source.Name, source.User)
// Ensure the account-specific database exists // Ensure the account-specific database exists
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
err = couchClient.EnsureDB(ctx, dbName) err = couchClient.EnsureDB(ctx, dbName)
cancel() cancel()
if err != nil { if err != nil {
log.Printf("Could not ensure CouchDB database '%s' exists (is it running?): %v", dbName, err) log.Printf("Could not ensure CouchDB database '%s' exists (is it running?): %v", dbName, err)
continue continue
@ -111,7 +111,7 @@ func processImapSource(source *config.MailSource, couchClient *couch.Client, dbN
if syncMetadata != nil { if syncMetadata != nil {
// Use last sync time for incremental sync // Use last sync time for incremental sync
sinceDate = &syncMetadata.LastSyncTime sinceDate = &syncMetadata.LastSyncTime
fmt.Printf(" Incremental sync since: %s (last synced %d messages)\n", fmt.Printf(" Incremental sync since: %s (last synced %d messages)\n",
sinceDate.Format("2006-01-02 15:04:05"), syncMetadata.MessageCount) sinceDate.Format("2006-01-02 15:04:05"), syncMetadata.MessageCount)
} else { } else {
// First sync - use config since date if available // First sync - use config since date if available

52
rust/Cargo.toml Normal file
View file

@ -0,0 +1,52 @@
[package]
name = "mail2couch"
version = "0.1.0"
edition = "2021"
description = "A powerful email backup utility that synchronizes mail from IMAP accounts to CouchDB"
license = "MIT"
repository = "https://github.com/yourusername/mail2couch"
keywords = ["email", "backup", "imap", "couchdb", "sync"]
categories = ["email", "database"]
[dependencies]
# Serialization
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
# Date/time handling
chrono = { version = "0.4", features = ["serde"] }
# HTTP client for CouchDB
reqwest = { version = "0.11", features = ["json"] }
# Async runtime
tokio = { version = "1.0", features = ["full"] }
# Error handling
thiserror = "1.0"
anyhow = "1.0"
# Configuration
config = "0.13"
# IMAP client (when implementing IMAP functionality)
# async-imap = "0.9" # Commented out for now due to compatibility issues
# Logging
log = "0.4"
env_logger = "0.10"
# CLI argument parsing
clap = { version = "4.0", features = ["derive"] }
[dev-dependencies]
# Testing utilities
tokio-test = "0.4"
[lib]
name = "mail2couch"
path = "src/lib.rs"
[[bin]]
name = "mail2couch"
path = "src/main.rs"

111
rust/README.md Normal file
View file

@ -0,0 +1,111 @@
# Mail2Couch Rust Implementation
This directory contains the Rust implementation of mail2couch, which will provide the same functionality as the Go implementation while maintaining full compatibility with the CouchDB document schemas.
## Current Status
🚧 **Work in Progress** - The Rust implementation is planned for future development.
Currently available:
- ✅ **CouchDB Schema Definitions**: Complete Rust structs that match the Go implementation
- ✅ **Serialization Support**: Full serde integration for JSON handling
- ✅ **Type Safety**: Strongly typed structures for all CouchDB documents
- ✅ **Compatibility Tests**: Validated against example documents
- ✅ **Database Naming**: Same database naming logic as Go implementation
## Schema Compatibility
The Rust implementation uses the same CouchDB document schemas as the Go implementation:
### Mail Documents
```rust
use mail2couch::{MailDocument, generate_database_name};
let mut doc = MailDocument::new(
"123".to_string(), // IMAP UID
"INBOX".to_string(), // Mailbox
vec!["sender@example.com".to_string()], // From
vec!["recipient@example.com".to_string()], // To
"Subject".to_string(), // Subject
Utc::now(), // Date
"Body content".to_string(), // Body
HashMap::new(), // Headers
false, // Has attachments
);
doc.set_id(); // Sets ID to "INBOX_123"
```
### Sync Metadata
```rust
use mail2couch::SyncMetadata;
let metadata = SyncMetadata::new(
"INBOX".to_string(), // Mailbox
Utc::now(), // Last sync time
456, // Last message UID
100, // Message count
);
// ID automatically set to "sync_metadata_INBOX"
```
### Database Naming
```rust
use mail2couch::generate_database_name;
let db_name = generate_database_name("Personal Gmail", "");
// Returns: "m2c_personal_gmail"
let db_name = generate_database_name("", "user@example.com");
// Returns: "m2c_user_example_com"
```
## Dependencies
The Rust implementation uses these key dependencies:
- **serde**: JSON serialization/deserialization
- **chrono**: Date/time handling with ISO8601 support
- **reqwest**: HTTP client for CouchDB API
- **tokio**: Async runtime
- **anyhow/thiserror**: Error handling
## Testing
Run the schema compatibility tests:
```bash
cargo test
```
All tests validate that the Rust structures produce JSON compatible with the Go implementation and documented schemas.
## Future Implementation
The planned Rust implementation will include:
- **IMAP Client**: Connect to mail servers and retrieve messages
- **CouchDB Integration**: Store documents using native Rust CouchDB client
- **Configuration**: Same JSON config format as Go implementation
- **CLI Interface**: Compatible command-line interface
- **Performance**: Leveraging Rust's performance characteristics
- **Memory Safety**: Rust's ownership model for reliable operation
## Schema Documentation
See the following files for complete schema documentation:
- [`../couchdb-schemas.md`](../couchdb-schemas.md): Complete schema specification
- [`../examples/`](../examples/): JSON example documents
- [`src/schemas.rs`](src/schemas.rs): Rust type definitions
## Cross-Implementation Compatibility
Both Go and Rust implementations:
- Use identical CouchDB document schemas
- Generate the same database names
- Store documents with the same field names and types
- Support incremental sync with compatible metadata
- Handle attachments using CouchDB native attachment storage
This ensures that databases created by either implementation can be used interchangeably.

20
rust/src/lib.rs Normal file
View file

@ -0,0 +1,20 @@
//! # mail2couch
//!
//! A powerful email backup utility that synchronizes mail from IMAP accounts to CouchDB.
//!
//! This library provides the core functionality for:
//! - Connecting to IMAP servers
//! - Retrieving email messages and attachments
//! - Storing emails in CouchDB with proper document structures
//! - Incremental synchronization to avoid re-processing messages
//! - Filtering by folders, dates, and keywords
//!
//! ## Document Schemas
//!
//! The library uses well-defined CouchDB document schemas that are compatible
//! with the Go implementation. See the `schemas` module for details.
pub mod schemas;
// Re-export main types for convenience
pub use schemas::{MailDocument, SyncMetadata, AttachmentStub, generate_database_name};

7
rust/src/main.rs Normal file
View file

@ -0,0 +1,7 @@
// Placeholder main.rs for Rust implementation
// This will be implemented in the future
fn main() {
println!("mail2couch Rust implementation - Coming Soon!");
println!("See the Go implementation in ../go/ for current functionality.");
}

266
rust/src/schemas.rs Normal file
View file

@ -0,0 +1,266 @@
// CouchDB document schemas for mail2couch
// This file defines the Rust structures that correspond to the CouchDB document schemas
// defined in couchdb-schemas.md
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
/// Represents an email message stored in CouchDB
/// Document ID format: {mailbox}_{uid} (e.g., "INBOX_123")
/// Document type: "mail"
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MailDocument {
/// CouchDB document ID
#[serde(rename = "_id")]
#[serde(skip_serializing_if = "Option::is_none")]
pub id: Option<String>,
/// CouchDB revision (managed by CouchDB)
#[serde(rename = "_rev")]
#[serde(skip_serializing_if = "Option::is_none")]
pub rev: Option<String>,
/// CouchDB native attachments for email attachments
#[serde(rename = "_attachments")]
#[serde(skip_serializing_if = "Option::is_none")]
pub attachments: Option<HashMap<String, AttachmentStub>>,
/// Original IMAP UID from mail server
#[serde(rename = "sourceUid")]
pub source_uid: String,
/// Source mailbox name (e.g., "INBOX", "Sent")
pub mailbox: String,
/// Sender email addresses
pub from: Vec<String>,
/// Recipient email addresses
pub to: Vec<String>,
/// Email subject line
pub subject: String,
/// Email date from headers (ISO8601 format)
pub date: DateTime<Utc>,
/// Email body content (plain text)
pub body: String,
/// All email headers as key-value pairs
pub headers: HashMap<String, Vec<String>>,
/// When document was stored in CouchDB (ISO8601 format)
#[serde(rename = "storedAt")]
pub stored_at: DateTime<Utc>,
/// Document type identifier (always "mail")
#[serde(rename = "docType")]
pub doc_type: String,
/// Whether email has attachments
#[serde(rename = "hasAttachments")]
pub has_attachments: bool,
}
/// Metadata for CouchDB native attachments
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AttachmentStub {
/// MIME type of attachment
#[serde(rename = "content_type")]
pub content_type: String,
/// Size in bytes (optional)
#[serde(skip_serializing_if = "Option::is_none")]
pub length: Option<u64>,
/// Indicates attachment is stored separately (optional)
#[serde(skip_serializing_if = "Option::is_none")]
pub stub: Option<bool>,
}
/// Sync state information for incremental syncing
/// Document ID format: sync_metadata_{mailbox} (e.g., "sync_metadata_INBOX")
/// Document type: "sync_metadata"
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SyncMetadata {
/// CouchDB document ID
#[serde(rename = "_id")]
#[serde(skip_serializing_if = "Option::is_none")]
pub id: Option<String>,
/// CouchDB revision (managed by CouchDB)
#[serde(rename = "_rev")]
#[serde(skip_serializing_if = "Option::is_none")]
pub rev: Option<String>,
/// Document type identifier (always "sync_metadata")
#[serde(rename = "docType")]
pub doc_type: String,
/// Mailbox name this metadata applies to
pub mailbox: String,
/// When this mailbox was last synced (ISO8601 format)
#[serde(rename = "lastSyncTime")]
pub last_sync_time: DateTime<Utc>,
/// Highest IMAP UID processed in last sync
#[serde(rename = "lastMessageUID")]
pub last_message_uid: u32,
/// Number of messages processed in last sync
#[serde(rename = "messageCount")]
pub message_count: u32,
/// When this metadata was last updated (ISO8601 format)
#[serde(rename = "updatedAt")]
pub updated_at: DateTime<Utc>,
}
impl MailDocument {
/// Create a new MailDocument with required fields
pub fn new(
source_uid: String,
mailbox: String,
from: Vec<String>,
to: Vec<String>,
subject: String,
date: DateTime<Utc>,
body: String,
headers: HashMap<String, Vec<String>>,
has_attachments: bool,
) -> Self {
let now = Utc::now();
Self {
id: None, // Will be set when storing to CouchDB
rev: None, // Managed by CouchDB
attachments: None,
source_uid,
mailbox,
from,
to,
subject,
date,
body,
headers,
stored_at: now,
doc_type: "mail".to_string(),
has_attachments,
}
}
/// Generate document ID based on mailbox and UID
pub fn generate_id(&self) -> String {
format!("{}_{}", self.mailbox, self.source_uid)
}
/// Set the document ID
pub fn set_id(&mut self) {
self.id = Some(self.generate_id());
}
}
impl SyncMetadata {
/// Create new sync metadata for a mailbox
pub fn new(
mailbox: String,
last_sync_time: DateTime<Utc>,
last_message_uid: u32,
message_count: u32,
) -> Self {
let now = Utc::now();
Self {
id: Some(format!("sync_metadata_{}", mailbox)),
rev: None, // Managed by CouchDB
doc_type: "sync_metadata".to_string(),
mailbox,
last_sync_time,
last_message_uid,
message_count,
updated_at: now,
}
}
}
/// Generate CouchDB database name from account information
/// Format: m2c_{account_name}
/// Rules: lowercase, replace invalid chars with underscores, ensure starts with letter
pub fn generate_database_name(account_name: &str, user_email: &str) -> String {
let name = if account_name.is_empty() {
user_email
} else {
account_name
};
// Convert to lowercase and replace invalid characters
let mut valid_name = name
.to_lowercase()
.chars()
.map(|c| {
if c.is_ascii_alphanumeric() || c == '_' || c == '$' || c == '(' || c == ')' || c == '+' || c == '-' || c == '/' {
c
} else {
'_'
}
})
.collect::<String>();
// Ensure starts with a letter
if valid_name.is_empty() || !valid_name.chars().next().unwrap().is_ascii_lowercase() {
valid_name = format!("m2c_mail_{}", valid_name);
} else {
valid_name = format!("m2c_{}", valid_name);
}
valid_name
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_generate_database_name() {
assert_eq!(generate_database_name("Personal Gmail", ""), "m2c_personal_gmail");
assert_eq!(generate_database_name("", "user@example.com"), "m2c_user_example_com");
assert_eq!(generate_database_name("123work", ""), "m2c_mail_123work");
}
#[test]
fn test_mail_document_id_generation() {
let mut doc = MailDocument::new(
"123".to_string(),
"INBOX".to_string(),
vec!["sender@example.com".to_string()],
vec!["recipient@example.com".to_string()],
"Test Subject".to_string(),
Utc::now(),
"Test body".to_string(),
HashMap::new(),
false,
);
assert_eq!(doc.generate_id(), "INBOX_123");
doc.set_id();
assert_eq!(doc.id, Some("INBOX_123".to_string()));
}
#[test]
fn test_sync_metadata_creation() {
let metadata = SyncMetadata::new(
"INBOX".to_string(),
Utc::now(),
456,
100,
);
assert_eq!(metadata.id, Some("sync_metadata_INBOX".to_string()));
assert_eq!(metadata.doc_type, "sync_metadata");
assert_eq!(metadata.mailbox, "INBOX");
assert_eq!(metadata.last_message_uid, 456);
assert_eq!(metadata.message_count, 100);
}
}

169
scripts/validate-schemas.py Executable file
View file

@ -0,0 +1,169 @@
#!/usr/bin/env python3
"""
Schema Validation Script for mail2couch
This script validates that the CouchDB document schemas are consistent
between the Go implementation and the documented JSON examples.
"""
import json
import sys
from pathlib import Path
def load_json_file(file_path):
"""Load and parse a JSON file."""
try:
with open(file_path, 'r') as f:
return json.load(f)
except FileNotFoundError:
print(f"ERROR: File not found: {file_path}")
return None
except json.JSONDecodeError as e:
print(f"ERROR: Invalid JSON in {file_path}: {e}")
return None
def validate_mail_document(doc, filename):
"""Validate a mail document structure."""
required_fields = [
'_id', 'sourceUid', 'mailbox', 'from', 'to', 'subject',
'date', 'body', 'headers', 'storedAt', 'docType', 'hasAttachments'
]
errors = []
# Check required fields
for field in required_fields:
if field not in doc:
errors.append(f"Missing required field: {field}")
# Check field types
if 'docType' in doc and doc['docType'] != 'mail':
errors.append(f"Invalid docType: expected 'mail', got '{doc['docType']}'")
if 'from' in doc and not isinstance(doc['from'], list):
errors.append("Field 'from' must be an array")
if 'to' in doc and not isinstance(doc['to'], list):
errors.append("Field 'to' must be an array")
if 'headers' in doc and not isinstance(doc['headers'], dict):
errors.append("Field 'headers' must be an object")
if 'hasAttachments' in doc and not isinstance(doc['hasAttachments'], bool):
errors.append("Field 'hasAttachments' must be a boolean")
# Check _id format
if '_id' in doc:
doc_id = doc['_id']
if '_' not in doc_id:
errors.append(f"Document ID '{doc_id}' should follow format 'mailbox_uid'")
# Validate attachments if present
if '_attachments' in doc:
if not isinstance(doc['_attachments'], dict):
errors.append("Field '_attachments' must be an object")
else:
for filename, stub in doc['_attachments'].items():
if 'content_type' not in stub:
errors.append(f"Attachment '{filename}' missing content_type")
if errors:
print(f"ERRORS in {filename}:")
for error in errors:
print(f" - {error}")
return False
else:
print(f"{filename}: Valid mail document")
return True
def validate_sync_metadata(doc, filename):
"""Validate a sync metadata document structure."""
required_fields = [
'_id', 'docType', 'mailbox', 'lastSyncTime',
'lastMessageUID', 'messageCount', 'updatedAt'
]
errors = []
# Check required fields
for field in required_fields:
if field not in doc:
errors.append(f"Missing required field: {field}")
# Check field types
if 'docType' in doc and doc['docType'] != 'sync_metadata':
errors.append(f"Invalid docType: expected 'sync_metadata', got '{doc['docType']}'")
if 'lastMessageUID' in doc and not isinstance(doc['lastMessageUID'], int):
errors.append("Field 'lastMessageUID' must be an integer")
if 'messageCount' in doc and not isinstance(doc['messageCount'], int):
errors.append("Field 'messageCount' must be an integer")
# Check _id format
if '_id' in doc:
doc_id = doc['_id']
if not doc_id.startswith('sync_metadata_'):
errors.append(f"Document ID '{doc_id}' should start with 'sync_metadata_'")
if errors:
print(f"ERRORS in {filename}:")
for error in errors:
print(f" - {error}")
return False
else:
print(f"{filename}: Valid sync metadata document")
return True
def main():
"""Main validation function."""
script_dir = Path(__file__).parent
project_root = script_dir.parent
examples_dir = project_root / "examples"
print("Validating CouchDB document schemas...")
print("=" * 50)
all_valid = True
# Validate mail documents
mail_files = [
"sample-mail-document.json",
"simple-mail-document.json"
]
for filename in mail_files:
file_path = examples_dir / filename
doc = load_json_file(file_path)
if doc is None:
all_valid = False
continue
if not validate_mail_document(doc, filename):
all_valid = False
# Validate sync metadata
sync_files = [
"sample-sync-metadata.json"
]
for filename in sync_files:
file_path = examples_dir / filename
doc = load_json_file(file_path)
if doc is None:
all_valid = False
continue
if not validate_sync_metadata(doc, filename):
all_valid = False
print("=" * 50)
if all_valid:
print("✓ All schemas are valid!")
sys.exit(0)
else:
print("✗ Schema validation failed!")
sys.exit(1)
if __name__ == "__main__":
main()

View file

@ -8,6 +8,7 @@ The test environment provides:
- **CouchDB**: Database for storing email messages - **CouchDB**: Database for storing email messages
- **GreenMail IMAP Server**: Java-based mail server designed for testing with pre-populated test accounts and messages - **GreenMail IMAP Server**: Java-based mail server designed for testing with pre-populated test accounts and messages
- **Test Configuration**: Ready-to-use config for testing both sync and archive modes - **Test Configuration**: Ready-to-use config for testing both sync and archive modes
- **HTML Webmail Interface**: Beautiful, responsive web interface for viewing archived emails
## Quick Start ## Quick Start
@ -114,6 +115,20 @@ Each account contains:
- **SMTP Port**: 3025 - **SMTP Port**: 3025
- **Server**: GreenMail (Java-based test server) - **Server**: GreenMail (Java-based test server)
### Accessing Test Data
After running mail2couch, you can access the stored emails via CouchDB's REST API:
**📋 Database Access:**
- All databases: http://localhost:5984/_all_dbs
- Specific database: http://localhost:5984/m2c_specific_folders_only
- All documents: http://localhost:5984/m2c_specific_folders_only/_all_docs
- Individual message: http://localhost:5984/m2c_specific_folders_only/INBOX_12
**🔍 Raw Data Examples:**
- Database info: http://localhost:5984/m2c_specific_folders_only
- Document content: http://localhost:5984/m2c_specific_folders_only/INBOX_1
- Email attachments: http://localhost:5984/m2c_specific_folders_only/INBOX_1/{attachment_name}
## Database Structure ## Database Structure
mail2couch will create separate databases for each mail source (with `m2c_` prefix): mail2couch will create separate databases for each mail source (with `m2c_` prefix):
@ -126,6 +141,7 @@ Each database contains documents with:
- `mailbox` field indicating the origin folder - `mailbox` field indicating the origin folder
- Native CouchDB attachments for email attachments - Native CouchDB attachments for email attachments
- Full message headers and body content - Full message headers and body content
- JSON documents accessible via CouchDB REST API
## Testing Sync vs Archive Modes ## Testing Sync vs Archive Modes

View file

@ -63,5 +63,18 @@ echo ""
echo "To run mail2couch:" echo "To run mail2couch:"
echo " cd ../go && ./mail2couch -config ../test/config-test.json" echo " cd ../go && ./mail2couch -config ../test/config-test.json"
echo "" echo ""
echo "📧 MAIL2COUCH DATABASE ACCESS:"
echo "After running mail2couch, you can access the stored emails via CouchDB:"
echo ""
echo "📋 Database Access (examples after sync):"
echo " - All databases: http://localhost:5984/_all_dbs"
echo " - Specific database: http://localhost:5984/m2c_specific_folders_only"
echo " - All documents: http://localhost:5984/m2c_specific_folders_only/_all_docs"
echo " - Individual message: http://localhost:5984/m2c_specific_folders_only/INBOX_12"
echo ""
echo "🔍 Raw Data Access:"
echo " - Database info: http://localhost:5984/m2c_specific_folders_only"
echo " - Document with content: http://localhost:5984/m2c_specific_folders_only/INBOX_12"
echo ""
echo "To stop the environment:" echo "To stop the environment:"
echo " ./stop-test-env.sh" echo " ./stop-test-env.sh"

View file

@ -81,7 +81,17 @@ add_new_messages() {
import imaplib import imaplib
import time import time
from test.populate_greenmail import create_simple_message import sys
import os
# Add the test directory to Python path to enable imports
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
import importlib.util
spec = importlib.util.spec_from_file_location("populate_greenmail", "populate-greenmail.py")
populate_greenmail = importlib.util.module_from_spec(spec)
spec.loader.exec_module(populate_greenmail)
create_simple_message = populate_greenmail.create_simple_message
def add_new_messages(): def add_new_messages():
"""Add new messages to test incremental sync""" """Add new messages to test incremental sync"""