# CouchDB Document Schemas This document defines the CouchDB document schemas used by mail2couch. These schemas must be maintained consistently across all implementations (Go, Rust, etc.). ## Mail Document Schema **Document Type**: `mail` **Document ID Format**: `{mailbox}_{uid}` (e.g., `INBOX_123`) **Purpose**: Stores individual email messages with metadata and content ```json { "_id": "INBOX_123", "_rev": "1-abc123...", "_attachments": { "attachment1.pdf": { "content_type": "application/pdf", "length": 12345, "stub": true } }, "sourceUid": "123", "mailbox": "INBOX", "from": ["sender@example.com"], "to": ["recipient@example.com"], "subject": "Email Subject", "date": "2025-08-02T12:16:10Z", "body": "Email body content", "headers": { "Content-Type": ["text/plain; charset=utf-8"], "Message-ID": [""], "Date": ["Sat, 02 Aug 2025 14:16:10 +0200"] }, "storedAt": "2025-08-02T14:16:22.375241322+02:00", "docType": "mail", "hasAttachments": true } ``` ### Field Definitions | Field | Type | Required | Description | |-------|------|----------|-------------| | `_id` | string | Yes | CouchDB document ID: `{mailbox}_{uid}` | | `_rev` | string | Auto | CouchDB revision (managed by CouchDB) | | `_attachments` | object | No | CouchDB native attachments (email attachments) | | `sourceUid` | string | Yes | Original IMAP UID from mail server | | `mailbox` | string | Yes | Source mailbox name (e.g., "INBOX", "Sent") | | `from` | array[string] | Yes | Sender email addresses | | `to` | array[string] | Yes | Recipient email addresses | | `subject` | string | Yes | Email subject line | | `date` | string (ISO8601) | Yes | Email date from headers | | `body` | string | Yes | Email body content (plain text) | | `headers` | object | Yes | All email headers as key-value pairs | | `storedAt` | string (ISO8601) | Yes | When document was stored in CouchDB | | `docType` | string | Yes | Always "mail" for email documents | | `hasAttachments` | boolean | Yes | Whether email has attachments | ### Attachment Stub Schema When emails have attachments, they are stored as CouchDB native attachments: ```json { "filename.ext": { "content_type": "mime/type", "length": 12345, "stub": true } } ``` | Field | Type | Required | Description | |-------|------|----------|-------------| | `content_type` | string | Yes | MIME type of attachment | | `length` | integer | No | Size in bytes | | `stub` | boolean | No | Indicates attachment is stored separately | ## Sync Metadata Document Schema **Document Type**: `sync_metadata` **Document ID Format**: `sync_metadata_{mailbox}` (e.g., `sync_metadata_INBOX`) **Purpose**: Tracks synchronization state for incremental syncing ```json { "_id": "sync_metadata_INBOX", "_rev": "1-def456...", "docType": "sync_metadata", "mailbox": "INBOX", "lastSyncTime": "2025-08-02T14:26:08.281094+02:00", "lastMessageUID": 15, "messageCount": 18, "updatedAt": "2025-08-02T14:26:08.281094+02:00" } ``` ### Field Definitions | Field | Type | Required | Description | |-------|------|----------|-------------| | `_id` | string | Yes | CouchDB document ID: `sync_metadata_{mailbox}` | | `_rev` | string | Auto | CouchDB revision (managed by CouchDB) | | `docType` | string | Yes | Always "sync_metadata" for sync documents | | `mailbox` | string | Yes | Mailbox name this metadata applies to | | `lastSyncTime` | string (ISO8601) | Yes | When this mailbox was last synced | | `lastMessageUID` | integer | Yes | Highest IMAP UID processed in last sync | | `messageCount` | integer | Yes | Number of messages processed in last sync | | `updatedAt` | string (ISO8601) | Yes | When this metadata was last updated | ## Database Naming Convention **Format**: `m2c_{account_name}` **Rules**: - Prefix all databases with `m2c_` - Convert account names to lowercase - Replace invalid characters with underscores - Ensure database name starts with a letter - If account name starts with non-letter, prefix with `mail_` **Examples**: - Account "Personal Gmail" → Database `m2c_personal_gmail` - Account "123work" → Database `m2c_mail_123work` - Email "user@example.com" → Database `m2c_user_example_com` ## Document ID Conventions ### Mail Documents - **Format**: `{mailbox}_{uid}` - **Examples**: `INBOX_123`, `Sent_456`, `Work/Projects_789` - **Uniqueness**: Combination of mailbox and IMAP UID ensures uniqueness ### Sync Metadata Documents - **Format**: `sync_metadata_{mailbox}` - **Examples**: `sync_metadata_INBOX`, `sync_metadata_Sent` - **Purpose**: One metadata document per mailbox for tracking sync state ## Data Type Mappings ### Go to JSON | Go Type | JSON Type | Example | |---------|-----------|---------| | `string` | string | `"text"` | | `[]string` | array | `["item1", "item2"]` | | `map[string][]string` | object | `{"key": ["value1", "value2"]}` | | `time.Time` | string (ISO8601) | `"2025-08-02T14:26:08.281094+02:00"` | | `uint32` | number | `123` | | `int` | number | `456` | | `bool` | boolean | `true` | ### Rust Considerations When implementing in Rust, ensure: - Use `chrono::DateTime` for timestamps with ISO8601 serialization - Use `Vec` for string arrays - Use `HashMap>` for headers - Use `serde` with `#[serde(rename = "fieldName")]` for JSON field mapping - Handle optional fields with `Option` ## Validation Rules ### Required Fields All documents must include: - `_id`: Valid CouchDB document ID - `docType`: Identifies document type for filtering - `mailbox`: Source mailbox name (for mail documents) ### Data Constraints - Email addresses: No validation enforced (preserve as-is from IMAP) - Dates: Must be valid ISO8601 format - UIDs: Must be positive integers - Document IDs: Must be valid CouchDB IDs (no spaces, special chars) ### Attachment Handling - Store email attachments as CouchDB native attachments - Preserve original filenames and MIME types - Use attachment stubs in document metadata - Support binary content through CouchDB attachment API ## Backward Compatibility When modifying schemas: 1. Add new fields as optional 2. Never remove existing fields 3. Maintain existing field types and formats 4. Document any breaking changes clearly 5. Provide migration guidance for existing data ## Implementation Notes ### CouchDB Features Used - **Native Attachments**: For email attachments - **Document IDs**: Predictable format for easy access - **Bulk Operations**: For efficient storage - **Conflict Resolution**: CouchDB handles revision conflicts ### Performance Considerations - Index by `docType` for efficient filtering - Index by `mailbox` for folder-based queries - Index by `date` for chronological access - Use bulk insert operations for multiple messages ### Future Extensions This schema supports future enhancements: - **Webmail Views**: CouchDB design documents for HTML interface - **Search Indexes**: Full-text search with CouchDB-Lucene - **Replication**: Multi-database sync scenarios - **Analytics**: Message statistics and reporting