mail2couch/FOLDER_PATTERNS.md
Ole-Morten Duesund 357cd06264 feat: implement comprehensive wildcard folder selection and keyword filtering
## Wildcard Folder Selection
- Add support for wildcard patterns (`*`, `?`, `[abc]`) using filepath.Match
- Implement special case: `"*"` selects ALL available folders
- Support for complex include/exclude pattern combinations
- Maintain backwards compatibility with exact string matching
- Enable subfolder pattern matching (e.g., `Work/*`, `*/Drafts`)

## Keyword Filtering
- Add SubjectKeywords, SenderKeywords, RecipientKeywords to MessageFilter config
- Implement case-insensitive keyword matching across message fields
- Support multiple keywords per filter type with inclusive OR logic
- Add ShouldProcessMessage method for message-level filtering

## Enhanced Test Environment
- Create comprehensive wildcard pattern test scenarios
- Add 12 test folders covering various pattern types: Work/*, Important/*, Archive/*, exact matches
- Implement dedicated wildcard test script (test-wildcard-patterns.sh)
- Update test configurations to demonstrate real-world wildcard usage patterns
- Enhance test data generation with folder-specific messages for validation

## Documentation
- Create FOLDER_PATTERNS.md with comprehensive wildcard examples and use cases
- Update CLAUDE.md to reflect all implemented features and current status
- Enhance test README with detailed wildcard pattern explanations
- Provide configuration examples for common email organization scenarios

## Message Origin Tracking
- Verify all messages in CouchDB properly tagged with origin folder in `mailbox` field
- Maintain per-account database isolation for better organization
- Document ID format: `{folder}_{uid}` ensures uniqueness across folders

Key patterns supported:
- `["*"]` - All folders (with excludes)
- `["Work*", "Important*"]` - Prefix matching
- `["Work/*", "Archive/*"]` - Subfolder patterns
- `["INBOX", "Sent"]` - Exact matches
- Complex include/exclude combinations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 17:24:02 +02:00

3.1 KiB

Folder Pattern Matching in mail2couch

mail2couch supports powerful wildcard patterns for selecting which folders to process. This allows flexible configuration for different mail backup scenarios.

Pattern Syntax

The folder filtering uses Go's filepath.Match syntax, which supports:

  • * matches any sequence of characters (including none)
  • ? matches any single character
  • [abc] matches any character within the brackets
  • [a-z] matches any character in the range
  • \ escapes special characters

Special Cases

  • "*" in the include list means ALL available folders will be processed
  • Empty include list with exclude patterns will process all folders except excluded ones
  • Exact string matching is supported for backwards compatibility

Examples

Include All Folders

{
  "folderFilter": {
    "include": ["*"],
    "exclude": ["Drafts", "Trash", "Spam"]
  }
}

This processes all folders except Drafts, Trash, and Spam.

{
  "folderFilter": {
    "include": ["Work*", "Projects*", "INBOX"],
    "exclude": ["*Temp*", "*Draft*"]
  }
}

This includes folders starting with "Work" or "Projects", plus INBOX, but excludes any folder containing "Temp" or "Draft".

Archive Patterns

{
  "folderFilter": {
    "include": ["Archive*", "*Important*", "INBOX"],
    "exclude": ["*Temp"]
  }
}

This includes folders starting with "Archive", any folder containing "Important", and INBOX, excluding temporary folders.

Specific Folders Only

{
  "folderFilter": {
    "include": ["INBOX", "Sent", "Important"],
    "exclude": []
  }
}

This processes only the exact folders: INBOX, Sent, and Important.

Subfolder Patterns

{
  "folderFilter": {
    "include": ["Work/*", "Personal/*"],
    "exclude": ["*/Drafts"]
  }
}

This includes all subfolders under Work and Personal, but excludes any Drafts subfolder.

Folder Hierarchy

Different IMAP servers use different separators for folder hierarchies:

  • Most servers use / (e.g., Work/Projects, Archive/2024)
  • Some use . (e.g., Work.Projects, Archive.2024)

The patterns work with whatever separator your IMAP server uses.

Common Use Cases

  1. Corporate Email: ["*"] with exclude ["Drafts", "Trash", "Spam"] for complete backup
  2. Selective Backup: ["INBOX", "Sent", "Important"] for essential folders only
  3. Project-based: ["Project*", "Client*"] to backup work-related folders
  4. Archive Mode: ["Archive*", "*Important*"] for long-term storage
  5. Sync Mode: ["INBOX"] for real-time synchronization

Message Origin Tracking

All messages stored in CouchDB include a mailbox field that records the original folder name. This ensures you can always identify which folder a message came from, regardless of how it was selected by the folder filter.

Performance Considerations

  • Using "*" processes all folders, which may be slow for accounts with many folders
  • Specific folder names are faster than wildcard patterns
  • Consider using exclude patterns to filter out large, unimportant folders like Trash or Spam