mail2couch/FOLDER_PATTERNS.md

102 lines
3.1 KiB
Markdown
Raw Normal View History

feat: implement comprehensive wildcard folder selection and keyword filtering ## Wildcard Folder Selection - Add support for wildcard patterns (`*`, `?`, `[abc]`) using filepath.Match - Implement special case: `"*"` selects ALL available folders - Support for complex include/exclude pattern combinations - Maintain backwards compatibility with exact string matching - Enable subfolder pattern matching (e.g., `Work/*`, `*/Drafts`) ## Keyword Filtering - Add SubjectKeywords, SenderKeywords, RecipientKeywords to MessageFilter config - Implement case-insensitive keyword matching across message fields - Support multiple keywords per filter type with inclusive OR logic - Add ShouldProcessMessage method for message-level filtering ## Enhanced Test Environment - Create comprehensive wildcard pattern test scenarios - Add 12 test folders covering various pattern types: Work/*, Important/*, Archive/*, exact matches - Implement dedicated wildcard test script (test-wildcard-patterns.sh) - Update test configurations to demonstrate real-world wildcard usage patterns - Enhance test data generation with folder-specific messages for validation ## Documentation - Create FOLDER_PATTERNS.md with comprehensive wildcard examples and use cases - Update CLAUDE.md to reflect all implemented features and current status - Enhance test README with detailed wildcard pattern explanations - Provide configuration examples for common email organization scenarios ## Message Origin Tracking - Verify all messages in CouchDB properly tagged with origin folder in `mailbox` field - Maintain per-account database isolation for better organization - Document ID format: `{folder}_{uid}` ensures uniqueness across folders Key patterns supported: - `["*"]` - All folders (with excludes) - `["Work*", "Important*"]` - Prefix matching - `["Work/*", "Archive/*"]` - Subfolder patterns - `["INBOX", "Sent"]` - Exact matches - Complex include/exclude combinations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 17:24:02 +02:00
# Folder Pattern Matching in mail2couch
mail2couch supports powerful wildcard patterns for selecting which folders to process. This allows flexible configuration for different mail backup scenarios.
## Pattern Syntax
The folder filtering uses Go's `filepath.Match` syntax, which supports:
- `*` matches any sequence of characters (including none)
- `?` matches any single character
- `[abc]` matches any character within the brackets
- `[a-z]` matches any character in the range
- `\` escapes special characters
## Special Cases
- `"*"` in the include list means **ALL available folders** will be processed
- Empty include list with exclude patterns will process all folders except excluded ones
- Exact string matching is supported for backwards compatibility
## Examples
### Include All Folders
```json
{
"folderFilter": {
"include": ["*"],
"exclude": ["Drafts", "Trash", "Spam"]
}
}
```
This processes all folders except Drafts, Trash, and Spam.
### Work-Related Folders Only
```json
{
"folderFilter": {
"include": ["Work*", "Projects*", "INBOX"],
"exclude": ["*Temp*", "*Draft*"]
}
}
```
This includes folders starting with "Work" or "Projects", plus INBOX, but excludes any folder containing "Temp" or "Draft".
### Archive Patterns
```json
{
"folderFilter": {
"include": ["Archive*", "*Important*", "INBOX"],
"exclude": ["*Temp"]
}
}
```
This includes folders starting with "Archive", any folder containing "Important", and INBOX, excluding temporary folders.
### Specific Folders Only
```json
{
"folderFilter": {
"include": ["INBOX", "Sent", "Important"],
"exclude": []
}
}
```
This processes only the exact folders: INBOX, Sent, and Important.
### Subfolder Patterns
```json
{
"folderFilter": {
"include": ["Work/*", "Personal/*"],
"exclude": ["*/Drafts"]
}
}
```
This includes all subfolders under Work and Personal, but excludes any Drafts subfolder.
## Folder Hierarchy
Different IMAP servers use different separators for folder hierarchies:
- Most servers use `/` (e.g., `Work/Projects`, `Archive/2024`)
- Some use `.` (e.g., `Work.Projects`, `Archive.2024`)
The patterns work with whatever separator your IMAP server uses.
## Common Use Cases
1. **Corporate Email**: `["*"]` with exclude `["Drafts", "Trash", "Spam"]` for complete backup
2. **Selective Backup**: `["INBOX", "Sent", "Important"]` for essential folders only
3. **Project-based**: `["Project*", "Client*"]` to backup work-related folders
4. **Archive Mode**: `["Archive*", "*Important*"]` for long-term storage
5. **Sync Mode**: `["INBOX"]` for real-time synchronization
## Message Origin Tracking
All messages stored in CouchDB include a `mailbox` field that records the original folder name. This ensures you can always identify which folder a message came from, regardless of how it was selected by the folder filter.
## Performance Considerations
- Using `"*"` processes all folders, which may be slow for accounts with many folders
- Specific folder names are faster than wildcard patterns
- Consider using exclude patterns to filter out large, unimportant folders like Trash or Spam