mail2couch/FOLDER_PATTERNS.md
Ole-Morten Duesund 357cd06264 feat: implement comprehensive wildcard folder selection and keyword filtering
## Wildcard Folder Selection
- Add support for wildcard patterns (`*`, `?`, `[abc]`) using filepath.Match
- Implement special case: `"*"` selects ALL available folders
- Support for complex include/exclude pattern combinations
- Maintain backwards compatibility with exact string matching
- Enable subfolder pattern matching (e.g., `Work/*`, `*/Drafts`)

## Keyword Filtering
- Add SubjectKeywords, SenderKeywords, RecipientKeywords to MessageFilter config
- Implement case-insensitive keyword matching across message fields
- Support multiple keywords per filter type with inclusive OR logic
- Add ShouldProcessMessage method for message-level filtering

## Enhanced Test Environment
- Create comprehensive wildcard pattern test scenarios
- Add 12 test folders covering various pattern types: Work/*, Important/*, Archive/*, exact matches
- Implement dedicated wildcard test script (test-wildcard-patterns.sh)
- Update test configurations to demonstrate real-world wildcard usage patterns
- Enhance test data generation with folder-specific messages for validation

## Documentation
- Create FOLDER_PATTERNS.md with comprehensive wildcard examples and use cases
- Update CLAUDE.md to reflect all implemented features and current status
- Enhance test README with detailed wildcard pattern explanations
- Provide configuration examples for common email organization scenarios

## Message Origin Tracking
- Verify all messages in CouchDB properly tagged with origin folder in `mailbox` field
- Maintain per-account database isolation for better organization
- Document ID format: `{folder}_{uid}` ensures uniqueness across folders

Key patterns supported:
- `["*"]` - All folders (with excludes)
- `["Work*", "Important*"]` - Prefix matching
- `["Work/*", "Archive/*"]` - Subfolder patterns
- `["INBOX", "Sent"]` - Exact matches
- Complex include/exclude combinations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 17:24:02 +02:00

102 lines
No EOL
3.1 KiB
Markdown

# Folder Pattern Matching in mail2couch
mail2couch supports powerful wildcard patterns for selecting which folders to process. This allows flexible configuration for different mail backup scenarios.
## Pattern Syntax
The folder filtering uses Go's `filepath.Match` syntax, which supports:
- `*` matches any sequence of characters (including none)
- `?` matches any single character
- `[abc]` matches any character within the brackets
- `[a-z]` matches any character in the range
- `\` escapes special characters
## Special Cases
- `"*"` in the include list means **ALL available folders** will be processed
- Empty include list with exclude patterns will process all folders except excluded ones
- Exact string matching is supported for backwards compatibility
## Examples
### Include All Folders
```json
{
"folderFilter": {
"include": ["*"],
"exclude": ["Drafts", "Trash", "Spam"]
}
}
```
This processes all folders except Drafts, Trash, and Spam.
### Work-Related Folders Only
```json
{
"folderFilter": {
"include": ["Work*", "Projects*", "INBOX"],
"exclude": ["*Temp*", "*Draft*"]
}
}
```
This includes folders starting with "Work" or "Projects", plus INBOX, but excludes any folder containing "Temp" or "Draft".
### Archive Patterns
```json
{
"folderFilter": {
"include": ["Archive*", "*Important*", "INBOX"],
"exclude": ["*Temp"]
}
}
```
This includes folders starting with "Archive", any folder containing "Important", and INBOX, excluding temporary folders.
### Specific Folders Only
```json
{
"folderFilter": {
"include": ["INBOX", "Sent", "Important"],
"exclude": []
}
}
```
This processes only the exact folders: INBOX, Sent, and Important.
### Subfolder Patterns
```json
{
"folderFilter": {
"include": ["Work/*", "Personal/*"],
"exclude": ["*/Drafts"]
}
}
```
This includes all subfolders under Work and Personal, but excludes any Drafts subfolder.
## Folder Hierarchy
Different IMAP servers use different separators for folder hierarchies:
- Most servers use `/` (e.g., `Work/Projects`, `Archive/2024`)
- Some use `.` (e.g., `Work.Projects`, `Archive.2024`)
The patterns work with whatever separator your IMAP server uses.
## Common Use Cases
1. **Corporate Email**: `["*"]` with exclude `["Drafts", "Trash", "Spam"]` for complete backup
2. **Selective Backup**: `["INBOX", "Sent", "Important"]` for essential folders only
3. **Project-based**: `["Project*", "Client*"]` to backup work-related folders
4. **Archive Mode**: `["Archive*", "*Important*"]` for long-term storage
5. **Sync Mode**: `["INBOX"]` for real-time synchronization
## Message Origin Tracking
All messages stored in CouchDB include a `mailbox` field that records the original folder name. This ensures you can always identify which folder a message came from, regardless of how it was selected by the folder filter.
## Performance Considerations
- Using `"*"` processes all folders, which may be slow for accounts with many folders
- Specific folder names are faster than wildcard patterns
- Consider using exclude patterns to filter out large, unimportant folders like Trash or Spam