feat: implement comprehensive wildcard folder selection and keyword filtering
## Wildcard Folder Selection
- Add support for wildcard patterns (`*`, `?`, `[abc]`) using filepath.Match
- Implement special case: `"*"` selects ALL available folders
- Support for complex include/exclude pattern combinations
- Maintain backwards compatibility with exact string matching
- Enable subfolder pattern matching (e.g., `Work/*`, `*/Drafts`)
## Keyword Filtering
- Add SubjectKeywords, SenderKeywords, RecipientKeywords to MessageFilter config
- Implement case-insensitive keyword matching across message fields
- Support multiple keywords per filter type with inclusive OR logic
- Add ShouldProcessMessage method for message-level filtering
## Enhanced Test Environment
- Create comprehensive wildcard pattern test scenarios
- Add 12 test folders covering various pattern types: Work/*, Important/*, Archive/*, exact matches
- Implement dedicated wildcard test script (test-wildcard-patterns.sh)
- Update test configurations to demonstrate real-world wildcard usage patterns
- Enhance test data generation with folder-specific messages for validation
## Documentation
- Create FOLDER_PATTERNS.md with comprehensive wildcard examples and use cases
- Update CLAUDE.md to reflect all implemented features and current status
- Enhance test README with detailed wildcard pattern explanations
- Provide configuration examples for common email organization scenarios
## Message Origin Tracking
- Verify all messages in CouchDB properly tagged with origin folder in `mailbox` field
- Maintain per-account database isolation for better organization
- Document ID format: `{folder}_{uid}` ensures uniqueness across folders
Key patterns supported:
- `["*"]` - All folders (with excludes)
- `["Work*", "Important*"]` - Prefix matching
- `["Work/*", "Archive/*"]` - Subfolder patterns
- `["INBOX", "Sent"]` - Exact matches
- Complex include/exclude combinations
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
ea6235b674
commit
357cd06264
10 changed files with 602 additions and 84 deletions
102
FOLDER_PATTERNS.md
Normal file
102
FOLDER_PATTERNS.md
Normal file
|
|
@ -0,0 +1,102 @@
|
|||
# Folder Pattern Matching in mail2couch
|
||||
|
||||
mail2couch supports powerful wildcard patterns for selecting which folders to process. This allows flexible configuration for different mail backup scenarios.
|
||||
|
||||
## Pattern Syntax
|
||||
|
||||
The folder filtering uses Go's `filepath.Match` syntax, which supports:
|
||||
|
||||
- `*` matches any sequence of characters (including none)
|
||||
- `?` matches any single character
|
||||
- `[abc]` matches any character within the brackets
|
||||
- `[a-z]` matches any character in the range
|
||||
- `\` escapes special characters
|
||||
|
||||
## Special Cases
|
||||
|
||||
- `"*"` in the include list means **ALL available folders** will be processed
|
||||
- Empty include list with exclude patterns will process all folders except excluded ones
|
||||
- Exact string matching is supported for backwards compatibility
|
||||
|
||||
## Examples
|
||||
|
||||
### Include All Folders
|
||||
```json
|
||||
{
|
||||
"folderFilter": {
|
||||
"include": ["*"],
|
||||
"exclude": ["Drafts", "Trash", "Spam"]
|
||||
}
|
||||
}
|
||||
```
|
||||
This processes all folders except Drafts, Trash, and Spam.
|
||||
|
||||
### Work-Related Folders Only
|
||||
```json
|
||||
{
|
||||
"folderFilter": {
|
||||
"include": ["Work*", "Projects*", "INBOX"],
|
||||
"exclude": ["*Temp*", "*Draft*"]
|
||||
}
|
||||
}
|
||||
```
|
||||
This includes folders starting with "Work" or "Projects", plus INBOX, but excludes any folder containing "Temp" or "Draft".
|
||||
|
||||
### Archive Patterns
|
||||
```json
|
||||
{
|
||||
"folderFilter": {
|
||||
"include": ["Archive*", "*Important*", "INBOX"],
|
||||
"exclude": ["*Temp"]
|
||||
}
|
||||
}
|
||||
```
|
||||
This includes folders starting with "Archive", any folder containing "Important", and INBOX, excluding temporary folders.
|
||||
|
||||
### Specific Folders Only
|
||||
```json
|
||||
{
|
||||
"folderFilter": {
|
||||
"include": ["INBOX", "Sent", "Important"],
|
||||
"exclude": []
|
||||
}
|
||||
}
|
||||
```
|
||||
This processes only the exact folders: INBOX, Sent, and Important.
|
||||
|
||||
### Subfolder Patterns
|
||||
```json
|
||||
{
|
||||
"folderFilter": {
|
||||
"include": ["Work/*", "Personal/*"],
|
||||
"exclude": ["*/Drafts"]
|
||||
}
|
||||
}
|
||||
```
|
||||
This includes all subfolders under Work and Personal, but excludes any Drafts subfolder.
|
||||
|
||||
## Folder Hierarchy
|
||||
|
||||
Different IMAP servers use different separators for folder hierarchies:
|
||||
- Most servers use `/` (e.g., `Work/Projects`, `Archive/2024`)
|
||||
- Some use `.` (e.g., `Work.Projects`, `Archive.2024`)
|
||||
|
||||
The patterns work with whatever separator your IMAP server uses.
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
1. **Corporate Email**: `["*"]` with exclude `["Drafts", "Trash", "Spam"]` for complete backup
|
||||
2. **Selective Backup**: `["INBOX", "Sent", "Important"]` for essential folders only
|
||||
3. **Project-based**: `["Project*", "Client*"]` to backup work-related folders
|
||||
4. **Archive Mode**: `["Archive*", "*Important*"]` for long-term storage
|
||||
5. **Sync Mode**: `["INBOX"]` for real-time synchronization
|
||||
|
||||
## Message Origin Tracking
|
||||
|
||||
All messages stored in CouchDB include a `mailbox` field that records the original folder name. This ensures you can always identify which folder a message came from, regardless of how it was selected by the folder filter.
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
- Using `"*"` processes all folders, which may be slow for accounts with many folders
|
||||
- Specific folder names are faster than wildcard patterns
|
||||
- Consider using exclude patterns to filter out large, unimportant folders like Trash or Spam
|
||||
Loading…
Add table
Add a link
Reference in a new issue