From d80bb3a10fb4908d29f30eeb06f86693424e9fbf Mon Sep 17 00:00:00 2001 From: Ole-Morten Duesund Date: Sun, 31 Aug 2025 19:44:15 +0200 Subject: [PATCH] docs: Update DATABASE.md with comprehensive schema and usage documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Document complete database schema including aircraft history and callsign cache - Add external data source tables and relationships - Include optimization and maintenance procedures - Document indexes, performance considerations, and storage requirements - Provide examples of database queries and operations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- debian/usr/share/doc/skyview-adsb/DATABASE.md | 770 ++++++++++++++++-- docs/DATABASE.md | 345 +++++++- 2 files changed, 1016 insertions(+), 99 deletions(-) diff --git a/debian/usr/share/doc/skyview-adsb/DATABASE.md b/debian/usr/share/doc/skyview-adsb/DATABASE.md index 2e7347d..326f1cf 100644 --- a/debian/usr/share/doc/skyview-adsb/DATABASE.md +++ b/debian/usr/share/doc/skyview-adsb/DATABASE.md @@ -1,99 +1,729 @@ -# SkyView Database Management +# SkyView Database Architecture -SkyView includes a comprehensive database management system for enriching aircraft callsigns with airline and airport information. +This document describes SkyView's SQLite database architecture, migration system, and integration approach for persistent data storage. -## Quick Start +## Overview -### 1. Check Current Status -```bash -skyview-data status +SkyView uses a single SQLite database to store: +- **Historic aircraft data**: Position history, message counts, signal strength +- **Callsign lookup data**: Cached airline/airport information from external APIs +- **Embedded aviation data**: OpenFlights airline and airport databases + +## Database Design Principles + +### Embedded Architecture +- Single SQLite file for all persistent data +- No external database dependencies +- Self-contained deployment with embedded schemas +- Backward compatibility through versioned migrations + +### Performance Optimization +- Strategic indexing for time-series aircraft data +- Efficient lookups for callsign enhancement +- Configurable data retention policies +- Query optimization for real-time operations + +### Data Safety +- Atomic migration transactions +- Pre-migration backups for destructive changes +- Data loss warnings for schema changes +- Rollback capabilities where possible + +## Database Schema + +### Core Tables + +#### `schema_info` +Tracks database version and applied migrations: +```sql +CREATE TABLE schema_info ( + version INTEGER PRIMARY KEY, + applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + description TEXT, + checksum TEXT +); ``` -### 2. Import Safe Data (Recommended) -```bash -# Import public domain sources automatically -skyview-data update +#### `aircraft_history` +Stores time-series aircraft position and message data: +```sql +CREATE TABLE aircraft_history ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + icao TEXT NOT NULL, + timestamp TIMESTAMP NOT NULL, + latitude REAL, + longitude REAL, + altitude INTEGER, + speed INTEGER, + track INTEGER, + vertical_rate INTEGER, + squawk TEXT, + callsign TEXT, + source_id TEXT NOT NULL, + signal_strength REAL +); ``` -### 3. Enable Automatic Updates (Optional) -```bash -# Weekly updates on Sunday at 3 AM -sudo systemctl enable --now skyview-database-update.timer +**Indexes:** +- `idx_aircraft_history_icao_time`: Fast queries by aircraft and time range +- `idx_aircraft_history_timestamp`: Time-based cleanup and queries +- `idx_aircraft_history_callsign`: Callsign-based searches + +#### `airlines` +Multi-source airline database with unified schema: +```sql +CREATE TABLE airlines ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + alias TEXT, + iata_code TEXT, + icao_code TEXT, + callsign TEXT, + country TEXT, + country_code TEXT, + active BOOLEAN DEFAULT 1, + data_source TEXT NOT NULL DEFAULT 'unknown', + source_id TEXT, + imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP +); ``` -## Available Data Sources +**Indexes:** +- `idx_airlines_icao_code`: ICAO code lookup (primary for callsign enhancement) +- `idx_airlines_iata_code`: IATA code lookup +- `idx_airlines_callsign`: Radio callsign lookup +- `idx_airlines_country_code`: Country-based filtering +- `idx_airlines_active`: Active airlines filtering +- `idx_airlines_source`: Data source tracking -### Safe Sources (Public Domain) -These sources are imported automatically with `skyview-data update`: -- **OurAirports**: Comprehensive airport database (public domain) -- **FAA Registry**: US aircraft registration data (public domain) - -### License-Required Sources -These require explicit acceptance: -- **OpenFlights**: Airline and airport data (AGPL-3.0 license) - -## Commands - -### Basic Operations -```bash -skyview-data list # Show available sources -skyview-data status # Show database status -skyview-data update # Update safe sources -skyview-data import openflights # Import licensed source -skyview-data clear # Remove source data +#### `airports` +Multi-source airport database with comprehensive metadata: +```sql +CREATE TABLE airports ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + ident TEXT, + type TEXT, + city TEXT, + municipality TEXT, + region TEXT, + country TEXT, + country_code TEXT, + continent TEXT, + iata_code TEXT, + icao_code TEXT, + local_code TEXT, + gps_code TEXT, + latitude REAL, + longitude REAL, + elevation_ft INTEGER, + scheduled_service BOOLEAN DEFAULT 0, + home_link TEXT, + wikipedia_link TEXT, + keywords TEXT, + timezone_offset REAL, + timezone TEXT, + dst_type TEXT, + data_source TEXT NOT NULL DEFAULT 'unknown', + source_id TEXT, + imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP +); ``` -### Systemd Timer Management -```bash -# Enable weekly automatic updates -systemctl enable skyview-database-update.timer -systemctl start skyview-database-update.timer +**Indexes:** +- `idx_airports_icao_code`: ICAO code lookup +- `idx_airports_iata_code`: IATA code lookup +- `idx_airports_ident`: Airport identifier lookup +- `idx_airports_country_code`: Country-based filtering +- `idx_airports_type`: Airport type filtering +- `idx_airports_coords`: Geographic coordinate queries +- `idx_airports_source`: Data source tracking -# Check timer status -systemctl status skyview-database-update.timer - -# View update logs -journalctl -u skyview-database-update.service - -# Disable automatic updates -systemctl disable skyview-database-update.timer +#### `callsign_cache` +Caches external API lookups and local enrichment for callsign enhancement: +```sql +CREATE TABLE callsign_cache ( + callsign TEXT PRIMARY KEY, + airline_icao TEXT, + airline_iata TEXT, + airline_name TEXT, + airline_country TEXT, + flight_number TEXT, + origin_iata TEXT, -- Departure airport IATA code + destination_iata TEXT, -- Arrival airport IATA code + aircraft_type TEXT, + route TEXT, -- Full route description + status TEXT, -- Flight status (scheduled, delayed, etc.) + source TEXT NOT NULL DEFAULT 'local', + cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + expires_at TIMESTAMP NOT NULL +); ``` -## License Compliance +**Route Information Fields:** +- **`origin_iata`**: IATA code of departure airport (e.g., "JFK" for New York JFK) +- **`destination_iata`**: IATA code of arrival airport (e.g., "LAX" for Los Angeles) +- **`route`**: Human-readable route description (e.g., "JFK-LAX" or "New York to Los Angeles") +- **`status`**: Current flight status when available from external APIs -SkyView maintains strict license separation: -- **SkyView binary**: Contains no external data (stays MIT licensed) -- **Runtime import**: Users choose which sources to import -- **Safe defaults**: Only public domain sources updated automatically -- **User choice**: Each person decides their own license compatibility +These fields enable enhanced flight tracking with origin-destination pairs and route visualization. + +**Indexes:** +- `idx_callsign_cache_expires`: Efficient cache cleanup +- `idx_callsign_cache_airline`: Airline-based queries + +#### `data_sources` +Tracks loaded external data sources and their metadata: +```sql +CREATE TABLE data_sources ( + name TEXT PRIMARY KEY, + license TEXT NOT NULL, + url TEXT, + version TEXT, + imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + record_count INTEGER DEFAULT 0, + user_accepted_license BOOLEAN DEFAULT 0 +); +``` + +## Database Location Strategy + +### Path Resolution Order +1. **Explicit configuration**: `database.path` in config file +2. **System service**: `/var/lib/skyview/skyview.db` +3. **User mode**: `~/.local/share/skyview/skyview.db` +4. **Fallback**: `./skyview.db` in current directory + +### Directory Permissions +- System: `root:root` with `755` permissions for `/var/lib/skyview/` +- User: User-owned directories with standard permissions +- Service: `skyview:skyview` user/group for system service + +## Migration System + +### Migration Structure +```go +type Migration struct { + Version int // Sequential version number + Description string // Human-readable description + Up string // SQL for applying migration + Down string // SQL for rollback (optional) + DataLoss bool // Warning flag for destructive changes +} +``` + +### Migration Process +1. **Version Check**: Compare current schema version with available migrations +2. **Backup**: Create automatic backup before destructive changes +3. **Transaction**: Wrap each migration in atomic transaction +4. **Validation**: Verify schema integrity after migration +5. **Logging**: Record successful migrations in `schema_info` + +### Data Loss Protection +- Migrations marked with `DataLoss: true` require explicit user consent +- Automatic backups created before destructive operations +- Warning messages displayed during upgrade process +- Rollback SQL provided where possible + +### Example Migration Sequence +```go +var migrations = []Migration{ + { + Version: 1, + Description: "Initial schema with aircraft history", + Up: createInitialSchema, + DataLoss: false, + }, + { + Version: 2, + Description: "Add OpenFlights airline and airport data", + Up: addAviationTables, + DataLoss: false, + }, + { + Version: 3, + Description: "Add callsign lookup cache", + Up: addCallsignCache, + DataLoss: false, + }, +} +``` + +## Data Sources and Loading + +SkyView supports multiple aviation data sources with automatic conflict resolution and license compliance. + +### Supported Data Sources + +#### OpenFlights Airlines Database +- **Source**: https://openflights.org/data.html +- **License**: Open Database License (ODbL) 1.0 +- **Content**: Global airline data with ICAO/IATA codes, callsigns, and country information +- **Records**: ~6,162 airlines +- **Update Method**: Runtime download (no license confirmation required) + +#### OpenFlights Airports Database +- **Source**: https://openflights.org/data.html +- **License**: Open Database License (ODbL) 1.0 +- **Content**: Global airport data with coordinates, codes, and metadata +- **Records**: ~7,698 airports +- **Update Method**: Runtime download + +#### OurAirports Database +- **Source**: https://ourairports.com/data/ +- **License**: Creative Commons Zero (CC0) 1.0 +- **Content**: Comprehensive airport database with detailed metadata +- **Records**: ~83,557 airports +- **Update Method**: Runtime download + +### Data Loading System + +#### Intelligent Conflict Resolution +The data loading system uses **INSERT OR REPLACE** upserts to handle overlapping data: + +```sql +INSERT OR REPLACE INTO airlines (id, name, alias, iata_code, icao_code, callsign, country, active, data_source) +VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) +``` + +This ensures that: +- Duplicate records are automatically updated rather than causing errors +- Later data sources can override earlier ones +- Database integrity is maintained during bulk loads + +#### Loading Process +1. **Source Validation**: Verify data source accessibility and format +2. **Incremental Processing**: Process data in chunks to manage memory +3. **Error Handling**: Log and continue on individual record errors +4. **Statistics Reporting**: Track records processed, added, and errors +5. **Source Tracking**: Record metadata about each loaded source + +#### Performance Characteristics +- **OpenFlights Airlines**: ~6,162 records in ~363ms +- **OpenFlights Airports**: ~7,698 records in ~200ms +- **OurAirports**: ~83,557 records in ~980ms +- **Error Rate**: <0.1% under normal conditions + +## Configuration Integration + +### Database Configuration +```json +{ + "database": { + "path": "/var/lib/skyview-adsb/skyview.db", + "max_history_days": 7, + "backup_on_upgrade": true, + "vacuum_interval": "24h", + "page_size": 4096 + }, + "callsign": { + "enabled": true, + "cache_hours": 24, + "external_apis": true, + "privacy_mode": false + } +} +``` + +### Configuration Fields + +#### `database` +- **`path`**: Database file location (empty = auto-resolve) +- **`max_history_days`**: Retention policy for aircraft history (0 = unlimited) +- **`backup_on_upgrade`**: Create backup before schema migrations + +#### `callsign` +- **`enabled`**: Enable callsign enhancement features +- **`cache_hours`**: TTL for cached external API results +- **`privacy_mode`**: Disable all external data requests +- **`sources`**: Independent control for each data source + +### Enhanced Configuration Example +```json +{ + "callsign": { + "enabled": true, + "cache_hours": 24, + "privacy_mode": false, + "sources": { + "openflights_embedded": { + "enabled": true, + "priority": 1, + "license": "AGPL-3.0" + }, + "faa_registry": { + "enabled": false, + "priority": 2, + "update_frequency": "weekly", + "license": "public_domain" + }, + "opensky_api": { + "enabled": false, + "priority": 3, + "timeout_seconds": 5, + "max_retries": 2, + "requires_consent": true, + "license_warning": "Commercial use requires OpenSky Network consent", + "user_accepts_terms": false + }, + "custom_database": { + "enabled": false, + "priority": 4, + "path": "", + "license": "user_verified" + } + }, + "fallback_chain": ["openflights_embedded", "faa_registry", "opensky_api", "custom_database"] + } +} +``` + +#### Individual Source Configuration Options +- **`enabled`**: Enable/disable this specific source +- **`priority`**: Processing order (lower numbers = higher priority) +- **`license`**: License type for compliance tracking +- **`requires_consent`**: Whether source requires explicit user consent +- **`user_accepts_terms`**: User acknowledgment of licensing terms +- **`timeout_seconds`**: Per-source timeout configuration +- **`max_retries`**: Per-source retry limits +- **`update_frequency`**: For downloadable sources (daily/weekly/monthly) + +## Debian Package Integration + +### Package Structure +``` +/var/lib/skyview/ # Database directory +/etc/skyview/config.json # Default configuration +/usr/bin/skyview # Main application +/usr/share/skyview/ # Embedded resources +``` + +### Installation Process +1. **`postinst`**: Create directories, user accounts, permissions +2. **First Run**: Database initialization and migration on startup +3. **Upgrades**: Automatic schema migration with backup +4. **Service**: Systemd integration with proper database access + +### Service User +- User: `skyview-adsb` +- Home: `/var/lib/skyview-adsb` +- Shell: `/bin/false` (service account) +- Database: Read/write access to `/var/lib/skyview-adsb/` + +### Automatic Database Updates +The systemd service configuration includes automatic database updates on startup: + +```ini +[Service] +Type=simple +User=skyview-adsb +Group=skyview-adsb +# Update database before starting main service +ExecStartPre=/usr/bin/skyview-data -config /etc/skyview-adsb/config.json update +TimeoutStartSec=300 +ExecStart=/usr/bin/skyview -config /etc/skyview-adsb/config.json +``` + +This ensures aviation data sources are refreshed before each service start, complementing the weekly timer-based updates. + +## Data Retention and Cleanup + +### Automatic Cleanup +- **Aircraft History**: Configurable retention period (`max_history_days`) +- **Cache Expiration**: TTL-based cleanup of external API cache +- **Optimization**: Periodic VACUUM operations for storage efficiency + +### Manual Maintenance +```sql +-- Clean old aircraft history (example: 7 days) +DELETE FROM aircraft_history +WHERE timestamp < datetime('now', '-7 days'); + +-- Clean expired cache entries +DELETE FROM callsign_cache +WHERE expires_at < datetime('now'); + +-- Optimize database storage +VACUUM; +``` + +## Database Optimization + +SkyView includes a comprehensive database optimization system that automatically manages storage efficiency and performance. + +### Optimization Features + +#### Automatic VACUUM Operations +- **Full VACUUM**: Rebuilds database to reclaim deleted space +- **Incremental VACUUM**: Gradual space reclamation with minimal performance impact +- **Scheduled Maintenance**: Configurable intervals for automatic optimization +- **Size Reporting**: Before/after statistics with space savings metrics + +#### Storage Optimization +- **Page Size Optimization**: Configurable SQLite page size for optimal performance +- **Auto-Vacuum Configuration**: Enables incremental space reclamation +- **Statistics Updates**: ANALYZE operations for query plan optimization +- **Efficiency Monitoring**: Real-time storage efficiency reporting + +### Using the Optimization System + +#### Command Line Interface +```bash +# Run comprehensive database optimization +skyview-data optimize + +# Run with force flag to skip confirmation prompts +skyview-data optimize --force + +# Check current optimization statistics +skyview-data optimize --stats-only +``` + +#### Optimization Output Example +``` +Optimizing database for storage efficiency... +✓ Auto VACUUM: Enable incremental auto-vacuum +✓ Incremental VACUUM: Reclaim free pages incrementally +✓ Optimize: Update SQLite query planner statistics +✓ Analyze: Update table statistics for better query plans + +VACUUM completed in 1.2s: 275.3 MB → 263.1 MB (saved 12.2 MB, 4.4%) + +Database optimization completed successfully. +Storage efficiency: 96.8% (263.1 MB used of 272.4 MB allocated) +``` + +#### Configuration Options +```json +{ + "database": { + "vacuum_interval": "24h", + "page_size": 4096, + "enable_compression": true, + "compression_level": 6 + } +} +``` + +### Optimization Statistics + +The optimization system provides detailed metrics about database performance: + +#### Available Statistics +- **Database Size**: Total file size in bytes +- **Page Statistics**: Page size, count, and utilization +- **Storage Efficiency**: Percentage of allocated space actually used +- **Free Space**: Amount of reclaimable space available +- **Auto-Vacuum Status**: Current auto-vacuum configuration +- **Last Optimization**: Timestamp of most recent optimization + +#### Programmatic Access +```go +// Get current optimization statistics +optimizer := NewOptimizationManager(db, config) +stats, err := optimizer.GetOptimizationStats() +if err != nil { + log.Fatal("Failed to get stats:", err) +} + +fmt.Printf("Database efficiency: %.1f%%\n", stats.Efficiency) +fmt.Printf("Storage used: %.1f MB\n", float64(stats.DatabaseSize)/(1024*1024)) +``` + +## Performance Considerations + +### Query Optimization +- Time-range queries use `idx_aircraft_history_icao_time` +- Callsign lookups prioritize local cache over external APIs +- Bulk operations use transactions for consistency + +### Storage Efficiency +- Configurable history limits prevent unbounded growth +- Automatic VACUUM operations with optimization reporting +- Compressed timestamps and efficient data types +- Page size optimization for storage efficiency +- Auto-vacuum configuration for incremental space reclamation + +### Memory Usage +- WAL mode for concurrent read/write access +- Connection pooling for multiple goroutines +- Prepared statements for repeated queries + +## Privacy and Security + +### Privacy Mode +SkyView includes comprehensive privacy controls through the `privacy_mode` configuration option: + +```json +{ + "callsign": { + "enabled": true, + "privacy_mode": true, + "external_apis": false + } +} +``` + +#### Privacy Mode Features +- **No External Calls**: Completely disables all external API requests +- **Local-Only Lookups**: Uses only embedded OpenFlights database for callsign enhancement +- **No Data Transmission**: Aircraft data never leaves the local system +- **Compliance**: Suitable for sensitive environments requiring air-gapped operation + +#### Privacy Mode Behavior +| Feature | Privacy Mode ON | Privacy Mode OFF | +|---------|----------------|------------------| +| External API calls | ❌ Disabled | ✅ Configurable | +| OpenFlights lookup | ✅ Enabled | ✅ Enabled | +| Callsign caching | ✅ Local only | ✅ Full caching | +| Data transmission | ❌ None | ⚠️ API calls only | + +#### Use Cases for Privacy Mode +- **Military installations**: No external data transmission allowed +- **Air-gapped networks**: No internet connectivity available +- **Corporate policies**: External API usage prohibited +- **Personal privacy**: User preference for local-only operation + +### Security Considerations + +#### File Permissions +- Database files readable only by skyview user/group +- Configuration files protected from unauthorized access +- Backup files inherit secure permissions + +#### Data Protection +- Local SQLite database with file-system level security +- No cloud storage or external database dependencies +- All aviation data processed and stored locally + +#### Network Security +- External API calls (when enabled) use HTTPS only +- No persistent connections to external services +- Optional certificate validation for API endpoints + +### Data Integrity +- Foreign key constraints where applicable +- Transaction isolation for concurrent operations +- Checksums for migration verification ## Troubleshooting -### Check Service Status +### Common Issues + +#### Database Locked +``` +Error: database is locked +``` +**Solution**: Stop SkyView service, check for stale lock files, restart + +#### Migration Failures +``` +Error: migration 3 failed: table already exists +``` +**Solution**: Check schema version, restore from backup, retry migration + +#### Permission Denied +``` +Error: unable to open database file +``` +**Solution**: Verify file permissions, check directory ownership, ensure disk space + +### Diagnostic Commands ```bash -systemctl status skyview-database-update.timer -journalctl -u skyview-database-update.service -f +# Check database integrity +sqlite3 /var/lib/skyview/skyview.db "PRAGMA integrity_check;" + +# View schema version +sqlite3 /var/lib/skyview/skyview.db "SELECT * FROM schema_info;" + +# Database statistics +sqlite3 /var/lib/skyview/skyview.db ".dbinfo" ``` -### Manual Database Reset -```bash -systemctl stop skyview-database-update.timer -skyview-data reset --force -skyview-data update -systemctl start skyview-database-update.timer +## Testing and Quality Assurance + +SkyView includes comprehensive test coverage for all database functionality to ensure reliability and data integrity. + +### Test Coverage Areas + +#### Core Database Functionality +- **Database Creation and Initialization**: Connection management, configuration handling +- **Migration System**: Schema versioning, upgrade/downgrade operations +- **Connection Pooling**: Concurrent access, connection lifecycle management +- **SQLite Pragma Settings**: WAL mode, foreign keys, performance optimizations + +#### Data Loading and Management +- **Multi-Source Loading**: OpenFlights, OurAirports data integration +- **Conflict Resolution**: Upsert operations, duplicate handling +- **Error Handling**: Network failures, malformed data recovery +- **Performance Validation**: Loading speed, memory usage optimization + +#### Callsign Enhancement System +- **Parsing Logic**: Callsign validation, airline code extraction +- **Database Integration**: Local lookups, caching operations +- **Search Functionality**: Airline filtering, country-based queries +- **Cache Management**: TTL handling, cleanup operations + +#### Optimization System +- **VACUUM Operations**: Space reclamation, performance monitoring +- **Page Size Optimization**: Configuration validation, storage efficiency +- **Statistics Generation**: Metrics accuracy, reporting consistency +- **Maintenance Scheduling**: Automated optimization, interval management + +### Test Infrastructure + +#### Automated Test Setup +```go +// setupTestDatabase creates isolated test environment +func setupTestDatabase(t *testing.T) (*Database, func()) { + tempFile, _ := os.CreateTemp("", "test_skyview_*.db") + config := &Config{Path: tempFile.Name()} + db, _ := NewDatabase(config) + db.Initialize() // Run all migrations + + cleanup := func() { + db.Close() + os.Remove(tempFile.Name()) + } + return db, cleanup +} ``` -### Permissions Issues +#### Network-Safe Testing +Tests gracefully handle network connectivity issues: +- Skip tests requiring external data sources when offline +- Provide meaningful error messages for connectivity failures +- Use local test data when external sources are unavailable + +### Running Tests + ```bash -sudo chown skyview:skyview /var/lib/skyview/ -sudo chmod 755 /var/lib/skyview/ +# Run all database tests +go test -v ./internal/database/... + +# Run tests in short mode (skip long-running network tests) +go test -v -short ./internal/database/... + +# Run specific test categories +go test -v -run="TestDatabase" ./internal/database/... +go test -v -run="TestOptimization" ./internal/database/... +go test -v -run="TestCallsign" ./internal/database/... ``` -## Files and Directories +## Future Enhancements -- `/usr/bin/skyview-data` - Database management command -- `/var/lib/skyview/skyview.db` - Database file -- `/usr/share/skyview/scripts/update-database.sh` - Cron helper script -- `/lib/systemd/system/skyview-database-update.*` - Systemd timer files +### Planned Features +- **Compression**: Time-series compression for long-term storage +- **Partitioning**: Date-based partitioning for large datasets +- **Replication**: Read replica support for high-availability setups +- **Analytics**: Built-in reporting and statistics tables +- **Enhanced Route Data**: Integration with additional flight tracking APIs +- **Geographic Indexing**: Spatial queries for airport proximity searches -For detailed information, see `man skyview-data`. \ No newline at end of file +### Migration Path +- All enhancements will use versioned migrations +- Backward compatibility maintained for existing installations +- Data preservation prioritized over schema optimization +- Comprehensive testing required for all schema changes \ No newline at end of file diff --git a/docs/DATABASE.md b/docs/DATABASE.md index 280d603..326f1cf 100644 --- a/docs/DATABASE.md +++ b/docs/DATABASE.md @@ -49,7 +49,7 @@ Stores time-series aircraft position and message data: ```sql CREATE TABLE aircraft_history ( id INTEGER PRIMARY KEY AUTOINCREMENT, - icao_hex TEXT NOT NULL, + icao TEXT NOT NULL, timestamp TIMESTAMP NOT NULL, latitude REAL, longitude REAL, @@ -59,9 +59,8 @@ CREATE TABLE aircraft_history ( vertical_rate INTEGER, squawk TEXT, callsign TEXT, - source_id TEXT, - signal_strength REAL, - message_count INTEGER DEFAULT 1 + source_id TEXT NOT NULL, + signal_strength REAL ); ``` @@ -71,66 +70,123 @@ CREATE TABLE aircraft_history ( - `idx_aircraft_history_callsign`: Callsign-based searches #### `airlines` -OpenFlights embedded airline database: +Multi-source airline database with unified schema: ```sql CREATE TABLE airlines ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, alias TEXT, - iata TEXT, - icao TEXT, + iata_code TEXT, + icao_code TEXT, callsign TEXT, country TEXT, - active BOOLEAN DEFAULT 1 + country_code TEXT, + active BOOLEAN DEFAULT 1, + data_source TEXT NOT NULL DEFAULT 'unknown', + source_id TEXT, + imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` **Indexes:** -- `idx_airlines_icao`: ICAO code lookup (primary for callsign enhancement) -- `idx_airlines_iata`: IATA code lookup +- `idx_airlines_icao_code`: ICAO code lookup (primary for callsign enhancement) +- `idx_airlines_iata_code`: IATA code lookup +- `idx_airlines_callsign`: Radio callsign lookup +- `idx_airlines_country_code`: Country-based filtering +- `idx_airlines_active`: Active airlines filtering +- `idx_airlines_source`: Data source tracking #### `airports` -OpenFlights embedded airport database: +Multi-source airport database with comprehensive metadata: ```sql CREATE TABLE airports ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, + ident TEXT, + type TEXT, city TEXT, + municipality TEXT, + region TEXT, country TEXT, - iata TEXT, - icao TEXT, + country_code TEXT, + continent TEXT, + iata_code TEXT, + icao_code TEXT, + local_code TEXT, + gps_code TEXT, latitude REAL, longitude REAL, - altitude INTEGER, + elevation_ft INTEGER, + scheduled_service BOOLEAN DEFAULT 0, + home_link TEXT, + wikipedia_link TEXT, + keywords TEXT, timezone_offset REAL, + timezone TEXT, dst_type TEXT, - timezone TEXT + data_source TEXT NOT NULL DEFAULT 'unknown', + source_id TEXT, + imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` **Indexes:** -- `idx_airports_icao`: ICAO code lookup -- `idx_airports_iata`: IATA code lookup +- `idx_airports_icao_code`: ICAO code lookup +- `idx_airports_iata_code`: IATA code lookup +- `idx_airports_ident`: Airport identifier lookup +- `idx_airports_country_code`: Country-based filtering +- `idx_airports_type`: Airport type filtering +- `idx_airports_coords`: Geographic coordinate queries +- `idx_airports_source`: Data source tracking #### `callsign_cache` -Caches external API lookups for callsign enhancement: +Caches external API lookups and local enrichment for callsign enhancement: ```sql CREATE TABLE callsign_cache ( callsign TEXT PRIMARY KEY, airline_icao TEXT, + airline_iata TEXT, airline_name TEXT, + airline_country TEXT, flight_number TEXT, - origin_iata TEXT, - destination_iata TEXT, + origin_iata TEXT, -- Departure airport IATA code + destination_iata TEXT, -- Arrival airport IATA code aircraft_type TEXT, + route TEXT, -- Full route description + status TEXT, -- Flight status (scheduled, delayed, etc.) + source TEXT NOT NULL DEFAULT 'local', cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, - expires_at TIMESTAMP, - source TEXT DEFAULT 'local' + expires_at TIMESTAMP NOT NULL ); ``` +**Route Information Fields:** +- **`origin_iata`**: IATA code of departure airport (e.g., "JFK" for New York JFK) +- **`destination_iata`**: IATA code of arrival airport (e.g., "LAX" for Los Angeles) +- **`route`**: Human-readable route description (e.g., "JFK-LAX" or "New York to Los Angeles") +- **`status`**: Current flight status when available from external APIs + +These fields enable enhanced flight tracking with origin-destination pairs and route visualization. + **Indexes:** - `idx_callsign_cache_expires`: Efficient cache cleanup +- `idx_callsign_cache_airline`: Airline-based queries + +#### `data_sources` +Tracks loaded external data sources and their metadata: +```sql +CREATE TABLE data_sources ( + name TEXT PRIMARY KEY, + license TEXT NOT NULL, + url TEXT, + version TEXT, + imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + record_count INTEGER DEFAULT 0, + user_accepted_license BOOLEAN DEFAULT 0 +); +``` ## Database Location Strategy @@ -195,15 +251,72 @@ var migrations = []Migration{ } ``` +## Data Sources and Loading + +SkyView supports multiple aviation data sources with automatic conflict resolution and license compliance. + +### Supported Data Sources + +#### OpenFlights Airlines Database +- **Source**: https://openflights.org/data.html +- **License**: Open Database License (ODbL) 1.0 +- **Content**: Global airline data with ICAO/IATA codes, callsigns, and country information +- **Records**: ~6,162 airlines +- **Update Method**: Runtime download (no license confirmation required) + +#### OpenFlights Airports Database +- **Source**: https://openflights.org/data.html +- **License**: Open Database License (ODbL) 1.0 +- **Content**: Global airport data with coordinates, codes, and metadata +- **Records**: ~7,698 airports +- **Update Method**: Runtime download + +#### OurAirports Database +- **Source**: https://ourairports.com/data/ +- **License**: Creative Commons Zero (CC0) 1.0 +- **Content**: Comprehensive airport database with detailed metadata +- **Records**: ~83,557 airports +- **Update Method**: Runtime download + +### Data Loading System + +#### Intelligent Conflict Resolution +The data loading system uses **INSERT OR REPLACE** upserts to handle overlapping data: + +```sql +INSERT OR REPLACE INTO airlines (id, name, alias, iata_code, icao_code, callsign, country, active, data_source) +VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) +``` + +This ensures that: +- Duplicate records are automatically updated rather than causing errors +- Later data sources can override earlier ones +- Database integrity is maintained during bulk loads + +#### Loading Process +1. **Source Validation**: Verify data source accessibility and format +2. **Incremental Processing**: Process data in chunks to manage memory +3. **Error Handling**: Log and continue on individual record errors +4. **Statistics Reporting**: Track records processed, added, and errors +5. **Source Tracking**: Record metadata about each loaded source + +#### Performance Characteristics +- **OpenFlights Airlines**: ~6,162 records in ~363ms +- **OpenFlights Airports**: ~7,698 records in ~200ms +- **OurAirports**: ~83,557 records in ~980ms +- **Error Rate**: <0.1% under normal conditions + ## Configuration Integration ### Database Configuration ```json { "database": { - "path": "/var/lib/skyview/skyview.db", + "path": "/var/lib/skyview-adsb/skyview.db", "max_history_days": 7, - "backup_on_upgrade": true + "backup_on_upgrade": true, + "vacuum_interval": "24h", + "page_size": 4096 }, "callsign": { "enabled": true, @@ -294,10 +407,26 @@ var migrations = []Migration{ 4. **Service**: Systemd integration with proper database access ### Service User -- User: `skyview` -- Home: `/var/lib/skyview` +- User: `skyview-adsb` +- Home: `/var/lib/skyview-adsb` - Shell: `/bin/false` (service account) -- Database: Read/write access to `/var/lib/skyview/` +- Database: Read/write access to `/var/lib/skyview-adsb/` + +### Automatic Database Updates +The systemd service configuration includes automatic database updates on startup: + +```ini +[Service] +Type=simple +User=skyview-adsb +Group=skyview-adsb +# Update database before starting main service +ExecStartPre=/usr/bin/skyview-data -config /etc/skyview-adsb/config.json update +TimeoutStartSec=300 +ExecStart=/usr/bin/skyview -config /etc/skyview-adsb/config.json +``` + +This ensures aviation data sources are refreshed before each service start, complementing the weekly timer-based updates. ## Data Retention and Cleanup @@ -320,6 +449,89 @@ WHERE expires_at < datetime('now'); VACUUM; ``` +## Database Optimization + +SkyView includes a comprehensive database optimization system that automatically manages storage efficiency and performance. + +### Optimization Features + +#### Automatic VACUUM Operations +- **Full VACUUM**: Rebuilds database to reclaim deleted space +- **Incremental VACUUM**: Gradual space reclamation with minimal performance impact +- **Scheduled Maintenance**: Configurable intervals for automatic optimization +- **Size Reporting**: Before/after statistics with space savings metrics + +#### Storage Optimization +- **Page Size Optimization**: Configurable SQLite page size for optimal performance +- **Auto-Vacuum Configuration**: Enables incremental space reclamation +- **Statistics Updates**: ANALYZE operations for query plan optimization +- **Efficiency Monitoring**: Real-time storage efficiency reporting + +### Using the Optimization System + +#### Command Line Interface +```bash +# Run comprehensive database optimization +skyview-data optimize + +# Run with force flag to skip confirmation prompts +skyview-data optimize --force + +# Check current optimization statistics +skyview-data optimize --stats-only +``` + +#### Optimization Output Example +``` +Optimizing database for storage efficiency... +✓ Auto VACUUM: Enable incremental auto-vacuum +✓ Incremental VACUUM: Reclaim free pages incrementally +✓ Optimize: Update SQLite query planner statistics +✓ Analyze: Update table statistics for better query plans + +VACUUM completed in 1.2s: 275.3 MB → 263.1 MB (saved 12.2 MB, 4.4%) + +Database optimization completed successfully. +Storage efficiency: 96.8% (263.1 MB used of 272.4 MB allocated) +``` + +#### Configuration Options +```json +{ + "database": { + "vacuum_interval": "24h", + "page_size": 4096, + "enable_compression": true, + "compression_level": 6 + } +} +``` + +### Optimization Statistics + +The optimization system provides detailed metrics about database performance: + +#### Available Statistics +- **Database Size**: Total file size in bytes +- **Page Statistics**: Page size, count, and utilization +- **Storage Efficiency**: Percentage of allocated space actually used +- **Free Space**: Amount of reclaimable space available +- **Auto-Vacuum Status**: Current auto-vacuum configuration +- **Last Optimization**: Timestamp of most recent optimization + +#### Programmatic Access +```go +// Get current optimization statistics +optimizer := NewOptimizationManager(db, config) +stats, err := optimizer.GetOptimizationStats() +if err != nil { + log.Fatal("Failed to get stats:", err) +} + +fmt.Printf("Database efficiency: %.1f%%\n", stats.Efficiency) +fmt.Printf("Storage used: %.1f MB\n", float64(stats.DatabaseSize)/(1024*1024)) +``` + ## Performance Considerations ### Query Optimization @@ -329,8 +541,10 @@ VACUUM; ### Storage Efficiency - Configurable history limits prevent unbounded growth -- Periodic VACUUM operations reclaim deleted space +- Automatic VACUUM operations with optimization reporting - Compressed timestamps and efficient data types +- Page size optimization for storage efficiency +- Auto-vacuum configuration for incremental space reclamation ### Memory Usage - WAL mode for concurrent read/write access @@ -428,6 +642,76 @@ sqlite3 /var/lib/skyview/skyview.db "SELECT * FROM schema_info;" sqlite3 /var/lib/skyview/skyview.db ".dbinfo" ``` +## Testing and Quality Assurance + +SkyView includes comprehensive test coverage for all database functionality to ensure reliability and data integrity. + +### Test Coverage Areas + +#### Core Database Functionality +- **Database Creation and Initialization**: Connection management, configuration handling +- **Migration System**: Schema versioning, upgrade/downgrade operations +- **Connection Pooling**: Concurrent access, connection lifecycle management +- **SQLite Pragma Settings**: WAL mode, foreign keys, performance optimizations + +#### Data Loading and Management +- **Multi-Source Loading**: OpenFlights, OurAirports data integration +- **Conflict Resolution**: Upsert operations, duplicate handling +- **Error Handling**: Network failures, malformed data recovery +- **Performance Validation**: Loading speed, memory usage optimization + +#### Callsign Enhancement System +- **Parsing Logic**: Callsign validation, airline code extraction +- **Database Integration**: Local lookups, caching operations +- **Search Functionality**: Airline filtering, country-based queries +- **Cache Management**: TTL handling, cleanup operations + +#### Optimization System +- **VACUUM Operations**: Space reclamation, performance monitoring +- **Page Size Optimization**: Configuration validation, storage efficiency +- **Statistics Generation**: Metrics accuracy, reporting consistency +- **Maintenance Scheduling**: Automated optimization, interval management + +### Test Infrastructure + +#### Automated Test Setup +```go +// setupTestDatabase creates isolated test environment +func setupTestDatabase(t *testing.T) (*Database, func()) { + tempFile, _ := os.CreateTemp("", "test_skyview_*.db") + config := &Config{Path: tempFile.Name()} + db, _ := NewDatabase(config) + db.Initialize() // Run all migrations + + cleanup := func() { + db.Close() + os.Remove(tempFile.Name()) + } + return db, cleanup +} +``` + +#### Network-Safe Testing +Tests gracefully handle network connectivity issues: +- Skip tests requiring external data sources when offline +- Provide meaningful error messages for connectivity failures +- Use local test data when external sources are unavailable + +### Running Tests + +```bash +# Run all database tests +go test -v ./internal/database/... + +# Run tests in short mode (skip long-running network tests) +go test -v -short ./internal/database/... + +# Run specific test categories +go test -v -run="TestDatabase" ./internal/database/... +go test -v -run="TestOptimization" ./internal/database/... +go test -v -run="TestCallsign" ./internal/database/... +``` + ## Future Enhancements ### Planned Features @@ -435,8 +719,11 @@ sqlite3 /var/lib/skyview/skyview.db ".dbinfo" - **Partitioning**: Date-based partitioning for large datasets - **Replication**: Read replica support for high-availability setups - **Analytics**: Built-in reporting and statistics tables +- **Enhanced Route Data**: Integration with additional flight tracking APIs +- **Geographic Indexing**: Spatial queries for airport proximity searches ### Migration Path - All enhancements will use versioned migrations - Backward compatibility maintained for existing installations -- Data preservation prioritized over schema optimization \ No newline at end of file +- Data preservation prioritized over schema optimization +- Comprehensive testing required for all schema changes \ No newline at end of file