# SkyView Database Architecture This document describes SkyView's SQLite database architecture, migration system, and integration approach for persistent data storage. ## Overview SkyView uses a single SQLite database to store: - **Historic aircraft data**: Position history, message counts, signal strength - **Callsign lookup data**: Cached airline/airport information from external APIs - **Embedded aviation data**: OpenFlights airline and airport databases ## Database Design Principles ### Embedded Architecture - Single SQLite file for all persistent data - No external database dependencies - Self-contained deployment with embedded schemas - Backward compatibility through versioned migrations ### Performance Optimization - Strategic indexing for time-series aircraft data - Efficient lookups for callsign enhancement - Configurable data retention policies - Query optimization for real-time operations ### Data Safety - Atomic migration transactions - Pre-migration backups for destructive changes - Data loss warnings for schema changes - Rollback capabilities where possible ## Database Schema ### Core Tables #### `schema_info` Tracks database version and applied migrations: ```sql CREATE TABLE schema_info ( version INTEGER PRIMARY KEY, applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, description TEXT, checksum TEXT ); ``` #### `aircraft_history` Stores time-series aircraft position and message data: ```sql CREATE TABLE aircraft_history ( id INTEGER PRIMARY KEY AUTOINCREMENT, icao TEXT NOT NULL, timestamp TIMESTAMP NOT NULL, latitude REAL, longitude REAL, altitude INTEGER, speed INTEGER, track INTEGER, vertical_rate INTEGER, squawk TEXT, callsign TEXT, source_id TEXT NOT NULL, signal_strength REAL ); ``` **Indexes:** - `idx_aircraft_history_icao_time`: Fast queries by aircraft and time range - `idx_aircraft_history_timestamp`: Time-based cleanup and queries - `idx_aircraft_history_callsign`: Callsign-based searches #### `airlines` Multi-source airline database with unified schema: ```sql CREATE TABLE airlines ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, alias TEXT, iata_code TEXT, icao_code TEXT, callsign TEXT, country TEXT, country_code TEXT, active BOOLEAN DEFAULT 1, data_source TEXT NOT NULL DEFAULT 'unknown', source_id TEXT, imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` **Indexes:** - `idx_airlines_icao_code`: ICAO code lookup (primary for callsign enhancement) - `idx_airlines_iata_code`: IATA code lookup - `idx_airlines_callsign`: Radio callsign lookup - `idx_airlines_country_code`: Country-based filtering - `idx_airlines_active`: Active airlines filtering - `idx_airlines_source`: Data source tracking #### `airports` Multi-source airport database with comprehensive metadata: ```sql CREATE TABLE airports ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, ident TEXT, type TEXT, city TEXT, municipality TEXT, region TEXT, country TEXT, country_code TEXT, continent TEXT, iata_code TEXT, icao_code TEXT, local_code TEXT, gps_code TEXT, latitude REAL, longitude REAL, elevation_ft INTEGER, scheduled_service BOOLEAN DEFAULT 0, home_link TEXT, wikipedia_link TEXT, keywords TEXT, timezone_offset REAL, timezone TEXT, dst_type TEXT, data_source TEXT NOT NULL DEFAULT 'unknown', source_id TEXT, imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); ``` **Indexes:** - `idx_airports_icao_code`: ICAO code lookup - `idx_airports_iata_code`: IATA code lookup - `idx_airports_ident`: Airport identifier lookup - `idx_airports_country_code`: Country-based filtering - `idx_airports_type`: Airport type filtering - `idx_airports_coords`: Geographic coordinate queries - `idx_airports_source`: Data source tracking #### `callsign_cache` Caches external API lookups and local enrichment for callsign enhancement: ```sql CREATE TABLE callsign_cache ( callsign TEXT PRIMARY KEY, airline_icao TEXT, airline_iata TEXT, airline_name TEXT, airline_country TEXT, flight_number TEXT, origin_iata TEXT, -- Departure airport IATA code destination_iata TEXT, -- Arrival airport IATA code aircraft_type TEXT, route TEXT, -- Full route description status TEXT, -- Flight status (scheduled, delayed, etc.) source TEXT NOT NULL DEFAULT 'local', cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, expires_at TIMESTAMP NOT NULL ); ``` **Route Information Fields:** - **`origin_iata`**: IATA code of departure airport (e.g., "JFK" for New York JFK) - **`destination_iata`**: IATA code of arrival airport (e.g., "LAX" for Los Angeles) - **`route`**: Human-readable route description (e.g., "JFK-LAX" or "New York to Los Angeles") - **`status`**: Current flight status when available from external APIs These fields enable enhanced flight tracking with origin-destination pairs and route visualization. **Indexes:** - `idx_callsign_cache_expires`: Efficient cache cleanup - `idx_callsign_cache_airline`: Airline-based queries #### `data_sources` Tracks loaded external data sources and their metadata: ```sql CREATE TABLE data_sources ( name TEXT PRIMARY KEY, license TEXT NOT NULL, url TEXT, version TEXT, imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, record_count INTEGER DEFAULT 0, user_accepted_license BOOLEAN DEFAULT 0 ); ``` ## Database Location Strategy ### Path Resolution Order 1. **Explicit configuration**: `database.path` in config file 2. **System service**: `/var/lib/skyview/skyview.db` 3. **User mode**: `~/.local/share/skyview/skyview.db` 4. **Fallback**: `./skyview.db` in current directory ### Directory Permissions - System: `root:root` with `755` permissions for `/var/lib/skyview/` - User: User-owned directories with standard permissions - Service: `skyview:skyview` user/group for system service ## Migration System ### Migration Structure ```go type Migration struct { Version int // Sequential version number Description string // Human-readable description Up string // SQL for applying migration Down string // SQL for rollback (optional) DataLoss bool // Warning flag for destructive changes } ``` ### Migration Process 1. **Version Check**: Compare current schema version with available migrations 2. **Backup**: Create automatic backup before destructive changes 3. **Transaction**: Wrap each migration in atomic transaction 4. **Validation**: Verify schema integrity after migration 5. **Logging**: Record successful migrations in `schema_info` ### Data Loss Protection - Migrations marked with `DataLoss: true` require explicit user consent - Automatic backups created before destructive operations - Warning messages displayed during upgrade process - Rollback SQL provided where possible ### Example Migration Sequence ```go var migrations = []Migration{ { Version: 1, Description: "Initial schema with aircraft history", Up: createInitialSchema, DataLoss: false, }, { Version: 2, Description: "Add OpenFlights airline and airport data", Up: addAviationTables, DataLoss: false, }, { Version: 3, Description: "Add callsign lookup cache", Up: addCallsignCache, DataLoss: false, }, } ``` ## Data Sources and Loading SkyView supports multiple aviation data sources with automatic conflict resolution and license compliance. ### Supported Data Sources #### OpenFlights Airlines Database - **Source**: https://openflights.org/data.html - **License**: Open Database License (ODbL) 1.0 - **Content**: Global airline data with ICAO/IATA codes, callsigns, and country information - **Records**: ~6,162 airlines - **Update Method**: Runtime download (no license confirmation required) #### OpenFlights Airports Database - **Source**: https://openflights.org/data.html - **License**: Open Database License (ODbL) 1.0 - **Content**: Global airport data with coordinates, codes, and metadata - **Records**: ~7,698 airports - **Update Method**: Runtime download #### OurAirports Database - **Source**: https://ourairports.com/data/ - **License**: Creative Commons Zero (CC0) 1.0 - **Content**: Comprehensive airport database with detailed metadata - **Records**: ~83,557 airports - **Update Method**: Runtime download ### Data Loading System #### Intelligent Conflict Resolution The data loading system uses **INSERT OR REPLACE** upserts to handle overlapping data: ```sql INSERT OR REPLACE INTO airlines (id, name, alias, iata_code, icao_code, callsign, country, active, data_source) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) ``` This ensures that: - Duplicate records are automatically updated rather than causing errors - Later data sources can override earlier ones - Database integrity is maintained during bulk loads #### Loading Process 1. **Source Validation**: Verify data source accessibility and format 2. **Incremental Processing**: Process data in chunks to manage memory 3. **Error Handling**: Log and continue on individual record errors 4. **Statistics Reporting**: Track records processed, added, and errors 5. **Source Tracking**: Record metadata about each loaded source #### Performance Characteristics - **OpenFlights Airlines**: ~6,162 records in ~363ms - **OpenFlights Airports**: ~7,698 records in ~200ms - **OurAirports**: ~83,557 records in ~980ms - **Error Rate**: <0.1% under normal conditions ## Configuration Integration ### Database Configuration ```json { "database": { "path": "/var/lib/skyview-adsb/skyview.db", "max_history_days": 7, "backup_on_upgrade": true, "vacuum_interval": "24h", "page_size": 4096 }, "callsign": { "enabled": true, "cache_hours": 24, "external_apis": true, "privacy_mode": false } } ``` ### Configuration Fields #### `database` - **`path`**: Database file location (empty = auto-resolve) - **`max_history_days`**: Retention policy for aircraft history (0 = unlimited) - **`backup_on_upgrade`**: Create backup before schema migrations #### `callsign` - **`enabled`**: Enable callsign enhancement features - **`cache_hours`**: TTL for cached external API results - **`privacy_mode`**: Disable all external data requests - **`sources`**: Independent control for each data source ### Enhanced Configuration Example ```json { "callsign": { "enabled": true, "cache_hours": 24, "privacy_mode": false, "sources": { "openflights_embedded": { "enabled": true, "priority": 1, "license": "AGPL-3.0" }, "faa_registry": { "enabled": false, "priority": 2, "update_frequency": "weekly", "license": "public_domain" }, "opensky_api": { "enabled": false, "priority": 3, "timeout_seconds": 5, "max_retries": 2, "requires_consent": true, "license_warning": "Commercial use requires OpenSky Network consent", "user_accepts_terms": false }, "custom_database": { "enabled": false, "priority": 4, "path": "", "license": "user_verified" } }, "fallback_chain": ["openflights_embedded", "faa_registry", "opensky_api", "custom_database"] } } ``` #### Individual Source Configuration Options - **`enabled`**: Enable/disable this specific source - **`priority`**: Processing order (lower numbers = higher priority) - **`license`**: License type for compliance tracking - **`requires_consent`**: Whether source requires explicit user consent - **`user_accepts_terms`**: User acknowledgment of licensing terms - **`timeout_seconds`**: Per-source timeout configuration - **`max_retries`**: Per-source retry limits - **`update_frequency`**: For downloadable sources (daily/weekly/monthly) ## Debian Package Integration ### Package Structure ``` /var/lib/skyview/ # Database directory /etc/skyview/config.json # Default configuration /usr/bin/skyview # Main application /usr/share/skyview/ # Embedded resources ``` ### Installation Process 1. **`postinst`**: Create directories, user accounts, permissions 2. **First Run**: Database initialization and migration on startup 3. **Upgrades**: Automatic schema migration with backup 4. **Service**: Systemd integration with proper database access ### Service User - User: `skyview-adsb` - Home: `/var/lib/skyview-adsb` - Shell: `/bin/false` (service account) - Database: Read/write access to `/var/lib/skyview-adsb/` ### Automatic Database Updates The systemd service configuration includes automatic database updates on startup: ```ini [Service] Type=simple User=skyview-adsb Group=skyview-adsb # Update database before starting main service ExecStartPre=/usr/bin/skyview-data -config /etc/skyview-adsb/config.json update TimeoutStartSec=300 ExecStart=/usr/bin/skyview -config /etc/skyview-adsb/config.json ``` This ensures aviation data sources are refreshed before each service start, complementing the weekly timer-based updates. ## Data Retention and Cleanup ### Automatic Cleanup - **Aircraft History**: Configurable retention period (`max_history_days`) - **Cache Expiration**: TTL-based cleanup of external API cache - **Optimization**: Periodic VACUUM operations for storage efficiency ### Manual Maintenance ```sql -- Clean old aircraft history (example: 7 days) DELETE FROM aircraft_history WHERE timestamp < datetime('now', '-7 days'); -- Clean expired cache entries DELETE FROM callsign_cache WHERE expires_at < datetime('now'); -- Optimize database storage VACUUM; ``` ## Database Optimization SkyView includes a comprehensive database optimization system that automatically manages storage efficiency and performance. ### Optimization Features #### Automatic VACUUM Operations - **Full VACUUM**: Rebuilds database to reclaim deleted space - **Incremental VACUUM**: Gradual space reclamation with minimal performance impact - **Scheduled Maintenance**: Configurable intervals for automatic optimization - **Size Reporting**: Before/after statistics with space savings metrics #### Storage Optimization - **Page Size Optimization**: Configurable SQLite page size for optimal performance - **Auto-Vacuum Configuration**: Enables incremental space reclamation - **Statistics Updates**: ANALYZE operations for query plan optimization - **Efficiency Monitoring**: Real-time storage efficiency reporting ### Using the Optimization System #### Command Line Interface ```bash # Run comprehensive database optimization skyview-data optimize # Run with force flag to skip confirmation prompts skyview-data optimize --force # Check current optimization statistics skyview-data optimize --stats-only ``` #### Optimization Output Example ``` Optimizing database for storage efficiency... ✓ Auto VACUUM: Enable incremental auto-vacuum ✓ Incremental VACUUM: Reclaim free pages incrementally ✓ Optimize: Update SQLite query planner statistics ✓ Analyze: Update table statistics for better query plans VACUUM completed in 1.2s: 275.3 MB → 263.1 MB (saved 12.2 MB, 4.4%) Database optimization completed successfully. Storage efficiency: 96.8% (263.1 MB used of 272.4 MB allocated) ``` #### Configuration Options ```json { "database": { "vacuum_interval": "24h", "page_size": 4096, "enable_compression": true, "compression_level": 6 } } ``` ### Optimization Statistics The optimization system provides detailed metrics about database performance: #### Available Statistics - **Database Size**: Total file size in bytes - **Page Statistics**: Page size, count, and utilization - **Storage Efficiency**: Percentage of allocated space actually used - **Free Space**: Amount of reclaimable space available - **Auto-Vacuum Status**: Current auto-vacuum configuration - **Last Optimization**: Timestamp of most recent optimization #### Programmatic Access ```go // Get current optimization statistics optimizer := NewOptimizationManager(db, config) stats, err := optimizer.GetOptimizationStats() if err != nil { log.Fatal("Failed to get stats:", err) } fmt.Printf("Database efficiency: %.1f%%\n", stats.Efficiency) fmt.Printf("Storage used: %.1f MB\n", float64(stats.DatabaseSize)/(1024*1024)) ``` ## Performance Considerations ### Query Optimization - Time-range queries use `idx_aircraft_history_icao_time` - Callsign lookups prioritize local cache over external APIs - Bulk operations use transactions for consistency ### Storage Efficiency - Configurable history limits prevent unbounded growth - Automatic VACUUM operations with optimization reporting - Compressed timestamps and efficient data types - Page size optimization for storage efficiency - Auto-vacuum configuration for incremental space reclamation ### Memory Usage - WAL mode for concurrent read/write access - Connection pooling for multiple goroutines - Prepared statements for repeated queries ## Privacy and Security ### Privacy Mode SkyView includes comprehensive privacy controls through the `privacy_mode` configuration option: ```json { "callsign": { "enabled": true, "privacy_mode": true, "external_apis": false } } ``` #### Privacy Mode Features - **No External Calls**: Completely disables all external API requests - **Local-Only Lookups**: Uses only embedded OpenFlights database for callsign enhancement - **No Data Transmission**: Aircraft data never leaves the local system - **Compliance**: Suitable for sensitive environments requiring air-gapped operation #### Privacy Mode Behavior | Feature | Privacy Mode ON | Privacy Mode OFF | |---------|----------------|------------------| | External API calls | ❌ Disabled | ✅ Configurable | | OpenFlights lookup | ✅ Enabled | ✅ Enabled | | Callsign caching | ✅ Local only | ✅ Full caching | | Data transmission | ❌ None | ⚠️ API calls only | #### Use Cases for Privacy Mode - **Military installations**: No external data transmission allowed - **Air-gapped networks**: No internet connectivity available - **Corporate policies**: External API usage prohibited - **Personal privacy**: User preference for local-only operation ### Security Considerations #### File Permissions - Database files readable only by skyview user/group - Configuration files protected from unauthorized access - Backup files inherit secure permissions #### Data Protection - Local SQLite database with file-system level security - No cloud storage or external database dependencies - All aviation data processed and stored locally #### Network Security - External API calls (when enabled) use HTTPS only - No persistent connections to external services - Optional certificate validation for API endpoints ### Data Integrity - Foreign key constraints where applicable - Transaction isolation for concurrent operations - Checksums for migration verification ## Troubleshooting ### Common Issues #### Database Locked ``` Error: database is locked ``` **Solution**: Stop SkyView service, check for stale lock files, restart #### Migration Failures ``` Error: migration 3 failed: table already exists ``` **Solution**: Check schema version, restore from backup, retry migration #### Permission Denied ``` Error: unable to open database file ``` **Solution**: Verify file permissions, check directory ownership, ensure disk space ### Diagnostic Commands ```bash # Check database integrity sqlite3 /var/lib/skyview/skyview.db "PRAGMA integrity_check;" # View schema version sqlite3 /var/lib/skyview/skyview.db "SELECT * FROM schema_info;" # Database statistics sqlite3 /var/lib/skyview/skyview.db ".dbinfo" ``` ## Testing and Quality Assurance SkyView includes comprehensive test coverage for all database functionality to ensure reliability and data integrity. ### Test Coverage Areas #### Core Database Functionality - **Database Creation and Initialization**: Connection management, configuration handling - **Migration System**: Schema versioning, upgrade/downgrade operations - **Connection Pooling**: Concurrent access, connection lifecycle management - **SQLite Pragma Settings**: WAL mode, foreign keys, performance optimizations #### Data Loading and Management - **Multi-Source Loading**: OpenFlights, OurAirports data integration - **Conflict Resolution**: Upsert operations, duplicate handling - **Error Handling**: Network failures, malformed data recovery - **Performance Validation**: Loading speed, memory usage optimization #### Callsign Enhancement System - **Parsing Logic**: Callsign validation, airline code extraction - **Database Integration**: Local lookups, caching operations - **Search Functionality**: Airline filtering, country-based queries - **Cache Management**: TTL handling, cleanup operations #### Optimization System - **VACUUM Operations**: Space reclamation, performance monitoring - **Page Size Optimization**: Configuration validation, storage efficiency - **Statistics Generation**: Metrics accuracy, reporting consistency - **Maintenance Scheduling**: Automated optimization, interval management ### Test Infrastructure #### Automated Test Setup ```go // setupTestDatabase creates isolated test environment func setupTestDatabase(t *testing.T) (*Database, func()) { tempFile, _ := os.CreateTemp("", "test_skyview_*.db") config := &Config{Path: tempFile.Name()} db, _ := NewDatabase(config) db.Initialize() // Run all migrations cleanup := func() { db.Close() os.Remove(tempFile.Name()) } return db, cleanup } ``` #### Network-Safe Testing Tests gracefully handle network connectivity issues: - Skip tests requiring external data sources when offline - Provide meaningful error messages for connectivity failures - Use local test data when external sources are unavailable ### Running Tests ```bash # Run all database tests go test -v ./internal/database/... # Run tests in short mode (skip long-running network tests) go test -v -short ./internal/database/... # Run specific test categories go test -v -run="TestDatabase" ./internal/database/... go test -v -run="TestOptimization" ./internal/database/... go test -v -run="TestCallsign" ./internal/database/... ``` ## Future Enhancements ### Planned Features - **Compression**: Time-series compression for long-term storage - **Partitioning**: Date-based partitioning for large datasets - **Replication**: Read replica support for high-availability setups - **Analytics**: Built-in reporting and statistics tables - **Enhanced Route Data**: Integration with additional flight tracking APIs - **Geographic Indexing**: Spatial queries for airport proximity searches ### Migration Path - All enhancements will use versioned migrations - Backward compatibility maintained for existing installations - Data preservation prioritized over schema optimization - Comprehensive testing required for all schema changes