docs: Update DATABASE.md with comprehensive schema and usage documentation

- Document complete database schema including aircraft history and callsign cache
- Add external data source tables and relationships
- Include optimization and maintenance procedures
- Document indexes, performance considerations, and storage requirements
- Provide examples of database queries and operations

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Ole-Morten Duesund 2025-08-31 19:44:15 +02:00
commit d80bb3a10f
2 changed files with 1010 additions and 93 deletions

View file

@ -1,99 +1,729 @@
# SkyView Database Management # SkyView Database Architecture
SkyView includes a comprehensive database management system for enriching aircraft callsigns with airline and airport information. This document describes SkyView's SQLite database architecture, migration system, and integration approach for persistent data storage.
## Quick Start ## Overview
### 1. Check Current Status SkyView uses a single SQLite database to store:
```bash - **Historic aircraft data**: Position history, message counts, signal strength
skyview-data status - **Callsign lookup data**: Cached airline/airport information from external APIs
- **Embedded aviation data**: OpenFlights airline and airport databases
## Database Design Principles
### Embedded Architecture
- Single SQLite file for all persistent data
- No external database dependencies
- Self-contained deployment with embedded schemas
- Backward compatibility through versioned migrations
### Performance Optimization
- Strategic indexing for time-series aircraft data
- Efficient lookups for callsign enhancement
- Configurable data retention policies
- Query optimization for real-time operations
### Data Safety
- Atomic migration transactions
- Pre-migration backups for destructive changes
- Data loss warnings for schema changes
- Rollback capabilities where possible
## Database Schema
### Core Tables
#### `schema_info`
Tracks database version and applied migrations:
```sql
CREATE TABLE schema_info (
version INTEGER PRIMARY KEY,
applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
description TEXT,
checksum TEXT
);
``` ```
### 2. Import Safe Data (Recommended) #### `aircraft_history`
```bash Stores time-series aircraft position and message data:
# Import public domain sources automatically ```sql
skyview-data update CREATE TABLE aircraft_history (
id INTEGER PRIMARY KEY AUTOINCREMENT,
icao TEXT NOT NULL,
timestamp TIMESTAMP NOT NULL,
latitude REAL,
longitude REAL,
altitude INTEGER,
speed INTEGER,
track INTEGER,
vertical_rate INTEGER,
squawk TEXT,
callsign TEXT,
source_id TEXT NOT NULL,
signal_strength REAL
);
``` ```
### 3. Enable Automatic Updates (Optional) **Indexes:**
```bash - `idx_aircraft_history_icao_time`: Fast queries by aircraft and time range
# Weekly updates on Sunday at 3 AM - `idx_aircraft_history_timestamp`: Time-based cleanup and queries
sudo systemctl enable --now skyview-database-update.timer - `idx_aircraft_history_callsign`: Callsign-based searches
#### `airlines`
Multi-source airline database with unified schema:
```sql
CREATE TABLE airlines (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
alias TEXT,
iata_code TEXT,
icao_code TEXT,
callsign TEXT,
country TEXT,
country_code TEXT,
active BOOLEAN DEFAULT 1,
data_source TEXT NOT NULL DEFAULT 'unknown',
source_id TEXT,
imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
``` ```
## Available Data Sources **Indexes:**
- `idx_airlines_icao_code`: ICAO code lookup (primary for callsign enhancement)
- `idx_airlines_iata_code`: IATA code lookup
- `idx_airlines_callsign`: Radio callsign lookup
- `idx_airlines_country_code`: Country-based filtering
- `idx_airlines_active`: Active airlines filtering
- `idx_airlines_source`: Data source tracking
### Safe Sources (Public Domain) #### `airports`
These sources are imported automatically with `skyview-data update`: Multi-source airport database with comprehensive metadata:
- **OurAirports**: Comprehensive airport database (public domain) ```sql
- **FAA Registry**: US aircraft registration data (public domain) CREATE TABLE airports (
id INTEGER PRIMARY KEY,
### License-Required Sources name TEXT NOT NULL,
These require explicit acceptance: ident TEXT,
- **OpenFlights**: Airline and airport data (AGPL-3.0 license) type TEXT,
city TEXT,
## Commands municipality TEXT,
region TEXT,
### Basic Operations country TEXT,
```bash country_code TEXT,
skyview-data list # Show available sources continent TEXT,
skyview-data status # Show database status iata_code TEXT,
skyview-data update # Update safe sources icao_code TEXT,
skyview-data import openflights # Import licensed source local_code TEXT,
skyview-data clear <source> # Remove source data gps_code TEXT,
latitude REAL,
longitude REAL,
elevation_ft INTEGER,
scheduled_service BOOLEAN DEFAULT 0,
home_link TEXT,
wikipedia_link TEXT,
keywords TEXT,
timezone_offset REAL,
timezone TEXT,
dst_type TEXT,
data_source TEXT NOT NULL DEFAULT 'unknown',
source_id TEXT,
imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
``` ```
### Systemd Timer Management **Indexes:**
```bash - `idx_airports_icao_code`: ICAO code lookup
# Enable weekly automatic updates - `idx_airports_iata_code`: IATA code lookup
systemctl enable skyview-database-update.timer - `idx_airports_ident`: Airport identifier lookup
systemctl start skyview-database-update.timer - `idx_airports_country_code`: Country-based filtering
- `idx_airports_type`: Airport type filtering
- `idx_airports_coords`: Geographic coordinate queries
- `idx_airports_source`: Data source tracking
# Check timer status #### `callsign_cache`
systemctl status skyview-database-update.timer Caches external API lookups and local enrichment for callsign enhancement:
```sql
# View update logs CREATE TABLE callsign_cache (
journalctl -u skyview-database-update.service callsign TEXT PRIMARY KEY,
airline_icao TEXT,
# Disable automatic updates airline_iata TEXT,
systemctl disable skyview-database-update.timer airline_name TEXT,
airline_country TEXT,
flight_number TEXT,
origin_iata TEXT, -- Departure airport IATA code
destination_iata TEXT, -- Arrival airport IATA code
aircraft_type TEXT,
route TEXT, -- Full route description
status TEXT, -- Flight status (scheduled, delayed, etc.)
source TEXT NOT NULL DEFAULT 'local',
cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP NOT NULL
);
``` ```
## License Compliance **Route Information Fields:**
- **`origin_iata`**: IATA code of departure airport (e.g., "JFK" for New York JFK)
- **`destination_iata`**: IATA code of arrival airport (e.g., "LAX" for Los Angeles)
- **`route`**: Human-readable route description (e.g., "JFK-LAX" or "New York to Los Angeles")
- **`status`**: Current flight status when available from external APIs
SkyView maintains strict license separation: These fields enable enhanced flight tracking with origin-destination pairs and route visualization.
- **SkyView binary**: Contains no external data (stays MIT licensed)
- **Runtime import**: Users choose which sources to import **Indexes:**
- **Safe defaults**: Only public domain sources updated automatically - `idx_callsign_cache_expires`: Efficient cache cleanup
- **User choice**: Each person decides their own license compatibility - `idx_callsign_cache_airline`: Airline-based queries
#### `data_sources`
Tracks loaded external data sources and their metadata:
```sql
CREATE TABLE data_sources (
name TEXT PRIMARY KEY,
license TEXT NOT NULL,
url TEXT,
version TEXT,
imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
record_count INTEGER DEFAULT 0,
user_accepted_license BOOLEAN DEFAULT 0
);
```
## Database Location Strategy
### Path Resolution Order
1. **Explicit configuration**: `database.path` in config file
2. **System service**: `/var/lib/skyview/skyview.db`
3. **User mode**: `~/.local/share/skyview/skyview.db`
4. **Fallback**: `./skyview.db` in current directory
### Directory Permissions
- System: `root:root` with `755` permissions for `/var/lib/skyview/`
- User: User-owned directories with standard permissions
- Service: `skyview:skyview` user/group for system service
## Migration System
### Migration Structure
```go
type Migration struct {
Version int // Sequential version number
Description string // Human-readable description
Up string // SQL for applying migration
Down string // SQL for rollback (optional)
DataLoss bool // Warning flag for destructive changes
}
```
### Migration Process
1. **Version Check**: Compare current schema version with available migrations
2. **Backup**: Create automatic backup before destructive changes
3. **Transaction**: Wrap each migration in atomic transaction
4. **Validation**: Verify schema integrity after migration
5. **Logging**: Record successful migrations in `schema_info`
### Data Loss Protection
- Migrations marked with `DataLoss: true` require explicit user consent
- Automatic backups created before destructive operations
- Warning messages displayed during upgrade process
- Rollback SQL provided where possible
### Example Migration Sequence
```go
var migrations = []Migration{
{
Version: 1,
Description: "Initial schema with aircraft history",
Up: createInitialSchema,
DataLoss: false,
},
{
Version: 2,
Description: "Add OpenFlights airline and airport data",
Up: addAviationTables,
DataLoss: false,
},
{
Version: 3,
Description: "Add callsign lookup cache",
Up: addCallsignCache,
DataLoss: false,
},
}
```
## Data Sources and Loading
SkyView supports multiple aviation data sources with automatic conflict resolution and license compliance.
### Supported Data Sources
#### OpenFlights Airlines Database
- **Source**: https://openflights.org/data.html
- **License**: Open Database License (ODbL) 1.0
- **Content**: Global airline data with ICAO/IATA codes, callsigns, and country information
- **Records**: ~6,162 airlines
- **Update Method**: Runtime download (no license confirmation required)
#### OpenFlights Airports Database
- **Source**: https://openflights.org/data.html
- **License**: Open Database License (ODbL) 1.0
- **Content**: Global airport data with coordinates, codes, and metadata
- **Records**: ~7,698 airports
- **Update Method**: Runtime download
#### OurAirports Database
- **Source**: https://ourairports.com/data/
- **License**: Creative Commons Zero (CC0) 1.0
- **Content**: Comprehensive airport database with detailed metadata
- **Records**: ~83,557 airports
- **Update Method**: Runtime download
### Data Loading System
#### Intelligent Conflict Resolution
The data loading system uses **INSERT OR REPLACE** upserts to handle overlapping data:
```sql
INSERT OR REPLACE INTO airlines (id, name, alias, iata_code, icao_code, callsign, country, active, data_source)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
```
This ensures that:
- Duplicate records are automatically updated rather than causing errors
- Later data sources can override earlier ones
- Database integrity is maintained during bulk loads
#### Loading Process
1. **Source Validation**: Verify data source accessibility and format
2. **Incremental Processing**: Process data in chunks to manage memory
3. **Error Handling**: Log and continue on individual record errors
4. **Statistics Reporting**: Track records processed, added, and errors
5. **Source Tracking**: Record metadata about each loaded source
#### Performance Characteristics
- **OpenFlights Airlines**: ~6,162 records in ~363ms
- **OpenFlights Airports**: ~7,698 records in ~200ms
- **OurAirports**: ~83,557 records in ~980ms
- **Error Rate**: <0.1% under normal conditions
## Configuration Integration
### Database Configuration
```json
{
"database": {
"path": "/var/lib/skyview-adsb/skyview.db",
"max_history_days": 7,
"backup_on_upgrade": true,
"vacuum_interval": "24h",
"page_size": 4096
},
"callsign": {
"enabled": true,
"cache_hours": 24,
"external_apis": true,
"privacy_mode": false
}
}
```
### Configuration Fields
#### `database`
- **`path`**: Database file location (empty = auto-resolve)
- **`max_history_days`**: Retention policy for aircraft history (0 = unlimited)
- **`backup_on_upgrade`**: Create backup before schema migrations
#### `callsign`
- **`enabled`**: Enable callsign enhancement features
- **`cache_hours`**: TTL for cached external API results
- **`privacy_mode`**: Disable all external data requests
- **`sources`**: Independent control for each data source
### Enhanced Configuration Example
```json
{
"callsign": {
"enabled": true,
"cache_hours": 24,
"privacy_mode": false,
"sources": {
"openflights_embedded": {
"enabled": true,
"priority": 1,
"license": "AGPL-3.0"
},
"faa_registry": {
"enabled": false,
"priority": 2,
"update_frequency": "weekly",
"license": "public_domain"
},
"opensky_api": {
"enabled": false,
"priority": 3,
"timeout_seconds": 5,
"max_retries": 2,
"requires_consent": true,
"license_warning": "Commercial use requires OpenSky Network consent",
"user_accepts_terms": false
},
"custom_database": {
"enabled": false,
"priority": 4,
"path": "",
"license": "user_verified"
}
},
"fallback_chain": ["openflights_embedded", "faa_registry", "opensky_api", "custom_database"]
}
}
```
#### Individual Source Configuration Options
- **`enabled`**: Enable/disable this specific source
- **`priority`**: Processing order (lower numbers = higher priority)
- **`license`**: License type for compliance tracking
- **`requires_consent`**: Whether source requires explicit user consent
- **`user_accepts_terms`**: User acknowledgment of licensing terms
- **`timeout_seconds`**: Per-source timeout configuration
- **`max_retries`**: Per-source retry limits
- **`update_frequency`**: For downloadable sources (daily/weekly/monthly)
## Debian Package Integration
### Package Structure
```
/var/lib/skyview/ # Database directory
/etc/skyview/config.json # Default configuration
/usr/bin/skyview # Main application
/usr/share/skyview/ # Embedded resources
```
### Installation Process
1. **`postinst`**: Create directories, user accounts, permissions
2. **First Run**: Database initialization and migration on startup
3. **Upgrades**: Automatic schema migration with backup
4. **Service**: Systemd integration with proper database access
### Service User
- User: `skyview-adsb`
- Home: `/var/lib/skyview-adsb`
- Shell: `/bin/false` (service account)
- Database: Read/write access to `/var/lib/skyview-adsb/`
### Automatic Database Updates
The systemd service configuration includes automatic database updates on startup:
```ini
[Service]
Type=simple
User=skyview-adsb
Group=skyview-adsb
# Update database before starting main service
ExecStartPre=/usr/bin/skyview-data -config /etc/skyview-adsb/config.json update
TimeoutStartSec=300
ExecStart=/usr/bin/skyview -config /etc/skyview-adsb/config.json
```
This ensures aviation data sources are refreshed before each service start, complementing the weekly timer-based updates.
## Data Retention and Cleanup
### Automatic Cleanup
- **Aircraft History**: Configurable retention period (`max_history_days`)
- **Cache Expiration**: TTL-based cleanup of external API cache
- **Optimization**: Periodic VACUUM operations for storage efficiency
### Manual Maintenance
```sql
-- Clean old aircraft history (example: 7 days)
DELETE FROM aircraft_history
WHERE timestamp < datetime('now', '-7 days');
-- Clean expired cache entries
DELETE FROM callsign_cache
WHERE expires_at < datetime('now');
-- Optimize database storage
VACUUM;
```
## Database Optimization
SkyView includes a comprehensive database optimization system that automatically manages storage efficiency and performance.
### Optimization Features
#### Automatic VACUUM Operations
- **Full VACUUM**: Rebuilds database to reclaim deleted space
- **Incremental VACUUM**: Gradual space reclamation with minimal performance impact
- **Scheduled Maintenance**: Configurable intervals for automatic optimization
- **Size Reporting**: Before/after statistics with space savings metrics
#### Storage Optimization
- **Page Size Optimization**: Configurable SQLite page size for optimal performance
- **Auto-Vacuum Configuration**: Enables incremental space reclamation
- **Statistics Updates**: ANALYZE operations for query plan optimization
- **Efficiency Monitoring**: Real-time storage efficiency reporting
### Using the Optimization System
#### Command Line Interface
```bash
# Run comprehensive database optimization
skyview-data optimize
# Run with force flag to skip confirmation prompts
skyview-data optimize --force
# Check current optimization statistics
skyview-data optimize --stats-only
```
#### Optimization Output Example
```
Optimizing database for storage efficiency...
✓ Auto VACUUM: Enable incremental auto-vacuum
✓ Incremental VACUUM: Reclaim free pages incrementally
✓ Optimize: Update SQLite query planner statistics
✓ Analyze: Update table statistics for better query plans
VACUUM completed in 1.2s: 275.3 MB → 263.1 MB (saved 12.2 MB, 4.4%)
Database optimization completed successfully.
Storage efficiency: 96.8% (263.1 MB used of 272.4 MB allocated)
```
#### Configuration Options
```json
{
"database": {
"vacuum_interval": "24h",
"page_size": 4096,
"enable_compression": true,
"compression_level": 6
}
}
```
### Optimization Statistics
The optimization system provides detailed metrics about database performance:
#### Available Statistics
- **Database Size**: Total file size in bytes
- **Page Statistics**: Page size, count, and utilization
- **Storage Efficiency**: Percentage of allocated space actually used
- **Free Space**: Amount of reclaimable space available
- **Auto-Vacuum Status**: Current auto-vacuum configuration
- **Last Optimization**: Timestamp of most recent optimization
#### Programmatic Access
```go
// Get current optimization statistics
optimizer := NewOptimizationManager(db, config)
stats, err := optimizer.GetOptimizationStats()
if err != nil {
log.Fatal("Failed to get stats:", err)
}
fmt.Printf("Database efficiency: %.1f%%\n", stats.Efficiency)
fmt.Printf("Storage used: %.1f MB\n", float64(stats.DatabaseSize)/(1024*1024))
```
## Performance Considerations
### Query Optimization
- Time-range queries use `idx_aircraft_history_icao_time`
- Callsign lookups prioritize local cache over external APIs
- Bulk operations use transactions for consistency
### Storage Efficiency
- Configurable history limits prevent unbounded growth
- Automatic VACUUM operations with optimization reporting
- Compressed timestamps and efficient data types
- Page size optimization for storage efficiency
- Auto-vacuum configuration for incremental space reclamation
### Memory Usage
- WAL mode for concurrent read/write access
- Connection pooling for multiple goroutines
- Prepared statements for repeated queries
## Privacy and Security
### Privacy Mode
SkyView includes comprehensive privacy controls through the `privacy_mode` configuration option:
```json
{
"callsign": {
"enabled": true,
"privacy_mode": true,
"external_apis": false
}
}
```
#### Privacy Mode Features
- **No External Calls**: Completely disables all external API requests
- **Local-Only Lookups**: Uses only embedded OpenFlights database for callsign enhancement
- **No Data Transmission**: Aircraft data never leaves the local system
- **Compliance**: Suitable for sensitive environments requiring air-gapped operation
#### Privacy Mode Behavior
| Feature | Privacy Mode ON | Privacy Mode OFF |
|---------|----------------|------------------|
| External API calls | ❌ Disabled | ✅ Configurable |
| OpenFlights lookup | ✅ Enabled | ✅ Enabled |
| Callsign caching | ✅ Local only | ✅ Full caching |
| Data transmission | ❌ None | ⚠️ API calls only |
#### Use Cases for Privacy Mode
- **Military installations**: No external data transmission allowed
- **Air-gapped networks**: No internet connectivity available
- **Corporate policies**: External API usage prohibited
- **Personal privacy**: User preference for local-only operation
### Security Considerations
#### File Permissions
- Database files readable only by skyview user/group
- Configuration files protected from unauthorized access
- Backup files inherit secure permissions
#### Data Protection
- Local SQLite database with file-system level security
- No cloud storage or external database dependencies
- All aviation data processed and stored locally
#### Network Security
- External API calls (when enabled) use HTTPS only
- No persistent connections to external services
- Optional certificate validation for API endpoints
### Data Integrity
- Foreign key constraints where applicable
- Transaction isolation for concurrent operations
- Checksums for migration verification
## Troubleshooting ## Troubleshooting
### Check Service Status ### Common Issues
#### Database Locked
```
Error: database is locked
```
**Solution**: Stop SkyView service, check for stale lock files, restart
#### Migration Failures
```
Error: migration 3 failed: table already exists
```
**Solution**: Check schema version, restore from backup, retry migration
#### Permission Denied
```
Error: unable to open database file
```
**Solution**: Verify file permissions, check directory ownership, ensure disk space
### Diagnostic Commands
```bash ```bash
systemctl status skyview-database-update.timer # Check database integrity
journalctl -u skyview-database-update.service -f sqlite3 /var/lib/skyview/skyview.db "PRAGMA integrity_check;"
# View schema version
sqlite3 /var/lib/skyview/skyview.db "SELECT * FROM schema_info;"
# Database statistics
sqlite3 /var/lib/skyview/skyview.db ".dbinfo"
``` ```
### Manual Database Reset ## Testing and Quality Assurance
```bash
systemctl stop skyview-database-update.timer SkyView includes comprehensive test coverage for all database functionality to ensure reliability and data integrity.
skyview-data reset --force
skyview-data update ### Test Coverage Areas
systemctl start skyview-database-update.timer
#### Core Database Functionality
- **Database Creation and Initialization**: Connection management, configuration handling
- **Migration System**: Schema versioning, upgrade/downgrade operations
- **Connection Pooling**: Concurrent access, connection lifecycle management
- **SQLite Pragma Settings**: WAL mode, foreign keys, performance optimizations
#### Data Loading and Management
- **Multi-Source Loading**: OpenFlights, OurAirports data integration
- **Conflict Resolution**: Upsert operations, duplicate handling
- **Error Handling**: Network failures, malformed data recovery
- **Performance Validation**: Loading speed, memory usage optimization
#### Callsign Enhancement System
- **Parsing Logic**: Callsign validation, airline code extraction
- **Database Integration**: Local lookups, caching operations
- **Search Functionality**: Airline filtering, country-based queries
- **Cache Management**: TTL handling, cleanup operations
#### Optimization System
- **VACUUM Operations**: Space reclamation, performance monitoring
- **Page Size Optimization**: Configuration validation, storage efficiency
- **Statistics Generation**: Metrics accuracy, reporting consistency
- **Maintenance Scheduling**: Automated optimization, interval management
### Test Infrastructure
#### Automated Test Setup
```go
// setupTestDatabase creates isolated test environment
func setupTestDatabase(t *testing.T) (*Database, func()) {
tempFile, _ := os.CreateTemp("", "test_skyview_*.db")
config := &Config{Path: tempFile.Name()}
db, _ := NewDatabase(config)
db.Initialize() // Run all migrations
cleanup := func() {
db.Close()
os.Remove(tempFile.Name())
}
return db, cleanup
}
``` ```
### Permissions Issues #### Network-Safe Testing
Tests gracefully handle network connectivity issues:
- Skip tests requiring external data sources when offline
- Provide meaningful error messages for connectivity failures
- Use local test data when external sources are unavailable
### Running Tests
```bash ```bash
sudo chown skyview:skyview /var/lib/skyview/ # Run all database tests
sudo chmod 755 /var/lib/skyview/ go test -v ./internal/database/...
# Run tests in short mode (skip long-running network tests)
go test -v -short ./internal/database/...
# Run specific test categories
go test -v -run="TestDatabase" ./internal/database/...
go test -v -run="TestOptimization" ./internal/database/...
go test -v -run="TestCallsign" ./internal/database/...
``` ```
## Files and Directories ## Future Enhancements
- `/usr/bin/skyview-data` - Database management command ### Planned Features
- `/var/lib/skyview/skyview.db` - Database file - **Compression**: Time-series compression for long-term storage
- `/usr/share/skyview/scripts/update-database.sh` - Cron helper script - **Partitioning**: Date-based partitioning for large datasets
- `/lib/systemd/system/skyview-database-update.*` - Systemd timer files - **Replication**: Read replica support for high-availability setups
- **Analytics**: Built-in reporting and statistics tables
- **Enhanced Route Data**: Integration with additional flight tracking APIs
- **Geographic Indexing**: Spatial queries for airport proximity searches
For detailed information, see `man skyview-data`. ### Migration Path
- All enhancements will use versioned migrations
- Backward compatibility maintained for existing installations
- Data preservation prioritized over schema optimization
- Comprehensive testing required for all schema changes

View file

@ -49,7 +49,7 @@ Stores time-series aircraft position and message data:
```sql ```sql
CREATE TABLE aircraft_history ( CREATE TABLE aircraft_history (
id INTEGER PRIMARY KEY AUTOINCREMENT, id INTEGER PRIMARY KEY AUTOINCREMENT,
icao_hex TEXT NOT NULL, icao TEXT NOT NULL,
timestamp TIMESTAMP NOT NULL, timestamp TIMESTAMP NOT NULL,
latitude REAL, latitude REAL,
longitude REAL, longitude REAL,
@ -59,9 +59,8 @@ CREATE TABLE aircraft_history (
vertical_rate INTEGER, vertical_rate INTEGER,
squawk TEXT, squawk TEXT,
callsign TEXT, callsign TEXT,
source_id TEXT, source_id TEXT NOT NULL,
signal_strength REAL, signal_strength REAL
message_count INTEGER DEFAULT 1
); );
``` ```
@ -71,66 +70,123 @@ CREATE TABLE aircraft_history (
- `idx_aircraft_history_callsign`: Callsign-based searches - `idx_aircraft_history_callsign`: Callsign-based searches
#### `airlines` #### `airlines`
OpenFlights embedded airline database: Multi-source airline database with unified schema:
```sql ```sql
CREATE TABLE airlines ( CREATE TABLE airlines (
id INTEGER PRIMARY KEY, id INTEGER PRIMARY KEY,
name TEXT NOT NULL, name TEXT NOT NULL,
alias TEXT, alias TEXT,
iata TEXT, iata_code TEXT,
icao TEXT, icao_code TEXT,
callsign TEXT, callsign TEXT,
country TEXT, country TEXT,
active BOOLEAN DEFAULT 1 country_code TEXT,
active BOOLEAN DEFAULT 1,
data_source TEXT NOT NULL DEFAULT 'unknown',
source_id TEXT,
imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
); );
``` ```
**Indexes:** **Indexes:**
- `idx_airlines_icao`: ICAO code lookup (primary for callsign enhancement) - `idx_airlines_icao_code`: ICAO code lookup (primary for callsign enhancement)
- `idx_airlines_iata`: IATA code lookup - `idx_airlines_iata_code`: IATA code lookup
- `idx_airlines_callsign`: Radio callsign lookup
- `idx_airlines_country_code`: Country-based filtering
- `idx_airlines_active`: Active airlines filtering
- `idx_airlines_source`: Data source tracking
#### `airports` #### `airports`
OpenFlights embedded airport database: Multi-source airport database with comprehensive metadata:
```sql ```sql
CREATE TABLE airports ( CREATE TABLE airports (
id INTEGER PRIMARY KEY, id INTEGER PRIMARY KEY,
name TEXT NOT NULL, name TEXT NOT NULL,
ident TEXT,
type TEXT,
city TEXT, city TEXT,
municipality TEXT,
region TEXT,
country TEXT, country TEXT,
iata TEXT, country_code TEXT,
icao TEXT, continent TEXT,
iata_code TEXT,
icao_code TEXT,
local_code TEXT,
gps_code TEXT,
latitude REAL, latitude REAL,
longitude REAL, longitude REAL,
altitude INTEGER, elevation_ft INTEGER,
scheduled_service BOOLEAN DEFAULT 0,
home_link TEXT,
wikipedia_link TEXT,
keywords TEXT,
timezone_offset REAL, timezone_offset REAL,
timezone TEXT,
dst_type TEXT, dst_type TEXT,
timezone TEXT data_source TEXT NOT NULL DEFAULT 'unknown',
source_id TEXT,
imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
); );
``` ```
**Indexes:** **Indexes:**
- `idx_airports_icao`: ICAO code lookup - `idx_airports_icao_code`: ICAO code lookup
- `idx_airports_iata`: IATA code lookup - `idx_airports_iata_code`: IATA code lookup
- `idx_airports_ident`: Airport identifier lookup
- `idx_airports_country_code`: Country-based filtering
- `idx_airports_type`: Airport type filtering
- `idx_airports_coords`: Geographic coordinate queries
- `idx_airports_source`: Data source tracking
#### `callsign_cache` #### `callsign_cache`
Caches external API lookups for callsign enhancement: Caches external API lookups and local enrichment for callsign enhancement:
```sql ```sql
CREATE TABLE callsign_cache ( CREATE TABLE callsign_cache (
callsign TEXT PRIMARY KEY, callsign TEXT PRIMARY KEY,
airline_icao TEXT, airline_icao TEXT,
airline_iata TEXT,
airline_name TEXT, airline_name TEXT,
airline_country TEXT,
flight_number TEXT, flight_number TEXT,
origin_iata TEXT, origin_iata TEXT, -- Departure airport IATA code
destination_iata TEXT, destination_iata TEXT, -- Arrival airport IATA code
aircraft_type TEXT, aircraft_type TEXT,
route TEXT, -- Full route description
status TEXT, -- Flight status (scheduled, delayed, etc.)
source TEXT NOT NULL DEFAULT 'local',
cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP, expires_at TIMESTAMP NOT NULL
source TEXT DEFAULT 'local'
); );
``` ```
**Route Information Fields:**
- **`origin_iata`**: IATA code of departure airport (e.g., "JFK" for New York JFK)
- **`destination_iata`**: IATA code of arrival airport (e.g., "LAX" for Los Angeles)
- **`route`**: Human-readable route description (e.g., "JFK-LAX" or "New York to Los Angeles")
- **`status`**: Current flight status when available from external APIs
These fields enable enhanced flight tracking with origin-destination pairs and route visualization.
**Indexes:** **Indexes:**
- `idx_callsign_cache_expires`: Efficient cache cleanup - `idx_callsign_cache_expires`: Efficient cache cleanup
- `idx_callsign_cache_airline`: Airline-based queries
#### `data_sources`
Tracks loaded external data sources and their metadata:
```sql
CREATE TABLE data_sources (
name TEXT PRIMARY KEY,
license TEXT NOT NULL,
url TEXT,
version TEXT,
imported_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
record_count INTEGER DEFAULT 0,
user_accepted_license BOOLEAN DEFAULT 0
);
```
## Database Location Strategy ## Database Location Strategy
@ -195,15 +251,72 @@ var migrations = []Migration{
} }
``` ```
## Data Sources and Loading
SkyView supports multiple aviation data sources with automatic conflict resolution and license compliance.
### Supported Data Sources
#### OpenFlights Airlines Database
- **Source**: https://openflights.org/data.html
- **License**: Open Database License (ODbL) 1.0
- **Content**: Global airline data with ICAO/IATA codes, callsigns, and country information
- **Records**: ~6,162 airlines
- **Update Method**: Runtime download (no license confirmation required)
#### OpenFlights Airports Database
- **Source**: https://openflights.org/data.html
- **License**: Open Database License (ODbL) 1.0
- **Content**: Global airport data with coordinates, codes, and metadata
- **Records**: ~7,698 airports
- **Update Method**: Runtime download
#### OurAirports Database
- **Source**: https://ourairports.com/data/
- **License**: Creative Commons Zero (CC0) 1.0
- **Content**: Comprehensive airport database with detailed metadata
- **Records**: ~83,557 airports
- **Update Method**: Runtime download
### Data Loading System
#### Intelligent Conflict Resolution
The data loading system uses **INSERT OR REPLACE** upserts to handle overlapping data:
```sql
INSERT OR REPLACE INTO airlines (id, name, alias, iata_code, icao_code, callsign, country, active, data_source)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
```
This ensures that:
- Duplicate records are automatically updated rather than causing errors
- Later data sources can override earlier ones
- Database integrity is maintained during bulk loads
#### Loading Process
1. **Source Validation**: Verify data source accessibility and format
2. **Incremental Processing**: Process data in chunks to manage memory
3. **Error Handling**: Log and continue on individual record errors
4. **Statistics Reporting**: Track records processed, added, and errors
5. **Source Tracking**: Record metadata about each loaded source
#### Performance Characteristics
- **OpenFlights Airlines**: ~6,162 records in ~363ms
- **OpenFlights Airports**: ~7,698 records in ~200ms
- **OurAirports**: ~83,557 records in ~980ms
- **Error Rate**: <0.1% under normal conditions
## Configuration Integration ## Configuration Integration
### Database Configuration ### Database Configuration
```json ```json
{ {
"database": { "database": {
"path": "/var/lib/skyview/skyview.db", "path": "/var/lib/skyview-adsb/skyview.db",
"max_history_days": 7, "max_history_days": 7,
"backup_on_upgrade": true "backup_on_upgrade": true,
"vacuum_interval": "24h",
"page_size": 4096
}, },
"callsign": { "callsign": {
"enabled": true, "enabled": true,
@ -294,10 +407,26 @@ var migrations = []Migration{
4. **Service**: Systemd integration with proper database access 4. **Service**: Systemd integration with proper database access
### Service User ### Service User
- User: `skyview` - User: `skyview-adsb`
- Home: `/var/lib/skyview` - Home: `/var/lib/skyview-adsb`
- Shell: `/bin/false` (service account) - Shell: `/bin/false` (service account)
- Database: Read/write access to `/var/lib/skyview/` - Database: Read/write access to `/var/lib/skyview-adsb/`
### Automatic Database Updates
The systemd service configuration includes automatic database updates on startup:
```ini
[Service]
Type=simple
User=skyview-adsb
Group=skyview-adsb
# Update database before starting main service
ExecStartPre=/usr/bin/skyview-data -config /etc/skyview-adsb/config.json update
TimeoutStartSec=300
ExecStart=/usr/bin/skyview -config /etc/skyview-adsb/config.json
```
This ensures aviation data sources are refreshed before each service start, complementing the weekly timer-based updates.
## Data Retention and Cleanup ## Data Retention and Cleanup
@ -320,6 +449,89 @@ WHERE expires_at < datetime('now');
VACUUM; VACUUM;
``` ```
## Database Optimization
SkyView includes a comprehensive database optimization system that automatically manages storage efficiency and performance.
### Optimization Features
#### Automatic VACUUM Operations
- **Full VACUUM**: Rebuilds database to reclaim deleted space
- **Incremental VACUUM**: Gradual space reclamation with minimal performance impact
- **Scheduled Maintenance**: Configurable intervals for automatic optimization
- **Size Reporting**: Before/after statistics with space savings metrics
#### Storage Optimization
- **Page Size Optimization**: Configurable SQLite page size for optimal performance
- **Auto-Vacuum Configuration**: Enables incremental space reclamation
- **Statistics Updates**: ANALYZE operations for query plan optimization
- **Efficiency Monitoring**: Real-time storage efficiency reporting
### Using the Optimization System
#### Command Line Interface
```bash
# Run comprehensive database optimization
skyview-data optimize
# Run with force flag to skip confirmation prompts
skyview-data optimize --force
# Check current optimization statistics
skyview-data optimize --stats-only
```
#### Optimization Output Example
```
Optimizing database for storage efficiency...
✓ Auto VACUUM: Enable incremental auto-vacuum
✓ Incremental VACUUM: Reclaim free pages incrementally
✓ Optimize: Update SQLite query planner statistics
✓ Analyze: Update table statistics for better query plans
VACUUM completed in 1.2s: 275.3 MB → 263.1 MB (saved 12.2 MB, 4.4%)
Database optimization completed successfully.
Storage efficiency: 96.8% (263.1 MB used of 272.4 MB allocated)
```
#### Configuration Options
```json
{
"database": {
"vacuum_interval": "24h",
"page_size": 4096,
"enable_compression": true,
"compression_level": 6
}
}
```
### Optimization Statistics
The optimization system provides detailed metrics about database performance:
#### Available Statistics
- **Database Size**: Total file size in bytes
- **Page Statistics**: Page size, count, and utilization
- **Storage Efficiency**: Percentage of allocated space actually used
- **Free Space**: Amount of reclaimable space available
- **Auto-Vacuum Status**: Current auto-vacuum configuration
- **Last Optimization**: Timestamp of most recent optimization
#### Programmatic Access
```go
// Get current optimization statistics
optimizer := NewOptimizationManager(db, config)
stats, err := optimizer.GetOptimizationStats()
if err != nil {
log.Fatal("Failed to get stats:", err)
}
fmt.Printf("Database efficiency: %.1f%%\n", stats.Efficiency)
fmt.Printf("Storage used: %.1f MB\n", float64(stats.DatabaseSize)/(1024*1024))
```
## Performance Considerations ## Performance Considerations
### Query Optimization ### Query Optimization
@ -329,8 +541,10 @@ VACUUM;
### Storage Efficiency ### Storage Efficiency
- Configurable history limits prevent unbounded growth - Configurable history limits prevent unbounded growth
- Periodic VACUUM operations reclaim deleted space - Automatic VACUUM operations with optimization reporting
- Compressed timestamps and efficient data types - Compressed timestamps and efficient data types
- Page size optimization for storage efficiency
- Auto-vacuum configuration for incremental space reclamation
### Memory Usage ### Memory Usage
- WAL mode for concurrent read/write access - WAL mode for concurrent read/write access
@ -428,6 +642,76 @@ sqlite3 /var/lib/skyview/skyview.db "SELECT * FROM schema_info;"
sqlite3 /var/lib/skyview/skyview.db ".dbinfo" sqlite3 /var/lib/skyview/skyview.db ".dbinfo"
``` ```
## Testing and Quality Assurance
SkyView includes comprehensive test coverage for all database functionality to ensure reliability and data integrity.
### Test Coverage Areas
#### Core Database Functionality
- **Database Creation and Initialization**: Connection management, configuration handling
- **Migration System**: Schema versioning, upgrade/downgrade operations
- **Connection Pooling**: Concurrent access, connection lifecycle management
- **SQLite Pragma Settings**: WAL mode, foreign keys, performance optimizations
#### Data Loading and Management
- **Multi-Source Loading**: OpenFlights, OurAirports data integration
- **Conflict Resolution**: Upsert operations, duplicate handling
- **Error Handling**: Network failures, malformed data recovery
- **Performance Validation**: Loading speed, memory usage optimization
#### Callsign Enhancement System
- **Parsing Logic**: Callsign validation, airline code extraction
- **Database Integration**: Local lookups, caching operations
- **Search Functionality**: Airline filtering, country-based queries
- **Cache Management**: TTL handling, cleanup operations
#### Optimization System
- **VACUUM Operations**: Space reclamation, performance monitoring
- **Page Size Optimization**: Configuration validation, storage efficiency
- **Statistics Generation**: Metrics accuracy, reporting consistency
- **Maintenance Scheduling**: Automated optimization, interval management
### Test Infrastructure
#### Automated Test Setup
```go
// setupTestDatabase creates isolated test environment
func setupTestDatabase(t *testing.T) (*Database, func()) {
tempFile, _ := os.CreateTemp("", "test_skyview_*.db")
config := &Config{Path: tempFile.Name()}
db, _ := NewDatabase(config)
db.Initialize() // Run all migrations
cleanup := func() {
db.Close()
os.Remove(tempFile.Name())
}
return db, cleanup
}
```
#### Network-Safe Testing
Tests gracefully handle network connectivity issues:
- Skip tests requiring external data sources when offline
- Provide meaningful error messages for connectivity failures
- Use local test data when external sources are unavailable
### Running Tests
```bash
# Run all database tests
go test -v ./internal/database/...
# Run tests in short mode (skip long-running network tests)
go test -v -short ./internal/database/...
# Run specific test categories
go test -v -run="TestDatabase" ./internal/database/...
go test -v -run="TestOptimization" ./internal/database/...
go test -v -run="TestCallsign" ./internal/database/...
```
## Future Enhancements ## Future Enhancements
### Planned Features ### Planned Features
@ -435,8 +719,11 @@ sqlite3 /var/lib/skyview/skyview.db ".dbinfo"
- **Partitioning**: Date-based partitioning for large datasets - **Partitioning**: Date-based partitioning for large datasets
- **Replication**: Read replica support for high-availability setups - **Replication**: Read replica support for high-availability setups
- **Analytics**: Built-in reporting and statistics tables - **Analytics**: Built-in reporting and statistics tables
- **Enhanced Route Data**: Integration with additional flight tracking APIs
- **Geographic Indexing**: Spatial queries for airport proximity searches
### Migration Path ### Migration Path
- All enhancements will use versioned migrations - All enhancements will use versioned migrations
- Backward compatibility maintained for existing installations - Backward compatibility maintained for existing installations
- Data preservation prioritized over schema optimization - Data preservation prioritized over schema optimization
- Comprehensive testing required for all schema changes