Database Schema: Cyber Intelligence
This document provides a detailed overview of the relational database structure used in the Cyber Sentinel project. The database, named cyber_intelligence, manages network observables, threat intelligence reports, and AI-generated verdicts.
1. Entity Relationship Diagram
erDiagram
dic_indicator_types ||--o{ threat_indicators : "defines type"
ai_analysis_results ||--o{ threat_indicators : "provides verdict"
dns_queries ||--o{ threat_indicators : "triggers analysis"
threat_indicators ||--o{ threat_indicator_details : "has metadata"
dic_source_providers ||--o{ threat_indicator_details : "is source of"
dic_threat_levels ||--o{ ai_analysis_results : "categorizes"
network_events ||--|| dns_queries : "logs access" 2. Core Tables
The following tables form the backbone of the Cyber Sentinel data model.
2.1 threat_indicators
The primary registry for all analyzed observables. It bridges the gap between a captured DNS query and its security evaluation.
| Column | Type | Constraint | Description |
|---|---|---|---|
id | INT | PK, AUTO | Unique internal ID for the indicator |
dns_query_id | INT | FK, NOT NULL | Reference to the original intercepted DNS query |
type_id | INT | FK, NOT NULL | Reference to dic_indicator_types (IP, FQDN, HASH) |
analysis_result_id | INT | FK, NOT NULL | Reference to ai_analysis_results |
last_scan | DATETIME | DEFAULT NOW | Timestamp of the most recent enrichment |
2.2 ai_analysis_results
Stores the cognitive synthesis generated by the AI agent (Google Gemini).
| Column | Type | Constraint | Description |
|---|---|---|---|
id | INT | PK, AUTO | Unique identifier for the verdict |
threat_score | INT | FK, NOT NULL | Risk score (1–10) aligned with dic_threat_levels |
threat_label | VARCHAR(50) | — | Short classification label |
verdict_summary_en | TEXT | — | Technical classification (EN) |
analysis_pl | TEXT | — | Human-readable explanation (PL) |
2.3 threat_indicator_details
Maps relational records to raw data stored in MongoDB.
| Column | Type | Constraint | Description |
|---|---|---|---|
id | INT | PK, AUTO | Unique identifier |
indicator_id | INT | FK, CASCADE | Reference to threat_indicators |
source_id | INT | FK, NOT NULL | Reference to dic_source_providers |
mongo_ref_id | VARCHAR(50) | INDEXED | ObjectID of raw JSON in MongoDB |
2.4 dns_queries
Stores raw DNS requests captured by the monitoring layer. Populated by dns_log_processor.py.
| Column | Type | Constraint | Description |
|---|---|---|---|
id | INT | PK, AUTO | Unique identifier |
timestamp | DATETIME | NOT NULL, IDX | Time of the DNS request |
query_type | VARCHAR(10) | NOT NULL | DNS query type (A, AAAA, etc.) |
domain | VARCHAR(255) | NOT NULL, IDX | Requested FQDN |
source_ip | VARCHAR(45) | NOT NULL | Client IP address |
response_ip | VARCHAR(45) | — | Resolved IP from DNS response |
2.5 network_events
Universal table for network events from IDS/IPS, Scapy, and packet sniffers. Tracks access events and links DNS queries with clients.
| Column | Type | Constraint | Description |
|---|---|---|---|
id | INT | PK, AUTO | Unique identifier |
timestamp | DATETIME | NOT NULL | Time of the event (default: now) |
dns_query_id | INT | FK, NULL | Link to dns_queries (optional context) |
threat_indicator_id | INT | FK, NULL | Link to existing threat report (optional) |
source_ip | VARCHAR(45) | NOT NULL | Client IP address |
dest_ip | VARCHAR(45) | NOT NULL | Destination IP address |
dest_port | INT | — | Destination port |
protocol | VARCHAR(10) | — | Network protocol (TCP, UDP, ICMP) |
application_protocol | VARCHAR(20) | — | App-layer protocol (HTTP, TLS, DNS, FTP) |
request_url | TEXT | — | Full URL or URI path clicked/requested |
user_agent | TEXT | — | Browser/client identification string |
event_type | VARCHAR(50) | — | alert, flow, http_request, scapy_intercept |
severity | INT | — | Normalized severity (1–5) |
signature_name | VARCHAR(255) | — | IDS rule triggered (Suricata rule name) |
action_taken | VARCHAR(20) | — | allowed, blocked, logged |
3. Dictionary Tables
These tables define controlled vocabularies used across the system.
3.1 dic_indicator_types
Defines the type of observable being analyzed.
id | name |
|---|---|
| 1 | FQDN |
| 2 | IP |
| 3 | HASH |
3.2 dic_source_providers
Lists the CTI (Cyber Threat Intelligence) providers used for enrichment.
id | name |
|---|---|
| 1 | VirusTotal |
| 2 | Abuse_ThreatFox |
| 3 | Abuse_URLhaus |
| 4 | urlscan.io |
3.3 dic_threat_levels
Defines the scoring policy. Scores above 5 are considered malicious.
| Score | Description | Malicious |
|---|---|---|
| 1 | Safe / Clean | ✅ No |
| 2 | Low Risk | ✅ No |
| 3 | Informational / CDN | ✅ No |
| 4 | Unverified / New | ✅ No |
| 5 | Suspicious — review needed | ✅ No |
| 6 | Likely Malicious — investigation required | ⚠️ Yes |
| 7 | Malicious — known threat | ⚠️ Yes |
| 8 | High Risk — immediate attention | ⚠️ Yes |
| 9 | Confirmed Malware — to be blocked | ⚠️ Yes |
| 10 | Critical Threat — active attack | ⚠️ Yes |
4. Analytical Views
SQL Views simplify orchestration (n8n) and visualization (Grafana). All views are prefixed with v_.
4.1 v_pending_analysis
Queue of new DNS observables that have not yet been enriched by CTI sources.
- Source tables:
dns_queries,threat_indicators - Filter: records with no matching
threat_indicatorsentry and a valid IPv4response_ip
4.2 v_latest_threat_reports
Returns the most recent scan result per DNS query, joining verdict data from ai_analysis_results.
- Source tables:
threat_indicators,ai_analysis_results - Key fields:
threat_score,verdict_summary_en,analysis_pl
4.3 v_grafana_malicious_stats
Provides global statistics on total scans vs. malicious detections, including a malicious percentage.
- Based on:
v_latest_threat_reports - Key fields:
total_scans,total_malicious_scans,malicious_percentage,last_scan
4.4 v_grafana_daily_trends
Time-series aggregation of daily scan volume and threat activity for Grafana dashboards.
- Based on:
v_latest_threat_reports - Key fields:
scan_date,time_sec(UNIX),total_scans_per_day,total_positives_per_day,max_threat_score_that_day
4.5 v_grafana_dns_hourly_traffic
Aggregates DNS query volume per hour for traffic intensity charts.
- Source table:
dns_queries - Key fields:
hour_group,time_sec(UNIX),total_queries
4.6 v_grafana_threat_explorer
Full forensic view combining network events, DNS data, AI verdicts, and CTI provider names. Filtered to malicious events only (threat_score > 5).
- Source tables:
network_events,dns_queries,threat_indicators,ai_analysis_results,dic_threat_levels,threat_indicator_details,dic_source_providers - Key fields:
timestamp,fqdn,source_ip,request_url,providers,threat_label,threat_score