Skip to content

Database Schema: Cyber Intelligence

This document provides a detailed overview of the relational database structure used in the Cyber Sentinel project. The database, named cyber_intelligence, manages network observables, threat intelligence reports, and AI-generated verdicts.


1. Entity Relationship Diagram

erDiagram
    dic_indicator_types ||--o{ threat_indicators : "defines type"
    ai_analysis_results ||--o{ threat_indicators : "provides verdict"
    dns_queries ||--o{ threat_indicators : "triggers analysis"
    threat_indicators ||--o{ threat_indicator_details : "has metadata"
    dic_source_providers ||--o{ threat_indicator_details : "is source of"
    dic_threat_levels ||--o{ ai_analysis_results : "categorizes"
    network_events ||--|| dns_queries : "logs access"

2. Core Tables

The following tables form the backbone of the Cyber Sentinel data model.

2.1 threat_indicators

The primary registry for all analyzed observables. It bridges the gap between a captured DNS query and its security evaluation.

Column Type Constraint Description
id INT PK, AUTO Unique internal ID for the indicator
dns_query_id INT FK, NOT NULL Reference to the original intercepted DNS query
type_id INT FK, NOT NULL Reference to dic_indicator_types (IP, FQDN, HASH)
analysis_result_id INT FK, NOT NULL Reference to ai_analysis_results
last_scan DATETIME DEFAULT NOW Timestamp of the most recent enrichment

2.2 ai_analysis_results

Stores the cognitive synthesis generated by the AI agent (Google Gemini).

Column Type Constraint Description
id INT PK, AUTO Unique identifier for the verdict
threat_score INT FK, NOT NULL Risk score (1–10) aligned with dic_threat_levels
threat_label VARCHAR(50) Short classification label
verdict_summary_en TEXT Technical classification (EN)
analysis_pl TEXT Human-readable explanation (PL)

2.3 threat_indicator_details

Maps relational records to raw data stored in MongoDB.

Column Type Constraint Description
id INT PK, AUTO Unique identifier
indicator_id INT FK, CASCADE Reference to threat_indicators
source_id INT FK, NOT NULL Reference to dic_source_providers
mongo_ref_id VARCHAR(50) INDEXED ObjectID of raw JSON in MongoDB

2.4 dns_queries

Stores raw DNS requests captured by the monitoring layer. Populated by dns_log_processor.py.

Column Type Constraint Description
id INT PK, AUTO Unique identifier
timestamp DATETIME NOT NULL, IDX Time of the DNS request
query_type VARCHAR(10) NOT NULL DNS query type (A, AAAA, etc.)
domain VARCHAR(255) NOT NULL, IDX Requested FQDN
source_ip VARCHAR(45) NOT NULL Client IP address
response_ip VARCHAR(45) Resolved IP from DNS response

2.5 network_events

Universal table for network events from IDS/IPS, Scapy, and packet sniffers. Tracks access events and links DNS queries with clients.

Column Type Constraint Description
id INT PK, AUTO Unique identifier
timestamp DATETIME NOT NULL Time of the event (default: now)
dns_query_id INT FK, NULL Link to dns_queries (optional context)
threat_indicator_id INT FK, NULL Link to existing threat report (optional)
source_ip VARCHAR(45) NOT NULL Client IP address
dest_ip VARCHAR(45) NOT NULL Destination IP address
dest_port INT Destination port
protocol VARCHAR(10) Network protocol (TCP, UDP, ICMP)
application_protocol VARCHAR(20) App-layer protocol (HTTP, TLS, DNS, FTP)
request_url TEXT Full URL or URI path clicked/requested
user_agent TEXT Browser/client identification string
event_type VARCHAR(50) alert, flow, http_request, scapy_intercept
severity INT Normalized severity (1–5)
signature_name VARCHAR(255) IDS rule triggered (Suricata rule name)
action_taken VARCHAR(20) allowed, blocked, logged

3. Dictionary Tables

These tables define controlled vocabularies used across the system.

3.1 dic_indicator_types

Defines the type of observable being analyzed.

id name
1 FQDN
2 IP
3 HASH

3.2 dic_source_providers

Lists the CTI (Cyber Threat Intelligence) providers used for enrichment.

id name
1 VirusTotal
2 Abuse_ThreatFox
3 Abuse_URLhaus
4 urlscan.io

3.3 dic_threat_levels

Defines the scoring policy. Scores above 5 are considered malicious.

Score Description Malicious
1 Safe / Clean ✅ No
2 Low Risk ✅ No
3 Informational / CDN ✅ No
4 Unverified / New ✅ No
5 Suspicious — review needed ✅ No
6 Likely Malicious — investigation required ⚠️ Yes
7 Malicious — known threat ⚠️ Yes
8 High Risk — immediate attention ⚠️ Yes
9 Confirmed Malware — to be blocked ⚠️ Yes
10 Critical Threat — active attack ⚠️ Yes

4. Analytical Views

SQL Views simplify orchestration (n8n) and visualization (Grafana). All views are prefixed with v_.

4.1 v_pending_analysis

Queue of new DNS observables that have not yet been enriched by CTI sources.

  • Source tables: dns_queries, threat_indicators
  • Filter: records with no matching threat_indicators entry and a valid IPv4 response_ip

4.2 v_latest_threat_reports

Returns the most recent scan result per DNS query, joining verdict data from ai_analysis_results.

  • Source tables: threat_indicators, ai_analysis_results
  • Key fields: threat_score, verdict_summary_en, analysis_pl

4.3 v_grafana_malicious_stats

Provides global statistics on total scans vs. malicious detections, including a malicious percentage.

  • Based on: v_latest_threat_reports
  • Key fields: total_scans, total_malicious_scans, malicious_percentage, last_scan

Time-series aggregation of daily scan volume and threat activity for Grafana dashboards.

  • Based on: v_latest_threat_reports
  • Key fields: scan_date, time_sec (UNIX), total_scans_per_day, total_positives_per_day, max_threat_score_that_day

4.5 v_grafana_dns_hourly_traffic

Aggregates DNS query volume per hour for traffic intensity charts.

  • Source table: dns_queries
  • Key fields: hour_group, time_sec (UNIX), total_queries

4.6 v_grafana_threat_explorer

Full forensic view combining network events, DNS data, AI verdicts, and CTI provider names. Filtered to malicious events only (threat_score > 5).

  • Source tables: network_events, dns_queries, threat_indicators, ai_analysis_results, dic_threat_levels, threat_indicator_details, dic_source_providers
  • Key fields: timestamp, fqdn, source_ip, request_url, providers, threat_label, threat_score