Designing and Building an Open Source SOC
A full-stack Security Operations Center built entirely with open-source tools, deployed on virtualized infrastructure. This project provides enterprise-grade threat detection, incident response, and threat intelligence capabilities for monitoring the college network environment.
// Institutional Value Created
By building this SOC in-house with open-source tools, Polk State achieved enterprise-grade security capabilities while avoiding the substantial costs of outsourced alternatives.
// Industry Comparison (200 Endpoints)
For context: LSU's statewide SOC serving 25 institutions costs $7.5M annually (~$300K/institution); Cleveland State received $451K in state capital funding for IT security infrastructure; and Houston Community College's SOC/Cyber Range RFP estimated at $150K–$300K.
The SOC stack consists of seven specialized VMs, each serving a distinct security function:
Data flows between components via authenticated APIs and agent protocols:
01 Endpoint Intrusion Detection
Example: PowerShell-based attack (MITRE T1059.001)
02 Network Threat Detection
Example: Malware Command & Control communication
Wazuh — The Nervous System
Central log collection, correlation, file integrity monitoring, and alert generation.
Ingestion
- Agents: Endpoints send logs via TCP/1514 to Manager
- NIDS: Zeek/Suricata JSON logs via local Wazuh Agent
Processing
- Decodes logs against decoders and rules
- Correlates events against MITRE ATT&CK framework
- Triggers alerts based on rule matches
Storage
- Ships data via HTTPS/9200 to Wazuh Indexer
- Indexer provides long-term storage and search
TheHive — The Brain
Central hub for alert management, case creation, and response coordination.
Backend
- Cassandra: Stores case data, tasks, and logs
- Elasticsearch: Indexing with X-Pack security and
thehive_role
Wazuh Integration
- Custom
custom-w2thive.pyscript on Wazuh Manager - Converts high-severity alerts to JSON, POSTs to TheHive API
- Analysts promote alerts to cases for investigation
Cluster Config
- Configured as "Cluster of One" using Akka/Pekko
- Prevents service shutdown from missing cluster peers
Cortex — The Muscle
Performs active analysis on observables extracted from cases.
Interaction
- TheHive connects via API on port 9001
- Receives observables (IPs, hashes, domains)
Analyzers
- Python scripts in Docker containers
- Queries: VirusTotal, AbuseIPDB, MISP, etc.
- Returns enrichment reports to TheHive
MISP — The Memory
Stores Indicators of Compromise and correlates against threat feeds.
TheHive Integration
- Pulls data via API (Port 443) to enrich cases
- SSL:
ssl.loose.acceptAnyCertificate = truefor self-signed certs - Bi-directional: Import events, export case intel
Cortex Integration
- MISP Analyzer queries local database
- "Have we seen this hash/IP before?"
NIDS — The Eyes
Monitors raw network traffic via mirror port for signatures and anomalies.
Software Stack
- Suricata: Signature-based IDS, outputs
eve.json - Zeek: Protocol analysis, configured for JSON output (not default TSV)
Integration
- Wazuh Agent reads local JSON logs
- Ships to Manager for correlation
- Enables network-level visibility in SIEM
Tags stripped by Proxmox Bridge (vmbr0) before reaching VM. Guest OS sees standard interface (ens18) — no internal VLAN config needed.
TheHive 5 requires ES 8.x with X-Pack security. Must create thehive_role with proper index permissions.
Default is TSV. Apply local policy: @load policy/tuning/json-logs.zeek for Wazuh compatibility.
Ubuntu VMs default to 50% disk. Run lvextend and resize2fs to utilize full virtual disk.
Generate authKey in MISP, configure in TheHive's application.conf to bridge systems.
TheHive runs on JVM with Elasticsearch backend—both are memory-hungry. JVMs allocate heap at startup and don't release it. Tune -Xms/-Xmx flags carefully to balance performance without exhausting host RAM.
Ubuntu 22.04 LTS ("Jammy Jellyfish") provides a stable, secure, and modern foundation for running TheHive and Cortex, particularly in containerized environments. It ensures compatibility with updated dependencies like OpenSSL 3.0 and provides long-term support until 2032 for critical security infrastructure.
- Optimal Docker Environment: TheHive and Cortex are frequently deployed together using Docker Compose for 24/7 SOCs. Jammy's robust container support makes it ideal for high-load, single-server setups.
- Modern Security Requirements: Provides necessary libraries and security hardening including OpenSSL 3.0 and the 5.15 Linux kernel—crucial for running latest Java versions (TheHive) and secure network communication (Cortex analyzers).
- Stability & Performance: LTS nature ensures the OS remains stable for years, essential for security infrastructure that cannot afford downtime.
- Efficient Resource Handling: Optimized for enterprise-class deployments "from data center to edge," helping manage the resource-intensive nature of TheHive (with its Elasticsearch backend) and Cortex running together.
// Roadmap: AI-Powered SOC Assistant
The next phase of this project is building an AI chatbot that can interact with all SOC components through Model Context Protocol (MCP) servers. This enables natural language queries and actions across the entire security stack.
- Query alerts by severity/timeframe
- Search indexed events
- Get agent status
- Retrieve rule information
Wazuh API (55000) - Create/update cases
- Add observables
- Manage tasks
- Search case history
TheHive API (9000) - Run analyzers on observables
- Get analysis reports
- List available analyzers
Cortex API (9001) - Search threat intel
- Query IoC database
- Get event details
- Correlate observables
MISP API (443) // Example Analyst Workflows
- ✓ Proxmox VE hypervisor configured
- ✓ Wazuh stack (Manager, Indexer, Dashboard) deployed
- ✓ TheHive 5 + Cortex integrated
- ✓ MISP threat intelligence platform online
- ✓ NIDS (Suricata + Zeek) monitoring network
- ◯ Integrating additional threat intelligence feeds
- ◯ Building custom detection rules
- ◯ Developing response playbooks
- ◯ Building AI chatbot interface
- ◯ Developing wazuh-mcp server
- ◯ Developing thehive-mcp server
- ◯ Developing cortex-mcp server
- ◯ Developing misp-mcp server