blog.johlem.net

Building an Air-Gapped Proxmox Lab for OSCP Prep and Detection Engineering

Disclaimer: this is a personal lab on my own hardware. Nothing in this post is drawn from any client engagement, and none of the detections or offensive techniques below have ever been deployed outside the box on my desk.

I wanted one environment where I could throw real offensive tooling at a real blue-team stack, fail loudly, and leave no evidence outside a single physical box. This is how that came together.

Why air-gapped

Three reasons, in descending order of importance:

  1. Regulatory reality. I live inside a CSSF-supervised entity. C2 traffic leaving my home network is an incident even if the intent is innocent. It is far easier to explain “I physically unplugged the uplink” than it is to explain “the EDR flagged my home lab as a staging box”.
  2. Detection purity. When I'm evaluating a Suricata rule against a specific TTP, I need to know with certainty that the only traffic hitting the sensor is lab traffic. Background internet noise ruins honest measurement.
  3. Mental hygiene. Hard physical boundary between work-related tooling and curiosity-driven experimentation. The red-team VLAN has no DNS resolver that can see a real root server. That is on purpose.

Hardware

The entire thing runs on one MINISFORUM MS-A2:

At roughly €1,200 all-in, it outperforms a rack of 2018-era enterprise gear and idles around 35 W. A single box is a deliberate constraint: it forces me to keep the lab small enough that I can tear it down and rebuild it from Ansible in a weekend.

Peripherals:

Proxmox install

Single-node PVE 8.x on the ZFS mirror. Key tweaks after the default install:

# Disable the enterprise repo, enable no-subscription
sed -i 's/^/#/' /etc/apt/sources.list.d/pve-enterprise.list
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" \
  > /etc/apt/sources.list.d/pve-no-subscription.list

# Let KVM nest — needed for running Windows with a functional EDR inside a VM
echo "options kvm-amd nested=1" > /etc/modprobe.d/kvm-amd.conf

# Zero swappiness on the host; VMs should swap in their own disks if at all
echo "vm.swappiness=0" >> /etc/sysctl.conf

# Suricata and Zeek live on the host, not in a VM — they need to see tap
# traffic from the virtual switch directly
apt install -y suricata zeek

I run no cluster. A single-node “cluster of one” keeps the moving parts down. The day I need HA for this lab is the day I've lost the plot.

Network: 10 VLANs

The core of the design. Every VM lands on exactly one VLAN. Inter-VLAN traffic is explicit, logged, and matched by at least one Suricata rule.

Rule of thumb: if a VM needs to reach the real internet, it does not belong in this lab. Every OS image is patched offline and seeded from a separate, internet-connected machine that never plugs into the lab switch.

Blue team stack

Running on VLAN 70:

An example of a teaching-grade Sigma rule I use to check the pipeline end-to-end. It fires on any handle open to LSASS from a non-system process, which is deliberately noisy:

title: LSASS Handle Access From Non-System Process
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10
    TargetImage|endswith: '\lsass.exe'
  filter_system:
    SourceImage|contains:
      - '\System32\csrss.exe'
      - '\System32\services.exe'
      - '\System32\wininit.exe'
  condition: selection and not filter_system
level: high

Noisy rules like this are useful for plumbing tests. Once a new log path is flowing, you silence them or replace them with something more targeted.

Red team stack

Running on VLAN 20:

The detection engineering loop

This is the whole reason the lab exists:

  1. Pick a technique. Example: T1003.001 — LSASS memory dumping.
  2. Run it from VLAN 20 against a host in VLAN 40 or 50.
  3. Watch Wazuh + Graylog. Did anything fire? How quickly?
  4. If nothing fired: write the Sigma rule, deploy it.
  5. Vary the method: Mimikatz → nanodump → ProcDump → comsvcs.dll → direct syscall invocation. Does the rule survive?
  6. If the rule stops firing: understand why, then improve it.
  7. Commit rule + notes to a private Git repo. Tag by MITRE technique.
  8. Tear down. Snapshot revert. Start over.

One technique per evening is a comfortable pace. Over six months that produces a few dozen rules I trust, and roughly the same number I had to walk away from because they couldn't survive basic variation.

Rough cost

Less than the cumulative cost of a decent cert path over the same period, and considerably more useful once it's running.

What's next

Three things on the list:

  1. A second, identical MS-A2 for multi-site NIS2 resilience scenarios. One physical box is a single point of failure by design today.
  2. Publishing stable Sigma rules to a public repo as they harden. Noisy teaching rules stay private.
  3. A separate writeup of the Windows 11 + BitLocker + Sysmon baseline I use for every new corp-VLAN image. It deserves its own post.

If you're building something similar: the boring work — patching golden images, rotating AD passwords, keeping firewall rules coherent — takes more time than the flashy detection engineering does. Budget accordingly. The lab is a forcing function for habits, not a substitute for them.