Your SIEM Is Only as Good as Its Worst-Onboarded Log Source

May 29, 2026 6 min read

siem
log-management
detection-engineering
data-quality
soc

Detection content gets the glory. Engineers write rules, tune correlations, build kill-chain logic, and present coverage maps. Almost nobody wants to talk about log-source onboarding — parsing, normalisation, field mapping, the unglamorous plumbing that gets data into the SIEM in a usable shape. Which is unfortunate, because most detection failures are data-quality failures wearing a detection costume. A perfect detection on a badly-onboarded log source is a perfect detection that does not fire.

This is about the layer underneath the detections, where reliability is actually won or lost.

The invisible dependency

Every detection has a silent dependency on the data underneath it: the right fields, parsed correctly, populated consistently, arriving reliably. When that dependency holds, the detection works and nobody thinks about the plumbing. When it breaks — a field stops parsing after a vendor update, a log source goes silent, a timestamp is misinterpreted — the detection fails silently, and the failure looks like a detection problem when it is really a data problem.

This is the insidious part. A detection that does not fire produces nothing — no alert, no error, no signal that it is broken. You cannot tell the difference between “the threat did not occur” and “the detection could not see it because the data underneath was malformed.” Both look identical: silence. And silence is exactly what you do not want to misread in a SOC.

So the reliability of your detection layer is capped by the reliability of your worst-onboarded log source — because that is where the silent failures hide.

What good onboarding actually requires

Onboarding a log source properly is more than pointing it at the SIEM and confirming events arrive. The disciplines:

Parsing correctness. The raw log must be broken into fields correctly, and stay correct across vendor format changes. A log source whose format drifts after an update — a new field order, a changed delimiter, an added wrapper — will silently mis-parse, populating your detection’s fields with garbage or nothing. Parsing is not set-and-forget; it is a dependency that decays when the source changes.

Normalisation consistency. Different sources describe the same concept differently — a source IP, a username, an action — and detection logic that spans sources needs them normalised to a common schema. Inconsistent normalisation means a cross-source detection silently misses events from the source that named the field differently. This is where multi-source correlation quietly loses coverage.

Field completeness. A detection needs specific fields populated. A source that delivers events but leaves the detection’s key field empty is worse than no source, because it creates the appearance of coverage without the substance. “We’re ingesting that source” is not “that source feeds our detections usefully.”

Timestamp integrity. Time is the axis all correlation runs on. A source whose timestamps are mis-parsed, in the wrong zone, or use ingestion-time instead of event-time will scramble any time-based correlation it participates in. Behavioural detection — which is fundamentally about sequence over time — is destroyed by bad timestamps, and the destruction is invisible until you look.

Reliability of delivery. A source that silently stops sending is a coverage hole that announces itself only when you go looking. Onboarding includes monitoring the onboarding — knowing when a source goes quiet, because a silent source and a quiet environment look the same.

The data-quality-as-detection-quality principle

The unifying idea: detection quality is bounded above by data quality, and data quality is mostly an onboarding-and-maintenance discipline. You cannot detect what your data does not faithfully represent. A kill-chain detection that depends on three sources is only as reliable as the least-reliable of the three — and if one of them mis-parses after a vendor update, the whole detection degrades silently.

This reframes where SOC reliability work should go. The instinct is to invest in more and better detections. But a SOC with excellent detections on shaky data sources is less reliable than one with decent detections on rock-solid data, because the second one fires when it should and the first one fails in ways nobody sees. Data quality is the foundation; detection content is the building. You cannot out-detect a bad foundation.

Treating onboarding as ongoing, not one-time

The most common onboarding mistake is treating it as a project that completes. A source is onboarded, it works, the ticket closes, and it is never looked at again — until a vendor update breaks the parsing and a detection silently dies. The fixes:

Monitor source health continuously. Know the expected volume and cadence of each source, and alert when a source deviates — goes silent, changes volume dramatically, or starts producing parse errors. A source going quiet should page someone, because it is a coverage hole.

Validate parsing after changes. Vendor format changes are the primary cause of silent parsing failure. Where you can, validate that key fields are still populated correctly after updates — the same way detection-as-code validates detections, validate that the data the detections depend on is still arriving in the expected shape.

Map sources to the detections that depend on them. When a source breaks, you need to know immediately which detections just went blind. A dependency map from sources to detections turns a vague “a log source broke” into a specific “these five detections are now non-functional” — which is the difference between a silent gap and a managed incident.

Treat onboarding quality as a tracked metric. Coverage maps usually show “do we have a detection for X.” They should also show “is the data underneath that detection healthy” — because a detection on a broken source is a red cell wearing a green costume.

The takeaway

The unglamorous truth of SIEM operations: your detection layer is only as reliable as your worst-onboarded log source, because that is where the silent failures live, and silent failures in detection are the most dangerous kind — indistinguishable from “nothing happened.” Parsing, normalisation, field completeness, timestamp integrity, and delivery reliability are not preamble to detection engineering; they are its foundation, and they decay when sources change.

The reframe to carry: most detection failures are data-quality failures in disguise, so invest in the plumbing — and monitor it as continuously as you monitor for threats. A decent detection on rock-solid data beats an excellent detection on a source that silently broke last Tuesday. The boring layer is where reliability actually lives.

An independent piece by johlem.net — IT security, Luxembourg. SIEM operations and detection engineering for regulated finance.