Splunk 101: Hands-On Introduction to SIEM, Log Ingestion, and Basic Threat Hunting

Splunk, SIEM, Log Analysis: Hands-on walkthrough of log ingestion, SPL querying, and basic security investigation using Splunk.

Keywords: Splunk, SIEM, Log Analysis, SOC, Threat Hunting, TryHackMe

Security monitoring is not just about collecting logs — it’s about turning raw machine data into actionable security insights.

In this hands-on walkthrough, I explored the fundamentals of Splunk, one of the most widely used SIEM platforms, through a practical lab exercise involving log ingestion and basic investigation queries. Instead of only covering theory, this walkthrough focuses on what we actually did, why we did it, and how Splunk helps security analysts investigate events efficiently.

Lab Link: https://tryhackme.com/room/splunk101

What is Splunk?

Splunk is a SIEM (Security Information and Event Management) platform that helps security teams:

Collect logs from multiple sources
Normalize and index machine data
Search events efficiently
Correlate security events
Visualize trends and anomalies
Accelerate incident detection and investigation

Think of it as a centralized visibility platform for your infrastructure.

Without SIEM tools, reviewing thousands of logs manually would be operationally painful.

Core Splunk Architecture

Splunk mainly operates using three core components.

1. Forwarder

The Forwarder acts as the log collector.

It is a lightweight agent installed on monitored systems that gathers logs and forwards them to Splunk.

Typical data sources include:

Windows Event Logs
Syslogs
Web server logs
Firewall logs
Database logs
Endpoint telemetry

Its lightweight nature ensures minimal performance impact on the host.

Answer from lab: Forwarder

2. Indexer

The Indexer is where the real processing happens.

Once data reaches the indexer:

Raw logs are parsed
Data is normalized
Fields are extracted
Events are indexed for fast searchability

Without indexing, searching through millions of events would be painfully slow.

3. Search Head

The Search Head is the analyst’s workspace.

This is where:

Queries are written
Investigations are performed
Dashboards are built
Visualizations are generated

Splunk uses SPL (Search Processing Language) for querying indexed data.

Connecting to the Lab

The lab environment provides a live Splunk instance.

After launching the machine, the dashboard becomes accessible via the provided IP.

At this point, the goal is not hunting threats yet — it’s simply getting familiar with the platform.

Navigating the Splunk Interface

When opening Splunk for the first time, several interface sections appear.

Top Navigation Bar

This provides administrative and operational controls:

Messages
Settings
Activity
Help
Search
App Switching

This is where analysts usually monitor running jobs and manage platform settings.

Apps Panel

Splunk is modular.

Different apps provide specialized functionality.

The default app is:

Search & Reporting

This is the primary workspace for analysts.

Explore Section

This section gives quick access to:

Add Data
Install Apps
Documentation

This becomes especially useful during data onboarding.

Adding Data into Splunk

A SIEM without data is just an expensive dashboard.

Splunk supports ingestion from:

Event logs
Syslogs
Web logs
Network telemetry
Custom files
API-based sources

For this lab, we worked with VPN log data.

Collecting Data from Files

Inside Add Data, Splunk provides multiple ingestion methods.

The correct option for file-based collection is:

Monitor

Why?

Because this mode continuously watches files or ports and ingests incoming data automatically.

Practical Log Ingestion

We uploaded the provided VPN log dataset.

Splunk ingestion follows a structured workflow:

Step 1 — Select Source

Choose the log file.

This tells Splunk what raw data needs processing.

Step 2 — Select Source Type

This defines how Splunk interprets the incoming data.

Examples:

Syslog
JSON
CSV
Windows logs

Choosing the correct type matters because field extraction depends on it.

Step 3 — Input Settings

Here we configured:

Index Name → vpn_logs
Host metadata

The index acts like a searchable storage bucket.

Step 4 — Review

Always validate configurations before ingestion.

Incorrect source parsing can break field extraction later.

Step 5 — Done

Splunk processes the file and indexes the events.

Now the investigation phase begins.

Investigation Queries

Now comes the fun part.

Instead of manually opening raw logs, we query structured event data.

1. Total Number of Events

Objective:

Determine how many events exist in the uploaded dataset.

PAYLOAD

source="VPN_logs.json" host="ip-10-10-40-195" sourcetype="_json"

Why this works

This tells Splunk:

Search everything inside the VPN_logs index.

Since no filters are applied, Splunk returns all indexed events.

Result

2862 events

2. Events Generated by User “Maleena”

Objective:

Identify all VPN activity associated with a specific user.

PAYLOAD

source="VPN_logs.json" host="ip-10-10-40-195" sourcetype="_json" UserName="Maleena"

Why this works

This query:

Searches inside vpn_logs
Filters only events where UserName equals Maleena

This is useful during user-centric investigations.

Examples:

Insider threat reviews
Suspicious account monitoring
Login activity auditing

Result

60 events

3. Identify Username Behind an IP Address

Objective:

Map an IP address back to a user identity.

PAYLOAD

source="VPN_logs.json" host="ip-10-10-40-195" sourcetype="_json" Source_ip="107.14.182.38"

Why this works

This filters events originating from the specified source IP.

This is common during incident response when an IP is flagged externally and analysts need attribution.

Result

Smith

4. Events from All Countries Except France

Objective:

Exclude one geography from the search.

PAYLOAD

source="VPN_logs.json" host="ip-10-10-40-195" sourcetype="_json" NOT Source_Country="France"

Why this works

The NOT operator removes all events matching the specified condition.

This is useful when:

Filtering known benign traffic
Narrowing investigations
Removing noise

Result

2814 events

5. Activity from a Specific IP

Objective:

Check all events tied to a suspicious IP.

PAYLOAD

source="VPN_logs.json" host="ip-10-10-40-195" sourcetype="_json" Source_ip="107.3.206.58"

Why this works

This isolates activity from one host.

This helps in:

IOC investigations
VPN misuse analysis
Threat actor tracking

Result

14 events

Key Splunk Concepts Learned

This exercise reinforces several foundational SIEM concepts:

Log Ingestion

Security visibility starts with proper onboarding.

No data = no detection.

Indexing

Raw logs become searchable events after parsing and indexing.

SPL Querying

Analysts use SPL to rapidly hunt across large datasets.

Filtering and Investigation

Instead of reading logs manually, we ask focused questions:

Who logged in?
From where?
How often?
Any anomalies?

Why This Matters in Real SOC Work

Even though this is a beginner lab, the workflow mirrors real-world analyst activity.

A SOC analyst frequently:

Investigates suspicious IPs
Reviews user login activity
Filters noise
Hunts indicators of compromise
Correlates events across datasets

Splunk dramatically reduces investigation time compared to manual log review.

Final Thoughts

Splunk remains one of the most practical SIEM platforms for defenders because it combines:

scalability
fast search
flexible parsing
rich dashboards
investigation speed

This lab was beginner-friendly, but it introduces the exact mindset needed for operational security monitoring.

Start with simple searches.

Then move toward:

correlation rules
alerting
dashboards
anomaly detection
incident investigations

That’s where Splunk becomes truly powerful.

Splunk 101: Hands-On Introduction to SIEM, Log Ingestion, and Basic Threat Hunting was originally published in System Weakness on Medium, where people are continuing the conversation by highlighting and responding to this story.