Kubernetes Logging - Crash Course

Lesson 1: Core Principles of Kubernetes Logging

Logging in Kubernetes is fundamentally different from traditional server environments due to the ephemeral nature of containers. Understanding these core principles is essential for building a robust logging system.

The Container Logging Model

Standard Output Requirement: Applications in containerized environments must write logs to standard output (stdout) and standard error (stderr). This is a fundamental requirement for container-based logging.

# Traditional logging (WRONG for containers)
# Application writes to /var/log/app.log
echo "Log message" >> /var/log/app.log

# Container logging (CORRECT)
# Application writes to stdout/stderr
echo "Log message"  # Goes to stdout
echo "Error message" >&2  # Goes to stderr

# Docker daemon captures these streams
# and handles them with configured log drivers

Docker Log Drivers

Log Driver Role: Docker's daemon aggregates the stdout and stderr streams from containers using configured log drivers. Common drivers include json-file (default) and journald.

Log Driver	Storage	Best For
json-file	JSON files on disk	Default, simple deployments
journald	systemd journal	systemd-based systems
syslog	Syslog daemon	Integration with existing syslog infrastructure
none	Disabled	When using external log collectors

The Five Critical Principles

1. Persistence

Problem: Containers are ephemeral. When a container restarts, its logs are lost unless persisted externally.

Solution: Logs must be collected and saved outside the container runtime to survive restarts and terminations.

kubectl logs Limitation: The built-in kubectl logs --previous command only shows logs from the immediately previous container instance. It cannot trace a chain of multiple restarts, making it insufficient for production debugging.

# kubectl logs limitations

# View current container logs
kubectl logs my-pod

# View previous container logs (only one restart back)
kubectl logs my-pod --previous

# Problem: If pod has restarted 5 times, you can only see the last restart
# Restarts 1, 2, 3, 4 are LOST forever without external log persistence

# Solution: Use a centralized logging system that captures logs
# from all containers before they restart

2. Aggregation and Centralization

Problem: In a Kubernetes cluster, logs are scattered across multiple nodes, pods, and containers.

Solution: Collect logs from all sources into a central location:

Application logs from all pods
System logs from worker nodes (OS-level)
Control plane component logs (API server, scheduler, controller manager)
kubelet and container runtime logs

3. Metadata Enrichment

Problem: Raw container logs don't include context about which pod, namespace, or replica generated them.

Solution: Add crucial metadata to each log entry:

Pod Name: my-app-deployment-abc123
Namespace: production
Container Name: nginx
Node Name: worker-node-1
Labels: app=my-app, version=v2.1, env=prod

Without Metadata: If you have 10 replicas of the same application, their logs are completely indistinguishable. You cannot debug which specific replica had an issue, making troubleshooting impossible.

# Log WITHOUT metadata (useless in production)
2025-01-15 10:30:45 ERROR: Database connection failed

# Log WITH metadata (actionable)
{
  "timestamp": "2025-01-15T10:30:45Z",
  "level": "ERROR",
  "message": "Database connection failed",
  "kubernetes": {
    "namespace": "production",
    "pod_name": "my-app-deployment-7f9c4b8d-xk2pq",
    "container_name": "app",
    "node_name": "worker-node-2",
    "labels": {
      "app": "my-app",
      "version": "v2.1",
      "environment": "production"
    }
  }
}

# Now you can:
# - Identify the exact pod that failed
# - Correlate with metrics from that specific pod
# - Check if other pods on the same node are affected
# - Filter by version to see if it's a deployment issue

4. Parsing and Structuring

Problem: Many applications output unstructured text logs that are difficult to search and analyze.

Solution: Parse raw text into structured formats (typically JSON) to enable powerful queries and filtering.

# Unstructured log (hard to query)
"2025-01-15 10:30:45 [ERROR] user=john action=login ip=192.168.1.100 result=failed"

# Structured log (easily queryable)
{
  "timestamp": "2025-01-15T10:30:45Z",
  "level": "ERROR",
  "user": "john",
  "action": "login",
  "ip": "192.168.1.100",
  "result": "failed"
}

# Now you can easily query:
# - All failed logins: level=ERROR AND action=login AND result=failed
# - All actions by user: user=john
# - All traffic from IP: ip=192.168.1.100

5. Filtering and Optimization

Problem: Collecting every single log message can overwhelm your logging system and generate massive costs.

Solution: Implement strict filtering to collect only relevant logs (WARNING, ERROR, CRITICAL levels) and drop verbose DEBUG/INFO messages in production.

Cost Warning: Without proper filtering, logging infrastructure costs can exceed the cost of running your actual application. Filter logs before they reach storage to prevent resource exhaustion.

# Example filtering strategy

# Development environment: Collect everything
log_level: DEBUG

# Staging environment: Drop DEBUG
log_level: INFO

# Production environment: Only warnings and errors
log_level: WARNING

# Example filter configuration (Fluent Bit)
[FILTER]
    Name    grep
    Match   *
    Regex   level (WARNING|ERROR|CRITICAL)

# This drops all DEBUG and INFO logs before sending to storage
# Reduces log volume by 80-90% in typical applications

Lesson 2: Logging Architectures & Challenges

Understanding the different approaches to collecting logs in Kubernetes and the challenges each presents.

Three Logging Patterns

Node-Level Logging Agent

Approach: Run a logging agent on each node as a DaemonSet

Pros:

Most common and recommended pattern
One agent per node (low overhead)
Automatic for all pods on node
No app modifications needed

Cons:

Only works with stdout/stderr
Agent must be privileged

Sidecar Container Pattern

Approach: Add a logging container to each pod

Pros:

Can handle multiple log streams
App-specific log processing
Works with file-based logs

Cons:

High resource overhead (many agents)
More complex configuration
Logs written to disk first

Direct Application Logging

Approach: Application sends logs directly to logging backend

Pros:

No intermediate agent needed
Full control over log format
Can add custom application context

Cons:

Requires application code changes
Tight coupling to logging backend
No logs if backend is unavailable
kubectl logs doesn't work

Recommended: Node-Level DaemonSet

Best Practice: Use a node-level logging agent deployed as a DaemonSet. This provides the best balance of simplicity, efficiency, and automatic coverage for all pods.

# Example: Fluent Bit DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:2.0
        volumeMounts:
        # Read container logs
        - name: varlog
          mountPath: /var/log
        # Read pod metadata
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        # Configuration
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config

The Ephemeral Nature Challenge

Core Challenge: Kubernetes pods and containers are designed to be ephemeral - they start, stop, restart, and terminate frequently. This makes logging significantly more complex than traditional server environments.

What Happens During a Pod Lifecycle

Pod Starts

Container begins writing logs to stdout/stderr → Docker stores in /var/log/containers/

Application Runs

Logs accumulate on disk → Logging agent reads and forwards to central storage

Container Crashes (OOM, panic, etc.)

Container terminates → Log files remain temporarily on disk

Pod Restarts

New container starts → Old log files eventually deleted by kubelet → Without external logging, restart history is LOST

Kubelet Log Retention: By default, kubelet keeps logs from terminated containers for a limited time and with size limits:

containerLogMaxSize: Default 10Mi per container log file
containerLogMaxFiles: Default 5 rotated files
Maximum ~50Mi of logs per container before rotation/deletion

Once these limits are exceeded, older logs are permanently lost without external collection.

Log Volume Challenges

Scale Problem: In a production Kubernetes cluster, log volume can quickly become overwhelming:

# Example calculation for a medium cluster:

# Cluster size:
- 50 nodes
- 500 pods (average 10 pods per node)
- Each pod generates 100 lines/second

# Log volume:
500 pods × 100 lines/sec = 50,000 lines/second
= 3,000,000 lines/minute
= 180,000,000 lines/hour
= 4,320,000,000 lines/day (4.3 billion!)

# At ~500 bytes per log line:
4.3 billion × 500 bytes = 2.15 TB/day of raw logs

# With metadata enrichment (+30% size):
2.15 TB × 1.3 = 2.8 TB/day

# Storage cost (at $0.10/GB/month):
2.8 TB/day × 30 days × $0.10 = $8,400/month JUST for storage

# This is why filtering is critical!

Resource Consumption Warning: Without proper filtering and optimization, your logging infrastructure can easily consume more resources (CPU, memory, storage, network) and cost more money than your actual application workloads!

Multi-Source Log Collection

Complete Kubernetes Logging Sources

Application Logs:

Container stdout/stderr (primary source)
Application-specific log files (if using sidecar)

System Logs:

kubelet logs (systemd journal or /var/log/)
Container runtime logs (Docker/containerd)
Operating system logs (syslog, kernel)

Control Plane Logs:

kube-apiserver (API requests, authentication, authorization)
kube-scheduler (scheduling decisions)
kube-controller-manager (reconciliation loops)
etcd (cluster state database)

Audit Logs:

Kubernetes audit events (who did what, when)
Security and compliance tracking

Best Practice: Start by collecting application logs (stdout/stderr) from all pods. Add system and control plane logs once your basic logging infrastructure is stable. Audit logs can be added last for compliance requirements.

Lesson 3: Loki - Modern Logging for Kubernetes

Grafana Loki is a modern, lightweight logging system designed specifically for cloud-native environments. It offers a simpler, more cost-effective alternative to traditional logging stacks.

What is Loki?

Loki: An open-source log aggregation system from Grafana Labs, often described as "Prometheus for logs." It's designed to be efficient, cost-effective, and integrate seamlessly with existing Prometheus and Grafana deployments.

Key Philosophy: Unlike traditional logging systems that index the full text of every log line, Loki only indexes metadata (labels). This makes it significantly more efficient and less resource-intensive.

Why "Prometheus for Logs"?

Prometheus (Metrics)

Indexes metrics by labels
Stores time-series data in TSDB
Uses PromQL for queries
Pull-based scraping model
Lightweight and efficient
Short-term retention (weeks)

Loki (Logs)

Indexes logs by labels (not content!)
Stores log data in TSDB
Uses LogQL for queries
Push-based ingestion model
Lightweight and efficient
Short-medium retention (weeks)

Loki Architecture

Three-Component Design

1. Promtail (Log Collector Agent):

Runs as DaemonSet on each node
Discovers log files automatically
Adds Kubernetes metadata as labels
Pushes logs to Loki server

2. Loki (Storage & Query Engine):

Receives logs from Promtail
Indexes only labels (not log content)
Stores log data in chunks (compressed)
Serves queries via HTTP API

3. Grafana (Visualization):

Queries Loki using LogQL
Displays logs alongside metrics
Provides correlation between logs and metrics
Same dashboard for everything

Installing Loki Stack

# Install Loki stack using Helm
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

# Create namespace
kubectl create namespace logging

# Install Loki stack (includes Loki, Promtail, and Grafana)
helm install loki grafana/loki-stack \
  --namespace logging \
  --set grafana.enabled=true \
  --set prometheus.enabled=true \
  --set prometheus.alertmanager.persistentVolume.enabled=false \
  --set prometheus.server.persistentVolume.enabled=false \
  --set loki.persistence.enabled=true \
  --set loki.persistence.size=10Gi

# Get Grafana password
kubectl get secret -n logging loki-grafana \
  -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

# Port forward to Grafana
kubectl port-forward -n logging svc/loki-grafana 3000:80

# Open http://localhost:3000
# Username: admin

LogQL Query Language

LogQL: Loki's query language, inspired by PromQL. It allows you to filter, parse, and aggregate log data using label selectors and pipeline operators.

# Basic LogQL queries

# Select all logs from a specific namespace
{namespace="production"}

# Filter by multiple labels
{namespace="production", app="my-app"}

# Filter by pod name pattern (regex)
{namespace="production", pod=~"my-app-.*"}

# Filter log content (search for text)
{namespace="production"} |= "error"

# Exclude log content
{namespace="production"} != "debug"

# Case-insensitive search
{namespace="production"} |~ "(?i)error"

# Parse JSON logs
{namespace="production"} | json

# Extract specific field from JSON
{namespace="production"} | json | line_format "{{.message}}"

# Count log lines per second
rate({namespace="production"}[5m])

# Count errors per minute
sum(rate({namespace="production"} |= "ERROR" [1m]))

# Top 10 pods by log volume
topk(10,
  sum by (pod) (rate({namespace="production"}[5m]))
)

Labels: The Key to Efficiency

Loki's Secret Sauce: By indexing only labels (not log content), Loki achieves massive efficiency gains:

10-50x less storage than Elasticsearch
Significantly lower memory and CPU usage
Faster query performance for label-based searches
Lower operational costs

# Loki automatically adds Kubernetes labels from Promtail:

{
  "job": "kubernetes-pods",
  "namespace": "production",
  "pod": "my-app-7f9c4b8d-xk2pq",
  "container": "app",
  "node_name": "worker-node-2",
  "app": "my-app",
  "version": "v2.1"
}

# These labels are indexed for fast filtering
# The actual log content is stored compressed, not indexed

Loki vs. Elasticsearch

Feature	Loki	Elasticsearch
Indexing	Labels only (metadata)	Full-text indexing of all log content
Storage	10-50x less (TSDB, compressed chunks)	High (inverted indices for all text)
Resource Usage	Low (single binary, minimal memory)	High (Java, large heap, multiple nodes)
Query Speed	Fast for label queries, slower for text search	Fast for full-text search
Setup Complexity	Simple (like Prometheus)	Complex (cluster, sharding, replicas)
Integration	Native Grafana (with metrics)	Kibana (separate tool)
Best For	Kubernetes, cloud-native, cost-sensitive	Complex text search, compliance, large teams

Limitations of Loki

TSDB Retention Limitation: Loki stores log data in a Time Series Database (TSDB), which is not ideal for very long-term retention (2+ weeks). TSDB performance degrades with age of data.

Workarounds for Long-Term Storage:

Use Loki for recent logs (1-2 weeks)
Export older logs to object storage (S3, GCS)
Use Elasticsearch or similar for compliance/audit logs requiring long retention
Configure Loki with object storage backend (S3, GCS, Azure Blob) for better retention

Grafana Integration Benefits

Unified Observability: The biggest advantage of Loki is seamless integration with Grafana, allowing you to view logs alongside metrics in the same dashboard.

# Example: Correlating logs and metrics in Grafana

# Scenario: High error rate alert fired

# Panel 1: Error rate metric (Prometheus)
sum(rate(http_requests_total{status=~"5.."}[5m]))

# Panel 2: Error logs (Loki) - same time range
{namespace="production", app="my-app"} |= "ERROR"

# You can:
# 1. See the error rate spike in the metric graph
# 2. Immediately view the actual error messages below
# 3. Click on a log line to see full context
# 4. All in the same dashboard, same time range
# 5. No context switching between tools!

Lesson 4: EFKB Stack (Fluent Bit, Elasticsearch, Kibana)

The EFKB stack represents the evolution of traditional logging systems, offering powerful full-text search capabilities and extensive customization for complex logging requirements.

Evolution: ELK → EFK → EFKB

ELK Stack (Original)

Components: Elasticsearch, Logstash, Kibana

Problem: Logstash is Java-based and extremely resource-intensive (high CPU and memory usage), making it expensive to run at scale.

EFK Stack (First Evolution)

Components: Elasticsearch, Fluentd, Kibana

Improvement: Replaced Logstash with Fluentd (Ruby-based), which is significantly lighter and more efficient.

Remaining Issue: Fluentd still had non-trivial resource overhead for very large deployments.

EFKB Stack (Modern)

Components: Elasticsearch, Fluent Bit, Kibana (+ Optional Backend)

Breakthrough: Fluent Bit (written in C) is extremely lightweight and high-performance, making it ideal for Kubernetes environments where it runs on every node.

Why Fluent Bit?

Performance Champion: Fluent Bit is significantly more efficient than its predecessors:

Memory: ~450KB footprint (vs. 40-60MB for Fluentd, 200MB+ for Logstash)
CPU: Minimal CPU usage due to C implementation
Speed: Can process millions of logs per second per node
Written in C: Direct system calls, no interpreter overhead

Log Collector	Language	Memory Footprint	Best Use Case
Logstash	Java (JRuby)	200MB - 1GB+	Complex transformations, legacy systems
Fluentd	Ruby (CRuby)	40-60MB	Moderate scale, plugin ecosystem
Fluent Bit	C	~450KB	High scale, Kubernetes, edge devices

Fluent Bit Architecture

Modular Pipeline Design

Fluent Bit processes logs through a series of modular stages:

1. Input (Collection)

Collects logs from various sources:

tail: Read from log files (like tail -f)
systemd: Read from systemd journal
tcp/udp: Listen on network ports
kubernetes: Automatically discover and read pod logs
docker: Read from Docker container logs

2. Parser (Structuring)

Converts raw text into structured data:

json: Parse JSON-formatted logs
regex: Custom parsing with regular expressions
apache/nginx: Parse web server logs
docker: Parse Docker JSON logs
syslog: Parse syslog format

# Example: Parsing Nginx logs with Fluent Bit

# Raw log line:
192.168.1.100 - - [15/Jan/2025:10:30:45 +0000] "GET /api/users HTTP/1.1" 200 1234

# Fluent Bit parser configuration:
[PARSER]
    Name   nginx
    Format regex
    Regex  ^(?[^ ]*) [^ ]* (?[^ ]*) \[(?[^\]]*)\] "(?\S+)(?: +(?[^\"]*?)(?: +\S*)?)?" (?[^ ]*) (?[^ ]*)

# Parsed output (structured JSON):
{
  "remote": "192.168.1.100",
  "user": "-",
  "time": "15/Jan/2025:10:30:45 +0000",
  "method": "GET",
  "path": "/api/users",
  "code": "200",
  "size": "1234"
}



                
                    3. Filter (Processing & Enrichment)
                    Modifies, enriches, or drops logs:
                    
                        kubernetes: Add pod name, namespace, labels
                        grep: Filter logs based on content (keep/drop)
                        modify: Add, remove, or rename fields
                        nest: Restructure nested JSON
                        throttle: Rate-limit log output
                    
                

                # Example filter configuration

# Add Kubernetes metadata
[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Merge_Log           On
    K8S-Logging.Parser  On

# Drop debug logs
[FILTER]
    Name    grep
    Match   *
    Exclude level DEBUG

# Add custom field
[FILTER]
    Name    modify
    Match   *
    Add     environment production
    Add     cluster us-east-1

                
                    4. Buffer (Reliability)
                    Handles temporary storage for reliability:
                    
                        Memory buffer: Fast, but lost on crash
                        Filesystem buffer: Persistent, survives restarts
                        Backpressure handling: Slows input when backend is unavailable
                    
                

                
                    Buffer Configuration Trade-off:
                    
                        Memory buffer: Faster but logs lost if Fluent Bit crashes
                        Filesystem buffer: Survives crashes but slower and uses disk I/O
                    
                    For production, use filesystem buffer to prevent log loss.
                

                
                    5. Routing/Output (Delivery)
                    Sends processed logs to destination(s):
                    
                        elasticsearch: Send to Elasticsearch cluster
                        kafka: Send to Apache Kafka
                        http: POST to HTTP endpoint
                        s3: Store in AWS S3
                        stdout: Print to console (debugging)
                        loki: Send to Grafana Loki
                    
                

                Complete Fluent Bit Configuration Example

                # Fluent Bit ConfigMap for Kubernetes
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        5
        Daemon       Off
        Log_Level    info

    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        Parser            docker
        Tag               kube.*
        Refresh_Interval  5
        Mem_Buf_Limit     5MB

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Merge_Log           On
        K8S-Logging.Parser  On

    [FILTER]
        Name    grep
        Match   *
        Regex   level (WARNING|ERROR|CRITICAL)

    [OUTPUT]
        Name            es
        Match           *
        Host            elasticsearch.logging.svc
        Port            9200
        Index           kubernetes-logs
        Type            _doc
        Logstash_Format On
        Retry_Limit     5

                Elasticsearch & Kibana

                
                    Elasticsearch: A distributed search and analytics engine that stores and indexes logs. Provides powerful full-text search and aggregation capabilities.
                

                
                    Kibana: The web interface for Elasticsearch. Allows you to search logs, create visualizations, and build dashboards.
                

                
                    Kibana Features:
                    
                        Discover: Search and filter logs with advanced queries
                        Visualize: Create charts, graphs, and metrics
                        Dashboard: Combine visualizations into comprehensive dashboards
                        Dev Tools: Run Elasticsearch queries directly
                        Alerting: Set up alerts based on log patterns
                    
                

                When to Use EFKB vs. Loki

                
                    
                        
                            Use Case
                            Recommended Solution
                            Reason
                        
                    
                    
                        
                            Small-medium Kubernetes clusters
                            Loki
                            Lower resource usage, simpler setup, Grafana integration
                        
                        
                            Full-text search requirements
                            EFKB
                            Elasticsearch excels at complex text queries
                        
                        
                            Long-term retention (months/years)
                            EFKB
                            Elasticsearch handles long retention better than TSDB
                        
                        
                            Compliance & audit logs
                            EFKB
                            Better for retention, immutability, and compliance features
                        
                        
                            Cost-sensitive environments
                            Loki
                            10-50x less storage and compute costs
                        
                        
                            Already using Prometheus/Grafana
                            Loki
                            Unified observability in single tool
                        
                        
                            Multiple data sources beyond logs
                            EFKB
                            Elasticsearch handles diverse data types
                        
                    
                

                
                    Hybrid Approach: Many organizations use both:
                    
                        Loki: For recent application logs (1-2 weeks) and debugging
                        EFKB: For long-term retention, compliance, and audit logs
                    
                    Fluent Bit can send logs to multiple destinations simultaneously!

Use Case	Recommended Solution	Reason
Small-medium Kubernetes clusters	Loki	Lower resource usage, simpler setup, Grafana integration
Full-text search requirements	EFKB	Elasticsearch excels at complex text queries
Long-term retention (months/years)	EFKB	Elasticsearch handles long retention better than TSDB
Compliance & audit logs	EFKB	Better for retention, immutability, and compliance features
Cost-sensitive environments	Loki	10-50x less storage and compute costs
Already using Prometheus/Grafana	Loki	Unified observability in single tool
Multiple data sources beyond logs	EFKB	Elasticsearch handles diverse data types



            
            
                Final Quiz
                Test your knowledge of Kubernetes logging!

                
                    Question 1: Where must containerized applications write their logs?
                    
                        a) To /var/log/ directory inside the container
                        b) To standard output (stdout) and standard error (stderr)
                        c) Directly to Elasticsearch
                        d) To a mounted volume shared with the host
                    
                

                
                    Question 2: Why is external log persistence critical in Kubernetes?
                    
                        a) kubectl logs automatically backs up all logs
                        b) Containers are ephemeral and logs are lost when containers restart without external collection
                        c) Kubernetes doesn't support internal logging
                        d) Docker doesn't capture stdout/stderr
                    
                

                
                    Question 3: Why is metadata enrichment essential for Kubernetes logs?
                    
                        a) It makes logs larger and easier to see
                        b) Without pod name, namespace, and labels, logs from multiple replicas are indistinguishable and debugging is impossible
                        c) Kubernetes requires metadata for compliance
                        d) Metadata is only needed for billing purposes
                    
                

                
                    Question 4: Why is Loki described as "Prometheus for logs"?
                    
                        a) It's written by the same people
                        b) It indexes logs by labels (like Prometheus metrics), uses TSDB storage, and has similar architecture with LogQL mirroring PromQL
                        c) It replaces Prometheus
                        d) It only works with Prometheus
                    
                

                
                    Question 5: What is Fluent Bit's main advantage over Logstash and Fluentd?
                    
                        a) It has more features
                        b) Written in C, it's significantly more lightweight and high-performance (~450KB vs 40-200MB+ footprint)
                        c) It only works with Kubernetes
                        d) It's easier to configure
                    
                

                
                    Question 6: What does the Filter stage in Fluent Bit do?
                    
                        a) Sends logs to Elasticsearch
                        b) Selectively drops unwanted logs, enriches with metadata, and modifies log fields before sending to storage
                        c) Parses JSON logs
                        d) Stores logs temporarily
                    
                

                
                    Question 7: What is a key limitation of Loki's TSDB storage?
                    
                        a) It cannot store JSON logs
                        b) TSDB is not ideal for very long-term retention (2+ weeks) due to performance degradation with age
                        c) It requires Elasticsearch backend
                        d) It only works with Grafana
                    
                

                
                    Question 8: Why is strict log filtering critical in production?
                    
                        a) To make logs more colorful
                        b) Without filtering (collecting only WARNING/ERROR), logging infrastructure costs and resource consumption can exceed the actual application
                        c) Filtering is only for compliance
                        d) Kubernetes requires filtering
                    
                

                
                    Quiz Complete!

                    All correct answers are option 'b'. These logging principles are essential for operating production Kubernetes clusters. Remember: proper logging with persistence, aggregation, metadata, parsing, and filtering is critical for debugging and maintaining cloud-native applications!