BLOG POSTS

MangoHost Blog / How to Set Up an Elasticsearch, Fluentd, and Kibana (EFK) Logging Stack on Kubernetes

How to Set Up an Elasticsearch, Fluentd, and Kibana (EFK) Logging Stack on Kubernetes

Setting up a centralized logging solution on Kubernetes is crucial for monitoring distributed applications and troubleshooting issues across multiple pods and services. The EFK stack (Elasticsearch, Fluentd, and Kibana) provides a powerful combination where Elasticsearch stores and indexes logs, Fluentd collects and forwards log data, and Kibana visualizes the information through interactive dashboards. This guide will walk you through deploying a complete EFK logging stack on Kubernetes, covering everything from basic setup to advanced configurations and common troubleshooting scenarios.

How the EFK Stack Works

The EFK architecture follows a simple but effective flow: Fluentd runs as a DaemonSet on each Kubernetes node, collecting logs from containers and system components, then forwards them to Elasticsearch for storage and indexing. Kibana connects to Elasticsearch to provide a web interface for searching, filtering, and visualizing log data.

Here’s what each component handles:

Fluentd: Acts as the log collector and forwarder, parsing various log formats and enriching them with Kubernetes metadata like pod names, namespaces, and labels
Elasticsearch: Stores logs in indexes, provides full-text search capabilities, and handles data retention policies
Kibana: Offers visualization tools, dashboard creation, and advanced search interfaces for log analysis

Compared to other logging solutions like ELK (Logstash instead of Fluentd) or Loki, EFK typically uses less memory and provides better Kubernetes integration out of the box. Fluentd’s plugin ecosystem is particularly strong for Kubernetes environments.

Prerequisites and Cluster Requirements

Before diving into the setup, ensure your Kubernetes cluster meets these requirements:

Kubernetes version 1.16 or higher
At least 4GB RAM per node (Elasticsearch is memory-intensive)
Persistent volume support for Elasticsearch data
kubectl configured to access your cluster
Helm 3.x installed (optional but recommended)

Check your cluster resources:

kubectl get nodes
kubectl get storageclass
kubectl top nodes

Step-by-Step EFK Stack Deployment

Step 1: Create Namespace and RBAC

First, create a dedicated namespace for the logging stack:

kubectl create namespace logging

Create the necessary RBAC permissions for Fluentd:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: logging

Step 2: Deploy Elasticsearch

Create a StatefulSet for Elasticsearch with persistent storage:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: logging
spec:
  serviceName: elasticsearch
  replicas: 1
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:7.17.0
        resources:
          limits:
            cpu: 1000m
            memory: 2Gi
          requests:
            cpu: 100m
            memory: 1Gi
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        env:
        - name: cluster.name
          value: k8s-logs
        - name: node.name
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: discovery.type
          value: single-node
        - name: ES_JAVA_OPTS
          value: "-Xms512m -Xmx512m"
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: elasticsearch
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 20Gi
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: logging
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
  - port: 9200
    name: rest
  - port: 9300
    name: inter-node

Step 3: Configure and Deploy Fluentd

Create a ConfigMap for Fluentd configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
  namespace: logging
data:
  fluent.conf: |
    
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag "kubernetes.*"
      exclude_path ["/var/log/containers/fluent*"]
      read_from_head true
      
        @type "k8s-json"
        time_format %Y-%m-%dT%H:%M:%S.%NZ
      
    

    
      @type tail
      @id in_tail_startupscript
      path /var/log/startupscript.log
      pos_file /var/log/fluentd-startupscript.log.pos
      tag startupscript
      
        @type syslog
      
    

    
      @type kubernetes_metadata
      @id filter_kube_metadata
      kubernetes_url "#{ENV['KUBERNETES_SERVICE_HOST']}:#{ENV['KUBERNETES_SERVICE_PORT_HTTPS']}"
      verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || true}"
      ca_file "#{ENV['KUBERNETES_CA_FILE']}"
      skip_labels false
      skip_container_metadata false
      skip_master_url false
      skip_namespace_metadata false
    

    
      @type elasticsearch
      @id out_es
      @log_level info
      include_tag_key true
      host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
      port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
      path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
      scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
      ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
      ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1_2'}"
      reload_connections false
      reconnect_on_error true
      reload_on_failure true
      log_es_400_reason false
      logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'logstash'}"
      logstash_dateformat "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_DATEFORMAT'] || '%Y.%m.%d'}"
      logstash_format true
      index_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_INDEX_NAME'] || 'logstash'}"
      type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
      
        flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
        flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
        chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '2M'}"
        queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '8'}"
        retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}"
        retry_forever true

Deploy Fluentd as a DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
        env:
        - name:  FLUENT_ELASTICSEARCH_HOST
          value: "elasticsearch.logging.svc.cluster.local"
        - name:  FLUENT_ELASTICSEARCH_PORT
          value: "9200"
        - name: FLUENT_ELASTICSEARCH_SCHEME
          value: "http"
        - name: FLUENTD_SYSTEMD_CONF
          value: disable
        - name: FLUENT_ELASTICSEARCH_SSL_VERIFY
          value: "false"
        - name: FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX
          value: "fluentd"
        - name: KUBERNETES_VERIFY_SSL
          value: "false"
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluentd-config
          mountPath: /fluentd/etc
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluentd-config
        configMap:
          name: fluentd-config

Step 4: Deploy Kibana

Create Kibana deployment and service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:7.17.0
        resources:
          limits:
            cpu: 1000m
            memory: 1Gi
          requests:
            cpu: 100m
            memory: 512Mi
        env:
        - name: ELASTICSEARCH_HOSTS
          value: "http://elasticsearch:9200"
        - name: SERVER_NAME
          value: kibana
        - name: SERVER_BASEPATH
          value: ""
        ports:
        - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  ports:
  - port: 5601
  selector:
    app: kibana
  type: LoadBalancer

Verification and Initial Setup

After deploying all components, verify the stack is running:

kubectl get pods -n logging
kubectl get svc -n logging

Check Elasticsearch health:

kubectl port-forward -n logging svc/elasticsearch 9200:9200
curl http://localhost:9200/_cluster/health

Access Kibana through port-forwarding:

kubectl port-forward -n logging svc/kibana 5601:5601

Open your browser to http://localhost:5601 and create an index pattern for “fluentd-*” to start viewing logs.

Performance Optimization and Best Practices

Here are some optimization strategies based on cluster size and log volume:

Cluster Size	Elasticsearch Replicas	Memory Allocation	Storage Requirements
Small (1-10 nodes)	1	1-2GB	20-50GB
Medium (10-50 nodes)	2-3	4-8GB	100-500GB
Large (50+ nodes)	3-5	8-16GB	1TB+

Key optimization tips:

Configure log rotation and retention policies to manage storage costs
Use node affinity to spread Elasticsearch pods across different nodes
Tune Fluentd buffer settings based on log volume
Implement index lifecycle management (ILM) for automatic data management
Monitor resource usage and adjust requests/limits accordingly

Common Issues and Troubleshooting

Elasticsearch Issues

If Elasticsearch pods are stuck in pending state, check storage:

kubectl describe pvc -n logging
kubectl get storageclass

For memory-related crashes, adjust the Java heap size:

- name: ES_JAVA_OPTS
  value: "-Xms1g -Xmx1g"

Fluentd Collection Problems

Check Fluentd logs if no data appears in Kibana:

kubectl logs -n logging -l app=fluentd

Common issues include:

Incorrect Elasticsearch connection settings
Missing RBAC permissions for reading pod metadata
Buffer overflow due to high log volume
Parsing errors with custom log formats

Kibana Connectivity Issues

Verify Kibana can reach Elasticsearch:

kubectl exec -n logging deployment/kibana -- curl http://elasticsearch:9200/_cluster/health

Advanced Configuration and Security

For production environments, consider these security enhancements:

Enable Elasticsearch security features (authentication and TLS)
Use secrets for storing credentials instead of environment variables
Implement network policies to restrict inter-pod communication
Configure log filtering to exclude sensitive information
Set up proper backup and disaster recovery procedures

Example TLS configuration for Elasticsearch:

- name: xpack.security.enabled
  value: "true"
- name: xpack.security.transport.ssl.enabled
  value: "true"
- name: xpack.security.http.ssl.enabled
  value: "true"

Integration with External Tools

The EFK stack integrates well with other Kubernetes monitoring tools:

Prometheus: Use elasticsearch_exporter for metrics collection
Grafana: Create dashboards combining logs and metrics
Jaeger: Correlate distributed traces with log events
AlertManager: Set up log-based alerting rules

You can also export logs to external systems like AWS CloudWatch, Google Cloud Logging, or Splunk using Fluentd’s extensive plugin ecosystem.

For additional configuration options and advanced features, check the official documentation for Elasticsearch, Fluentd, and Kibana. The EFK stack provides a robust foundation for centralized logging that scales with your Kubernetes infrastructure while offering powerful search and visualization capabilities for troubleshooting and monitoring your applications.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.

This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.