BLOG POSTS

MangoHost Blog / How to Set Up a Ceph Cluster Within Kubernetes Using Rook

How to Set Up a Ceph Cluster Within Kubernetes Using Rook

Setting up a Ceph cluster within Kubernetes might sound like one of those “hold my beer” moments in distributed storage, but with Rook as your orchestrator, it’s actually more manageable than you’d expect. Ceph gives you unified storage (block, object, and file) that scales horizontally, while Rook handles the heavy lifting of deploying and managing Ceph in your K8s environment. You’ll learn how to deploy a production-ready Ceph cluster using Rook, troubleshoot common gotchas, and understand when this setup makes sense versus alternatives like cloud-native storage solutions.

How Rook and Ceph Work Together

Rook is basically a Kubernetes operator that knows how to speak Ceph. Instead of manually configuring Ceph daemons, managing configuration files, and babysitting cluster health, Rook translates your desired storage state into Kubernetes resources. It deploys Ceph Monitor (MON), Manager (MGR), Object Storage Daemon (OSD), and Metadata Server (MDS) components as pods, handling everything from initial cluster bootstrap to ongoing maintenance tasks.

The architecture looks like this: Rook runs as a set of controllers watching for custom resources like CephCluster, CephBlockPool, and CephFilesystem. When you create these resources, Rook spins up the appropriate Ceph daemons and configures them according to your specifications. The beauty is that failed components get automatically recreated, scaling happens through simple kubectl commands, and your storage cluster becomes as declarative as the rest of your K8s infrastructure.

Prerequisites and Initial Setup

Before diving in, make sure your Kubernetes cluster meets the requirements. You’ll need at least three nodes for a production setup (Ceph needs odd numbers for quorum), with raw block devices or directories available for OSDs. Each node should have at least 4GB RAM and decent CPU resources – Ceph isn’t lightweight.

Start by cloning the Rook repository and applying the common resources:

git clone https://github.com/rook/rook.git
cd rook/deploy/examples
kubectl create -f crds.yaml
kubectl create -f common.yaml
kubectl create -f operator.yaml

Verify the operator is running:

kubectl -n rook-ceph get pods

You should see the rook-ceph-operator pod in Running state. This operator will handle all the Ceph lifecycle management.

Deploying Your First Ceph Cluster

The cluster configuration is where things get interesting. Here’s a basic cluster spec that assumes you have raw devices available:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: quay.io/ceph/ceph:v17.2.6
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  mon:
    count: 3
    allowMultiplePerNode: false
  mgr:
    count: 2
  dashboard:
    enabled: true
    ssl: true
  crashCollector:
    disable: false
  storage:
    useAllNodes: true
    useAllDevices: true
    deviceFilter: "^sd[b-z]"
    config:
      osdsPerDevice: "1"

Apply this configuration:

kubectl create -f cluster.yaml

The initial deployment takes several minutes. Monitor progress with:

kubectl -n rook-ceph get pods -w

You’ll see MON pods come up first, followed by MGR, and finally OSD pods as Rook discovers and configures your storage devices.

Configuring Storage Classes and PVCs

Once your cluster is healthy, create storage classes for different use cases. Here’s a block storage setup:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
  clusterID: rook-ceph
  pool: replicapool
  imageFormat: "2"
  imageFeatures: layering
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
allowVolumeExpansion: true
reclaimPolicy: Delete

Test it with a simple PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: rook-ceph-block

Real-World Use Cases and Performance Considerations

In production environments, Rook-managed Ceph clusters shine in several scenarios. E-commerce platforms use them for persistent storage across multiple availability zones, where the self-healing nature means less 3 AM pages when disks fail. Media companies leverage the object storage capabilities for content distribution, while development teams appreciate having the same storage stack in staging and production.

Performance-wise, expect different characteristics compared to cloud block storage:

Metric	Rook/Ceph (3 replicas)	AWS EBS gp3	Local SSD
Sequential Read (MB/s)	300-800	250-1000	500-3000
Random IOPS (4K)	5000-15000	3000-16000	50000+
Durability	Triple replication	99.999%	Single point failure
Cost per GB/month	$0.05-0.15	$0.08-0.20	$0.03

The sweet spot is usually when you have predictable workloads, need multi-zone redundancy, and want to avoid vendor lock-in. Don’t expect to beat dedicated NVMe performance, but you’ll get solid throughput with built-in replication.

Common Issues and Troubleshooting

The most frequent gotcha is OSD pods stuck in pending state. This usually means Rook can’t find suitable storage devices. Check what Rook discovered:

kubectl -n rook-ceph logs -l app=rook-discover

If you’re using directories instead of raw devices, make sure your cluster spec includes:

storage:
  useAllNodes: true
  useAllDevices: false
  directories:
  - path: /var/lib/rook/storage-dir

MON quorum issues are another classic problem. If MONs can’t reach each other, check your network policies and make sure the required ports (6789, 3300) are accessible between nodes. The Ceph dashboard (accessible via port-forward) gives you cluster health at a glance:

kubectl -n rook-ceph port-forward service/rook-ceph-mgr-dashboard 8443:8443

For performance issues, monitor OSD utilization and consider adjusting the OSD placement groups. A good starting point is 100-200 PGs per OSD, but this depends heavily on your workload patterns.

Comparison with Alternative Solutions

Rook/Ceph isn’t the only game in town for Kubernetes storage. Here’s how it stacks up:

Solution	Complexity	Features	Best For
Rook/Ceph	High	Block, Object, File storage	Multi-cloud, avoiding vendor lock-in
OpenEBS	Medium	Multiple storage engines	Flexibility, local storage optimization
Longhorn	Low	Block storage, snapshots	Simplicity, edge deployments
Cloud CSI drivers	Low	Provider-specific features	Cloud-native applications

Choose Rook/Ceph when you need proven enterprise storage features, plan to run across multiple clouds, or have specific requirements around data sovereignty. Skip it if you’re just getting started with Kubernetes or have simple storage needs that cloud providers handle well.

Best Practices and Production Considerations

Never run Ceph components on the same nodes as your application workloads in production. Use node selectors and taints to dedicate specific nodes for storage:

spec:
  placement:
    all:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: storage-node
              operator: In
              values:
              - "true"

Monitor disk usage religiously – Ceph gets unhappy when OSDs exceed 85% utilization. Set up alerting on cluster health using the metrics endpoint or integrate with Prometheus.

For disaster recovery, configure regular snapshots and test restore procedures. The CephBlockPool resource supports automated snapshot schedules:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  replicated:
    size: 3
  mirroring:
    enabled: true
    mode: image

Security-wise, enable encryption at rest and consider running Ceph communication over encrypted channels in multi-tenant environments. The overhead is usually worth the peace of mind.

Finally, keep your Rook and Ceph versions current, but test upgrades in staging first. The upgrade process is largely automated, but complex distributed systems can surprise you in creative ways.

For comprehensive documentation and advanced configuration options, check the official Rook documentation and Ceph documentation.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.

This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.