
How to Set Up a Ceph Cluster Within Kubernetes Using Rook
Setting up a Ceph cluster within Kubernetes might sound like one of those “hold my beer” moments in distributed storage, but with Rook as your orchestrator, it’s actually more manageable than you’d expect. Ceph gives you unified storage (block, object, and file) that scales horizontally, while Rook handles the heavy lifting of deploying and managing Ceph in your K8s environment. You’ll learn how to deploy a production-ready Ceph cluster using Rook, troubleshoot common gotchas, and understand when this setup makes sense versus alternatives like cloud-native storage solutions.
How Rook and Ceph Work Together
Rook is basically a Kubernetes operator that knows how to speak Ceph. Instead of manually configuring Ceph daemons, managing configuration files, and babysitting cluster health, Rook translates your desired storage state into Kubernetes resources. It deploys Ceph Monitor (MON), Manager (MGR), Object Storage Daemon (OSD), and Metadata Server (MDS) components as pods, handling everything from initial cluster bootstrap to ongoing maintenance tasks.
The architecture looks like this: Rook runs as a set of controllers watching for custom resources like CephCluster, CephBlockPool, and CephFilesystem. When you create these resources, Rook spins up the appropriate Ceph daemons and configures them according to your specifications. The beauty is that failed components get automatically recreated, scaling happens through simple kubectl commands, and your storage cluster becomes as declarative as the rest of your K8s infrastructure.
Prerequisites and Initial Setup
Before diving in, make sure your Kubernetes cluster meets the requirements. You’ll need at least three nodes for a production setup (Ceph needs odd numbers for quorum), with raw block devices or directories available for OSDs. Each node should have at least 4GB RAM and decent CPU resources – Ceph isn’t lightweight.
Start by cloning the Rook repository and applying the common resources:
git clone https://github.com/rook/rook.git
cd rook/deploy/examples
kubectl create -f crds.yaml
kubectl create -f common.yaml
kubectl create -f operator.yaml
Verify the operator is running:
kubectl -n rook-ceph get pods
You should see the rook-ceph-operator pod in Running state. This operator will handle all the Ceph lifecycle management.
Deploying Your First Ceph Cluster
The cluster configuration is where things get interesting. Here’s a basic cluster spec that assumes you have raw devices available:
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: quay.io/ceph/ceph:v17.2.6
allowUnsupported: false
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false
mon:
count: 3
allowMultiplePerNode: false
mgr:
count: 2
dashboard:
enabled: true
ssl: true
crashCollector:
disable: false
storage:
useAllNodes: true
useAllDevices: true
deviceFilter: "^sd[b-z]"
config:
osdsPerDevice: "1"
Apply this configuration:
kubectl create -f cluster.yaml
The initial deployment takes several minutes. Monitor progress with:
kubectl -n rook-ceph get pods -w
You’ll see MON pods come up first, followed by MGR, and finally OSD pods as Rook discovers and configures your storage devices.
Configuring Storage Classes and PVCs
Once your cluster is healthy, create storage classes for different use cases. Here’s a block storage setup:
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
allowVolumeExpansion: true
reclaimPolicy: Delete
Test it with a simple PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: rook-ceph-block
Real-World Use Cases and Performance Considerations
In production environments, Rook-managed Ceph clusters shine in several scenarios. E-commerce platforms use them for persistent storage across multiple availability zones, where the self-healing nature means less 3 AM pages when disks fail. Media companies leverage the object storage capabilities for content distribution, while development teams appreciate having the same storage stack in staging and production.
Performance-wise, expect different characteristics compared to cloud block storage:
Metric | Rook/Ceph (3 replicas) | AWS EBS gp3 | Local SSD |
---|---|---|---|
Sequential Read (MB/s) | 300-800 | 250-1000 | 500-3000 |
Random IOPS (4K) | 5000-15000 | 3000-16000 | 50000+ |
Durability | Triple replication | 99.999% | Single point failure |
Cost per GB/month | $0.05-0.15 | $0.08-0.20 | $0.03 |
The sweet spot is usually when you have predictable workloads, need multi-zone redundancy, and want to avoid vendor lock-in. Don’t expect to beat dedicated NVMe performance, but you’ll get solid throughput with built-in replication.
Common Issues and Troubleshooting
The most frequent gotcha is OSD pods stuck in pending state. This usually means Rook can’t find suitable storage devices. Check what Rook discovered:
kubectl -n rook-ceph logs -l app=rook-discover
If you’re using directories instead of raw devices, make sure your cluster spec includes:
storage:
useAllNodes: true
useAllDevices: false
directories:
- path: /var/lib/rook/storage-dir
MON quorum issues are another classic problem. If MONs can’t reach each other, check your network policies and make sure the required ports (6789, 3300) are accessible between nodes. The Ceph dashboard (accessible via port-forward) gives you cluster health at a glance:
kubectl -n rook-ceph port-forward service/rook-ceph-mgr-dashboard 8443:8443
For performance issues, monitor OSD utilization and consider adjusting the OSD placement groups. A good starting point is 100-200 PGs per OSD, but this depends heavily on your workload patterns.
Comparison with Alternative Solutions
Rook/Ceph isn’t the only game in town for Kubernetes storage. Here’s how it stacks up:
Solution | Complexity | Features | Best For |
---|---|---|---|
Rook/Ceph | High | Block, Object, File storage | Multi-cloud, avoiding vendor lock-in |
OpenEBS | Medium | Multiple storage engines | Flexibility, local storage optimization |
Longhorn | Low | Block storage, snapshots | Simplicity, edge deployments |
Cloud CSI drivers | Low | Provider-specific features | Cloud-native applications |
Choose Rook/Ceph when you need proven enterprise storage features, plan to run across multiple clouds, or have specific requirements around data sovereignty. Skip it if you’re just getting started with Kubernetes or have simple storage needs that cloud providers handle well.
Best Practices and Production Considerations
Never run Ceph components on the same nodes as your application workloads in production. Use node selectors and taints to dedicate specific nodes for storage:
spec:
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: storage-node
operator: In
values:
- "true"
Monitor disk usage religiously – Ceph gets unhappy when OSDs exceed 85% utilization. Set up alerting on cluster health using the metrics endpoint or integrate with Prometheus.
For disaster recovery, configure regular snapshots and test restore procedures. The CephBlockPool resource supports automated snapshot schedules:
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
replicated:
size: 3
mirroring:
enabled: true
mode: image
Security-wise, enable encryption at rest and consider running Ceph communication over encrypted channels in multi-tenant environments. The overhead is usually worth the peace of mind.
Finally, keep your Rook and Ceph versions current, but test upgrades in staging first. The upgrade process is largely automated, but complex distributed systems can surprise you in creative ways.
For comprehensive documentation and advanced configuration options, check the official Rook documentation and Ceph documentation.

This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.
This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.