Install OpenObserve Enterprise HA on Bare-Metal Kubernetes with Traefik + Ceph RGW (S3) + Ceph RBD

This post performs a fresh installation of OpenObserve Enterprise in HA (cluster) mode on a bare-metal Kubernetes cluster using the official Helm chart. In HA mode, object storage is mandatory and local-only storage is not supported as the primary store.

Enterprise licensing note: Enterprise can run without a license key up to 50 GB/day, and the free-tier license covers up to 200 GB/day.

Compared to my previous standalone (local-mode) deployment (ZO_LOCAL_MODE=true + Ceph RBD PVC), this HA setup uses:

  • Ceph RGW (S3) for stream data (Parquet files in object storage).
  • PostgreSQL (CloudNativePG) for metadata (streams, dashboards, users, filelist/stats, etc.).
  • NATS (deployed by the chart) as the cluster coordinator/event store.

This post is based on


Lab context

Kubernetes (bare metal)

Nodes:

  • k8s-1.maksonlee.com – 192.168.0.99
  • k8s-2.maksonlee.com – 192.168.0.100
  • k8s-3.maksonlee.com – 192.168.0.101

Ingress:

  • Traefik (exposed via MetalLB)
  • MetalLB IP (Ingress LB): 192.168.0.98
  • DNS: openobserve.maksonlee.com → 192.168.0.98

Ceph storage

Ceph RGW (S3):

  • Endpoint: https://ceph.maksonlee.com:443
  • HAProxy terminates TLS and forwards to RGW

Ceph RBD (PVCs):

  • StorageClass: csi-rbd-sc (recommended default)

What you’ll do

  • Create a dedicated Ceph RGW user + bucket for OpenObserve (S3).
  • Install CloudNativePG operator (required for chart-managed Postgres). (openobserve.ai)
  • Download the official Helm chart values.yaml and edit it in place. (openobserve.ai)
  • Configure:
    • Enterprise mode
    • S3 (Ceph RGW)
    • Traefik ingress
  • Install with Helm and verify.

Prerequisites

  • DNS

Add one record (LAN DNS or /etc/hosts on clients):

192.168.0.98 openobserve.maksonlee.com
  • Confirm default StorageClass (Ceph RBD CSI)
kubectl get storageclass
kubectl get storageclass -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.annotations.storageclass\.kubernetes\.io/is-default-class}{"\n"}{end}'

  1. Prepare Ceph RGW (S3): user + bucket

OpenObserve HA performs S3 operations at startup. Use a dedicated bucket and ensure the keys you configure belong to the bucket owner.

  • Create a dedicated RGW user

On ceph.maksonlee.com:

sudo cephadm shell -- radosgw-admin user create \
  --uid="openobserve" \
  --display-name="OpenObserve (Enterprise HA)"

Record the generated access_key and secret_key.

  • Create a dedicated bucket (owned by the same user)

Create a bucket (example name: openobserve) using any S3 client (s3cmd / aws-cli) pointing to RGW.

Optional “owner sanity check” (fastest way to prevent 403 surprises):

sudo cephadm shell -- radosgw-admin bucket stats --bucket openobserve | egrep '"bucket"|owner'

  1. Install CloudNativePG operator

The OpenObserve Enterprise HA installation guide uses CloudNativePG when you’re not using a managed Postgres service.

kubectl apply --server-side -f \
  https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/main/releases/cnpg-1.24.0.yaml

Verify:

kubectl rollout status deployment -n cnpg-system cnpg-controller-manager
kubectl get pods -n cnpg-system

  1. Download OpenObserve Helm values.yaml
wget https://raw.githubusercontent.com/openobserve/openobserve-helm-chart/main/charts/openobserve/values.yaml

  1. Edit values.yaml (Enterprise HA + Ceph S3 + Ceph RBD + Traefik)

Open values.yaml and apply the changes below.

  • Set root user + Ceph RGW keys (auth)
auth:
  ZO_ROOT_USER_EMAIL: "root@maksonlee.com"
  ZO_ROOT_USER_PASSWORD: "CHANGE_ME_STRONG"
  ZO_ROOT_USER_TOKEN: ""

  ZO_S3_ACCESS_KEY: "PASTE_RGW_ACCESS_KEY"
  ZO_S3_SECRET_KEY: "PASTE_RGW_SECRET_KEY"
  • Configure Ceph RGW S3 (config)
config:
  ZO_LOCAL_MODE: "false"

  ZO_S3_PROVIDER: "s3"
  ZO_S3_SERVER_URL: "https://ceph.maksonlee.com:443"
  ZO_S3_REGION_NAME: "us-east-1"
  ZO_S3_BUCKET_NAME: "openobserve"

  # Path-style access for S3-compatible endpoints
  ZO_S3_FEATURE_FORCE_HOSTED_STYLE: "false"

  # Optional: helps with some TLS proxies / S3-compatible endpoints
  ZO_S3_FEATURE_HTTP1_ONLY: "true"
  • Configure Traefik ingress
ingress:
  enabled: true
  className: "traefik"
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: websecure
    traefik.ingress.kubernetes.io/router.tls: "true"
  hosts:
    - host: openobserve.maksonlee.com
      paths:
        - path: /
          pathType: ImplementationSpecific
  tls: []
  • Optional: reduce “warm-up” delay for schema/stats convergence
config:
  ZO_S3_SYNC_TO_CACHE_INTERVAL: "120"
  ZO_COMPACT_SYNC_TO_DB_INTERVAL: "300"

  1. Install OpenObserve Enterprise HA (Helm)
helm repo add openobserve https://charts.openobserve.ai
helm repo update

kubectl create ns openobserve

helm -n openobserve install o2 openobserve/openobserve -f values.yaml

  1. Verify

Pods, PVCs, CNPG cluster

kubectl -n openobserve get pods -o wide
kubectl -n openobserve get pvc
kubectl -n openobserve get clusters.postgresql.cnpg.io

Ingress

kubectl -n openobserve get ingress
kubectl -n openobserve describe ingress

UI

Open:

  • https://openobserve.maksonlee.com/web/

Login:

  • Email: root@maksonlee.com
  • Password: CHANGE_ME_STRONG

Known behavior (fresh HA install): transient Search stream not found

What you might see

During the first minutes (sometimes longer), some UI/API paths may temporarily return:

  • Error: Search stream not found

even while other screens show ingested points.

Why it happens (practical explanation)

In HA mode, data is written to object storage (S3/Ceph RGW) and metadata is stored in Postgres. Some query paths depend on schema/filelist/stats metadata being present and refreshed. Immediately after a fresh install (or when a new stream appears), metadata and caches can converge gradually, so a specific request path may briefly fail with “stream not found” until metadata is visible to the component handling that request.

Practical mitigations

  • Wait for warm-up after first install.
  • Reduce metadata refresh intervals if your environment can afford it (ZO_S3_SYNC_TO_CACHE_INTERVAL, ZO_COMPACT_SYNC_TO_DB_INTERVAL).

Did this guide save you time?

Support this site

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top