Install Grafana Loki on Kubernetes with Ceph RGW S3 Storage

This post installs Grafana Loki (logs backend) on Kubernetes, using Ceph RGW (S3-compatible) for long-term object storage.

This is the “install Loki separately” follow-up to my Mimir-only post (metrics are already done). We reuse the same Ceph RGW endpoint and the same shared Kubernetes Secret (ceph-s3-credentials) created in the Mimir post.

This post is based on

Build a 3-Node HA Kubernetes Cluster with kube-vip, MetalLB, and Traefik on Bare-Metal Ubuntu 24.04

All-in-One Ceph S3 Test Environment on a Single Node

Install Grafana Mimir on Kubernetes with Ceph RGW S3 Storage

Lab context

Kubernetes (bare metal)
Ingress controller: Traefik
Namespace: observability
Ceph RGW (S3)
Endpoint (HAProxy + Let’s Encrypt): https://ceph.maksonlee.com:443

What you’ll get

Loki deployed via Helm (grafana/loki) in SimpleScalable mode
Loki stores data in Ceph RGW S3 buckets:
- lgtm-loki-chunks
- lgtm-loki-ruler
- lgtm-loki-admin
A Loki gateway service for Grafana datasource + log ingestion (in-cluster)

Note: Loki does not create buckets for you. Create buckets first.

Why SimpleScalable (supported) and what scale this post targets

This post uses Loki’s SimpleScalable deployment mode (read / write / backend behind a gateway). SimpleScalable is a supported mode in the current Loki Helm chart and is a practical middle ground between single-binary Loki and full microservices mode: you can scale read and write independently later without running the entire microservices topology from day one.

That said, this tutorial is intentionally homelab-sized for a 3-node cluster:

read/write/backend are 1 replica each
replication_factor: 1 (no redundancy / HA)
caches are disabled to avoid extra components and large default memory usage
lokiCanary and Helm tests are disabled for simplicity

This keeps the install lightweight and easy to verify. For production/HA, increase replicas, set replication_factor >= 2–3, re-enable caches with tuned memory, add anti-affinity/PDBs, and enable canary/testing.

Prerequisites from the Mimir post

From the Mimir post, you should already have:

Namespace observability
s3cmd installed and configured for Ceph RGW (path-style) on a client machine
The shared Secret: observability/ceph-s3-credentials (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY)

Quick checks:

kubectl get ns observability
kubectl -n observability get secret ceph-s3-credentials

kubectl get ns observability
kubectl -n observability get secret ceph-s3-credentials

If these are missing, complete the “Prepare Ceph S3 (create buckets + credentials Secret)” section in the Mimir post first.

Create the namespace (skip if already exists)

kubectl create namespace observability --dry-run=client -o yaml | kubectl apply -f -

kubectl create namespace observability --dry-run=client -o yaml | kubectl apply -f -

Add Helm repos

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Prepare Ceph S3 (create Loki buckets)

Because you already configured s3cmd in the Mimir post, here you only create Loki’s buckets.

Bucket naming warning (important)

When deploying Loki using S3 storage, do not use the default bucket names (chunk, ruler, admin). Use unique names instead (like the lgtm-* buckets below).

Create buckets (Loki)

s3cmd mb s3://lgtm-loki-chunks
s3cmd mb s3://lgtm-loki-ruler
s3cmd mb s3://lgtm-loki-admin

s3cmd mb s3://lgtm-loki-chunks
s3cmd mb s3://lgtm-loki-ruler
s3cmd mb s3://lgtm-loki-admin

Verify:

s3cmd ls | grep lgtm-loki

s3cmd ls | grep lgtm-loki

Reuse the shared S3 credentials Secret (skip if already created)

This Secret was created in the Mimir post and should already exist:

kubectl -n observability get secret ceph-s3-credentials

kubectl -n observability get secret ceph-s3-credentials

If you need to create it (or you rotated keys), apply it like this:

kubectl -n observability create secret generic ceph-s3-credentials \
  --from-literal=AWS_ACCESS_KEY_ID='REPLACE_ME' \
  --from-literal=AWS_SECRET_ACCESS_KEY='REPLACE_ME' \
  --dry-run=client -o yaml | kubectl apply -f -

kubectl -n observability create secret generic ceph-s3-credentials \
  --from-literal=AWS_ACCESS_KEY_ID='REPLACE_ME' \
  --from-literal=AWS_SECRET_ACCESS_KEY='REPLACE_ME' \
  --dry-run=client -o yaml | kubectl apply -f -

Verify (shows keys, not values):

kubectl -n observability describe secret ceph-s3-credentials

kubectl -n observability describe secret ceph-s3-credentials

Install Loki

Why “path-style” matters for Ceph RGW behind one hostname/cert

If your RGW is exposed as a single endpoint like ceph.maksonlee.com:443, virtual-host style buckets (<bucket>.ceph.maksonlee.com) often fail without wildcard DNS + wildcard TLS.

So we force path-style access for S3-compatible storage:

Set loki.storage.type: s3
Force path-style at chart level: loki.storage.s3.s3ForcePathStyle: true
Force path-style in Loki runtime config: loki.storage_config.aws.s3forcepathstyle: true

Why you may hit “Helm test requires the Loki Canary to be enabled”

Helm tests can be enabled by default, and those tests expect the canary endpoint. If you disable canary but keep tests, Helm can fail during install/upgrade.

For homelab simplicity, this post disables both canary and tests.

Create loki-values.yaml

# Loki in SimpleScalable mode (read/write/backend + gateway)
deploymentMode: SimpleScalable

# We use external Ceph RGW (S3), not MinIO
minio:
  enabled: false

# Inject shared S3 creds Secret into all Loki pods
global:
  extraEnvFrom:
    - secretRef:
        name: ceph-s3-credentials
  # Required so Loki expands ${AWS_ACCESS_KEY_ID} / ${AWS_SECRET_ACCESS_KEY} in config
  extraArgs:
    - "-config.expand-env=true"

# Homelab: disable caches (avoid memcached + higher memory usage)
chunksCache:
  enabled: false
resultsCache:
  enabled: false

# Homelab: disable canary + helm tests
lokiCanary:
  enabled: false
test:
  enabled: false

loki:
  auth_enabled: false

  commonConfig:
    replication_factor: 1

  # TSDB schema
  schemaConfig:
    configs:
      - from: "2024-01-01"
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h

  limits_config:
    allow_structured_metadata: true
    volume_enabled: true

  # Loki config: required for TSDB + S3-compatible backends
  storage_config:
    aws:
      region: us-east-1
      bucketnames: lgtm-loki-chunks
      s3forcepathstyle: true

  # Chart-level storage config (also used for ruler/admin buckets)
  storage:
    type: s3
    bucketNames:
      chunks: lgtm-loki-chunks
      ruler: lgtm-loki-ruler
      admin: lgtm-loki-admin
    s3:
      endpoint: ceph.maksonlee.com:443
      region: us-east-1
      accessKeyId: ${AWS_ACCESS_KEY_ID}
      secretAccessKey: ${AWS_SECRET_ACCESS_KEY}
      signatureVersion: v4
      s3ForcePathStyle: true
      insecure: false

# Homelab scale-down
backend:
  replicas: 1
read:
  replicas: 1
write:
  replicas: 1

# Loki in SimpleScalable mode (read/write/backend + gateway)
deploymentMode: SimpleScalable

# We use external Ceph RGW (S3), not MinIO
minio:
  enabled: false

# Inject shared S3 creds Secret into all Loki pods
global:
  extraEnvFrom:
    - secretRef:
        name: ceph-s3-credentials
  # Required so Loki expands ${AWS_ACCESS_KEY_ID} / ${AWS_SECRET_ACCESS_KEY} in config
  extraArgs:
    - "-config.expand-env=true"

# Homelab: disable caches (avoid memcached + higher memory usage)
chunksCache:
  enabled: false
resultsCache:
  enabled: false

# Homelab: disable canary + helm tests
lokiCanary:
  enabled: false
test:
  enabled: false

loki:
  auth_enabled: false

  commonConfig:
    replication_factor: 1

  # TSDB schema
  schemaConfig:
    configs:
      - from: "2024-01-01"
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h

  limits_config:
    allow_structured_metadata: true
    volume_enabled: true

  # Loki config: required for TSDB + S3-compatible backends
  storage_config:
    aws:
      region: us-east-1
      bucketnames: lgtm-loki-chunks
      s3forcepathstyle: true

  # Chart-level storage config (also used for ruler/admin buckets)
  storage:
    type: s3
    bucketNames:
      chunks: lgtm-loki-chunks
      ruler: lgtm-loki-ruler
      admin: lgtm-loki-admin
    s3:
      endpoint: ceph.maksonlee.com:443
      region: us-east-1
      accessKeyId: ${AWS_ACCESS_KEY_ID}
      secretAccessKey: ${AWS_SECRET_ACCESS_KEY}
      signatureVersion: v4
      s3ForcePathStyle: true
      insecure: false

# Homelab scale-down
backend:
  replicas: 1
read:
  replicas: 1
write:
  replicas: 1

Install

helm -n observability upgrade --install lgtm-loki grafana/loki -f loki-values.yaml

helm -n observability upgrade --install lgtm-loki grafana/loki -f loki-values.yaml

Optional: pin a chart version for reproducibility:

helm search repo grafana/loki --versions | head

helm search repo grafana/loki --versions | head

Verify the deployment

Pods and Services

kubectl -n observability get pods | grep loki
kubectl -n observability get svc  | grep loki

kubectl -n observability get pods | grep loki
kubectl -n observability get svc  | grep loki

You should see (names vary by chart version, but typically):

lgtm-loki-backend-0
lgtm-loki-read-...
lgtm-loki-write-...
lgtm-loki-gateway

If anything is Pending / CrashLoopBackOff

Pending (scheduler / resources):

kubectl -n observability describe pod <pending-pod>

kubectl -n observability describe pod <pending-pod>

CrashLoopBackOff (config/storage/auth issues):

kubectl -n observability logs <pod-name> --previous --tail=200

kubectl -n observability logs <pod-name> --previous --tail=200

Did this guide save you time?

Support this site

Leave a Comment Cancel Reply