Ana içeriğe geç

Redis/Celery Kuyruk Birikti

Semptomlar

  • Arka plan gorevleri (OTA, alarm, bildirim) gecikiyor veya hic calismiyor
  • OTA guncellemeler beklemede kaliyor
  • Alarm bildirimleri dakikalar hatta saatler sonra ulasiyor
  • Redis bellek kullanimi hizla artiyor
  • Celery Flower/Horizon dashboard'unda kuyrukta birikmis gorevler gorunuyor

Olasi Sebepler

  1. Celery worker cokmus veya yetersiz sayida worker var
  2. Stuck task'lar worker'lari bloklayarak yeni gorevlerin islenmesini engelliyor
  3. Redis bellek limiti asilmis, yeni gorevler eklenemiyor
  4. Gorev patlamasi — tek bir olay cok fazla alt gorev uretmis
  5. Backend hatasi — gorev icindeki exception yakalanmamis, retry dongusu olusmus
  6. Ag sorunu — Redis'e baglanti kararsiz

Teshis Adimlari

1. Kuyruk Boyutlarini Kontrol Et

# Ana kuyruk boyutlarini kontrol et
docker exec zeus-redis redis-cli LLEN celery
docker exec zeus-redis redis-cli LLEN celery:default
docker exec zeus-redis redis-cli LLEN celery:alarm_processing
docker exec zeus-redis redis-cli LLEN celery:notification_sending
docker exec zeus-redis redis-cli LLEN celery:ota_processing
docker exec zeus-redis redis-cli LLEN celery:measurement_processing

# Tum celery kuyruklarini listele
docker exec zeus-redis redis-cli KEYS "celery:*" | head -30

# Toplam bekleyen gorev sayisi
docker exec zeus-redis redis-cli EVAL "
local total = 0
local keys = redis.call('keys', 'celery:*')
for _, key in ipairs(keys) do
local t = redis.call('type', key)['ok']
if t == 'list' then
total = total + redis.call('llen', key)
end
end
return total
" 0

2. Celery Worker Durumunu Kontrol Et

# Calisan worker'lari listele
docker exec zeus-backend celery -A app.core.celery.app inspect ping

# Aktif gorevleri goster
docker exec zeus-backend celery -A app.core.celery.app inspect active

# Ayrilmis (reserved) gorevleri goster
docker exec zeus-backend celery -A app.core.celery.app inspect reserved

# Worker istatistikleri
docker exec zeus-backend celery -A app.core.celery.app inspect stats

3. Redis Bellek Durumunu Kontrol Et

# Redis genel bellek bilgisi
docker exec zeus-redis redis-cli INFO memory

# Detayli bellek analizi
docker exec zeus-redis redis-cli MEMORY STATS | head -30

# Bellek kullanimi ozeti
docker exec zeus-redis redis-cli INFO memory | grep -E "used_memory_human|maxmemory_human|used_memory_peak_human|mem_fragmentation_ratio"

# En buyuk key'leri bul
docker exec zeus-redis redis-cli --bigkeys

4. Flower/Horizon Dashboard Kontrolu

# Flower erisilebilir mi?
curl -s -o /dev/null -w "%{http_code}" http://localhost:5555/api/workers

# Flower uzerinden kuyruk istatistikleri
curl -s http://localhost:5555/api/queues | python3 -m json.tool

# Flower uzerinden aktif gorevler
curl -s http://localhost:5555/api/tasks?state=STARTED | python3 -m json.tool

5. Stuck Task'lari Tespit Et

# Uzun suredir calisan gorevleri bul
docker exec zeus-backend celery -A app.core.celery.app inspect active \
| python3 -c "
import sys, json, re
text = sys.stdin.read()
# Basit metin analizi — uzun calisan gorevleri filtrele
for line in text.split('\n'):
if 'time_start' in line:
print(line.strip())
"

# Worker loglarinda tekrarlayan hatalari ara
docker logs --tail 1000 zeus-celery-worker 2>&1 \
| grep -E "ERROR|Retry|MaxRetriesExceeded" | tail -20

# Retry dongusune girmis gorevleri kontrol et
docker logs --tail 1000 zeus-celery-worker 2>&1 \
| grep "Retry in" | sort | uniq -c | sort -rn | head -10

6. Redis Baglanti Durumunu Kontrol Et

# Redis baglanti bilgileri
docker exec zeus-redis redis-cli INFO clients | grep -E "connected_clients|blocked_clients|maxclients"

# Redis yavas komut logunu kontrol et
docker exec zeus-redis redis-cli SLOWLOG GET 10

# Redis container loglarini incele
docker logs --tail 100 zeus-redis 2>&1 | grep -i "error\|warning\|oom"

Cozum Adimlari

Celery Worker Scale Up

# Ek celery worker baslat (gecici olarak)
docker compose up -d --scale celery-worker=3

# Veya mevcut worker'a ek concurrency ekle
# docker-compose.yml icinde:
# CELERY_WORKER_CONCURRENCY=8 (varsayilan: 4)
docker compose restart celery-worker

# Worker'larin baslatildigini dogrula
docker exec zeus-backend celery -A app.core.celery.app inspect ping

Stuck Task'lari Iptal Et

# Belirli bir gorevi iptal et
docker exec zeus-backend celery -A app.core.celery.app control revoke {task_id} --terminate

# Tum aktif gorevleri iptal et (dikkatli kullanin)
docker exec zeus-backend celery -A app.core.celery.app control shutdown

# Worker'i yeniden baslat (tum gorevleri birakirir)
docker compose restart celery-worker

# Celery beat'i de yeniden baslat
docker compose restart celery-beat

Redis Bellek Yonetimi

# Redis bellek limitini kontrol et
docker exec zeus-redis redis-cli CONFIG GET maxmemory

# Bellek politikasini kontrol et
docker exec zeus-redis redis-cli CONFIG GET maxmemory-policy

# Suresi dolmus key'leri temizle
docker exec zeus-redis redis-cli DBSIZE
docker exec zeus-redis redis-cli INFO keyspace

# Gereksiz gecici verileri temizle
docker exec zeus-redis redis-cli KEYS "celery-task-meta-*" | wc -l
# Eski task sonuclarini temizle (7 gundan eski)
docker exec zeus-redis redis-cli EVAL "
local keys = redis.call('keys', 'celery-task-meta-*')
local deleted = 0
for _, key in ipairs(keys) do
local ttl = redis.call('ttl', key)
if ttl == -1 then
redis.call('expire', key, 86400)
deleted = deleted + 1
end
end
return deleted
" 0

Kuyruk Purge (Son Care)

# UYARI: Bu islem kuyruktaki TUM gorevleri siler!
# Sadece geri donulemez bir birikme durumunda kullanin.

# Belirli bir kuyrugu temizle
docker exec zeus-backend celery -A app.core.celery.app purge -Q celery:default

# Tum kuyruklari temizle
docker exec zeus-backend celery -A app.core.celery.app purge -f

# Redis uzerinden dogrudan temizle (celery purge calismiyorsa)
docker exec zeus-redis redis-cli DEL celery
docker exec zeus-redis redis-cli DEL celery:default
docker exec zeus-redis redis-cli DEL celery:alarm_processing
docker exec zeus-redis redis-cli DEL celery:notification_sending
docker exec zeus-redis redis-cli DEL celery:ota_processing

# Temizlik sonrasi worker'lari yeniden baslat
docker compose restart celery-worker celery-beat

Gorev Patlamasini Onle

# Rate limit ekle (belirli gorev turleri icin)
docker exec zeus-backend celery -A app.core.celery.app control rate_limit \
app.tasks.alarm.check_alarms 100/m

# Backend konfigurasyonunda task rate limit'lerini ayarla
# CELERY_TASK_RATE_LIMIT_ALARM=50/m
# CELERY_TASK_RATE_LIMIT_NOTIFICATION=30/m
docker compose restart backend celery-worker

Eskalasyon

Asagidaki durumlarda eskalasyon yapin:

  • Redis OOM (Out of Memory) durumu — Redis container cokuyor ve restart ile duzelmiyorsa
  • Kuyruk boyutu 100.000'i asiyorsa ve purge riskliyse (onemli gorevler kaybedilebilir)
  • Alarm bildirimleri 30 dakikadan fazla gecikmisse — guvenlik riski olusabilir
  • Worker crash loop — celery worker surekli cokup yeniden basliyorsa (kok neden analizi gerekli)
  • Redis veri bozulmasi — persistence dosyalari bozulmussa
  • Olcekleme gereksinimi — mevcut kaynaklarla kuyruk bos tutulumyorsa altyapi ekibine bildirin