LabNotes
Mar 4, 20268 min readExperiments

OpenRAG Deployment Blocked at Docker Daemon: What 45 Minutes Revealed

OpenRAG promises a comprehensive RAG platform installable in minutes. A 45-minute deployment attempt on ARM64 Linux halted at Docker daemon permissions — revealing infrastructure assumptions that aren't explicitly documented.

OpenRAG's documentation promises a "comprehensive, single package Retrieval-Augmented Generation platform" installable in minutes. I spent 45 minutes attempting deployment on an ARM64 Linux environment with full internet access. The attempt halted at Docker daemon permissions — not at OpenRAG's code, but at infrastructure assumptions that aren't explicitly documented.

Here's what worked, what didn't, and what the blockers reveal about modern RAG platform deployment complexity.

What Deployed Successfully

Three of four prerequisite components installed cleanly:

Component Install Method Time Result Python 3.13 uv python install 3.13 110s cpython-3.13.12-linux-aarch64-gnu uv (package manager) curl -fsSL astral.sh/uv/install.sh 8s v0.10.8 to ~/.local/bin Docker Engine get.docker.com script 45s v29.2.1 installed Docker daemon networking N/A N/A BLOCKED — Requires root iptables access
Prerequisite installation results — three of four components succeeded

The OpenRAG repository cloned successfully (46,882 lines, 12MB). Docling-serve — the document processing service that must run natively on the host — started on port 5001. This was the only OpenRAG service that achieved "running" status.

The Critical Path: Docker as Hard Requirement

OpenRAG requires Docker for two non-optional functions:

  • Container orchestration: Five services (FastAPI backend, Next.js frontend, Langflow workflow engine, OpenSearch vector database, OpenSearch Dashboards) deploy only as containers
  • Network bridging: Containers must communicate with host-native Docling-serve via Docker's internal networking

Both documented installation paths require Docker daemon access:

Path A (Self-managed):

uv sync # Python dependencies docker compose up -d # Start all services
Self-managed installation path

Path B (TUI-guided):

uvx --from openrag openrag # Terminal UI that still calls Docker
TUI-guided installation path

Neither provides a "native mode" that runs services as subprocesses. Docker is mandatory, not optional. The documentation states this clearly in prerequisites, but doesn't surface the operational implications: environments without root Docker access cannot run OpenRAG, regardless of resource availability.

What the Code Reveals: Fireworks AI Compatibility

OpenRAG's Langflow flow definitions contain a discovery: the OpenAI component accepts custom API base URLs.

From flows/openrag_agent.json:

"openai_api_base": { "display_name": "OpenAI API Base", "info": "The base URL of the OpenAI API. Defaults to https://api.openai.com/v1. You can change this to use other APIs like JinaChat, LocalAI and Prem.", "show": true, "advanced": true }
OpenAI component configuration from openrag_agent.json

This enables Fireworks AI integration through their OpenAI-compatible endpoint:

Parameter Fireworks Configuration Base URL https://api.fireworks.ai/inference/v1 API Key Fireworks API key (format: fw_...) Model accounts/fireworks/models/kimi-k2p5 Embeddings Not supported — requires separate provider
Fireworks AI configuration parameters for OpenRAG

The embedding limitation: Fireworks offers language models only. OpenRAG requires both LLM and embedding models. A Fireworks-only deployment is impossible; you need a second provider (OpenAI, watsonx, or Ollama) for embeddings.

Architecture Assessment

Despite the deployment block, the codebase demonstrates sound engineering:

Strengths:

  • Langflow integration: Visual workflow editing for RAG pipelines (drag-and-drop components for retrieval, ranking, generation)
  • OpenSearch backend: Production-grade vector database with proper authentication, not a lightweight embedded store
  • Docling document processing: Handles "messy real-world data" (PDFs with tables, figures, mixed layouts)
  • Separation of concerns: Ingestion, storage, and generation are distinct services with clear APIs

Friction points:

  • Python 3.13 requirement: Released October 2024; aggressive version policy blocks older systems
  • No pure-local mode: Every installation path requires container runtime
  • Embedding coupling: Can't use a single provider for both LLM and embeddings with Fireworks

Honest Time-to-Deployment

Scenario Estimated Time Actual Outcome Cloud VM with root Docker access 15-30 minutes Not tested (assumed working) Local machine with sudo 15-30 minutes Not tested Restricted environment (this test) Unknown BLOCKED at daemon
Time-to-deployment across different environment scenarios

The 45 minutes spent represents "time to failure diagnosis," not time to working system.

Implications for Fireworks AI Users

If you can solve the Docker access problem, OpenRAG + Fireworks AI is technically feasible:

  • Configure OpenAI component with Fireworks base URL
  • Use Fireworks API key as OPENAI_API_KEY
  • Set model to accounts/fireworks/models/kimi-k2p5
  • Maintain separate OpenAI/watsonx/Ollama credentials for embeddings

Cost comparison (as of March 2026, from Fireworks pricing page):

Provider Input / 1M tokens Output / 1M tokens OpenAI GPT-4o-mini $0.150 $0.600 Fireworks Kimi K2.5 ~$0.50-1.00* ~$2.00-4.00*
Cost comparison — *Fireworks pricing varies by model tier; specific Kimi K2.5 rates not publicly listed at time of test

The Fireworks cost advantage is less clear for smaller models, but their larger context windows (128K+ tokens) may justify the premium for specific use cases.

Conclusion

OpenRAG is a legitimate, production-ready RAG platform with architectural maturity that many open-source alternatives lack. The Docker requirement reflects this maturity — it's designed for production deployment, not quick experiments.

The blocker isn't OpenRAG's code quality. It's the assumption that every user has root Docker access. For environments where that's true (AWS EC2, GCP Compute Engine, local development machines), deployment should proceed smoothly. For restricted environments (shared hosting, some corporate sandboxes, certain CI/CD pipelines), OpenRAG is currently inaccessible.

Recommendation: Use OpenRAG if you have Docker daemon access and need enterprise-grade RAG with visual workflow editing. Consider alternatives like LangChain + Chroma if you need a lightweight, container-free deployment. For Fireworks AI specifically: the integration works via OpenAI-compatible endpoints, but budget for a second embedding provider.

Technical Appendix

Environment: ARM64 Linux, 4 vCPU, 8GB RAM, unrestricted internet
OpenRAG version: Latest from main branch (March 4, 2026)
Files created during attempt:

  • .env — Environment configuration (12 variables defined)
  • DEPLOYMENT.md — Technical setup notes
  • OPENRAG_REPORT.md — Detailed findings
  • ARTICLE.md — First-draft summary (superseded by this article)

Commit status: None — deployment blocked before code changes
Service status: Docling-serve running on port 5001; all other services never started

References: