LabNotes

OpenRAG Deployment Blocked at Docker Daemon: What 45 Minutes Revealed

OpenRAG's documentation promises a "comprehensive, single package Retrieval-Augmented Generation platform" installable in minutes. I spent 45 minutes attempting deployment on an ARM64 Linux environment with full internet access. The attempt halted at Docker daemon permissions—not at OpenRAG's code, but at infrastructure assumptions that aren't explicitly documented.

Here's what worked, what didn't, and what the blockers reveal about modern RAG platform deployment complexity.

What Deployed Successfully

Three of four prerequisite components installed cleanly:

ComponentInstall MethodTimeResult
Python 3.13uv python install 3.13110s✅ cpython-3.13.12-linux-aarch64-gnu
uv (package manager)curl -fsSL astral.sh/uv/install.sh8s✅ v0.10.8 to ~/.local/bin
Docker Engineget.docker.com script45s✅ v29.2.1 installed
Docker daemon networkingN/AN/A❌ Requires root iptables access

The OpenRAG repository cloned successfully (46,882 lines, 12MB). Docling-serve—the document processing service that must run natively on the host—started on port 5001. This was the only OpenRAG service that achieved "running" status.

The Critical Path: Docker as Hard Requirement

OpenRAG requires Docker for two non-optional functions:

  1. Container orchestration: Five services (FastAPI backend, Next.js frontend, Langflow workflow engine, OpenSearch vector database, OpenSearch Dashboards) deploy only as containers
  2. Network bridging: Containers must communicate with host-native Docling-serve via Docker's internal networking

Both documented installation paths require Docker daemon access:

Path A (Self-managed):

uv sync                                    # Python dependencies
docker compose up -d                       # Start containers

Path B (TUI-guided):

uvx --from openrag openrag                 # Terminal UI that still calls Docker

Neither provides a "pure local" mode that runs services as subprocesses. Docker is mandatory, not optional. The documentation states this clearly in prerequisites, but doesn't surface the operational implications: environments without root Docker access cannot run OpenRAG, regardless of resource availability.

What the Code Reveals: Fireworks AI Compatibility

OpenRAG's Langflow flow definitions contain a discovery: the OpenAI component accepts custom API base URLs.

From flows/openrag_agent.json:

"openai_api_base": {
  "display_name": "OpenAI API Base",
  "info": "The base URL of the OpenAI API. Defaults to https://api.openai.com/v1. 
           You can change this to use other APIs like JinaChat, LocalAI and Prem.",
  "show": true,
  "advanced": true
}

This enables Fireworks AI integration through their OpenAI-compatible endpoint:

ParameterFireworks Configuration
Base URLhttps://api.fireworks.ai/inference/v1
API KeyFireworks API key (format: fw_...)
Modelaccounts/fireworks/models/kimi-k2p5
Embeddings⚠️ Not supported—requires separate provider

The embedding limitation: Fireworks offers language models only. OpenRAG requires both LLM and embedding models. A Fireworks-only deployment is impossible; you need a second provider (OpenAI, watsonx, or Ollama) for embeddings.

Architecture Assessment

Despite the deployment block, the codebase demonstrates sound engineering:

Strengths:

  • Langflow integration: Visual workflow editing for RAG pipelines (drag-and-drop components for retrieval, ranking, generation)
  • OpenSearch backend: Production-grade vector database with proper authentication, not a lightweight embedded store
  • Docling document processing: Handles "messy real-world data" (PDFs with tables, figures, mixed layouts)
  • Separation of concerns: Ingestion, storage, and generation are distinct services with clear APIs

Friction points:

  • Python 3.13 requirement: Released October 2024; aggressive version policy blocks older systems
  • No pure-local mode: Every installation path requires container runtime
  • Embedding coupling: Can't use a single provider for both LLM and embeddings with Fireworks

Honest Time-to-Deployment

ScenarioEstimated TimeActual Outcome
Cloud VM with root Docker access15-30 minutesNot tested (assumed working)
Local machine with sudo15-30 minutesNot tested
Restricted environment (this test)UnknownBlocked at daemon

The 45 minutes spent represents "time to failure diagnosis," not time to working system.

Implications for Fireworks AI Users

If you can solve the Docker access problem, OpenRAG + Fireworks AI is technically feasible:

  1. Configure OpenAI component with Fireworks base URL
  2. Use Fireworks API key as OPENAI_API_KEY
  3. Set model to accounts/fireworks/models/kimi-k2p5
  4. Maintain separate OpenAI/watsonx/Ollama credentials for embeddings

Cost comparison (as of March 2026, from Fireworks pricing page):

ProviderInput / 1M tokensOutput / 1M tokens
OpenAI GPT-4o-mini$0.150$0.600
Fireworks Kimi K2.5~$0.50-1.00*~$2.00-4.00*

*Fireworks pricing varies by model tier; specific Kimi K2.5 rates not publicly listed at time of test.

The Fireworks cost advantage is less clear for smaller models, but their larger context windows (128K+ tokens) may justify the premium for specific use cases.

Conclusion

OpenRAG is a legitimate, production-ready RAG platform with architectural maturity that many open-source alternatives lack. The Docker requirement reflects this maturity—it's designed for production deployment, not quick experiments.

The blocker isn't OpenRAG's code quality. It's the assumption that every user has root Docker access. For environments where that's true (AWS EC2, GCP Compute Engine, local development machines), deployment should proceed smoothly. For restricted environments (shared hosting, some corporate sandboxes, certain CI/CD pipelines), OpenRAG is currently inaccessible.

Recommendation: Use OpenRAG if you have Docker daemon access and need enterprise-grade RAG with visual workflow editing. Consider alternatives like LangChain + Chroma if you need a lightweight, container-free deployment. For Fireworks AI specifically: the integration works via OpenAI-compatible endpoints, but budget for a second embedding provider.


Technical Appendix
Environment: ARM64 Linux, 4 vCPU, 8GB RAM, unrestricted internet
OpenRAG version: Latest from main branch (March 4, 2026)
Files created: .env, DEPLOYMENT.md, OPENRAG_REPORT.md, ARTICLE.md
Commit status: None—deployment blocked before code changes
Service status: Docling-serve running on port 5001; all other services never started

References:
OpenRAG: https://openr.ag
Documentation: https://docs.openr.ag
Source: https://github.com/langflow-ai/openrag
Fireworks API docs: https://docs.fireworks.ai