Self-Hosted — User Guide

Audience: Organizations deploying Dagen inside their VPC or data center using an image (e.g. AWS AMI) or equivalent appliance—not routing production warehouse data through a shared SaaS control plane.


Deployment pattern

Self-hosted Dagen is full product software on your infrastructure: agents, pipelines, ingestion runtimes, workflows, Knowledge Base, Git integrations, and LLM calls (to providers you allow) run under your change control.

Boundary summary

Stays customer-side May still reach vendor (contract-specific)
Pipeline execution data, warehouse traffic, repo clones Software updates, license/token checks, container registry pulls
Workspace metadata in your DB Customer-initiated remote support sessions
Secrets in your vaults

Networking and TLS

  • Serve the control plane UI + API over HTTPS (port 443 typical).
  • Choose a hostname that matches your certificate (example pattern: self-hosted.dagen.ai).
  • Expose OIDC discovery for workload identity:
    • /.well-known/openid-configuration
    • /openid/v1/jwks
  • Use these for workload ↔ IdP federation; if routing to the issuer is difficult, consider offline trust (e.g. Vault JWT/OIDC).

Cloud IAM (example: AWS)

An instance profile or role often needs:

  1. Metering / entitlement APIs (per your commercial agreement).
  2. Object storage (e.g. S3) for durable state or artifacts.

Tune least-privilege policies with your security team.


Egress to LLM providers

Model calls go to whatever endpoints you configure in Model Settings (OpenAI, Anthropic, Google Vertex, open-source gateways, etc.). Your network must allowlist those HTTPS endpoints.


Operational parity with cloud

You should expect the same product capabilities as cloud-hosted Dagen, subject to your configuration:


Security reporting

Many deployments surface SBOM and CVE reports via a gateway path such as /security-reports/ for auditors.


User identity bridges

Users may still authenticate with OAuth providers (GitHub, Google) or CLI flows (gcloud, device code) through your IdP policies—configure per your enterprise standard.


Compliance framing

Self-hosting supports data residency and sovereignty goals because core data paths remain in your account. Legal obligations (GDPR, NIS2, EU AI Act, etc.) still depend on how you operate the deployment—this doc is not legal advice.


Related