On this page

Redesigning a 3-Host Homelab: From Promtail Removal to devstack Splitting and Config Consolidation

A record of migrating from Promtail to Vector, splitting devstack by host, and consolidating Go config into defaults.go across a 3-host homelab (storage / desktop / compute), aligning dev and production topologies.

These articles use AI-generated summaries of Obsidian notes originally kept as technical memos.

English translations are produced with AI assistance.

Conclusion

Across the three homelab hosts (storage / desktop / compute), log collection was migrated to Vector, devstack compose files were split, and Go config was consolidated—all as a single coordinated effort. The final host layout is as follows:

  desktop.home.arpa          compute.home.arpa          storage.home.arpa
─────────────────          ─────────────────          ─────────────────
agent-gateway :8080        vLLM :8000                 Vector
NATS :4222                 llama.cpp :8001             Loki :3100
reranker :8081             PostgreSQL :5432           Prometheus :9090
LM Studio :1234            Dagster :3300              MinIO :9000
                           MLflow :5050

desktop is the routing layer—“receive requests and dispatch to the right backend.” compute is the computation layer—“inference, data processing, experiment tracking.” storage is the persistence layer—“log aggregation, metrics collection, object storage.” Each host has a clear responsibility, and the devstack compose files were split to match this three-layer structure.

Background

The homelab runs a three-host setup centered on agent-gateway. storage.home.arpa is the 24/7 observability + object store layer (Prometheus, Loki, MinIO, various exporters). desktop.home.arpa runs macOS and serves as the gateway + messaging layer with agent-gateway and NATS JetStream. compute.home.arpa handles GPU inference (vLLM, llama.cpp) and the data platform (PostgreSQL, Dagster, MLflow), started on demand.

Behind this setup is a Nikkei 225 real-time prediction use case. The prediction pipeline requires stable parallel event execution and memory-safe transform processing—Vector’s transform layer fits those requirements. Running locally avoids the latency, data leakage, and resource instability concerns of cloud deployment, maintaining full control over resources and machines. Since all components are containerized, the same setup can be reproduced given the hardware.

Three problems had accumulated:

Log collection depended on Promtail—Promtail is a Loki-specific tail agent that cannot handle multiple outputs like NATS publishing or Prometheus metrics generation
devstack was a single compose—one podman-compose.yml at the root bundled all services, diverging from production’s distributed host layout
Config was scattered across .envrc—hostnames and ports were fixed, yet managed via environment variables, creating a breeding ground for bugs when defaults and .envrc values drifted apart

Phase 1: Promtail → Vector Migration

Motivation

Vector is a general-purpose data pipeline that can simultaneously route input from a single source to multiple outputs: forwarding to Loki, publishing to NATS, and generating Prometheus metrics. With plans to eventually ingest infrastructure logs into the Dagster data pipeline via NATS, migrating from Promtail made sense.

Production Vector Setup

Vector was already deployed on storage.home.arpa as a rootful systemd quadlet. Memory usage was a lightweight 19.3 MB (peak 25.7 MB), and it was already added as a Prometheus scrape target.

The running config has five sources:

journald—collecting only the current boot session with current_boot_only
file_security—alternatives.log, apport.log
file_apt—apt/dpkg logs
syslog_udp—UDP:1514, receiving syslog from MikroTik routers and others
internal_metrics—Vector’s own metrics

The transform layer uses route to split journald into kernel and systemd streams, applying job/host labels to each. Kernel logs match on ._TRANSPORT == "kernel", and systemd logs additionally extract _SYSTEMD_UNIT and __UID__. Syslog is parsed to extract appname/severity/facility as labels.

Sinks run two paths: all transform outputs go to Loki with JSON encoding, and internal_metrics are exposed via a Prometheus exporter (:9598). The only externally exposed ports are 9598 for Prometheus scraping and 4222 for NATS client connections.

Removing devstack Vector

There was a second Vector in devstack. It subscribed to telemetry.> subjects from NATS JetStream, parsed JSON, and routed by domain—storage handled log collection while devstack consumed telemetry, forming two separate pipelines.

Three options were considered for coordinating them:

Option A: storage → NATS publish, flowing telemetry.infra.* to desktop
Option B: devstack → Loki sink, aggregating NATS telemetry into Loki as well
Option C: flow both directions so Loki and NATS both have complete data

Implementation initially proceeded with Option C, but was redirected by the judgment that “devstack is for development, so this should be delegated to the production side.” The devstack Vector was removed entirely, and the storage side was returned to a simple config (journald + file + syslog → Loki + Prometheus exporter).

A key design decision: NATS publishing is handled by agent-gateway’s Go code. Gateway goroutines publish to telemetry.* and pipeline.* subjects during request processing, accumulating in JetStream streams. Vector is a log collection specialist, separate from agent-gateway’s event pipeline. The conclusion was that cross-referencing via CorrelationID alone is sufficient.

Phase 2: Splitting devstack by Host

Motivation

The root podman-compose.yml housed NATS, PostgreSQL, Dagster (3 containers), MLflow, Reranker, minio-init, and more—8 services total. In production, services are distributed across storage / desktop / compute, but devstack crammed everything into one compose. The problem of “it works in devstack but all the connection targets are different in production” was becoming apparent.

Split Result

The compose was split into two files matching the production host layout:

devstack/desktop/ — NATS + nats-init + Reranker
devstack/compute/ — PostgreSQL + Dagster (x3) + MLflow + minio-init (for development)

The desktop side is lightweight: NATS container (JetStream enabled, with health check), nats-init (creating PIPELINE / TELEMETRY streams), and Reranker only.

The compute side has cross-host references. Dagster’s dagster-user-code and dagster-daemon connect to desktop’s NATS with NATS_URL: nats://desktop.home.arpa:4222. This was originally nats://nats:4222 within the compose, but since NATS moved to a separate host’s compose, it now uses the hostname. Cross-compose dependency control isn’t possible, so startup order is managed operationally.

Dagster / MLflow Placement Decision

Initially there was consideration of “placing Dagster UI on desktop and splitting only user-code and daemon to compute.” Dagster’s architecture allows separating webserver from user-code/daemon, referencing remote gRPC endpoints via workspace.yaml.

However, MLflow’s mlflow server command bundles UI and tracking server as one unit—they cannot be separated. Since Dagster assets sometimes reference MLflow experiment links, having them on the same host is operationally easier. Accepting the practical constraint that “UI and computation cannot be separated,” the decision was finalized to place all of Dagster and MLflow on compute. From desktop, accessing http://compute.home.arpa:3300 (Dagster UI) and :5050 (MLflow UI) via browser is sufficient.

minio-init Handling

It was deleted once but restored. Although MinIO is running on storage.home.arpa, the agw-mlflow / agw-iceberg buckets had not yet been created. It was kept as an initialization job to reliably create buckets on first setup, with a service_completed_successfully condition on MLflow’s depends_on to prevent startup without buckets.

Phase 3: Removing .envrc and Consolidating into defaults.go

Motivation

agent-gateway’s config was managed via direnv’s .envrc. About 25 lines of environment variables were defined—COMPUTE_HOST, STORAGE_HOST, VLLM_BASE_URL, NATS_URL, POSTGRES_DSN, etc.—injected from the shell at go run time.

But in a local infrastructure, hostnames and ports are fixed, and secrets are fixed development values. There is no need to switch via environment variables. In fact, when .envrc goes stale, “the default value in config.go doesn’t match the .envrc value” becomes a bug source.

Design

All default values were consolidated as constants in internal/config/defaults.go. The three host names, port numbers for each service, DB connection details, and log setting defaults were all moved into Go const blocks. The config.go Load() function was unified around four helpers—envOr / boolEnvOr / intEnvOr / optionalBoolEnv—building AppConfig in a single concise return statement.

The core pattern is service discovery rooted in the three hosts:

  COMPUTE_HOST → vLLM, llama.cpp, PostgreSQL, Dagster, MLflow
STORAGE_HOST → Loki, MinIO, Prometheus
DESKTOP_HOST → NATS, Reranker, LM Studio

If the hostnames are correct, there is no need to specify individual URLs via environment variables. Override capability is preserved, so temporarily changing a port during development is as simple as VLLM_BASE_URL=http://localhost:8000 go run ./cmd/server.

During implementation, port number mix-ups occurred twice. llama.cpp was set to 8081 and Reranker to 8001, but the correct assignment is llama.cpp:8001 (compute side) and Reranker:8081 (desktop side). Also, DagsterBaseURL’s default host was initially set to desktop, but since Phase 2 decided to place all of Dagster on compute, it was corrected. Comments in defaults.go explicitly note which host each port belongs to.

Overall Event Flow

Two data paths exist:

Path A: agent-gateway goroutine → NATS publish → Dagster sensor. A non-real-time path used for request lineage tracking and data synthesis
Path B: storage Vector → Loki. A real-time infrastructure log aggregation path

The two are designed to be cross-referenced via CorrelationID. Vector doesn’t need to publish to NATS because Path A is already self-contained on the agent-gateway side.

The devstack compose split means development now tests with a network topology close to production. The compute-side Dagster’s NATS_URL points to nats://desktop.home.arpa:4222, matching the production layout. To run everything on localhost, simply override with environment variables.

Integrating MLflow into devstack — Separating Dagster and Experiment Tracking Responsibilities

Implementation record of …

Response Vocabulary Design Swings Small LLM Accuracy by 15 Points: Experiment Log from pathfinder

Simply changing the metadata …