Articles

Personal tech articles and research notes from work and hobby projects.

Infrastructure（19 articles）

Server hardware, network topology, container orchestration, monitoring, and GPU environment documentation.

Key topics: AMD EPYC 9175F, MikroTik RouterOS, Podman/Quadlet, Ubuntu Server, Prometheus/Grafana, 10GbE Networking, PostgreSQL, LLM Stack Deployment

Latest articles:

Homelab Infrastructure Redesign -- PostgreSQL Storage/Compute Separation and Devstack Overhaul
2026-04-09
Migrating PostgreSQL from an on-demand GPU box to a 24/7 Mac Mini in a 3-node homelab. Covers the pgvector integration decision, devstack macOS compatibility, and backup symlink design.

Dagster + NATS JetStream Event Pipeline: Implementation Deep Dive
2026-03-14
Implementation details of the event consumption side where Dagster sensors pull subscribe from NATS JetStream and materialize assets from fire-and-forget events published by agent-gateway. Covers the …

Integrating MLflow into devstack — Separating Dagster and Experiment Tracking Responsibilities
2026-03-14
Implementation record of adding MLflow Tracking Server and MinIO to the agent-gateway devstack, connecting Dagster's orchestration layer with the ML experiment tracking layer via correlation_id. …

Browse all articles →

LLM Research（38 articles）

Large language model benchmarks, CPU/GPU inference validation, quantization testing, and optimization research.

Key topics: DeepSeek V3.2, Qwen3, Kimi K2.5, GLM-4.7, Llama 4, Hermes, MiniMax, EPYC 9175F inference optimization, GGUF quantization

Latest articles:

Running GLM-5.2 (744B-A40B) GGUFs Locally: Did MTP Help? Notes From a Few Quant and Expert-Placement Tests
2026-06-30
Notes from running two GLM-5.2 (744B-A40B MoE) GGUF quants (1.630bpw / 2.244bpw) on a dual RTX PRO 6000 Blackwell Max-Q (96GB x2) + 768GB RAM homelab. The quant author reports MTP raising TG …

Running DeepSeek-V4-Flash on Two DwarfStar4 Nodes for Orchestration
2026-06-04
A record of loading DeepSeek-V4-Flash IQ2XXS on two RTX PRO 6000 Blackwell Max-Q GPUs, one DwarfStar4 node per GPU, and evaluating it as both orchestrator and worker in a multi-agent coding system. …

Step-3.7-Flash-NVFP4 as a Local Orchestrator: Multi-Agent System Development
2026-06-02
A local-only multi-agent benchmark using Step-3.7-Flash-NVFP4 as the orchestrator and the familiar framework to generate a Django business system. The run reached a working state in about one hour and …

Browse all articles →

Software Tools（10 articles）

Development tools, IDE configurations, MCP integrations, code analysis utilities, and web project implementations.

Key topics: VS Code Server, Zed, Serena MCP, ctree, Dagster, Django, Lightdash, shelpa

Latest articles:

All Rust, All Handmade -- 9 MCP Tools Powering the Homelab
2026-04-09
A tour of the 9 custom MCP tools powering the homelab LLM agent stack. All Rust, all stdio JSON-RPC 2.0. Design philosophy and purpose for each tool.

voracle Dev Log vol.2 -- Deploying the Research Pipeline and Overhauling the ONNX Inference Engine
2026-04-09
A week of stabilizing voracle's research pipeline: fixing a multibyte character panic in Rust, migrating the ONNX inference engine to Qwen3, and redesigning the vault structure.

From shelpa to filesystem — Complete Redesign of a Rust MCP Filesystem Server
2026-03-30
A record of renaming the local LLM agent MCP server shelpa to filesystem, removing the pipeline execution engine, and redesigning it into a full MCP filesystem server with undo/redo, trash, and file …

Browse all articles →

Architecture（10 articles）

System architecture designs, distributed pipeline patterns, and migration records.

Key topics: Rust, NATS, Dagster, OpenAI Proxy, SSE Streaming, Go Migration

Latest articles:

familiar - Building a Multi-Agent Development Platform That Runs Only on Local LLMs
2026-05-16
A record of the origin and early design of familiar, a local-LLM multi-agent development platform that autonomously plans, implements, tests, and reviews on a home server without relying on cloud …

Evaluating llm-jp-4-32b-a3b-base-NVFP4 for Translation and Pivoting Away from a Resident Translator Role
2026-04-14
A single-GPU vLLM 0.18.0 record for llm-jp-4-32b-a3b-base-NVFP4, and why I switched from SFT/DPO+LoRA-first assumptions to on-demand translation batches in Dagster.

Validating the familiar Harness: Field Observations of a Cloud-Agent orchestrator with Qwen3-Coder-Next 80B / GLM-5.1
2026-04-14
A field validation record of the familiar orchestrator / naughty / grandpa setup using a Claude orchestrator, Qwen3-Coder-Next 80B IQ4_KSS, and GLM-5.1 smol-IQ4_K. Covers the root cause behind a …

Browse all articles →