Articles
Technical articles and research notes by ksh3. Covering infrastructure, LLM, software tools, workflows, and more.
Personal tech articles and research notes from work and hobby projects.
Infrastructure(19 articles)
Server hardware, network topology, container orchestration, monitoring, and GPU environment documentation.
Key topics: AMD EPYC 9175F, MikroTik RouterOS, Podman/Quadlet, Ubuntu Server, Prometheus/Grafana, 10GbE Networking, PostgreSQL, LLM Stack Deployment
Latest articles:
2026-04-09
Migrating PostgreSQL from an on-demand GPU box to a 24/7 Mac Mini in a 3-node homelab. Covers the pgvector integration decision, devstack macOS compatibility, and backup symlink design.
2026-03-14
Implementation details of the event consumption side where Dagster sensors pull subscribe from NATS JetStream and materialize assets from fire-and-forget events published by agent-gateway. Covers the …
2026-03-14
Implementation record of adding MLflow Tracking Server and MinIO to the agent-gateway devstack, connecting Dagster's orchestration layer with the ML experiment tracking layer via correlation_id. …
LLM Research(37 articles)
Large language model benchmarks, CPU/GPU inference validation, quantization testing, and optimization research.
Key topics: DeepSeek V3.2, Qwen3, Kimi K2.5, GLM-4.7, Llama 4, Hermes, MiniMax, EPYC 9175F inference optimization, GGUF quantization
Latest articles:
2026-06-04
A record of loading DeepSeek-V4-Flash IQ2XXS on two RTX PRO 6000 Blackwell Max-Q GPUs, one DwarfStar4 node per GPU, and evaluating it as both orchestrator and worker in a multi-agent coding system. …
2026-06-02
A local-only multi-agent benchmark using Step-3.7-Flash-NVFP4 as the orchestrator and the familiar framework to generate a Django business system. The run reached a working state in about one hour and …
2026-05-21
A record of running Gemma 4 31B IT on vLLM 0.21.0 and SGLang gemma4-mtp, comparing NVFP4/FP8 block quantization, FP8/BF16 KV cache, and Gemma 4 MTP speculative decoding.
Software Tools(10 articles)
Development tools, IDE configurations, MCP integrations, code analysis utilities, and web project implementations.
Key topics: VS Code Server, Zed, Serena MCP, ctree, Dagster, Django, Lightdash, shelpa
Latest articles:
2026-04-09
A tour of the 9 custom MCP tools powering the homelab LLM agent stack. All Rust, all stdio JSON-RPC 2.0. Design philosophy and purpose for each tool.
2026-04-09
A week of stabilizing voracle's research pipeline: fixing a multibyte character panic in Rust, migrating the ONNX inference engine to Qwen3, and redesigning the vault structure.
2026-03-30
A record of renaming the local LLM agent MCP server shelpa to filesystem, removing the pipeline execution engine, and redesigning it into a full MCP filesystem server with undo/redo, trash, and file …
Architecture(10 articles)
System architecture designs, distributed pipeline patterns, and migration records.
Key topics: Rust, NATS, Dagster, OpenAI Proxy, SSE Streaming, Go Migration
Latest articles:
2026-05-16
A record of the origin and early design of familiar, a local-LLM multi-agent development platform that autonomously plans, implements, tests, and reviews on a home server without relying on cloud …
2026-04-14
A single-GPU vLLM 0.18.0 record for llm-jp-4-32b-a3b-base-NVFP4, and why I switched from SFT/DPO+LoRA-first assumptions to on-demand translation batches in Dagster.
2026-04-14
A field validation record of the familiar orchestrator / naughty / grandpa setup using a Claude orchestrator, Qwen3-Coder-Next 80B IQ4_KSS, and GLM-5.1 smol-IQ4_K. Covers the root cause behind a …
