A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.