Work with me
Build LLM systems and agents that hold up in production.
I take a small number of advisory engagements each quarter. Senior-IC depth on the technical questions, not deck-driven advisory.
Who I am
Hamidreza Saghir, Principal Applied Scientist at Microsoft. Previously Senior MLE Lead at X (Twitter), Applied Scientist at Amazon, ML Researcher at Borealis AI. PhD, U. Toronto. Publications at ACL, InterSpeech, WWW, IEEE TASLP.
I open-sourced Looplet,
an iterator-first agent framework that exposes the loop instead of hiding
it behind agent.run(task). I write
opinion essays on agent design, evals, and the parts
of LLM systems that don't show up in tutorials.
What I work on
- Agent architectures. Loop design, tool dispatch, recovery, observability. When to use a framework vs. own the loop. The territory I explore in Looplet and write about in The loop is the product and Your coding agent is under-specified.
- LLM evaluation systems. The pipeline most teams underbuild. Designing eval suites that actually predict production quality, judge-model scaffolding, regression tracking, calibration.
- Post-training stacks. SFT data curation, DPO vs. RLHF decisions, knowledge retention vs. instruction following. Why fine-tuning often degrades the base model and what to do about it.
- RAG and retrieval design. Two-tower vs. cross-encoder tradeoffs, chunking, hybrid retrieval, re-ranking. When the right answer is "smaller model + better retrieval," not more compute.
- Inference cost models. Reading actual serving traces, prefill vs. decode tradeoffs, KV-cache strategy, quantization. Identifying the 10x speedups hiding in your stack.
Engagement shapes
- Technical advisory (typical). 2–4 hour deep-dive sessions, async followup over a defined window. For teams with a specific decision in front of them: "should we fine-tune or RAG", "is our eval set actually measuring what we think", "why does our agent fail at scale".
- Architecture review. Read the code, the eval, the recent incidents. Produce a written assessment with prioritized actions.
- Hands-on prototype. For very specific problems where a prototype unblocks a decision. Limited availability.
Who I work with
- Engineering teams at Series B to growth-stage companies shipping LLM products.
- Established companies adding LLM features to existing products.
- Founders/CTOs who need a senior technical sounding board for specific decisions.
Who I don't work with
- Pre-product teams looking for general "AI strategy" advice.
- Anyone wanting full-time or fractional CTO arrangements.
- Projects that compete directly with Microsoft.
How to reach me
Email saghir.hr [at] gmail [dot] com with:
- One paragraph on what you're building.
- The specific decision or problem you're stuck on.
- Your team size and current stack (model, infra, eval setup if any).
- Timeline and rough engagement shape you have in mind.
I reply within a few business days with whether it's a good fit and how we'd start. First call is 30 minutes, no charge, no slide deck.