Work with me

Build LLM systems and agents that hold up in production.

I take a small number of advisory engagements each quarter. Senior-IC depth on the technical questions, not deck-driven advisory.

Who I am

Hamidreza Saghir, Principal Applied Scientist at Microsoft. Previously Senior MLE Lead at X (Twitter), Applied Scientist at Amazon, ML Researcher at Borealis AI. PhD, U. Toronto. Publications at ACL, InterSpeech, WWW, IEEE TASLP.

I open-sourced Looplet, an iterator-first agent framework that exposes the loop instead of hiding it behind agent.run(task). I write opinion essays on agent design, evals, and the parts of LLM systems that don't show up in tutorials.

What I work on

Agent architectures. Loop design, tool dispatch, recovery, observability. When to use a framework vs. own the loop. The territory I explore in Looplet and write about in The loop is the product and Your coding agent is under-specified.
LLM evaluation systems. The pipeline most teams underbuild. Designing eval suites that actually predict production quality, judge-model scaffolding, regression tracking, calibration.
Post-training stacks. SFT data curation, DPO vs. RLHF decisions, knowledge retention vs. instruction following. Why fine-tuning often degrades the base model and what to do about it.
RAG and retrieval design. Two-tower vs. cross-encoder tradeoffs, chunking, hybrid retrieval, re-ranking. When the right answer is "smaller model + better retrieval," not more compute.
Inference cost models. Reading actual serving traces, prefill vs. decode tradeoffs, KV-cache strategy, quantization. Identifying the 10x speedups hiding in your stack.

Engagement shapes

Technical advisory (typical). 2–4 hour deep-dive sessions, async followup over a defined window. For teams with a specific decision in front of them: "should we fine-tune or RAG", "is our eval set actually measuring what we think", "why does our agent fail at scale".
Architecture review. Read the code, the eval, the recent incidents. Produce a written assessment with prioritized actions.
Hands-on prototype. For very specific problems where a prototype unblocks a decision. Limited availability.

Who I work with

Engineering teams at Series B to growth-stage companies shipping LLM products.
Established companies adding LLM features to existing products.
Founders/CTOs who need a senior technical sounding board for specific decisions.

Who I don't work with

Pre-product teams looking for general "AI strategy" advice.
Anyone wanting full-time or fractional CTO arrangements.
Projects that compete directly with Microsoft.

How to reach me

Email saghir.hr [at] gmail [dot] com with:

One paragraph on what you're building.
The specific decision or problem you're stuck on.
Your team size and current stack (model, infra, eval setup if any).
Timeline and rough engagement shape you have in mind.

I reply within a few business days with whether it's a good fit and how we'd start. First call is 30 minutes, no charge, no slide deck.