Writing
Field notes, series, and essays. Written first for me - then for you.
Building a production conversational assistant
A technical series on routing, tools, compliance, observability, and everything between the LLM and the database.
QueryBuilder: turning a Pydantic object into a safe FT.SEARCH query
Building FT.SEARCH queries manually is where you discover RediSearch silently interprets '&' as AND — no error, no exception.
dspy.Refine: runtime self-correction without recompiling the model
DSPy outside offline mode: generate, evaluate against a reward function, and if it fails, try again - before critique_node steps in.
Observability in a LangGraph graph: what Langfuse sees that the log doesn't
Logs cover what happened inside each node. They don't answer 'did the fallback rate climb in the last 30 minutes?'. For that, Langfuse.
Three routers, three different problems: DSPy, custom Semantic Router, and Aurélio AI
Before building the custom one I evaluated an open-source library that almost made it into the project. This is the comparison I wish I'd read before making those decisions.
DSPy in practice: what changes when the router is already an LLM but isn't yet compilable
The problem DSPy solves isn't the absence of AI in routing. It's the absence of a contract on that AI's output.
Regulatory guardrails in investment assistants: CVM, ANBIMA, and the LGPD paradox
Between the LLM generating a response and it reaching the customer is where a regulatory violation can happen - without intent, without malice, and with no possibility of reversing it.
Agent memory: episodic, semantic, and procedural
Confusing the three types of memory is where banking LLM projects fail structurally. Cognitive psychology already had the right taxonomy; it just had to be translated to infrastructure.
DSPy, the framework that treats prompts as compilable code, not as strings
Instead of writing prompts, you program declarative modules and let the framework compile optimized prompts - based on data, metrics, and the model you're using.
Fat vs Slim vs Hybrid in Redis Stack: the model that changed how I think about retrieval for LLM
When volume grows and the LLM starts losing itself in the context, the modeling decision is as important as the database choice. Fat, Slim, or Hybrid - which one stuck?
Have you used Redis for more than simple caching?
Cache miss became a slow API call, p99 climbed, LLM cost climbed. That's where I discovered Redis Stack as a deterministic retrieval and analytics layer for LLM applications.