Engineering brief

Agentic RAG: The Power of Error-Returning Tools

Dave Ebbelaar

The Brief

Dave Ebbelaar's tutorial on building agentic RAG from scratch highlights a crucial production pattern: design tools to return errors instead of raising exceptions. This lets the agent self-correct rather than crash, making workflows more resilient. The trade-off? Agentic RAG adds latency and token cost, so it's best when reliability outweighs speed.

Decision relevance

Read this for workflow impact, implementation trade-offs, and the claims that need technical scrutiny before they reach team planning.

Summary

Dave Ebbelaar walks through building an agentic retrieval-augmented generation (RAG) system from scratch in Python, positioning it as an upgrade path from traditional semantic RAG for teams working with private data. The core premise is that an agentic loop—where an LLM iteratively calls search, list, and read tools to explore a knowledge base—can self-correct and outperform a single-shot linear retrieval approach when latency and cost are not the primary constraints.

The tutorial begins with basic file system primitives (glob for listing, regex-based grep for searching, path-safe file reading) to demonstrate the exact mechanics that coding agents like Cursor or ‘Claude Code’ use to navigate codebases. These are then wired into an agent using Pydantic AI, showing how prompt steers the tool calls, and how debugging reveals the model’s actual search parameters.

For the engineering leader, the most useful segment is the production section. Here, Ebbelaar swaps the naive Python grep for the Rust-based ‘ripgrep’ (matching modern agent harnesses’ choices), adds explicit request limits and max read lines to prevent runaway loops, and crucially adopts the pattern of returning human-readable error strings instead of raising exceptions. This ensures the agent hits a wall, receives formatted feedback, and can self-correct rather than crashing the entire orchestration—an essential pattern for building resilient tools. The use of structured output (Pydantic models for answers and citations) is highlighted as a necessary interface for downstream consumers.

The discussion of trade-offs is honest: semantic RAG is labeled 'not dead' for ultra-low-latency, cost-sensitive paths, while agentic RAG adds a noticeable time and token tax (the demo takes 10-15 seconds with multiple round trips). The instructor leans on a hypothetical engineering wiki use case but the abstractions are thin enough to adapt, though the heavy reliance on local filesystem semantics will require re-tooling for database-backed or blob-storage-centric deployments.

Why It Matters

Shows how to reinforce AI reliability in private-knowledge systems by designing tools that let agents fail gracefully and self-correct.

Editorial analysis

Key claims

  • Agentic RAG's secret isn't just the loop—it’s engineering the tools to return errors instead of crashing.

Practical use cases

  • Use this as input for tooling evaluation, workflow planning, and technical due diligence.

Risks / caveats

  • Pydantic AI framework details; the pattern works with any orchestration tool or the raw model API.

Who should care

  • Engineering managers, tech leads, and CTOs evaluating AI or developer tooling decisions.

Related topics

Bottom Line

Agentic RAG's secret isn't just the loop—it’s engineering the tools to return errors instead of crashing.

Watch

This video is blocked due to your privacy settings. To watch this video, please accept YouTube marketing cookies.

Related breakdowns

Get TL;DW

Too Long; Didn't Watch.

A concise breakdowns of the AI and devtools videos that actually matter for engineering leaders.

Free. Weekly. No hype.

Video and thumbnails remain the property of their respective creators. tldw.news provides editorial analysis, commentary, and discovery links to original content.

Agentic RAG: The Power of Error-Returning Tools | tldw.news