If You Want to Take Your AI Agent from Mediocre to World-Class, Fix Its RAG Documentation

Introduction

Google Cloud released a report that found 88% of organizations deploying agentic AI experience a positive ROI, with 74% of that being achieved within the first year. Despite these optimistic metrics, fewer than 1 in 5 adults trust an AI system’s output because they’re skeptical of its accuracy. This disconnect demonstrates a need for greater attention toward agentic AI’s performance. The ability of an agent to produce an accurate response is highly dependent on the quality of its RAG documentation layer. This article illustrates how a proper RAG strategy can help your organization win at AI agent deployment.


What is RAG? (And how is it distinct from an LLM?)

First, it’s important to distinguish between your average AI answer bot, and a RAG-enabled AI agent. RAG stands for “Retrieval-Augmented Generation”, a carefully prepared knowledge base layer that ensures an AI agent is able to respond as accurately as possible.

An average AI built without a RAG strategy will lean heavily on the Large Language Model (LLM) contained within its system to answer queries. This leads to an AI guessing or hallucinating to fill in blanks in its knowledge. Likewise, an agent containing a poorly crafted RAG layer will struggle to surface the correct answer when queried. As demonstrated by the Moffatt v. Air Canada civil tribunal ruling, such errors can be quite costly.

An AI Agent without a RAG layer is like a well-trained librarian with no book collection to refer to.

In a RAG-supported AI system, the LLM’s role is to handle inference—the operation that formulates the complete response returned to a user. Meanwhile, the knowledge base or RAG documentation layer provides specially prepared data for the LLM to retrieve and interpret to formulate replies.

Illustration: imagine an AI system as a brick-and-mortar library, and the LLM as the librarian. If the shelves of the library are empty, the LLM will do its best to answer questions based solely on its training. Once enhanced with a structured RAG documentation layer (the book collection), the LLM (librarian) can now fetch correct data, while simultaneously adding helpful personalized touches to responses that meet the needs of each unique user.

How some organizations get RAG wrong (and how to get it right)

Here is a typical workflow followed by organizations lacking a proper RAG strategy or dedicated Knowledge Base Architect on staff:

  • The Engineering department is tasked with building the infrastructure that will run the LLM.
  • The Product department designs the user interface (and sometimes, the user experience too.)
  • Based on a minimum viable product, Leadership approves the budget for full-scale implementation.
  • Finally, the organization’s proprietary content (PDFs, articles, wikis, policy documents, procedure manuals, etc.) are stuffed into the AI agent’s knowledge base, with minimal-to-no special vetting or prep.

At this point, the average company may feel they’ve completed the work required to launch their RAG-based AI into their operations. In reality it’s more likely that they’ve unwittingly introduced friction into their systems that will have a deleterious impact on their mission and their organization’s bottom line.

The garbage in, garbage out dilemma

A common misstep in RAG layer construction is assuming that more data added to a knowledge base results in a smarter agent. In reality, building a robust AI relies on quality more than quantity.

An AI agent can only retrieve what it’s been fed. When queried, it will attempt to call ingested data regardless of the quality of that data, because its role is not to make a judgment call; that responsibility belongs to the Knowledge Base Architect, a role most organizations don’t have on their teams. Lacking access to such a specialist can surface the following problems in an AI system:

  • Outdated source material.
    When superseded procedures, old regulatory guidelines, and deprecated policy versions sit alongside current ones, with no distinction between them, the agent has no way to determine which version is authoritative. This leads to hallucinating, a problem that can present real legal liability.

  • Contradictory documentation and duplicated data.
    Organizations that have grown through mergers, rapid scaling, or multiple product iterations often have the same procedures documented several different ways across varying departments and systems. This is particularly true in legacy enterprises. An agent cannot resolve that contradiction on its own; it will surface whichever version scores highest in retrieval regardless of accuracy.

  • Irrelevant filler content.
    Meeting notes, draft documents, internal brainstorms, and off-topic files bloat the knowledge base and increase retrieval noise. Every irrelevant document in the pipeline is a competitor for the right answer.

  • Unstructured formatting.
    Documents that were designed for human reading (dense paragraphs, inconsistent headers, tables embedded in PDFs, etc.), don’t translate cleanly into machine-readable chunks. Without proper reformatting, the agent will be prone to retrieving data that’s littered with visual artifacts.

RAG documentation preparation is the keystone of an excellent AI Agent

Preparing documents or “chunks” for AI ingestion is a combination of careful research, curation, and formatting. The first step, deciding what to include and exclude, requires an understanding of:

  • Version-controlled policy and procedure documents
  • Regulatory frameworks and compliance standards relevant to a subject, industry, or domain
  • Technical specifications, maintenance protocols, and operational standards
  • Verified research, methodology documentation, and authoritative internal reports

Of equal importance, is knowing what to keep out of a knowledge database:

  • Superseded versions of any document still present in active systems
  • Draft, unreviewed, or informal internal communications
  • Duplicate content covering the same topic with inconsistent detail
  • Any document whose accuracy cannot be verified or whose author cannot be identified

After curation, the formatting of approved content into a RAG layer is the next critical step. Documents need to be cleaned of unnecessary visual artifacts and “chunked” at logical boundaries so retrieved pieces carry complete meaning.

Every chunk also needs metadata appended (document type, date, source, version status, regulatory jurisdiction, etc.) so the retrieval system can find the exact information that’s being requested. Of equal importance; source hierarchy needs to be established so in instances where similar data could reasonably be surfaced, the information with the greatest authority outranks data that’s inferior.

This is methodical and highly critical work that takes significant time and care to do right. As a relatively new and evolving need, it sits in a gap between being “too strategic for a junior writer” and “too tedious for a senior engineer.” It’s also uncommon for a team to have a dedicated specialist just for managing a RAG documentation workflow.

Work with a Knowledge Base Architect for best results

Building a robust AI agent is often driven by a need to improve efficiency and increase revenue. While these outcomes are obvious targets, a well-built AI also helps mitigate legal risks in highly regulated industries like healthcare, and markedly improves customer satisfaction and retention metrics across industries.

Once established, a RAG-enabled AI is relatively easy to update and will support the most accurate LLM inferences based on real-time, organization specific data. In comparison, a basic AI agent with no RAG layer is only capable of inferring based on the last static data its LLM was trained on.

Considering what can be achieved by a properly tuned AI, it’s worth investing in an expert who can own the RAG documentation process and ensure it’s completed to the highest possible standard.

A Knowledge Base Architect specializes in structuring your AI’s RAG documentation, ensuring the agent is accurate and reliable.

A Knowledge Base Architect specializing in RAG documentation will focus on structuring the knowledge layer that makes your agent reliable, leaving your other team members free to conduct their normal work. With the right hire, this clean division of labor results in maximized efficiency, fewer errors, and better AI agents. The logical next question may be, should you add this specialist to your team as a salaried employee? Or are you better off securing the services of a contractor?

The World Economic Forum’s Future of Jobs Report 2025 found that 85% of major employers plan to upskill their workforce for AI-related roles, while 70% plan to actively hire new talent with those skills. Knowledge management is rapidly becoming one of those roles. Metrics like this signal that a RAG related role is definitely worth considering either as an internal team member or as external contractor.

Data from job boards and tech recruitment services reveal that an annual salary for a knowledge architect role averages around $124k. If you have someone on your team with the foundational experience to be trained into the role, the cost to upskill them will be roughly $500-$1400 depending on the size of your organization.

Where time is crucial and budgets are limited, hiring a contractor either as a consultant or a temporary member of a flex team allows your organization to instantly fill a resource gap. You can expect a contract knowledge architect to charge between $100-$200 per hour commensurate with their level of experience.

Knowledge is power…

If your organization is developing an AI agent or planning to do so, the following question is worth putting in front of your team before the next sprint cycle:

Do we have someone responsible for the quality of what our AI agent knows? Not the model or the pipeline, but the knowledge itself?

If the answer is no, or if the answer is “the engineers are handling it,” now you know which invisible layer is most likely to fail you, and how to get on top of it.

About the Author

Sequoia is the founder of Funnel Amp, an AI Knowledge Base Architecture and RAG Documentation practice serving the Green Energy and Sustainable Finance sector. Guided by Human-Centered AI Principles, Funnel Amp helps clients deploy AI agents responsibly to support mission-critical decisions that impact people and systems. To start your next AI project, visit funnelamp.com.


References

  1. Google Cloud — ROI of AI: How Agents Help Business
    https://cloud.google.com/transform/roi-of-ai-how-agents-help-business
  2. YouGov — Most Americans Use AI But Still Don’t Trust It
    https://yougov.com/en-us/articles/53701-most-americans-use-ai-but-still-dont-trust-it
  3. Moffatt v. Air Canada — Civil tribunal ruling establishing organizational liability for AI chatbot errors (UBC Allard School of Law Review)
    https://commons.allard.ubc.ca/cgi/viewcontent.cgi?article=1376&context=ubclawreview
  4. PMC/NIH — Legal risks of AI in highly regulated industries
    https://pmc.ncbi.nlm.nih.gov/articles/PMC12540348/
  5. PwC/Salesforce — Customer Experience Survey, customer satisfaction and retention metrics
    https://www.pwc.com/us/en/technology/alliances/library/salesforce-customer-experience-survey.html
  6. World Economic Forum — Future of Jobs Report 2025: Upskilling and hiring trends for AI-related roles (January 2025)
    https://www.weforum.org/publications/the-future-of-jobs-report-2025/digest/
  7. 6figr — Knowledge Architect Salary, United States
    https://6figr.com/us/salary/knowledge-architect–united-states–5-25–tly
Funnel Amp - Knowledge Architecture for Responsible AI