Add a Knowledge Base
Upload documents and connect them to your agent.
This guide covers how to upload documents to Society AI, connect them to your agent as a knowledge base, and let your agent search and cite those documents when answering questions. By the end your agent will be able to answer questions grounded in your own data.
Overview
A knowledge base (KB) lets your agent answer questions using your own documents rather than only its training data. When a user asks a question, the agent searches the KB for relevant passages, includes them as context, and cites the source files in its response.
Society AI supports two types of KB sources:
| Source Type | Description | Use Case |
|---|---|---|
| Agent documents | Files uploaded directly to an agent | Product docs, FAQs, research papers |
| Workspace connections | Spaces or projects in your organization | Shared team knowledge, project documentation |
Both types are searched together at query time, and results are merged and ranked by relevance.
How It Works
User sends a question
|
v
Agent receives task
|
v
search_knowledge_base(query)
|
v
KB Service resolves agent's configured sources
|
v
Fan-out search across all sources
|
v
Merge, deduplicate, rank by score
|
v
Return top-K chunks with source info
|
v
Agent uses chunks as context for its responseThe KB search is a tool the agent calls when it needs to find information. The agent's system prompt instructs it to cite sources using the format [Source: filename].
Step 1 -- Upload Documents
Via the Dashboard
The simplest way to add documents is through the Society AI dashboard:
- Go to societyai.com
- Navigate to your agent's settings
- Open the Knowledge Base section
- Upload files (PDF, TXT, DOCX, and other common formats)
Files are indexed automatically after upload. Indexing typically takes a few seconds to a few minutes depending on file size.
Via the API
You can also upload documents programmatically through the platform API. Files are stored and indexed in Google File Search, and the KB service handles retrieval.
Step 2 -- Connect KB to Your Agent
Config Agents (Agent Builder)
For agents built with the Agent Builder UI, add KB sources to the agent configuration. There are two source types:
Agent documents -- Search files uploaded directly to this agent:
{
"kb_sources": [
{
"type": "agent",
"agent_id": "my-agent"
}
]
}Workspace connections -- Search documents from a space or project:
{
"workspace": [
{
"type": "space",
"space_id": "your-space-uuid",
"name": "Engineering Docs"
},
{
"type": "project",
"space_id": "your-space-uuid",
"project_id": "your-project-uuid",
"name": "Q1 Research"
}
]
}You can configure both kb_sources and workspace together. The agent searches all configured sources and merges the results.
When either kb_sources or workspace is configured, the platform automatically registers a search_knowledge_base tool on the agent. No code changes are needed -- the agent receives the tool and can call it during conversations.
Self-Hosted Agents (Python SDK)
For self-hosted agents using the Python SDK, knowledge base integration works differently. Your agent needs to implement its own retrieval logic, since it runs outside the platform infrastructure.
A common approach is to use a retrieval library (like LlamaIndex, LangChain, or ChromaDB) to index and search your documents locally:
from society_ai import SocietyAgent, TaskContext
agent = SocietyAgent(
name="kb-agent",
description="Agent with local knowledge base",
)
# Example: use a local vector store for retrieval
# (Replace with your actual retrieval implementation)
async def search_documents(query: str, top_k: int = 5):
"""Search your local document index."""
# Your retrieval logic here -- ChromaDB, FAISS, LlamaIndex, etc.
results = await your_vector_store.query(query, top_k=top_k)
return results
@agent.skill(name="ask", description="Answer questions from my documentation")
async def ask(message: str, context: TaskContext) -> str:
# 1. Search knowledge base
chunks = await search_documents(message)
# 2. Build context from relevant chunks
context_text = "\n\n".join(
f"[Source: {c.source_file}]\n{c.text}" for c in chunks
)
# 3. Use LLM with document context
response = await llm.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": (
"Answer the question using the provided context. "
"Cite sources as [Source: filename]."
),
},
{"role": "user", "content": f"Context:\n{context_text}\n\nQuestion: {message}"},
],
)
return response.choices[0].message.content
agent.run()Step 3 -- How the Agent Uses KB
When the KB is connected, the agent gains a search_knowledge_base tool. Here is what happens when a user asks a question:
- The agent's LLM decides whether to call
search_knowledge_basebased on the question - The tool sends a query to the KB service endpoint
- The service resolves the agent's configured sources from the database
- It fans out searches across all configured sources in parallel
- Results are merged, deduplicated, and sorted by relevance score
- The top-K chunks are returned to the agent
- The agent uses the chunks as context and cites them in the response
Search Results Format
Each chunk returned by the KB search includes:
| Field | Type | Description |
|---|---|---|
text | str | The relevant text passage |
score | float | Relevance score (0-1, higher is better) |
source_file | str | Name of the source document |
page_number | int | Page number (if available) |
metadata | dict | Additional metadata |
Citation Behavior
The agent's system prompt instructs it to cite sources when using KB information:
Cite sources when using knowledge base information: [Source: filename]This means responses grounded in KB data include citations like:
The deployment process uses blue-green deployments for zero-downtime releases. [Source: deployment-guide.pdf]
Multiple Source Types
You can mix agent documents and workspace connections. The agent searches all of them:
{
"kb_sources": [
{
"type": "agent",
"agent_id": "my-agent"
}
],
"workspace": [
{
"type": "space",
"space_id": "engineering-space-uuid",
"name": "Engineering"
},
{
"type": "project",
"space_id": "engineering-space-uuid",
"project_id": "api-project-uuid",
"name": "API Documentation"
}
]
}In this configuration, the agent searches:
- Files uploaded directly to
my-agent - All documents in the "Engineering" space
- Documents in the "API Documentation" project
Results from all sources are merged, deduplicated, and ranked by relevance score before being returned to the agent.
Updating the Knowledge Base
Adding Documents
Upload new files through the dashboard or API. They are indexed automatically and become searchable within minutes. No agent restart or redeployment is needed -- the KB service resolves sources from the database at query time.
Removing Documents
Delete documents through the dashboard. The document is removed from the search index and will no longer appear in results.
Workspace Changes
Changes to workspace connections (adding or removing spaces/projects) take effect immediately. The KB service reads the agent's configuration from the database on every search request.
Scoping Rules
KB search results are scoped based on how the source is configured:
| Source Config | What Gets Searched |
|---|---|
Agent documents (kb_sources.type: agent) | Files uploaded to that specific agent |
Space (workspace.type: space) | All documents in the space |
Project (workspace.type: project) | Documents in the specific project within the space |
This scoping ensures agents only access documents they are configured to use.
Best Practices
Document Preparation
- Use descriptive filenames -- The filename appears in citations, so
api-authentication-guide.pdfis better thandoc1.pdf - Structure content clearly -- Headings, sections, and paragraphs help the retrieval system find relevant passages
- Keep documents focused -- A 10-page guide on one topic retrieves better than a 200-page omnibus document
Query Optimization
- The
search_knowledge_basetool uses natural language queries, not keyword search - The default
top_kis 5 results, configurable up to 20 - For better results, the agent should formulate specific questions rather than broad topics
When to Use KB vs. Web Search
| Scenario | Best Tool |
|---|---|
| Questions about your own products, processes, or internal docs | Knowledge Base |
| Current events, public data, general knowledge | Web Search |
| Combining internal context with external data | Both (KB + web search) |
Next Steps
- Knowledge Base Reference -- Detailed KB configuration for config agents
- Workspaces -- Understanding organizations, spaces, and projects
- Build a Research Agent -- Build an agent that combines KB with LLM reasoning
- MCP Tools -- Add web search and other tools alongside KB