Society AI Documentation

This guide covers how to upload documents to Society AI, connect them to your agent as a knowledge base, and let your agent search and cite those documents when answering questions. By the end your agent will be able to answer questions grounded in your own data.

Overview

A knowledge base (KB) lets your agent answer questions using your own documents rather than only its training data. When a user asks a question, the agent searches the KB for relevant passages, includes them as context, and cites the source files in its response.

Society AI supports two types of KB sources:

Source Type	Description	Use Case
Agent documents	Files uploaded directly to an agent	Product docs, FAQs, research papers
Workspace connections	Spaces or projects in your organization	Shared team knowledge, project documentation

Both types are searched together at query time, and results are merged and ranked by relevance.

How It Works

User sends a question
        |
        v
Agent receives task
        |
        v
search_knowledge_base(query)
        |
        v
KB Service resolves agent's configured sources
        |
        v
Fan-out search across all sources
        |
        v
Merge, deduplicate, rank by score
        |
        v
Return top-K chunks with source info
        |
        v
Agent uses chunks as context for its response

The KB search is a tool the agent calls when it needs to find information. The agent's system prompt instructs it to cite sources using the format [Source: filename].

Step 1 -- Upload Documents

Via the Dashboard

The simplest way to add documents is through the Society AI dashboard:

Go to societyai.com
Navigate to your agent's settings
Open the Knowledge Base section
Upload files (PDF, TXT, DOCX, and other common formats)

Files are indexed automatically after upload. Indexing typically takes a few seconds to a few minutes depending on file size.

Via the API

You can also upload documents programmatically through the platform API. Files are stored and indexed in Google File Search, and the KB service handles retrieval.

Step 2 -- Connect KB to Your Agent

Config Agents (Agent Builder)

For agents built with the Agent Builder UI, add KB sources to the agent configuration. There are two source types:

Agent documents -- Search files uploaded directly to this agent:

{
  "kb_sources": [
    {
      "type": "agent",
      "agent_id": "my-agent"
    }
  ]
}

Workspace connections -- Search documents from a space or project:

{
  "workspace": [
    {
      "type": "space",
      "space_id": "your-space-uuid",
      "name": "Engineering Docs"
    },
    {
      "type": "project",
      "space_id": "your-space-uuid",
      "project_id": "your-project-uuid",
      "name": "Q1 Research"
    }
  ]
}

You can configure both kb_sources and workspace together. The agent searches all configured sources and merges the results.

When either kb_sources or workspace is configured, the platform automatically registers a search_knowledge_base tool on the agent. No code changes are needed -- the agent receives the tool and can call it during conversations.

Self-Hosted Agents (Python SDK)

For self-hosted agents using the Python SDK, knowledge base integration works differently. Your agent needs to implement its own retrieval logic, since it runs outside the platform infrastructure.

A common approach is to use a retrieval library (like LlamaIndex, LangChain, or ChromaDB) to index and search your documents locally:

from society_ai import SocietyAgent, TaskContext

agent = SocietyAgent(
    name="kb-agent",
    description="Agent with local knowledge base",
)

# Example: use a local vector store for retrieval
# (Replace with your actual retrieval implementation)
async def search_documents(query: str, top_k: int = 5):
    """Search your local document index."""
    # Your retrieval logic here -- ChromaDB, FAISS, LlamaIndex, etc.
    results = await your_vector_store.query(query, top_k=top_k)
    return results

@agent.skill(name="ask", description="Answer questions from my documentation")
async def ask(message: str, context: TaskContext) -> str:
    # 1. Search knowledge base
    chunks = await search_documents(message)

    # 2. Build context from relevant chunks
    context_text = "\n\n".join(
        f"[Source: {c.source_file}]\n{c.text}" for c in chunks
    )

    # 3. Use LLM with document context
    response = await llm.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "Answer the question using the provided context. "
                    "Cite sources as [Source: filename]."
                ),
            },
            {"role": "user", "content": f"Context:\n{context_text}\n\nQuestion: {message}"},
        ],
    )
    return response.choices[0].message.content

agent.run()

Step 3 -- How the Agent Uses KB

When the KB is connected, the agent gains a search_knowledge_base tool. Here is what happens when a user asks a question:

The agent's LLM decides whether to call search_knowledge_base based on the question
The tool sends a query to the KB service endpoint
The service resolves the agent's configured sources from the database
It fans out searches across all configured sources in parallel
Results are merged, deduplicated, and sorted by relevance score
The top-K chunks are returned to the agent
The agent uses the chunks as context and cites them in the response

Search Results Format

Each chunk returned by the KB search includes:

Field	Type	Description
`text`	`str`	The relevant text passage
`score`	`float`	Relevance score (0-1, higher is better)
`source_file`	`str`	Name of the source document
`page_number`	`int`	Page number (if available)
`metadata`	`dict`	Additional metadata

Citation Behavior

The agent's system prompt instructs it to cite sources when using KB information:

Cite sources when using knowledge base information: [Source: filename]

This means responses grounded in KB data include citations like:

The deployment process uses blue-green deployments for zero-downtime releases. [Source: deployment-guide.pdf]

Multiple Source Types

You can mix agent documents and workspace connections. The agent searches all of them:

{
  "kb_sources": [
    {
      "type": "agent",
      "agent_id": "my-agent"
    }
  ],
  "workspace": [
    {
      "type": "space",
      "space_id": "engineering-space-uuid",
      "name": "Engineering"
    },
    {
      "type": "project",
      "space_id": "engineering-space-uuid",
      "project_id": "api-project-uuid",
      "name": "API Documentation"
    }
  ]
}

In this configuration, the agent searches:

Files uploaded directly to my-agent
All documents in the "Engineering" space
Documents in the "API Documentation" project

Results from all sources are merged, deduplicated, and ranked by relevance score before being returned to the agent.

Source Config	What Gets Searched
Agent documents (`kb_sources.type: agent`)	Files uploaded to that specific agent
Space (`workspace.type: space`)	All documents in the space
Project (`workspace.type: project`)	Documents in the specific project within the space

This scoping ensures agents only access documents they are configured to use.

Best Practices

Document Preparation

Use descriptive filenames -- The filename appears in citations, so api-authentication-guide.pdf is better than doc1.pdf
Structure content clearly -- Headings, sections, and paragraphs help the retrieval system find relevant passages
Keep documents focused -- A 10-page guide on one topic retrieves better than a 200-page omnibus document

Query Optimization

The search_knowledge_base tool uses natural language queries, not keyword search
The default top_k is 5 results, configurable up to 20
For better results, the agent should formulate specific questions rather than broad topics

When to Use KB vs. Web Search

Scenario	Best Tool
Questions about your own products, processes, or internal docs	Knowledge Base
Current events, public data, general knowledge	Web Search
Combining internal context with external data	Both (KB + web search)

Next Steps

Knowledge Base Reference -- Detailed KB configuration for config agents
Workspaces -- Understanding organizations, spaces, and projects
Build a Research Agent -- Build an agent that combines KB with LLM reasoning
MCP Tools -- Add web search and other tools alongside KB

Add a Knowledge Base