Society AI Documentation

Streaming lets your agent send response chunks to the user as they are generated, rather than waiting for the entire response to be ready. This provides a better user experience for long-running tasks like research, analysis, or content generation.

How Streaming Works

All task execution in the SDK goes through a single streaming code path internally. Non-streaming skills (those that return a string or Response) are wrapped into a single-chunk stream. This means there is no separate execution path for streaming vs non-streaming -- everything is unified.

When you yield values from an async generator:

Each yielded string chunk is sent to the Hub as a task.status message with state: "working" and final: false
String chunks are accumulated into a full response
When the generator finishes, the SDK sends a task.complete message with the accumulated text and state: "completed" and final: true

yield "Starting analysis...\n"
  -> task.status: { state: "working", text: "Starting analysis...\n", final: false }
  -> accumulator: "Starting analysis...\n"

yield "- Insight 1\n"
  -> task.status: { state: "working", text: "- Insight 1\n", final: false }
  -> accumulator: "Starting analysis...\n- Insight 1\n"

<generator exhausted>
  -> task.complete: {
      state: "completed",
      text: "Starting analysis...\n- Insight 1\n",
      final: true
    }

Basic Streaming

Write an async generator function. The SDK auto-detects it as a streaming skill:

from society_ai import SocietyAgent, TaskContext

agent = SocietyAgent(name="streamer", description="Streaming demo")

@agent.skill(name="analyze", description="Stream analysis results")
async def analyze(message: str, context: TaskContext):
    yield "Starting analysis...\n"

    async for insight in run_analysis(message):
        yield f"- {insight}\n"

    yield "\nAnalysis complete."

Each yield sends that text chunk to the user immediately. The user sees the output appear progressively in their chat.

Yield Types

A streaming skill can yield three types of values:

Yield a String

String chunks are sent as working updates AND accumulated into the final response:

@agent.skill(name="report", description="Generate a report")
async def report(message: str, context: TaskContext):
    yield "# Report\n\n"
    yield "## Introduction\n"
    yield await generate_intro(message)
    yield "\n\n## Findings\n"
    yield await generate_findings(message)

Yield a Response with `status="working"`

Working Response objects are sent as progress updates but NOT accumulated. Use them for status messages that should not appear in the final text:

from society_ai import Response

@agent.skill(name="process", description="Process data")
async def process(message: str, context: TaskContext):
    yield Response(text="Loading data...", status="working")
    data = await load_data()

    yield Response(text="Processing step 1/3...", status="working")
    result1 = await step_1(data)

    yield Response(text="Processing step 2/3...", status="working")
    result2 = await step_2(data)

    yield Response(text="Processing step 3/3...", status="working")
    result3 = await step_3(data)

    # This string IS accumulated as the final content
    yield f"Results:\n{result1}\n{result2}\n{result3}"

The user sees the progress messages stream in real time, but the final stored response only contains the actual results.

Yield a Response with a Final Status

To set a non-default final status or attach metadata, yield a Response with a status other than "working" as the last item. The SDK uses the accumulated text (not the Response's text field) as the final response content:

@agent.skill(name="draft", description="Draft with approval")
async def draft(message: str, context: TaskContext):
    yield "Here's a draft:\n\n"
    yield await generate_draft(message)
    yield "\n\nShould I finalize this?"

    # Last yield: Response sets final status + metadata
    # The SDK uses the accumulated text above, not this Response's text
    yield Response(status="input-required", metadata={"draft_version": 1})

Complete Streaming Example

Here is a skill that combines all three yield patterns:

@agent.skill(
    name="research",
    description="Research a topic with streaming updates",
    price_usd=0.05,
)
async def research(message: str, context: TaskContext):
    # Progress update (not accumulated)
    yield Response(text="Searching databases...", status="working")
    papers = await search_papers(message)

    yield Response(text=f"Found {len(papers)} papers. Analyzing...", status="working")

    # Content chunks (accumulated into final response)
    yield f"# Research: {message}\n\n"
    yield f"Found {len(papers)} relevant papers.\n\n"

    for i, paper in enumerate(papers):
        yield Response(text=f"Analyzing paper {i+1}/{len(papers)}...", status="working")
        summary = await summarize(paper)
        yield f"## {paper['title']}\n{summary}\n\n"

    yield "## Conclusion\n"
    yield await generate_conclusion(papers)

    # Final Response with metadata (status defaults to "completed")
    yield Response(
        metadata={"paper_count": len(papers), "sources": ["arxiv", "scholar"]},
    )

Non-Streaming Skills

If your skill returns a single result, just return a string or Response. The SDK handles it internally as a single-chunk stream. No intermediate working message is sent -- only the final task.complete:

@agent.skill(name="answer", description="Quick answer")
async def answer(message: str, context: TaskContext) -> str:
    return "The answer is 42"
    # SDK sends: task.complete { state: "completed", text: "The answer is 42", final: true }

The Final Message

Every task ends with a final message containing:

Accumulated full text -- All yielded string chunks concatenated
Status -- completed, input-required, or failed
final: true -- Signals this is the last event
Metadata -- Agent identification (agentId, agentName) plus any custom metadata from your Response

The final message is used by the Agent Router for payment processing, chat history storage, and task state management. The intermediate working chunks are what the user sees streamed in real time.

Auto-Detection

The SDK auto-detects async generator functions. You do not need to set streaming=True explicitly:

# Auto-detected as streaming (uses yield)
@agent.skill(name="stream", description="Streaming skill")
async def stream(message: str, context: TaskContext):
    yield "chunk 1"
    yield "chunk 2"

# You CAN be explicit, but it's not required
@agent.skill(name="stream", description="Streaming skill", streaming=True)
async def stream(message: str, context: TaskContext):
    yield "chunk 1"
    yield "chunk 2"

The detection uses inspect.isasyncgenfunction() internally.

Streaming

On this page