Skip to main content
POST
/
stream
curl --request POST \
  --url https://api.starlight-search.com/stream \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "task": "What is the weather in Berlin?",
  "model": "gpt-4.1-nano",
  "base_url": "https://api.openai.com/v1/chat/completions",
  "tools": [
    "DuckDuckGo",
    "VisitWebsite",
    "GoogleSearchTool",
    "ExaSearchTool"
  ]
}
'
"data: {\"type\":\"token\",\"content\":\"The\"}\n\n"

Overview

The /stream endpoint provides real-time streaming of agent execution using Server-Sent Events (SSE). It accepts the same parameters as the /run endpoint but returns events as they occur, allowing you to monitor the agent’s progress, see token-by-token output, and receive step updates in real-time.

Request Parameters

Required Parameters

  • task (string): The task description that the agent should complete
  • model (string): The AI model identifier (e.g., “gpt-4.1-nano”)
  • base_url (string): The base URL for your chat completions API endpoint

Optional Parameters

  • tools (array): List of available tool identifiers (e.g., [“ExaSearchTool”, “VisitWebsite”, “DuckDuckGo”, “GoogleSearchTool”])
  • max_steps (integer): Maximum number of steps the agent can take
  • agent_type (string): Agent type - “function-calling” (default), “code-agent”, or “mcp”
  • history (array): Optional conversation history as an array of Message objects
  • max_results (integer): Maximum number of results for search tools (default: 5 for ExaSearchTool)
  • system_prompt (string): Custom system prompt to override the default
  • planning_interval (integer): Interval for planning steps (for function-calling and mcp agents)
  • mcp_servers (array): Array of MCP server configurations for MCP agent type

Response Format

The endpoint returns a Server-Sent Events (SSE) stream with the following event types:

Event Types

Token Event

Streams individual tokens from the LLM as they’re generated. Each token is a small piece of text (typically a word or part of a word):
{
  "type": "token",
  "content": "The"
}
{
  "type": "token",
  "content": " weather"
}
{
  "type": "token",
  "content": " in"
}

Step Event

Emitted when the agent completes a step, including tool calls:
{
  "type": "step",
  "step": {
    "step": 1,
    "tool_calls": [
      {
        "id": "call_123",
        "type": "function",
        "function": {
          "name": "ExaSearchTool",
          "arguments": "{\"query\": \"weather Berlin\"}"
        }
      }
    ]
  }
}

Error Event

Emitted when an error occurs during execution:
{
  "type": "error",
  "message": "Error description"
}

Done Event

Emitted when the agent completes the task:
{
  "type": "done",
  "token_usage": {
    "prompt_tokens": 150,
    "completion_tokens": 200,
    "total_tokens": 350
  },
  "final_answer": "The final answer from the agent"
}

Examples

Basic Streaming Request

curl --location 'https://api.starlight-search.com/stream' \
--header 'Authorization: Bearer <your-api-key>' \
--header 'Content-Type: application/json' \
--data '{
    "task": "What is the weather in Berlin?",
    "model": "gpt-4.1-nano",
    "base_url": "https://api.openai.com/v1/chat/completions",
    "tools": ["ExaSearchTool", "VisitWebsite"]
}'

Code Agent with Streaming

curl --location 'https://api.starlight-search.com/stream' \
--header 'Authorization: Bearer <your-api-key>' \
--header 'Content-Type: application/json' \
--data '{
    "task": "Search for the top 5 AI companies and get their current stock prices",
    "model": "gpt-4.1-mini",
    "base_url": "https://api.openai.com/v1/chat/completions",
    "tools": ["ExaSearchTool", "VisitWebsite"],
    "agent_type": "code-agent",
    "max_steps": 5
}'

JavaScript/TypeScript Example

const response = await fetch('https://api.starlight-search.com/stream', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <your-api-key>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    task: 'What is the weather in Berlin?',
    model: 'gpt-4.1-nano',
    base_url: 'https://api.openai.com/v1/chat/completions',
    tools: ['ExaSearchTool', 'VisitWebsite']
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  const lines = chunk.split('\n');
  
  for (const line of lines) {
    if (line.startsWith('data: ')) {
      const data = JSON.parse(line.slice(6));
      
      switch (data.type) {
        case 'token':
          // Accumulate tokens to build the full response
          process.stdout.write(data.content);
          break;
        case 'step':
          console.log('Step:', data.step);
          break;
        case 'error':
          console.error('Error:', data.message);
          break;
        case 'done':
          console.log('Final answer:', data.final_answer);
          console.log('Token usage:', data.token_usage);
          break;
      }
    }
  }
}

Python Example

import requests
import json

url = 'https://api.starlight-search.com/stream'
headers = {
    'Authorization': 'Bearer <your-api-key>',
    'Content-Type': 'application/json',
}
data = {
    'task': 'What is the weather in Berlin?',
    'model': 'gpt-4.1-nano',
    'base_url': 'https://api.openai.com/v1/chat/completions',
    'tools': ['ExaSearchTool', 'VisitWebsite']
}

response = requests.post(url, headers=headers, json=data, stream=True)

for line in response.iter_lines():
    if line:
        line_str = line.decode('utf-8')
        if line_str.startswith('data: '):
            event_data = json.loads(line_str[6:])
            
            if event_data['type'] == 'token':
                print(f"Token: {event_data['content']}", end='', flush=True)
            elif event_data['type'] == 'step':
                print(f"\nStep: {event_data['step']}")
            elif event_data['type'] == 'error':
                print(f"\nError: {event_data['message']}")
            elif event_data['type'] == 'done':
                print(f"\n\nFinal answer: {event_data['final_answer']}")
                print(f"Token usage: {event_data['token_usage']}")

Agent Types

Function Calling Agent (Default)

The function calling agent streams tokens and step updates as it executes tools. Ideal for:
  • Multi-step tasks requiring sequential actions
  • External data retrieval
  • Real-time monitoring of tool interactions

Code Agent

The code agent streams execution progress including code generation and execution results. Ideal for:
  • Complex multi-step workflows
  • Data processing with loops and conditionals
  • Tasks requiring self-correction and debugging
Set agent_type: "code-agent" to use this agent type.

MCP Agent

The MCP (Model Context Protocol) agent streams interactions with MCP servers. Requires mcp_servers configuration. Ideal for:
  • Integration with MCP-compatible tools and services
  • Custom tool implementations via MCP servers
Set agent_type: "mcp" to use this agent type.

Available Tools

  • DuckDuckGo: Web search using DuckDuckGo
  • VisitWebsite: Visit and extract content from websites
  • GoogleSearchTool: Web search using Google
  • ExaSearchTool: Semantic search using Exa (requires max_results parameter)
  • PythonInterpreter: Execute Python code (code-agent only)

Notes

  • The stream uses Server-Sent Events (SSE) format with text/event-stream content type
  • Each event is prefixed with data: and followed by \n\n
  • Token events contain individual tokens (typically a word or part of a word), not full sentences. You’ll receive many token events that need to be accumulated to form the complete response
  • Step events are emitted after each agent step completes
  • The stream ends with a done event containing the final answer and token usage
  • Connection is kept alive with keep-alive header
  • Caching is disabled with no-cache header

Authorizations

Authorization
string
header
required

Bearer token authentication. Get your API key from https://lumo.starlight-search.com/dashboard

Body

application/json

Task configuration and parameters

task
string
required

The task description that the agent should complete. This can be a question, instruction, or complex multi-step task.

Example:

"What is the weather in Berlin?"

model
string
required

The AI model identifier to use for the task. This should match the model format expected by your base_url provider (e.g., OpenAI, Anthropic, etc.).

Example:

"gpt-4.1-nano"

base_url
string<uri>
required

The base URL for the chat completions API endpoint. This should point to your model provider's API.

Example:

"https://api.openai.com/v1/chat/completions"

tools
string[]

Array of tool identifiers available to the agent. The agent will automatically select and use these tools as needed to complete the task.

Example:
[
"DuckDuckGo",
"VisitWebsite",
"GoogleSearchTool",
"ExaSearchTool",
"PythonInterpreter"
]
max_steps
integer

Maximum number of steps the agent can take to complete the task. If not specified, the agent will continue until the task is completed or reaches a timeout.

Required range: x >= 1
Example:

3

agent_type
enum<string>
default:function-calling

Type of agent to use. 'function-calling' uses traditional function selection and execution. 'code-agent' uses executable Python code as actions, providing greater flexibility and efficiency for complex tasks. 'mcp' uses Model Context Protocol remote servers to connect to self-hosted toolchains.

Available options:
function-calling,
code-agent,
mcp
mcp_servers
object[]

Array of MCP remote server configurations. Required when agent_type is 'mcp'. Each server configuration specifies how to launch and connect to an MCP server.

Example:
[
{
"command": "npx",
"args": [
"-y",
"mcp-remote",
"https://mcp.linear.app/sse",
"--header",
"Authorization: Bearer ${AUTH_TOKEN}"
],
"env": { "AUTH_TOKEN": "YOUR-AUTH_TOKEN" }
}
]
messages
object[]

Optional conversation history or system messages. If not provided, the task will be treated as a standalone request.

history
object[]

Optional conversation history as an array of Message objects. Alias for 'messages'.

max_results
integer

Maximum number of results for search tools (default: 5 for ExaSearchTool).

Required range: x >= 1
Example:

5

system_prompt
string

Custom system prompt to override the default system prompt for the agent.

Example:

"You are a helpful assistant specialized in weather information."

planning_interval
integer

Interval for planning steps (for function-calling and mcp agents). The agent will pause for planning after this many steps.

Required range: x >= 1
Example:

3

Response

Task execution stream (Server-Sent Events)

Server-Sent Events stream. Each event is prefixed with 'data: ' and followed by '\n\n'. Event types include: 'token' (token-by-token content), 'step' (step completion with tool calls), 'error' (error messages), and 'done' (final answer and token usage).