Skip to main content

Overview

A skill is a project-level guide distilled from your reviewed traces. While memories are per-task reflections retrieved by semantic similarity, a skill is a single, consolidated document that captures the proven strategies and pitfalls for an entire project. Skills are built using hierarchical consolidation inspired by the Trace2Skill paper:
  1. Level 1 (automatic): each reviewed trace generates a per-task reflection (a memory)
  2. Level 2: the top passed and failed reflections are consolidated separately into “proven strategies” and “pitfalls to avoid”
  3. Level 3: both summaries are synthesized into a unified skill guide
The output follows the Anthropic skill standard with YAML frontmatter and structured markdown.

Skills vs Memories

MemoriesSkills
ScopePer-task: retrieved by similarity to the current taskPer-project: same guide for every task
CountMany per project (one per reviewed trace)One per project
RetrievalAutomatic via query_memories or augment_with_memoriesExplicit via get_skill()
CreationAutomatic on trace reviewOn-demand (dashboard button or API call)
Best forTask-specific context (“last time I tried this exact thing…”)Project-wide patterns (“on this project, always check for pagination…”)
Use memories when your tasks are diverse and need targeted context. Use skills when you want a consistent baseline of project knowledge injected into every run. You can also use both together.

Creating a skill

From the dashboard

  1. Navigate to the Memories tab for your project
  2. Once you have at least 5 reviewed memories, a Create Skill button appears
  3. Click Settings to configure how many passed/failed reflections to sample (default: 5 each)
  4. Click Create Skill
The skill appears with its frontmatter metadata rendered as fields. You can:
  • Refine Skill: regenerate with the latest memories (version increments)
  • Download .md: save as a markdown file to use in your agent’s prompt or as a Claude Code skill

From the API

# Create or refine the skill
curl -X POST "https://api.starlight-search.com/v1/projects/my-project/skill/create" \
  -H "Authorization: Bearer rf_live_..." \
  -H "Content-Type: application/json" \
  -d '{"n_passed": 5, "n_failed": 5}'

# Retrieve the current skill
curl "https://api.starlight-search.com/v1/projects/my-project/skill" \
  -H "Authorization: Bearer rf_live_..."

From the SDK

from reflect_sdk import ReflectClient

client = ReflectClient(
    api_key="rf_live_...",
    project_id="my-project",
)

# Retrieve the skill (returns None if not created yet)
skill = client.get_skill()
if skill:
    print(skill)

Using a skill

Inject into your agent’s prompt

The simplest approach: prepend the skill to your system prompt or task:
skill = client.get_skill()

system_prompt = "You are a helpful assistant."
if skill:
    system_prompt += f"\n\n<skill>\n{skill}\n</skill>"

response = my_llm(system_prompt=system_prompt, task=user_task)

Use with the trace context manager

When using skills, you typically skip per-task memory retrieval since the skill already provides project-wide context:
skill = client.get_skill()

with client.trace(task, limit=1) as ctx:
    if skill:
        prompt = task + "\n\n<skill>\n" + skill + "\n</skill>"
    else:
        prompt = ctx.augmented_task  # fallback to memories

    response = my_agent(prompt)
    ctx.set_output(
        trajectory=[...],
        result="pass",
    )

Download and use as a Claude Code skill

Click Download .md in the dashboard to save the skill file. Place it in your Claude Code skills directory:
# Project-scoped skill
mkdir -p .claude/skills/my-project-skill
cp my-project-skill.md .claude/skills/my-project-skill/SKILL.md

# Or personal skill (available across projects)
mkdir -p ~/.claude/skills/my-project-skill
cp my-project-skill.md ~/.claude/skills/my-project-skill/SKILL.md
Claude Code will automatically load the skill when it matches the task context.

Skill format

Skills follow the Anthropic skill standard with YAML frontmatter:
---
name: my-project
description: Project-specific skill for my-project distilled from 10 reviewed traces.
project: my-project
source_memories: 10
passed_sampled: 5
failed_sampled: 5
generated_at: 2025-01-15T10:30:00+00:00
---

## Proven Strategies
1. Always check for pagination tokens when querying the search API...
2. Use structured entity chains (Label -> Artist -> Album) and validate each link...

## Pitfalls to Avoid
1. Never piece together partial data from multiple sources when a single ranked list exists...
2. Verify ordinal positions against primary sources, not summaries...

## General Procedure
1. Identify the core question type and map it to the relevant strategy
2. Start with primary databases and structured lists
3. Cross-check each component against at least two sources
4. Verify the final answer against the original query's requirements

Frontmatter fields

FieldDescription
nameProject identifier
descriptionWhat the skill covers and how many traces it was built from
projectProject ID
source_memoriesTotal number of reflections sampled
passed_sampledNumber of successful reflections used
failed_sampledNumber of failed reflections used
generated_atISO 8601 timestamp of when the skill was generated

Configuration

The skill generation can be configured in config.toml:
[skill]
min_memories_for_skill = 5   # minimum reviewed memories before skill can be created
default_n_passed = 5          # default number of top passed reflections to sample
default_n_failed = 5          # default number of top failed reflections to sample
These defaults are used when the API request doesn’t specify n_passed or n_failed. The dashboard Settings panel lets you override these per-request.

Best practices

A skill built from 3 very similar traces will be narrow. Wait until you have at least 5-10 reviewed traces covering different task types within the project. The more diverse the traces, the more generalizable the skill.
Skills are a snapshot. After reviewing more traces (especially failures that reveal new pitfalls), click Refine Skill to regenerate with the latest data. The version number increments so you can track changes.
The default of 5 passed + 5 failed works for most projects. For mature projects with 50+ traces, increase to 10-15 per category via the Settings panel to capture more patterns.
Skills provide a broad baseline (“on this project, always do X”). Memories provide task-specific context (“last time I tried this exact query…”). For complex projects, inject the skill into the system prompt and use memory augmentation for the task:
skill = client.get_skill()

with client.trace(task) as ctx:
    system = f"You are a research assistant.\n\n<skill>\n{skill}\n</skill>"
    response = my_agent(system_prompt=system, task=ctx.augmented_task)
    ctx.set_output(trajectory=[...], result="pass")
The downloaded .md file works as a Claude Code skill, a system prompt snippet, or documentation for your team. The frontmatter is valid YAML and can be parsed by any tool that understands the Anthropic skill format.