Query memories
| Parameter | Type | Default | Description |
|---|---|---|---|
task | str | required | Task text to search against. The API embeds this and finds similar memories. |
limit | int | 10 | Maximum memories to return. |
lambda_ | float | 0.5 | Blend weight: 0.0 = pure similarity, 1.0 = pure Q-value. |
Memory object has:
| Field | Type | Description |
|---|---|---|
id | str | Memory identifier |
task | str | The past task this memory was generated from |
reflection | str | LLM-generated reflection text |
q_value | float | Learned Q-value (0–1, higher = better track record) |
similarity | float | Cosine similarity to the query |
score | float | Combined score: (1 - λ) × similarity + λ × q_value |
success | bool | None | Whether the source trace passed review |
Augment with memories
augment_with_memories queries memories and formats them into a text block you can prepend or append to your LLM prompt:
augmented_task string contains the original task followed by formatted memory blocks grouped into three sections:
- Successful memories — reflections from traces that passed review
- Failed memories — reflections from traces that failed review
- Other relevant memories — reflections where success is unknown
augmented_task returns the original task unchanged.
Format
The output looks like:Tuning retrieval
Thelambda_ parameter controls the similarity vs. Q-value balance:
| Value | Behavior |
|---|---|
0.0 | Pure semantic similarity — retrieves the most textually relevant memories regardless of outcome |
0.5 | Equal weight (default) — balances relevance with track record |
1.0 | Pure Q-value — retrieves memories with the best success history |
0.5. Increase lambda_ if your agent keeps retrieving relevant-but-unhelpful memories. Decrease it if the agent needs broader context from different past tasks.