Skip to main content

Pattern: Direct API

This example uses the Reflect SDK at its most explicit level - no decorators, no context managers. You call each step yourself, which makes it the best starting point for understanding how Reflect works.
augment_with_memories  →  LLM call  →  interactive review  →  create_trace_and_wait
Reviews don’t have to happen immediately. You can defer a trace and review it later in bulk via the dashboard or the API.

Prerequisites

export REFLECT_API_KEY=rf_live_...
export REFLECT_PROJECT_ID=your-project-id
export OPENAI_API_KEY=sk-...
Install dependencies:
pip install reflect-sdk openai

Full example

interactive_feedback_cli.py
"""Interactive CLI example: solve a task with memory augmentation, then review it.

This example shows the core Reflect SDK loop:
  1. Augment a task with past memories
  2. Solve it with an LLM
  3. Review the answer (pass / fail / defer)
  4. Store the trace so Reflect can learn from it

Prerequisites:
- Reflect API running locally (or set --base-url)
- REFLECT_API_KEY and REFLECT_PROJECT_ID set in the environment
- OPENAI_API_KEY set in the environment
"""

import os
import argparse

from openai import OpenAI
from reflect_sdk import ReflectClient

DEFAULT_TASK = "Who is Sonam Pankaj?"


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(description="Interactive Reflect SDK demo.")
    parser.add_argument("--base-url", default="http://localhost:8000", help="Reflect API base URL.")
    parser.add_argument("--project-id", default=os.getenv("REFLECT_PROJECT_ID"), help="Reflect project id.")
    parser.add_argument("--reflect-api-key", default=os.getenv("REFLECT_API_KEY"), help="Reflect API key.")
    parser.add_argument("--model", default="gpt-5.4-mini", help="OpenAI model to use.")
    parser.add_argument("--task", default=DEFAULT_TASK, help="The task for the model to solve.")
    parser.add_argument("--limit", type=int, default=3, help="Max number of memories to retrieve.")
    args = parser.parse_args()
    if not args.project_id:
        raise RuntimeError("Set REFLECT_PROJECT_ID or pass --project-id.")
    if not args.reflect_api_key:
        raise RuntimeError("Set REFLECT_API_KEY or pass --reflect-api-key.")
    return args


def ask_for_review() -> tuple[str, str | None]:
    """Prompt the user to pass, fail, or defer the review.

    Deferring stores the trace without a review - useful when you want to
    review in bulk later via the dashboard or the API.
    """
    while True:
        choice = input("Was this answer correct? [y/n/d (defer)]: ").strip().lower()
        if choice in {"y", "yes"}:
            return "pass", None
        if choice in {"n", "no"}:
            feedback = input("What was wrong? (used as learning feedback): ").strip()
            return "fail", feedback or "The answer was incorrect."
        if choice in {"d", "defer"}:
            return "defer", None
        print("Please enter 'y', 'n', or 'd'.")


def main() -> None:
    args = parse_args()

    openai_api_key = os.getenv("OPENAI_API_KEY")
    if not openai_api_key:
        raise RuntimeError("OPENAI_API_KEY must be set.")

    # --- Step 1: Connect to Reflect ---
    reflect = ReflectClient(
        base_url=args.base_url,
        api_key=args.reflect_api_key,
        project_id=args.project_id,
    )

    # --- Step 2: Augment the task with relevant memories ---
    # Reflect retrieves past traces and injects their insights into the prompt.
    augmented = reflect.augment_with_memories(task=args.task, limit=args.limit)
    print(f"Task: {args.task}")
    print(f"Retrieved {len(augmented.memories)} relevant memories.\n")

    # --- Step 3: Solve with an LLM ---
    messages = [
        {
            "role": "system",
            "content": (
                "Solve the user's task. Use any relevant memories included in the prompt. "
                "Respond concisely."
            ),
        },
        {"role": "user", "content": augmented.augmented_task},
    ]

    openai = OpenAI(api_key=openai_api_key)
    response = openai.chat.completions.create(model=args.model, messages=messages)
    answer = (response.choices[0].message.content or "").strip()

    print("Model answer:")
    print(answer)
    print()

    # --- Step 4: Review the answer ---
    review_result, feedback = ask_for_review()

    # --- Step 5: Store the trace so Reflect can learn from it ---
    # When review_result is "pass" or "fail", Reflect immediately generates a
    # reflection and updates utility scores for memory ranking.
    # When deferred (review_result=None), the trace is stored without a review
    # and can be reviewed later via the dashboard or the API.
    trajectory = messages + [{"role": "assistant", "content": answer}]
    trace = reflect.create_trace_and_wait(
        task=args.task,
        trajectory=trajectory,
        retrieved_memory_ids=[m.id for m in augmented.memories],
        model=args.model,
        review_result=None if review_result == "defer" else review_result,
        feedback_text=feedback,
    )

    print(f"\nResult:   {review_result}")
    if feedback:
        print(f"Feedback: {feedback}")
    print(f"Trace id: {trace.id}")
    print(f"Review:   {trace.review_status}")
    if trace.created_memory_id:
        print(f"Memory created: {trace.created_memory_id}")


if __name__ == "__main__":
    main()

Run it

python interactive_feedback_cli.py --task "Explain transformer attention in one sentence"

How it works

1

Connect to Reflect

ReflectClient authenticates with your API key and ties all traces to a project.
reflect = ReflectClient(
    base_url="http://localhost:8000",
    api_key=os.getenv("REFLECT_API_KEY"),
    project_id=os.getenv("REFLECT_PROJECT_ID"),
)
2

Augment the task with memories

Before calling the LLM, ask Reflect for relevant past experiences. It returns the original task plus a memory-augmented version you can pass directly to the model.
augmented = reflect.augment_with_memories(task=args.task, limit=3)
# augmented.augmented_task  - task text with memories injected
# augmented.memories        - list of Memory objects (ids needed later)
3

Solve with an LLM

Pass augmented.augmented_task as the user message so the model sees relevant context from past runs.
messages = [
    {"role": "system", "content": "Solve the user's task. Use any relevant memories."},
    {"role": "user", "content": augmented.augmented_task},
]
response = openai.chat.completions.create(model="gpt-5.4-mini", messages=messages)
4

Review the answer

You decide whether the answer was correct. Three options:
InputMeaning
yPass - answer was correct
nFail - answer was wrong, provide feedback
dDefer - store now, review later
5

Store the trace

create_trace_and_wait submits the trace and blocks until Reflect has processed it.When review_result is "pass" or "fail", Reflect immediately generates a reflection and updates utility scores so better memories surface in future runs.When review_result is None (deferred), the trace is stored as-is and can be reviewed later from the dashboard.
trace = reflect.create_trace_and_wait(
    task=args.task,
    trajectory=trajectory,
    retrieved_memory_ids=[m.id for m in augmented.memories],
    model="gpt-5.4-mini",
    review_result="pass",   # or "fail", or None to defer
    feedback_text=feedback, # only meaningful on fail
)

Key concept: deferred reviews

Passing review_result=None stores the trace without triggering the learning loop. This is useful when:
  • You’re running a batch and want to review results in one go
  • A human reviewer needs to approve the answer asynchronously
  • You want to collect traces first and label them later
Deferred traces appear in the dashboard with a pending review status. You can review them there or via the API.