Managed Agents에이전트에 작업 위임

결과 정의하기

에이전트에게 '완료'가 어떤 모습인지 알려주고, 그 목표에 도달할 때까지 반복하도록 합니다.

outcome은 세션을 대화에서 작업으로 격상시킵니다. 최종 결과가 어떤 모습이어야 하는지와 품질을 측정하는 방법을 정의하면, 에이전트는 그 목표를 향해 작업하며 결과가 충족될 때까지 자체 평가하고 반복합니다.

결과를 정의하면 하네스가 자동으로 grader(채점자)를 프로비저닝하여 루브릭에 따라 산출물을 평가합니다. Grader는 메인 에이전트의 구현 선택에 영향을 받지 않도록 별도의 컨텍스트 윈도우를 사용합니다.

Grader는 어떤 기준이 통과했거나 실패했는지 요약하거나, 산출물이 루브릭을 충족함을 확인하는 설명을 반환합니다. 이 피드백은 다음 반복을 위해 에이전트에게 다시 전달됩니다.

모든 Managed Agents API 요청에는 managed-agents-2026-04-01 베타 헤더가 필요합니다. SDK는 베타 헤더를 자동으로 설정합니다.

루브릭 생성하기

"Rubric"(루브릭)은 기준별 채점을 설명하는 마크다운 문서입니다. 루브릭은 필수입니다.

루브릭 예시:

# DCF Model Rubric

## Revenue Projections
- Uses historical revenue data from the last 5 fiscal years
- Projects revenue for at least 5 years forward
- Growth rate assumptions are explicitly stated and reasonable

## Cost Structure
- COGS and operating expenses are modeled separately
- Margins are consistent with historical trends or deviations are justified

## Discount Rate
- WACC is calculated with stated assumptions for cost of equity and cost of debt
- Beta, risk-free rate, and equity risk premium are sourced or justified

## Terminal Value
- Uses either perpetuity growth or exit multiple method (stated which)
- Terminal growth rate does not exceed long-term GDP growth

## Output Quality
- All figures are in a single .xlsx file with clearly labeled sheets
- Key assumptions are on a separate "Assumptions" sheet
- Sensitivity analysis on WACC and terminal growth rate is included

루브릭을 user.define_outcome에 인라인 텍스트로 전달하거나(다음 섹션 참조), 여러 세션에서 재사용하기 위해 Files API를 통해 업로드하세요.

Files API를 통해 업로드하려면 managed-agents-2026-04-01과 files-api-2025-04-14 베타 헤더가 모두 필요합니다.

rubric = client.beta.files.upload(file=Path("/tmp/rubric.md"))
print(f"Uploaded rubric: {rubric.id}")

결과가 있는 세션 생성하기

세션을 생성한 후 user.define_outcome 이벤트를 전송하세요. 에이전트는 즉시 작업을 시작하며, 추가 사용자 메시지 이벤트는 필요하지 않습니다.

# 세션 생성
session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Financial analysis on Costco",
)

# 결과 정의 — 에이전트가 수신 즉시 작업을 시작합니다
client.beta.sessions.events.send(
    session_id=session.id,
    events=[
        {
            "type": "user.define_outcome",
            "description": "Build a DCF model for Costco in .xlsx",
            "rubric": {"type": "text", "content": RUBRIC},
            # 또는: "rubric": {"type": "file", "file_id": rubric.id},
            "max_iterations": 5,  # optional; default 3, max 20
        }
    ],
)

결과 이벤트

결과 지향 세션의 진행 상황은 이벤트 스트림에 표시됩니다.

agent.* 이벤트(메시지 및 도구 사용 등)는 결과를 향한 진행 상황을 보여줍니다.
span.outcome_evaluation_* 이벤트는 결과 지향 세션에서만 발생하며, 반복 루프 횟수와 grader의 피드백 프로세스를 보여줍니다.
결과 지향 세션에 user.message 이벤트를 전송하여 진행 중인 에이전트의 작업을 지시할 수도 있지만, 필수는 아닙니다. 에이전트는 성공하거나 반복 횟수가 소진될 때까지 스스로 반복하며 결과를 향해 작업합니다.
user.interrupt 이벤트는 현재 결과에 대한 작업을 일시 중지하고 span.outcome_evaluation_end.result를 interrupted로 표시하여 새로운 결과를 시작할 수 있게 합니다.
최종 결과 평가 후, 세션은 대화형 세션으로 계속되거나 새로운 결과를 시작할 수 있습니다. 세션은 이전 결과의 기록을 유지합니다.

결과 정의 사용자 이벤트

한 번에 하나의 결과만 지원되지만, 결과를 순차적으로 연결할 수 있습니다. 이를 위해서는 이전 결과의 종료 이벤트 이후에 새로운 user.define_outcome 이벤트를 전송하세요.

이것은 결과를 시작하기 위해 전송하는 이벤트입니다. 수신 시 processed_at 타임스탬프와 outcome_id를 포함하여 다시 에코됩니다.

{
  "type": "user.define_outcome",
  "description": "Build a DCF model for Costco in .xlsx",
  "rubric": { "type": "file", "file_id": "file_01..." },
  "max_iterations": 5
}

결과 평가 시작

Grader가 하나의 반복 루프에 대한 평가를 시작하면 발생합니다. iteration 필드는 0부터 시작하는 수정 카운터입니다. 0은 첫 번째 평가, 1은 첫 번째 수정 후의 재평가, 이런 식으로 이어집니다.

{
  "type": "span.outcome_evaluation_start",
  "id": "sevt_01def...",
  "outcome_id": "outc_01a...",
  "iteration": 0,
  "processed_at": "2026-03-25T14:01:45Z"
}

결과 평가 진행 중

Grader가 실행되는 동안 발생하는 하트비트입니다. Grader의 내부 추론은 불투명합니다. 작업 중이라는 것은 볼 수 있지만, 무엇을 생각하고 있는지는 볼 수 없습니다.

{
  "type": "span.outcome_evaluation_ongoing",
  "id": "sevt_01ghi...",
  "outcome_id": "outc_01a...",
  "processed_at": "2026-03-25T14:02:10Z"
}

결과 평가 종료

Grader가 하나의 반복 평가를 완료한 후 발생합니다. result 필드는 다음에 일어날 일을 나타냅니다.

결과	다음 단계
`satisfied`	세션이 `idle`로 전환됩니다.
`needs_revision`	에이전트가 새로운 반복 주기를 시작합니다.
`max_iterations_reached`	더 이상의 평가 주기가 없습니다. 세션이 `idle`로 전환되기 전에 에이전트가 마지막 수정을 한 번 실행할 수 있습니다.
`failed`	세션이 `idle`로 전환됩니다. 루브릭이 근본적으로 작업과 일치하지 않을 때 반환됩니다. 예를 들어 설명과 루브릭이 서로 모순되는 경우입니다.
`interrupted`	인터럽트 전에 `outcome_evaluation_start`가 이미 발생한 경우에만 발생합니다.

{
  "type": "span.outcome_evaluation_end",
  "id": "sevt_01jkl...",
  "outcome_evaluation_start_id": "sevt_01def...",
  "outcome_id": "outc_01a...",
  "result": "satisfied",
  "explanation": "All 12 criteria met: revenue projections use 5 years of historical data, WACC assumptions are stated, sensitivity table is included...",
  "iteration": 0,
  "usage": {
    "input_tokens": 2400,
    "output_tokens": 350,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 1800
  },
  "processed_at": "2026-03-25T14:03:00Z"
}

결과 상태 확인하기

이벤트 스트림에서 span.outcome_evaluation_end를 수신하거나, GET /v1/sessions/:id를 폴링하여 outcome_evaluations[].result를 읽을 수 있습니다.

session = client.beta.sessions.retrieve(session.id)

for outcome in session.outcome_evaluations:
    print(f"{outcome.outcome_id}: {outcome.result}")
    # outc_01a...: satisfied

산출물 가져오기

에이전트는 샌드박스 내부의 /mnt/session/outputs/에 출력 파일을 작성합니다. 세션이 idle 상태가 되면, 세션으로 범위가 지정된 Files API를 통해 파일을 가져오세요.

# 이 세션에서 생성된 파일 나열
files = client.beta.files.list(scope_id=session.id)
for f in files:
    print(f.id, f.filename)

# 파일 다운로드
if files.data:
    content = client.beta.files.download(files.data[0].id)
    content.write_to_file("/tmp/output.txt")

Was this page helpful?

Managed Agents에이전트에 작업 위임

결과 정의하기

에이전트에게 '완료'가 어떤 모습인지 알려주고, 그 목표에 도달할 때까지 반복하도록 합니다.

모든 Managed Agents API 요청에는 managed-agents-2026-04-01 베타 헤더가 필요합니다. SDK는 베타 헤더를 자동으로 설정합니다.

루브릭 생성하기

"Rubric"(루브릭)은 기준별 채점을 설명하는 마크다운 문서입니다. 루브릭은 필수입니다.

루브릭 예시:

# DCF Model Rubric

## Revenue Projections
- Uses historical revenue data from the last 5 fiscal years
- Projects revenue for at least 5 years forward
- Growth rate assumptions are explicitly stated and reasonable

## Cost Structure
- COGS and operating expenses are modeled separately
- Margins are consistent with historical trends or deviations are justified

## Discount Rate
- WACC is calculated with stated assumptions for cost of equity and cost of debt
- Beta, risk-free rate, and equity risk premium are sourced or justified

## Terminal Value
- Uses either perpetuity growth or exit multiple method (stated which)
- Terminal growth rate does not exceed long-term GDP growth

## Output Quality
- All figures are in a single .xlsx file with clearly labeled sheets
- Key assumptions are on a separate "Assumptions" sheet
- Sensitivity analysis on WACC and terminal growth rate is included

루브릭을 user.define_outcome에 인라인 텍스트로 전달하거나(다음 섹션 참조), 여러 세션에서 재사용하기 위해 Files API를 통해 업로드하세요.

Files API를 통해 업로드하려면 managed-agents-2026-04-01과 files-api-2025-04-14 베타 헤더가 모두 필요합니다.

rubric = client.beta.files.upload(file=Path("/tmp/rubric.md"))
print(f"Uploaded rubric: {rubric.id}")

결과가 있는 세션 생성하기

세션을 생성한 후 user.define_outcome 이벤트를 전송하세요. 에이전트는 즉시 작업을 시작하며, 추가 사용자 메시지 이벤트는 필요하지 않습니다.

# 세션 생성
session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Financial analysis on Costco",
)

# 결과 정의 — 에이전트가 수신 즉시 작업을 시작합니다
client.beta.sessions.events.send(
    session_id=session.id,
    events=[
        {
            "type": "user.define_outcome",
            "description": "Build a DCF model for Costco in .xlsx",
            "rubric": {"type": "text", "content": RUBRIC},
            # 또는: "rubric": {"type": "file", "file_id": rubric.id},
            "max_iterations": 5,  # optional; default 3, max 20
        }
    ],
)

결과 이벤트

결과 지향 세션의 진행 상황은 이벤트 스트림에 표시됩니다.

agent.* 이벤트(메시지 및 도구 사용 등)는 결과를 향한 진행 상황을 보여줍니다.
span.outcome_evaluation_* 이벤트는 결과 지향 세션에서만 발생하며, 반복 루프 횟수와 grader의 피드백 프로세스를 보여줍니다.
결과 지향 세션에 user.message 이벤트를 전송하여 진행 중인 에이전트의 작업을 지시할 수도 있지만, 필수는 아닙니다. 에이전트는 성공하거나 반복 횟수가 소진될 때까지 스스로 반복하며 결과를 향해 작업합니다.
user.interrupt 이벤트는 현재 결과에 대한 작업을 일시 중지하고 span.outcome_evaluation_end.result를 interrupted로 표시하여 새로운 결과를 시작할 수 있게 합니다.
최종 결과 평가 후, 세션은 대화형 세션으로 계속되거나 새로운 결과를 시작할 수 있습니다. 세션은 이전 결과의 기록을 유지합니다.

결과 정의 사용자 이벤트

이것은 결과를 시작하기 위해 전송하는 이벤트입니다. 수신 시 processed_at 타임스탬프와 outcome_id를 포함하여 다시 에코됩니다.

{
  "type": "user.define_outcome",
  "description": "Build a DCF model for Costco in .xlsx",
  "rubric": { "type": "file", "file_id": "file_01..." },
  "max_iterations": 5
}

결과 평가 시작

{
  "type": "span.outcome_evaluation_start",
  "id": "sevt_01def...",
  "outcome_id": "outc_01a...",
  "iteration": 0,
  "processed_at": "2026-03-25T14:01:45Z"
}

결과 평가 진행 중

{
  "type": "span.outcome_evaluation_ongoing",
  "id": "sevt_01ghi...",
  "outcome_id": "outc_01a...",
  "processed_at": "2026-03-25T14:02:10Z"
}

결과 평가 종료

Grader가 하나의 반복 평가를 완료한 후 발생합니다. result 필드는 다음에 일어날 일을 나타냅니다.

결과	다음 단계
`satisfied`	세션이 `idle`로 전환됩니다.
`needs_revision`	에이전트가 새로운 반복 주기를 시작합니다.
`max_iterations_reached`	더 이상의 평가 주기가 없습니다. 세션이 `idle`로 전환되기 전에 에이전트가 마지막 수정을 한 번 실행할 수 있습니다.
`failed`	세션이 `idle`로 전환됩니다. 루브릭이 근본적으로 작업과 일치하지 않을 때 반환됩니다. 예를 들어 설명과 루브릭이 서로 모순되는 경우입니다.
`interrupted`	인터럽트 전에 `outcome_evaluation_start`가 이미 발생한 경우에만 발생합니다.

{
  "type": "span.outcome_evaluation_end",
  "id": "sevt_01jkl...",
  "outcome_evaluation_start_id": "sevt_01def...",
  "outcome_id": "outc_01a...",
  "result": "satisfied",
  "explanation": "All 12 criteria met: revenue projections use 5 years of historical data, WACC assumptions are stated, sensitivity table is included...",
  "iteration": 0,
  "usage": {
    "input_tokens": 2400,
    "output_tokens": 350,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 1800
  },
  "processed_at": "2026-03-25T14:03:00Z"
}

결과 상태 확인하기

이벤트 스트림에서 span.outcome_evaluation_end를 수신하거나, GET /v1/sessions/:id를 폴링하여 outcome_evaluations[].result를 읽을 수 있습니다.

session = client.beta.sessions.retrieve(session.id)

for outcome in session.outcome_evaluations:
    print(f"{outcome.outcome_id}: {outcome.result}")
    # outc_01a...: satisfied

산출물 가져오기

# 이 세션에서 생성된 파일 나열
files = client.beta.files.list(scope_id=session.id)
for f in files:
    print(f.id, f.filename)

# 파일 다운로드
if files.data:
    content = client.beta.files.download(files.data[0].id)
    content.write_to_file("/tmp/output.txt")

Was this page helpful?

루브릭 생성하기

효과적인 루브릭 작성을 위한 팁

결과가 있는 세션 생성하기

결과 이벤트

결과 정의 사용자 이벤트

결과 평가 시작

결과 평가 진행 중

결과 평가 종료

결과 상태 확인하기

산출물 가져오기

루브릭 생성하기

효과적인 루브릭 작성을 위한 팁

결과가 있는 세션 생성하기

결과 이벤트

결과 정의 사용자 이벤트

결과 평가 시작

결과 평가 진행 중

결과 평가 종료

결과 상태 확인하기

산출물 가져오기

루브릭 생성하기

결과가 있는 세션 생성하기

결과 이벤트

결과 정의 사용자 이벤트

결과 평가 시작

결과 평가 진행 중

결과 평가 종료

결과 상태 확인하기

산출물 가져오기

루브릭 생성하기

결과가 있는 세션 생성하기

결과 이벤트

결과 정의 사용자 이벤트

결과 평가 시작

결과 평가 진행 중

결과 평가 종료

결과 상태 확인하기

산출물 가져오기