자율적인 AI 코딩 루프 실행에 대한 종합 가이드: Ralph Playbook 방법론

Geoffrey Huntley가 고안하고 Clayton Farr가 체계화한 Ralph Playbook은 AI를 활용한 자율적 코딩의 새로운 패러다임을 제시합니다. 이 방법론의 핵심은 간단합니다. Claude와 같은 LLM을 반복적인 루프 안에서 실행시키면서, 각 반복마다 깨끗한 컨텍스트로 시작하여 하나의 작업만 수행하도록 하는 것입니다.

<출처: https://ghuntley.com/ralph/>

흥미로운 점은 이름의 유래입니다. “Ralph Wiggum”은 심슨 가족에 나오는 순수하지만 다소 어리석은 캐릭터로, 복잡한 지시사항을 따르기보다는 단순하고 반복적인 작업에서 의외로 효과적인 모습을 보입니다. 이 방법론이 바로 그런 철학을 따릅니다.

Ralph의 작동 원리

핵심 철학: 3가지 기둥

Ralph 방법론은 세 가지 핵심 원칙 위에 세워져 있습니다.

1. 컨텍스트가 전부다

200K 토큰 중 실제 사용 가능한 것은 약 176K이며, 40-60% 활용률에서 AI가 가장 “똑똑하게” 작동합니다. Ralph는 각 작업을 작게 나누어 매 반복마다 100% 스마트 존 활용률을 유지합니다.

메인 에이전트는 스케줄러로만 사용
실제 작업은 서브에이전트에 분산하여 메모리 확장
간결함과 단순함 추구 (JSON보다 Markdown 선호)

2. 역압(Backpressure)을 통한 방향 제어

AI에게 무엇을 하라고 지시하는 대신, 잘못된 결과가 자동으로 거부되는 환경을 만듭니다.

상류 제어: 기존 코드 패턴, 유틸리티가 올바른 방향 제시
하류 제어: 테스트, 타입체크, 린트, 빌드가 잘못된 작업 차단
필요시 LLM-as-judge로 주관적 품질 검증

3. Ralph에게 맡겨라

AI가 스스로 문제를 인식하고 수정하며 개선하는 능력을 신뢰합니다. 계획도 AI가 작성하고 우선순위도 AI가 결정합니다. 인간은 루프 안이 아니라 루프 위에서 관찰하고 조정합니다.

중요한 것은 샌드박스 환경입니다. Ralph는 --dangerously-skip-permissions 플래그로 작동하므로 격리된 환경(Docker, Fly Sprites 등)에서 실행해야 합니다.

3단계 워크플로우

Phase 1: 요구사항 정의 (LLM 대화)

인간과 LLM이 대화하며 프로젝트를 정의합니다.

JTBD(Jobs to Be Done) 파악: 사용자가 달성하고자 하는 실제 목표 식별
관심 주제(Topics of Concern) 분해: 각 JTBD를 구성하는 개별 측면 구분
스펙 문서 작성: 각 주제마다 specs/FILENAME.md 파일 생성

예시:

JTBD: “디자이너가 무드보드를 만들 수 있도록 돕기”
주제: 이미지 수집, 색상 추출, 레이아웃, 공유
스펙: 각 주제당 하나의 마크다운 파일

Topic Scope Test: 주제가 적절히 분리되었는지 확인하는 방법은 “and” 없이 한 문장으로 설명할 수 있는가입니다. 만약 “and”가 필요하다면 여러 주제로 나눠야 합니다.

Phase 2 & 3: Ralph 루프 실행

동일한 루프 메커니즘이지만 두 가지 모드로 작동합니다.

계획 모드(Planning Mode)

스펙과 기존 코드를 비교하여 갭 분석
IMPLEMENTATION_PLAN.md 생성/업데이트만 수행
구현은 하지 않음

빌드 모드(Building Mode)

계획을 읽고 가장 중요한 작업 선택
구현 → 테스트 → 커밋 → 계획 업데이트
각 반복마다 하나의 작업만 처리

빌드 모드 상세 과정

각 루프 반복은 다음 단계를 거칩니다.

방향 설정: specs/* 스터디 (요구사항 파악)
계획 읽기: IMPLEMENTATION_PLAN.md 스터디
작업 선택: 가장 중요한 작업 하나 선택
조사: 기존 /src 코드 검토 (“구현 안 됐다고 가정하지 말 것”)
구현: N개 서브에이전트로 파일 작업
검증: 1개 서브에이전트로 빌드/테스트 (역압 메커니즘)
계획 업데이트: 작업 완료 표시, 발견사항 기록
AGENTS.md 업데이트: 운영 관련 학습사항 있으면 추가
커밋
루프 종료 → 컨텍스트 초기화 → 다음 반복 시작

파일 구조와 역할

Ralph는 5개의 핵심 파일로 작동합니다.

project-root/
├── loop.sh                         # Ralph 루프 스크립트
├── PROMPT_build.md                 # 빌드 모드 지침서
├── PROMPT_plan.md                  # 계획 모드 지침서
├── AGENTS.md                       # 운영 가이드 (~60줄)
├── IMPLEMENTATION_PLAN.md          # 우선순위 작업 목록
├── specs/                          # 요구사항 스펙
│   ├── [topic-a].md
│   └── [topic-b].md
└── src/                            # 애플리케이션 소스 코드

loop.sh – 외부 루프 제어

가장 기본적인 형태:

while :; do cat PROMPT.md | claude ; done

개선된 버전은 모드 선택, 반복 횟수 제한, 자동 푸시 등을 포함합니다.

chmod +x loop.sh

퍼미션을 변경해두어야 합니다.

# 사용 예시
./loop.sh              # 빌드 모드, 무제한
./loop.sh 20           # 빌드 모드, 최대 20회
./loop.sh plan         # 계획 모드
./loop.sh plan 5       # 계획 모드, 최대 5회

핵심 통찰: IMPLEMENTATION_PLAN.md 파일이 디스크에 지속되면서 각 반복 간 공유 상태 역할을 합니다. 복잡한 오케스트레이션 없이도 단순한 bash 루프가 에이전트를 계속 재시작하고, 에이전트는 매번 계획 파일을 읽어 다음 작업을 결정합니다.

PROMPT_*.md – 모드별 지침서

각 반복마다 로드되는 상세한 지침서입니다.

prompt_plan.md 예시

PROMPT_plan.md

0a. Study `specs/*` with up to 250 parallel Sonnet subagents to learn the application specifications.
0b. Study @IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
0c. Study `src/lib/*` with up to 250 parallel Sonnet subagents to understand shared utilities & components.
0d. For reference, the application source code is in `src/*`.

1. Study @IMPLEMENTATION_PLAN.md (if present; it may be incorrect) and use up to 500 Sonnet subagents to study existing source code in `src/*` and compare it against `specs/*`. Use an Opus subagent to analyze findings, prioritize tasks, and create/update @IMPLEMENTATION_PLAN.md as a bullet point list sorted in priority of items yet to be implemented. Ultrathink. Consider searching for TODO, minimal implementations, placeholders, skipped/flaky tests, and inconsistent patterns. Study @IMPLEMENTATION_PLAN.md to determine starting point for research and keep it up to date with items considered complete/incomplete using subagents.

IMPORTANT: Plan only. Do NOT implement anything. Do NOT assume functionality is missing; confirm with code search first. Treat `src/lib` as the project's standard library for shared utilities and components. Prefer consolidated, idiomatic implementations there over ad-hoc copies.

ULTIMATE GOAL: We want to achieve [project-specific goal]. Consider missing elements and plan accordingly. If an element is missing, search first to confirm it doesn't exist, then if needed author the specification at specs/FILENAME.md. If you create a new element then document the plan to implement it in @IMPLEMENTATION_PLAN.md using a subagent.

PROMPT_plan.md

0a. Study `specs/*` with up to 250 parallel Sonnet subagents to learn the application specifications.
0b. Study @IMPLEMENTATION_PLAN.md (if present) to understand the plan so far.
0c. Study `src/lib/*` with up to 250 parallel Sonnet subagents to understand shared utilities & components.
0d. For reference, the application source code is in `src/*`.

1. Study @IMPLEMENTATION_PLAN.md (if present; it may be incorrect) and use up to 500 Sonnet subagents to study existing source code in `src/*` and compare it against `specs/*`. Use an Opus subagent to analyze findings, prioritize tasks, and create/update @IMPLEMENTATION_PLAN.md as a bullet point list sorted in priority of items yet to be implemented. Ultrathink. Consider searching for TODO, minimal implementations, placeholders, skipped/flaky tests, and inconsistent patterns. Study @IMPLEMENTATION_PLAN.md to determine starting point for research and keep it up to date with items considered complete/incomplete using subagents.

IMPORTANT: Plan only. Do NOT implement anything. Do NOT assume functionality is missing; confirm with code search first. Treat `src/lib` as the project's standard library for shared utilities and components. Prefer consolidated, idiomatic implementations there over ad-hoc copies.

ULTIMATE GOAL: We want to achieve [project-specific goal]. Consider missing elements and plan accordingly. If an element is missing, search first to confirm it doesn't exist, then if needed author the specification at specs/FILENAME.md. If you create a new element then document the plan to implement it in @IMPLEMENTATION_PLAN.md using a subagent.

prompt_build.md 예시

0a. Study `specs/*` with up to 500 parallel Sonnet subagents to learn the application specifications.
0b. Study @IMPLEMENTATION_PLAN.md.
0c. For reference, the application source code is in `src/*`.

1. Your task is to implement functionality per the specifications using parallel subagents. Follow @IMPLEMENTATION_PLAN.md and choose the most important item to address. Before making changes, search the codebase (don't assume not implemented) using Sonnet subagents. You may use up to 500 parallel Sonnet subagents for searches/reads and only 1 Sonnet subagent for build/tests. Use Opus subagents when complex reasoning is needed (debugging, architectural decisions).
2. After implementing functionality or resolving problems, run the tests for that unit of code that was improved. If functionality is missing then it's your job to add it as per the application specifications. Ultrathink.
3. When you discover issues, immediately update @IMPLEMENTATION_PLAN.md with your findings using a subagent. When resolved, update and remove the item.
4. When the tests pass, update @IMPLEMENTATION_PLAN.md, then `git add -A` then `git commit` with a message describing the changes. After the commit, `git push`.

99999. Important: When authoring documentation, capture the why – tests and implementation importance.
999999. Important: Single sources of truth, no migrations/adapters. If tests unrelated to your work fail, resolve them as part of the increment.
9999999. As soon as there are no build or test errors create a git tag. If there are no git tags start at 0.0.0 and increment patch by 1 for example 0.0.1 if 0.0.0 does not exist.
99999999. You may add extra logging if required to debug issues.
999999999. Keep @IMPLEMENTATION_PLAN.md current with learnings using a subagent – future work depends on this to avoid duplicating efforts. Update especially after finishing your turn.
9999999999. When you learn something new about how to run the application, update @AGENTS.md using a subagent but keep it brief. For example if you run commands multiple times before learning the correct command then that file should be updated.
99999999999. For any bugs you notice, resolve them or document them in @IMPLEMENTATION_PLAN.md using a subagent even if it is unrelated to the current piece of work.
999999999999. Implement functionality completely. Placeholders and stubs waste efforts and time redoing the same work.
9999999999999. When @IMPLEMENTATION_PLAN.md becomes large periodically clean out the items that are completed from the file using a subagent.
99999999999999. If you find inconsistencies in the specs/* then use an Opus 4.5 subagent with 'ultrathink' requested to update the specs.
999999999999999. IMPORTANT: Keep @AGENTS.md operational only – status updates and progress notes belong in `IMPLEMENTATION_PLAN.md`. A bloated AGENTS.md pollutes every future loop's context.

0a. Study `specs/*` with up to 500 parallel Sonnet subagents to learn the application specifications.
0b. Study @IMPLEMENTATION_PLAN.md.
0c. For reference, the application source code is in `src/*`.

1. Your task is to implement functionality per the specifications using parallel subagents. Follow @IMPLEMENTATION_PLAN.md and choose the most important item to address. Before making changes, search the codebase (don't assume not implemented) using Sonnet subagents. You may use up to 500 parallel Sonnet subagents for searches/reads and only 1 Sonnet subagent for build/tests. Use Opus subagents when complex reasoning is needed (debugging, architectural decisions).
2. After implementing functionality or resolving problems, run the tests for that unit of code that was improved. If functionality is missing then it's your job to add it as per the application specifications. Ultrathink.
3. When you discover issues, immediately update @IMPLEMENTATION_PLAN.md with your findings using a subagent. When resolved, update and remove the item.
4. When the tests pass, update @IMPLEMENTATION_PLAN.md, then `git add -A` then `git commit` with a message describing the changes. After the commit, `git push`.

99999. Important: When authoring documentation, capture the why – tests and implementation importance.
999999. Important: Single sources of truth, no migrations/adapters. If tests unrelated to your work fail, resolve them as part of the increment.
9999999. As soon as there are no build or test errors create a git tag. If there are no git tags start at 0.0.0 and increment patch by 1 for example 0.0.1  if 0.0.0 does not exist.
99999999. You may add extra logging if required to debug issues.
999999999. Keep @IMPLEMENTATION_PLAN.md current with learnings using a subagent – future work depends on this to avoid duplicating efforts. Update especially after finishing your turn.
9999999999. When you learn something new about how to run the application, update @AGENTS.md using a subagent but keep it brief. For example if you run commands multiple times before learning the correct command then that file should be updated.
99999999999. For any bugs you notice, resolve them or document them in @IMPLEMENTATION_PLAN.md using a subagent even if it is unrelated to the current piece of work.
999999999999. Implement functionality completely. Placeholders and stubs waste efforts and time redoing the same work.
9999999999999. When @IMPLEMENTATION_PLAN.md becomes large periodically clean out the items that are completed from the file using a subagent.
99999999999999. If you find inconsistencies in the specs/* then use an Opus 4.5 subagent with 'ultrathink' requested to update the specs.
999999999999999. IMPORTANT: Keep @AGENTS.md operational only – status updates and progress notes belong in `IMPLEMENTATION_PLAN.md`. A bloated AGENTS.md pollutes every future loop's context.

구조:

Phase 0 (0a, 0b, 0c): 방향 설정 – 스펙, 소스 위치, 현재 계획 스터디
Phase 1-4: 주요 지침 – 작업, 검증, 커밋
999… 번호: 가드레일/불변 규칙 (번호가 높을수록 중요)

핵심 언어 패턴 (Geoff의 특정 표현):

“study” (읽기나 보기가 아님)
“don’t assume not implemented” (가장 중요 – 아킬레스건)
“Ultrathink” (깊이 생각하기)
“capture the why” (이유를 기록)
“keep it up to date” (최신 상태 유지)

AGENTS.md – 운영 가이드

프로젝트 빌드/실행 방법을 담은 간결한 가이드입니다 (약 60줄 제한).

Agents.md 예시

## Build & Run

Succinct rules for how to BUILD the project:

## Validation

Run these after implementing to get immediate feedback:

- Tests: `[test command]`
- Typecheck: `[typecheck command]`
- Lint: `[lint command]`

## Operational Notes

Succinct learnings about how to RUN the project:

...

### Codebase Patterns

...

## Build & Run

Succinct rules for how to BUILD the project:

## Validation

Run these after implementing to get immediate feedback:

- Tests: `[test command]`
- Typecheck: `[typecheck command]`
- Lint: `[lint command]`

## Operational Notes

Succinct learnings about how to RUN the project:

...

### Codebase Patterns

...

빌드 명령어
테스트 실행 방법
타입체크/린트 명령어
운영 관련 학습사항

중요: 상태 업데이트나 진행 상황 메모는 여기가 아니라 IMPLEMENTATION_PLAN.md에 기록합니다. AGENTS.md가 비대해지면 매 루프의 컨텍스트를 오염시킵니다.

그 외 파일들

IMPLEMENTATION_PLAN.md: Ralph가 생성하고 업데이트하는 우선순위 작업 목록. 잘못되었다면 언제든 삭제하고 재생성 가능 (계획은 일회용).

specs/*: 각 관심 주제별 요구사항 문서. 무엇을 만들어야 하는지에 대한 진실의 원천.

이 방법론의 특징

1. 컨텍스트 효율성

일반적인 AI 코딩에서는 컨텍스트가 40%를 넘으면 “dumb zone”에 진입하여 추론 능력이 저하됩니다. Ralph는 매 반복마다 깨끗한 컨텍스트로 시작하여 이를 회피합니다.

각 반복은 필요한 것만 로드: 프롬프트, 계획, 관련 코드
서브에이전트를 활용하여 메인 컨텍스트 보호
코드 패턴을 통한 학습으로 장황한 프롬프트 불필요

2. 계획의 일회성

계획은 잘못될 수 있습니다. 요구사항이 바뀌고, AI가 초반에 잘못 이해하면 오류가 누적됩니다.

해결책: 재생성. 계획 모드로 전환하여 갭 분석을 다시 실행하는 것이 잘못된 계획과 싸우는 것보다 저렴합니다.

재생성 시기:

Ralph가 잘못된 방향으로 가고 있을 때
계획이 오래되거나 현재 상태와 맞지 않을 때
완료된 항목으로 인한 혼잡함
스펙에 큰 변경이 있을 때

3. 역압 메커니즘의 계층

자율 루프는 잘못된 결과가 거부될 때 수렴합니다. 세 가지 계층:

결정적 게이트: 테스트, 타입체크, 린트, 빌드 검증
코드 패턴: 기존 유틸리티와 컴포넌트가 올바른 접근 방식 안내
LLM-as-judge: 주관적 기준(톤, UX 느낌, 미학)에 대한 이진 합격/불합격

활용 방안

1. 소규모 프로젝트부터 시작

명확한 요구사항이 있는 작은 프로젝트로 시작하세요. 스펙을 작성하고, 계획을 생성하고, 빌드하세요. 무엇이 잘못되는지 관찰하고 그에 따라 역압을 추가하세요.

2. 튜닝은 반응적으로

처음부터 모든 것을 규정하지 마세요. 관찰하고 조정하세요. Ralph가 특정 방식으로 실패하면 다음에 도움이 될 신호를 추가합니다.

신호는 프롬프트 텍스트만이 아닙니다:

프롬프트 가드레일
AGENTS.md의 운영 학습사항
코드베이스의 유틸리티 (패턴을 추가하면 Ralph가 발견하고 따름)

3. 관찰하고 궤도 수정

특히 초기에는 앉아서 지켜보세요. 어떤 패턴이 나타나나요? Ralph가 어디서 잘못되나요? 어떤 신호가 필요한가요? 시작 프롬프트는 최종 프롬프트가 아닙니다 – 관찰된 실패 패턴을 통해 진화합니다.

4. 고급 패턴 적용

기본 루프가 잘 작동하면 다음을 고려하세요.

작업 범위 브랜치: 각 기능 브랜치에 대해 범위가 지정된 계획을 미리 생성. 런타임 필터링(확률적)이 아닌 계획 생성 시점의 범위 지정(결정론적).

git checkout -b feature/user-auth
./loop.sh plan-work "OAuth와 세션 관리가 포함된 사용자 인증 시스템"
./loop.sh  # 범위가 지정된 계획에서 빌드

수용 기준 기반 역압: 계획 단계에서 수용 기준으로부터 테스트 요구사항 도출. 적절한 테스트를 통과하지 않으면 완료를 주장할 수 없습니다.

JTBD → 스토리 맵 → SLC 릴리스: 스펙을 사용자 여정 활동으로 재구성. 한 번에 모든 것을 만드는 대신 Simple, Lovable, Complete 릴리스를 식별.

이 방법론이 해결하지 못하는 것

현실적인 한계를 이해하는 것도 중요합니다.

나쁜 스펙: 쓰레기가 들어가면 쓰레기가 나옵니다. Phase 1을 제대로 수행했다고 가정합니다.

아키텍처 결정: 새로운 추상화는 여전히 인간의 판단이 필요합니다. Ralph는 실행을 담당하지 설계를 담당하지 않습니다.

비용: 각 반복은 토큰을 소비합니다. 대규모 코드베이스에서 50회 반복은 $50-100+ 비용이 발생할 수 있습니다. 제한을 설정하세요.

마치며

Ralph Playbook은 AI 코딩에 대한 근본적으로 다른 접근 방식을 제시합니다. AI에게 무엇을 어떻게 하라고 지시하는 대신, 좋은 결과가 자연스럽게 나타나는 조건을 설계합니다.

핵심은 작은 작업, 깨끗한 컨텍스트, 그리고 강력한 역압입니다. 이 세 가지가 결합되면 AI가 놀라울 정도로 자율적이고 효과적으로 작동할 수 있습니다.

시작은 간단합니다. 명확한 요구사항이 있는 작은 프로젝트를 선택하고, 스펙을 작성하고, Ralph를 실행하세요. 무엇이 잘못되는지 지켜보고, 역압을 추가하고, 조정하세요. 계획이 잘못되면? 재생성하세요. 계획은 일회용입니다.

이 방법론의 아름다움은 단순함에 있습니다. 복잡한 오케스트레이션이나 정교한 프롬프트 엔지니어링이 아니라, 올바른 구조와 피드백 루프만 있으면 됩니다. 인간은 루프 안이 아니라 루프 위에서, 관찰하고 조정하며, Ralph가 Ralph답게 일하도록 합니다.

참고 자료:

게시됨

2026년 01월 21일

카테고리

AI, Software

작성자

choonzang

태그:

AI 코딩, Geoffrey Huntley, Ralph playbook, Ralph Wiggum, 코딩 루프