将 drill 提升到 Superpowers 的 `evals/` 中 —— 实施计划

由 Markdown 原样翻译并转换为 Astro Starlight MDX 格式。

将 drill 提升到 Superpowers 的 `evals/` 中 —— 实施计划

对于 agentic workers: REQUIRED SUB-SKILL: 使用 superpowers:subagent-driven-development (推荐) or superpowers:executing-plans to implement this 计划 task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Move the standalone obra/drill skill-compliance benchmark into superpowers as a top-level evals/ 目录, delete redundant bash tests under superpowers/tests/ after per-file 子 agent 验证 of drill 场景 coverage, and update top-level docs so contributors land on the 新 structure.

架构: Single PR against dev on a 新 branch f/evals-lift. Drill source is copied verbatim with explicit rsync excludes to keep .git/, .venv/, etc. out of the 新 dir. A small helper in drill/cli.py defaults SUPERPOWERS_ROOT to the parent of the evals/ 目录, so contributors don’t have to set the env var. Each bash-test deletion is gated by a 子 agent that compares the bash test’s assertions to its claimed drill 场景’s verify block. Historical 引用 in 计划 docs and release notes are annotated, not rewritten.

Tech Stack: Python 3.11 + uv (drill’s 现有 toolchain, unchanged); rsync; bash; git.

Spec: docs/superpowers/specs/2026-05-06-lift-drill-into-evals-design.md — read this first.

Drill source location: /Users/jesse/Documents/GitHub/superpowers/drill/ (sibling to superpowers/).

Task 1: Branch off dev

文件: none (git operation only)

步骤 1：验证 clean working tree

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git status --short

预期: empty output (or only untracked .opencode/package-lock.json, which is fine).

步骤 2： Fetch latest dev

git fetch origin dev:dev

步骤 3：创建 the branch

git checkout -b f/evals-lift dev

预期: Switched to a new branch 'f/evals-lift'.

步骤 4： Sanity check

git log --oneline -1

预期 output begins with whatever commit origin/dev points to (currently b4363df docs: turned the dash in "- Jesse" into an escape sequence (#1474)).

Task 2: Capture drill SHA at copy time

文件: none (records the value for the lift 提交消息)

步骤 1： Get the 当前 drill HEAD SHA

cd /Users/jesse/Documents/GitHub/superpowers/drill
DRILL_SHA=$(git rev-parse HEAD)
echo "$DRILL_SHA"

步骤 2：验证 drill has no uncommitted work

cd /Users/jesse/Documents/GitHub/superpowers/drill
git status --short

预期: empty (no untracked or modified files). 如果 output is non-empty, 停止 and 报告 — drill working tree must be clean before lift, otherwise the SHA-pin is meaningless.

步骤 3： Save the SHA in shell env for next 任务

echo "DRILL_SHA=$DRILL_SHA"  # write this down for use in Task 3

Task 3: rsync drill into evals/

文件：

创建: evals/ (entire 目录 tree from drill, minus excludes)
步骤 1：验证 source and destination paths

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
test -d /Users/jesse/Documents/GitHub/superpowers/drill && echo "drill source: OK"
test ! -d evals && echo "evals/ does not yet exist: OK"

预期: both echoes print.

步骤 2： rsync drill to evals/ with explicit excludes

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
rsync -a \
  --exclude=.git \
  --exclude=.venv \
  --exclude=results \
  --exclude=.env \
  --exclude=__pycache__ \
  --exclude='*.egg-info' \
  --exclude=.private-journal \
  --exclude='*.pyc' \
  /Users/jesse/Documents/GitHub/superpowers/drill/ \
  evals/

步骤 3：验证 excludes worked

find evals -name '.git' -type d
find evals -name '.venv' -type d
find evals -name 'results' -type d
find evals -name '.env'
find evals -name '__pycache__' -type d
find evals -name '*.egg-info' -type d

预期: every command returns no output. 如果 any returns a path, manually rm -rf it before continuing.

步骤 4： Confirm the source SHA for the 提交消息

cd /Users/jesse/Documents/GitHub/superpowers/drill
DRILL_SHA=$(git rev-parse HEAD)
echo "$DRILL_SHA"

预期: the SHA from Task 2 step 1.

步骤 5： Stage everything

git add evals/
git status --short | head -20

预期 output starts with A evals/... lines listing many added files. Many of these are in scenarios/, drill/, backends/, setup_helpers/, etc.

步骤 6：提交

: "${DRILL_SHA:?Set DRILL_SHA from Task 2 before committing}"
git commit -m "$(cat <<EOF
Lift drill into evals/ at $DRILL_SHA

rsync of obra/drill@$DRILL_SHA into superpowers/evals/, excluding
.git/, .venv/, results/, .env/, __pycache__/, *.egg-info/,
.private-journal/.

The drill repo is unaffected by this commit; archival is a separate
manual step after this PR merges.

Source SHA recorded in this commit message for provenance.
EOF
)"

Task 4: 验证 the copy with checksums

文件: none (验证 only)

步骤 1： Get list of files that exist in drill but should NOT be in evals (the excludes)

cd /Users/jesse/Documents/GitHub/superpowers/drill
find . \
  \( -name '.git' -prune \
  -o -name '.venv' -prune \
  -o -name 'results' -prune \
  -o -name '__pycache__' -prune \
  -o -name '*.egg-info' -prune \
  -o -name '.private-journal' -prune \
  -o -name '*.pyc' -prune \
  -o -name '.env' -prune \) \
  -o -type f -print | sort > /tmp/drill-files.txt
wc -l /tmp/drill-files.txt

步骤 2： Get list of files in evals/

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
find evals -type f | sed 's|^evals/|./|' | sort > /tmp/evals-files.txt
wc -l /tmp/evals-files.txt

步骤 3： Diff the two lists

The file lists should match exactly after excluded paths are removed.

diff /tmp/drill-files.txt /tmp/evals-files.txt

预期: no output.

步骤 4： Per-file 校验和验证

cd /Users/jesse/Documents/GitHub/superpowers/drill
while read -r f; do
  sha1=$(shasum -a 256 "$f" | cut -d' ' -f1)
  sha2=$(shasum -a 256 "/Users/jesse/Documents/GitHub/superpowers/superpowers/evals/${f#./}" | cut -d' ' -f1)
  if [ "$sha1" != "$sha2" ]; then
    echo "MISMATCH: $f ($sha1 vs $sha2)"
  fi
done < /tmp/drill-files.txt | head -20

预期: no output (every file’s 校验和 matches between drill and evals).

步骤 5： Smoke check - install 依赖

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv sync

预期: Installed N packages or similar. No errors.

步骤 6： Smoke check - drill list

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv run drill list 2>&1 | head -5

预期: starts with 场景 names. (Will likely 错误 or warn about missing SUPERPOWERS_ROOT — that’s fine, fixed in next task.)

步骤 7： Dispatch 验证子 agent

Dispatch a general-purpose 子 agent with this 提示词:

You are verifying a verbatim copy of the drill repo at
/Users/jesse/Documents/GitHub/superpowers/drill into
/Users/jesse/Documents/GitHub/superpowers/superpowers/evals.

Verify:

1. The lift commit message records the SHA reported by:
  cd /Users/jesse/Documents/GitHub/superpowers/drill && git rev-parse HEAD

2. None of these excluded paths exist under evals/: .git/, .venv/,
results/, .env/, __pycache__/, *.egg-info/, .private-journal/.

3. Every non-excluded file in drill has a SHA-256-identical
counterpart in evals/, and there are no extra files in evals/.

4. The pyproject.toml, uv.lock, scenarios/*.yaml, backends/*.yaml,
setup_helpers/*.py, drill/*.py, prompts/*.md, fixtures/, bin/, and
docs/ are all present.

Report each check with PASS/FAIL. If any FAIL, dump enough detail
that the parent can fix.

如果 the 子 agent reports any FAIL, fix the underlying issue (delete the leaked file, re-rsync, etc.) before continuing.

Task 5: 添加 `SUPERPOWERS_ROOT` default helper

文件：

修改: evals/drill/cli.py:11-14
步骤 1： Read the 当前 cli.py header

sed -n '1,20p' /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/drill/cli.py

预期 output:

"""Drill CLI: run, compare, list."""

from __future__ import annotations

import secrets
from pathlib import Path

import click
from dotenv import load_dotenv

PROJECT_ROOT: Path = Path(__file__).parent.parent

load_dotenv(PROJECT_ROOT / ".env")

步骤 2： Write a failing test for the helper

打开 evals/tests/test_cli.py and add this test at the end:

def test_set_superpowers_root_default_when_unset(monkeypatch, tmp_path):
    """When SUPERPOWERS_ROOT is unset, helper sets it to PROJECT_ROOT.parent."""
    monkeypatch.delenv("SUPERPOWERS_ROOT", raising=False)
    from drill.cli import _set_superpowers_root_default, PROJECT_ROOT

    _set_superpowers_root_default()

    import os
    assert os.environ["SUPERPOWERS_ROOT"] == str(PROJECT_ROOT.parent)


def test_set_superpowers_root_default_respects_existing(monkeypatch):
    """When SUPERPOWERS_ROOT is already set, helper does not override."""
    monkeypatch.setenv("SUPERPOWERS_ROOT", "/custom/path")
    from drill.cli import _set_superpowers_root_default

    _set_superpowers_root_default()

    import os
    assert os.environ["SUPERPOWERS_ROOT"] == "/custom/path"

步骤 3：运行 the test and watch it fail

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv run pytest tests/test_cli.py -k set_superpowers_root_default -v

预期: 2 tests fail with AttributeError: module 'drill.cli' has no attribute '_set_superpowers_root_default'.

步骤 4：添加 the helper to cli.py

Edit /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/drill/cli.py. 替换 lines 1–14 with:

"""Drill CLI: run, compare, list."""

from __future__ import annotations

import os
import secrets
from pathlib import Path

import click
from dotenv import load_dotenv

PROJECT_ROOT: Path = Path(__file__).parent.parent

load_dotenv(PROJECT_ROOT / ".env")


def _set_superpowers_root_default() -> None:
    """Default SUPERPOWERS_ROOT to the parent of evals/ if not already set.

    Drill historically required contributors to export SUPERPOWERS_ROOT
    pointing at the superpowers checkout. After lifting drill into
    superpowers/evals/, the parent of PROJECT_ROOT is always the
    superpowers root, so we can supply this default automatically.

    Existing SUPERPOWERS_ROOT environment values are respected as overrides.
    """
    os.environ.setdefault("SUPERPOWERS_ROOT", str(PROJECT_ROOT.parent))


_set_superpowers_root_default()

The bottom-of-module call to _set_superpowers_root_default() runs at import time, immediately after load_dotenv(). This ensures both engine.py and setup.py (which read os.environ["SUPERPOWERS_ROOT"] directly) and the YAML interpolation (which 读取 os.environ when the backend YAML is 已加载) all see the value.

步骤 5：运行 the test and watch it pass

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv run pytest tests/test_cli.py -k set_superpowers_root_default -v

预期: 2 tests pass.

步骤 6：提交

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add evals/drill/cli.py evals/tests/test_cli.py
git commit -m "evals: default SUPERPOWERS_ROOT to parent of evals/ if unset

Adds _set_superpowers_root_default() to drill/cli.py, called at
module import after load_dotenv(). PROJECT_ROOT resolves to evals/
post-lift; its parent is the superpowers repo root, which is the
correct value for SUPERPOWERS_ROOT.

Existing env values are respected as overrides via os.environ.setdefault.

Tests:
- helper sets default when var is unset
- helper does not override when var is already set"

Task 6: 更新 backend YAMLs to reflect the 新 env contract

文件：

修改: evals/backends/codex.yaml (drop SUPERPOWERS_ROOT from required_env)
修改: evals/backends/gemini.yaml (drop SUPERPOWERS_ROOT from required_env)

The five claude*.yaml backend configs interpolate ${SUPERPOWERS_ROOT} into args for the --plugin-dir flag — they keep SUPERPOWERS_ROOT in required_env because the interpolation needs it. The codex/gemini configs only listed it for engine.py/setup.py’s os.environ 读取, which the helper now satisfies.

步骤 1： Confirm 当前 state

grep -A3 'required_env:' /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/backends/codex.yaml
grep -A2 'required_env:' /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/backends/gemini.yaml

预期 outputs include - SUPERPOWERS_ROOT lines.

步骤 2： Read codex.yaml fully

cat /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/backends/codex.yaml

步骤 3： Edit codex.yaml — drop the - SUPERPOWERS_ROOT line under required_env

打开 evals/backends/codex.yaml and find:

required_env:
  - OPENAI_API_KEY
  - SUPERPOWERS_ROOT

替换为：

required_env:
  - OPENAI_API_KEY

步骤 4： Edit gemini.yaml — drop the - SUPERPOWERS_ROOT line under required_env

打开 evals/backends/gemini.yaml and find:

required_env:
  - SUPERPOWERS_ROOT

替换为：

required_env: []

(Empty list rather than dropping the field, so YAML schema validation doesn’t trip.)

步骤 5：运行 drill’s pytest suite to ensure nothing broke

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv run pytest -x 2>&1 | tail -20

预期: all tests pass. 如果 tests/test_backend.py complains about required_env membership for codex/gemini, see Task 7.

步骤 6：提交

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add evals/backends/codex.yaml evals/backends/gemini.yaml
git commit -m "evals: drop SUPERPOWERS_ROOT from codex/gemini required_env

These backends only read SUPERPOWERS_ROOT via engine.py/setup.py's
os.environ access, which the new cli.py default helper supplies
automatically. claude*.yaml keep SUPERPOWERS_ROOT in required_env
because they interpolate \${SUPERPOWERS_ROOT} into --plugin-dir args."

Task 7: 更新 drill’s pytest suite for the 新 contract

文件：

修改: evals/tests/test_backend.py (per-test updates if Task 6 step 5 surfaced 失败)
步骤 1：运行 the test suite

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv run pytest tests/test_backend.py -v 2>&1 | tail -30

如果 all tests pass, 跳过 to step 5 (commit nothing, move to Task 8). Otherwise:

步骤 2： Read failing tests

对于 each 失败, open the test in evals/tests/test_backend.py and read the assertion.

步骤 3：更新 assertions

对于 tests that assert SUPERPOWERS_ROOT membership in codex.yaml’s or gemini.yaml’s required_env: invert the assertion to confirm absence. Example:

# Before:
def test_codex_requires_superpowers_root():
    backend = load_backend("codex")
    assert "SUPERPOWERS_ROOT" in backend.required_env

# After:
def test_codex_does_not_require_superpowers_root():
    """codex.yaml dropped SUPERPOWERS_ROOT from required_env;
    the cli.py helper supplies the default."""
    backend = load_backend("codex")
    assert "SUPERPOWERS_ROOT" not in backend.required_env

步骤 4： Re-run the test suite

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
uv run pytest -x 2>&1 | tail -10

预期：所有测试通过。

步骤 5：提交 (only if step 1 had 失败)

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add evals/tests/test_backend.py
git commit -m "evals: update test_backend.py for relaxed required_env contract"

Task 8: 更新 evals/README.md and evals/CLAUDE.md

文件：

修改: evals/README.md (drop SUPERPOWERS_ROOT setup step)
修改: evals/CLAUDE.md (drop SUPERPOWERS_ROOT setup step)
步骤 1： Edit evals/README.md

Find the section that looks like:

Required environment:
```bash
export SUPERPOWERS_ROOT=/path/to/superpowers
export ANTHROPIC_API_KEY=sk-...

替换为：

```markdown
Required environment:
```bash
export ANTHROPIC_API_KEY=sk-...

SUPERPOWERS_ROOT defaults to the parent of evals/ (the superpowers repo root) and only needs to be set if you’re running drill against a different superpowers checkout.

- [ ] **步骤 2： Edit evals/CLAUDE.md**

Find the section:

```markdown
## Required env

SUPERPOWERS_ROOT=/path/to/superpowers ANTHROPIC_API_KEY=sk-…

替换为：

## Required env

ANTHROPIC_API_KEY=sk-…

`SUPERPOWERS_ROOT` defaults to the parent of `evals/` (the superpowers repo root). Override only if running drill against a different superpowers checkout.

步骤 3：提交

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add evals/README.md evals/CLAUDE.md
git commit -m "evals: drop SUPERPOWERS_ROOT setup step from README/CLAUDE

The cli.py helper now defaults the env var. Mention as override only."

Task 9: Validate from 新 location

文件: none (validation only — no commit unless something needs fixing)

步骤 1：运行 drill’s full pytest suite

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
unset SUPERPOWERS_ROOT
uv run pytest 2>&1 | tail -5

预期: all tests pass. The unset ensures we’re testing the helper, not an inherited env var.

步骤 2：运行 drill list

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
unset SUPERPOWERS_ROOT
uv run drill list 2>&1 | head -10

预期: 场景 list, no 错误 about missing SUPERPOWERS_ROOT.

步骤 3： Source the env file

set -a
source /Users/jesse/Documents/GitHub/prime-radiant-inc/sprout/.env
set +a
echo "ANTHROPIC_API_KEY set: ${ANTHROPIC_API_KEY:+yes}"

预期: ANTHROPIC_API_KEY set: yes.

步骤 4：运行 a cheap drill 场景

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
unset SUPERPOWERS_ROOT
uv run drill run triggering-test-driven-development -b claude 2>&1 | tail -3

预期: claude: 1 passed, 0 failed, 0 errors.

如果 FAIL, debug before continuing. The path-defaults change is the most likely culprit; check that the helper actually fired by adding a print(os.environ["SUPERPOWERS_ROOT"]) after the helper call temporarily.

Task 10: Bash test deletion 阶段 — per-file with 子 agent gate

This 任务 has many sub-steps because each candidate-deletion file gets its own 子 agent 验证 + commit. The candidate list comes from the spec’s coverage map. 对于 each entry below:

Read the bash test file.
Read the candidate drill 场景 YAML.
Dispatch a 子 agent with both contents and the comparison prompt.
Subagent reports per-assertion match table.
如果 every bash assertion has a match: delete the bash test, commit.
如果 any unmatched: 停止, escalate, do not delete.

Subagent 提示词模板 (use for every deletion):

You are gating a bash test deletion. The bash test is allegedly
covered by a drill scenario; your job is to verify that claim.

BASH TEST: <paste full contents of bash test>

DRILL SCENARIO: <paste full contents of drill scenario YAML>

Output a markdown table with columns: BASH ASSERTION, DRILL CHECK,
STATUS. List EVERY assertion the bash test makes (every grep, every
[ ], every test command, every PASS/FAIL emit). For each, find a
matching drill check (in verify.assertions or verify.criteria) or
mark as UNMATCHED.

After the table, output "VERDICT: SAFE TO DELETE" if every bash
assertion has a match, otherwise "VERDICT: KEEP — N unmatched
assertions". Be conservative: if you are uncertain about a match,
mark as UNMATCHED.

Task 10a: Skill-triggering prompts (6 files)

文件：

删除: tests/skill-triggering/prompts/dispatching-parallel-agents.txt
删除: tests/skill-triggering/prompts/executing-plans.txt
删除: tests/skill-triggering/prompts/requesting-code-review.txt
删除: tests/skill-triggering/prompts/systematic-debugging.txt
删除: tests/skill-triggering/prompts/test-driven-development.txt
删除: tests/skill-triggering/prompts/writing-plans.txt
Keep: tests/skill-triggering/run-test.sh, run-all.sh

These 提示词 files are inputs to the bash runner — they don’t have their own assertions. The runner script does the assertion. Map each 提示词 to its drill 场景:

Prompt	Drill 场景
dispatching-parallel-agents.txt	triggering-dispatching-parallel-agents.yaml
executing-plans.txt	triggering-executing-plans.yaml
requesting-code-review.txt	triggering-requesting-code-review.yaml
systematic-debugging.txt	triggering-systematic-debugging.yaml
test-driven-development.txt	triggering-test-driven-development.yaml
writing-plans.txt	triggering-writing-plans.yaml

步骤 1：对于 each 提示词 file, 派发 the 子 agent

对于提示词 tests/skill-triggering/prompts/<name>.txt and 场景 evals/scenarios/triggering-<name>.yaml, run the 子 agent 提示词模板 with both contents pasted in. The 子 agent’s job is to verify the 提示词 content matches what the drill 场景’s turns[].intent describes.

如果 all 6 verify SAFE TO DELETE, proceed to step 2. 如果 any verifies KEEP, that one stays and the rest may still proceed.

步骤 2：验证 the runner is still useful for unrelated cases

ls /Users/jesse/Documents/GitHub/superpowers/superpowers/tests/skill-triggering/prompts/

如果 the prompts/ 目录 is empty after the planned deletions, also delete tests/skill-triggering/run-test.sh and run-all.sh (they have nothing to run). Otherwise keep the runner.

步骤 3：删除 and commit

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git rm tests/skill-triggering/prompts/dispatching-parallel-agents.txt
git rm tests/skill-triggering/prompts/executing-plans.txt
git rm tests/skill-triggering/prompts/requesting-code-review.txt
git rm tests/skill-triggering/prompts/systematic-debugging.txt
git rm tests/skill-triggering/prompts/test-driven-development.txt
git rm tests/skill-triggering/prompts/writing-plans.txt
# If runner is now orphaned:
git rm tests/skill-triggering/run-test.sh tests/skill-triggering/run-all.sh
rmdir tests/skill-triggering/prompts/ 2>/dev/null || true
rmdir tests/skill-triggering/ 2>/dev/null || true
git commit -m "tests: remove skill-triggering bash prompts (covered by drill triggering-* scenarios)

Subagent verification confirmed each prompt's intent matches its
corresponding drill scenario's turns[].intent. Drill scenarios are
canonical; bash runner has no remaining prompts to drive."

Task 10b: explicit-skill-requests (selective deletion)

文件：

Inspect: 6 files in tests/explicit-skill-requests/
删除: only those verified to be 100% covered by drill scenarios
Keep: the rest

Per the spec’s updated coverage map, most of these have no drill counterpart. The likely-deletable ones:

Bash test	Candidate drill 场景	Likely outcome
`run-test.sh`	n/a (runner)	KEEP
`run-all.sh`	n/a (runner)	KEEP
`run-claude-describes-sdd.sh`	`mid-conversation-skill-invocation.yaml`	likely DELETE; verify
`run-haiku-test.sh`	none (Haiku-specific)	KEEP
`run-multiturn-test.sh`, `run-extended-multiturn-test.sh`	none	KEEP
`prompts/please-use-brainstorming.txt`, `prompts/use-systematic-debugging.txt`	none	KEEP

步骤 1： Read each .sh file and 提示词 to confirm

for f in /Users/jesse/Documents/GitHub/superpowers/superpowers/tests/explicit-skill-requests/*.sh /Users/jesse/Documents/GitHub/superpowers/superpowers/tests/explicit-skill-requests/prompts/*.txt; do
  echo "=== $f ==="
  cat "$f" | head -30
done

步骤 2： Dispatch 子 agent for run-claude-describes-sdd.sh only

使用 the 子 agent 提示词模板 above with:

Bash test content: tests/explicit-skill-requests/run-claude-describes-sdd.sh
Drill 场景: evals/scenarios/mid-conversation-skill-invocation.yaml
步骤 3： Act on 子 agent verdict

如果 SAFE TO DELETE:

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git rm tests/explicit-skill-requests/run-claude-describes-sdd.sh
git commit -m "tests: remove run-claude-describes-sdd.sh (covered by drill mid-conversation-skill-invocation)

Subagent verification: every assertion matches a drill check.
Other tests in tests/explicit-skill-requests/ are preserved
(run-haiku-test.sh, run-*-multiturn-test.sh, please-use-brainstorming
and use-systematic-debugging prompts have no drill coverage)."

如果 KEEP: 跳过 the deletion, document the gap as a future drill-scenario authoring task.

Task 10c: subagent-driven-dev real-project tests

文件：

Inspect: tests/subagent-driven-dev/go-fractals/, tests/subagent-driven-dev/svelte-todo/
Candidate scenarios: evals/scenarios/sdd-go-fractals.yaml, evals/scenarios/sdd-svelte-todo.yaml

These are entire fixture 目录 with design.md, plan.md, scaffold.sh. Each fixture 目录 was lifted into drill as a fixture under evals/fixtures/.

步骤 1： Confirm drill has fixture parity

ls /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/fixtures/sdd-go-fractals/
ls /Users/jesse/Documents/GitHub/superpowers/superpowers/evals/fixtures/sdd-svelte-todo/

预期: each contains design.md, plan.md, scaffold.sh (or equivalent) matching the source under tests/subagent-driven-dev/.

步骤 2： Dispatch 子 agent for each pair

Subagent 提示词: same 模板, with bash “test” being the 目录’s scaffold.sh and (if present) any *.sh runner. Drill 场景 being the corresponding sdd-*.yaml.

步骤 3： Act on verdicts

对于 each that returns SAFE TO DELETE:

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git rm -r tests/subagent-driven-dev/go-fractals/   # or svelte-todo
git commit -m "tests: remove subagent-driven-dev/<fixture> (covered by drill sdd-<fixture>)

Subagent verification: drill scenario asserts test suite passes
post-execution. Fixture content lives at evals/fixtures/sdd-<fixture>/."

如果 both 目录 are removed, also git rm -r tests/subagent-driven-dev/ if it becomes empty.

Task 10d: tests/claude-code/test-document-review-system.sh

Candidate 场景: evals/scenarios/spec-reviewer-catches-planted-flaws.yaml

步骤 1： Dispatch 子 agent

Subagent 提示词模板 with the bash test content and the drill 场景 YAML.

步骤 2： Act on verdict

如果 SAFE TO DELETE:

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git rm tests/claude-code/test-document-review-system.sh
git commit -m "tests: remove test-document-review-system.sh (covered by drill spec-reviewer-catches-planted-flaws)

Subagent verification: every assertion matches a drill check."

Task 10e: tests/claude-code/test-requesting-code-review.sh

Candidate 场景: evals/scenarios/code-review-catches-planted-bugs.yaml

步骤 1： Dispatch 子 agent

Subagent 提示词模板 with both contents.

步骤 2： Act on verdict

如果 SAFE TO DELETE:

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git rm tests/claude-code/test-requesting-code-review.sh
git commit -m "tests: remove test-requesting-code-review.sh (covered by drill code-review-catches-planted-bugs)

Subagent verification: every assertion matches a drill check."

Task 10f: tests/claude-code/test-worktree-native-preference.sh

Candidate 场景: evals/scenarios/worktree-creation-under-pressure.yaml

步骤 1： Dispatch 子 agent

Subagent 提示词模板 with both contents.

步骤 2： Act on verdict

如果 SAFE TO DELETE:

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git rm tests/claude-code/test-worktree-native-preference.sh
git commit -m "tests: remove test-worktree-native-preference.sh (covered by drill worktree-creation-under-pressure)

Subagent verification: every assertion matches a drill check."

Task 10g: tests/claude-code/test-subagent-driven-development-integration.sh

Candidate 场景: evals/scenarios/sdd-rejects-extra-features.yaml (partial)

The spec marks this as “almost certainly keep + extend drill 场景”. Don’t delete. Instead:

步骤 1： Dispatch 子 agent for the comparison anyway

This documents the gap explicitly.

步骤 2： Decide based on 子 agent output

Likely outcome: KEEP with documented gap. The bash test asserts: commit_count >= 3, npm test passes, runs analyze-token-usage.py. The drill 场景 asserts forbidden-exports + reviewer-as-gate. These are mostly disjoint.

步骤 3： Document the gap (if KEEP)

添加 a comment at the top of tests/claude-code/test-subagent-driven-development-integration.sh:

# Drill coverage: sdd-rejects-extra-features.yaml covers the YAGNI
# enforcement (forbidden exports + reviewer-as-gate). This bash test
# additionally asserts: ≥3 task commits, npm test passes, token
# analysis runs. Keep until those assertions are added to drill or
# explicitly retired.

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add tests/claude-code/test-subagent-driven-development-integration.sh
git commit -m "tests: annotate SDD integration test with drill coverage notes

Drill scenario sdd-rejects-extra-features covers the YAGNI subset.
This bash test adds: ≥3 commits, npm test, token analysis. Kept
until drill scenario covers those or they're retired."

Task 10h: tests/claude-code/test-subagent-driven-development.sh

This is a meta/describe-skill test (per spec). No drill 场景 covers describe-skill behavior.

步骤 1： Confirm by reading the file

cat /Users/jesse/Documents/GitHub/superpowers/superpowers/tests/claude-code/test-subagent-driven-development.sh

预期: tests asking the agent to describe SDD skills, not exercise them.

步骤 2： KEEP and annotate

添加 at the top:

# No drill coverage: this test asks the agent to *describe* SDD
# (asserts that asked-about skills can be summarized correctly).
# Drill scenarios test behavior, not description. Kept.

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add tests/claude-code/test-subagent-driven-development.sh
git commit -m "tests: annotate SDD describe-skill test with kept-by-design note

Tests agent's ability to *describe* the SDD skill — drill scenarios
test behavior, not description. No drill coverage; kept by design."

Task 11: Stale-reference scrub

文件：

Possibly modify: docs/testing.md, README.md, CLAUDE.md, lefthook.yml, .opencode/INSTALL.md, .codex-plugin/INSTALL.md, .github/*, scripts/*
Annotate (do not rewrite): RELEASE-NOTES.md, docs/superpowers/plans/*.md
步骤 1： Build list of deleted-file paths

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git diff --name-only --diff-filter=D dev..HEAD | sort > /tmp/deleted-paths.txt
cat /tmp/deleted-paths.txt

步骤 2： Search for active 引用

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
while read -r path; do
  echo "=== $path ==="
  grep -rln "$path" \
    --include="*.md" \
    --include="*.yml" \
    --include="*.yaml" \
    --include="*.sh" \
    --include="*.json" \
    --exclude-dir=node_modules \
    --exclude-dir=.venv \
    --exclude-dir=evals \
    --exclude-dir=.git \
    .
done < /tmp/deleted-paths.txt

This finds every 引用 to a deleted file. Categorize each hit:

Hit location	Treatment
`docs/testing.md`	更新 — actively documents the test
`README.md` (Contributing section)	更新 if it points at deleted tests
`CLAUDE.md`, `GEMINI.md`, `AGENTS.md`	更新 if they 引用 deleted tests
`.github/workflows/*.yml`	更新 — CI shouldn’t try to run deleted tests
`scripts/*`	更新 if they run deleted tests
`.opencode/INSTALL.md`, `.codex-plugin/INSTALL.md`	更新 if they 引用 deleted tests
`lefthook.yml`	更新 if hooks invoke deleted tests
`RELEASE-NOTES.md`	Annotate, don’t rewrite (dated artifact)
`docs/superpowers/plans/*.md`	Annotate, don’t rewrite (dated artifact)

步骤 3：更新 active 引用

对于 each “更新” hit, edit the file to either:

移除 the 引用 if the deleted test was the only reason it was named.
替换 with a pointer to the drill 场景 (e.g., “see evals/scenarios/triggering-test-driven-development.yaml”).
步骤 4： Annotate dated artifacts

对于 each RELEASE-NOTES.md or docs/superpowers/plans/*.md hit, add an inline annotation at the first hit per file:

> Note: this section references `tests/skill-triggering/run-all.sh` and
> related bash tests that were lifted into drill scenarios on 2026-05-06
> (see `evals/scenarios/triggering-*.yaml`). The references are
> preserved as dated artifacts of the work this doc describes.

Don’t modify the actual 引用 — they’re historical.

步骤 5： Dispatch 子 agent for second-pass scrub

Dispatch a general-purpose 子 agent:

Working directory: /Users/jesse/Documents/GitHub/superpowers/superpowers

These bash test paths were deleted on the current branch; some are
already addressed, but I want a second pair of eyes:

<paste contents of /tmp/deleted-paths.txt>

Search the entire superpowers tree (excluding evals/, node_modules/,
.venv/, .git/) for any remaining references to those paths. Report
every hit with file:line and one-sentence judgment of whether it
needs an update or is fine as-is. Do not modify files; just report.

Address every reported hit before continuing.

步骤 6：提交 the active updates

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add -u  # picks up edits to existing files
git commit -m "docs: update references to lifted-and-deleted bash tests

Active references in docs/testing.md, README.md, CI workflows, etc.
now point at drill scenarios. Historical references in RELEASE-NOTES.md
and docs/superpowers/plans/*.md are annotated as dated artifacts,
not rewritten."

Task 12: Top-level docs

文件：

修改: docs/testing.md — split into “Plugin tests” + “Skill behavior evals”
修改: CLAUDE.md — add evals pointer
修改: README.md — add Contributing-section pointer
修改: .gitignore — add evals/results/, evals/.venv/, evals/.env
步骤 1： Split docs/testing.md

The file is currently Claude-Code-centric. Split into two top-level sections.

打开 /Users/jesse/Documents/GitHub/superpowers/superpowers/docs/testing.md and replace the file content with this structure (preserve the 现有 Plugin-test details where applicable):

# Testing Superpowers

Superpowers has two distinct kinds of tests, each in its own directory:

- **`tests/`** — does the plugin's non-LLM code work? Bash + node + python integration tests for brainstorm-server JS, OpenCode plugin loading, codex-plugin sync, and analysis utilities.
- **`evals/`** — do agents behave correctly on real LLM sessions? Python harness driving real tmux sessions of Claude Code / Codex / Gemini CLI / Copilot CLI, with an LLM actor and verifier judging skill compliance.

## Plugin tests

Live in `tests/`. Currently:

- `tests/brainstorm-server/` — node test suite for the brainstorm server JS code.
- `tests/opencode/` — bash tests for OpenCode plugin loading, bootstrap caching, and tool registration.
- `tests/codex-plugin-sync/` — bash sync verification.
- `tests/claude-code/test-helpers.sh`, `analyze-token-usage.py` — utilities used by remaining bash tests.
- `tests/claude-code/test-subagent-driven-development.sh` — agent-can-describe-SDD test (no drill counterpart).
- `tests/claude-code/test-subagent-driven-development-integration.sh` — extended SDD integration with token analysis (drill covers the YAGNI subset).
- `tests/explicit-skill-requests/` — Haiku-specific, multi-turn, and skill-name-prompted tests not covered by drill.

Run plugin tests via the relevant directory's `run-*.sh` or `npm test`.

## Skill behavior evals

Live in `evals/`. Drill is the harness; scenarios live at `evals/scenarios/*.yaml`. See `evals/README.md` for setup. Quick start:

```bash
cd evals
uv sync
export ANTHROPIC_API_KEY=sk-...
uv run drill run triggering-test-driven-development -b claude

Drill scenarios are slow (3-30+ minutes each) and run real LLM sessions. They are not part of CI today; the natural follow-up is a tiered model (fast subset on PR, full sweep nightly + on-demand).

- [ ] **步骤 2： 更新 CLAUDE.md**

Read the 当前 CLAUDE.md, find a spot near the 项目 structure section, and add:

```markdown
## Eval harness

Skill-behavior evals live at `evals/` — see `evals/README.md`. Drill (the harness) drives real tmux sessions of Claude Code / Codex / Gemini CLI / Copilot CLI and judges skill compliance with an LLM verifier. Plugin-infrastructure tests still live at `tests/`.

步骤 3：更新 README.md

Find the Contributing section. 添加 a line:

- Skill-behavior tests use the eval harness at `evals/`. See `evals/README.md` for setup. Plugin-infrastructure tests live at `tests/` and run via the relevant `run-*.sh` or `npm test`.

步骤 4：更新 top-level .gitignore

打开 /Users/jesse/Documents/GitHub/superpowers/superpowers/.gitignore and add at the bottom:

# Eval harness — drill ships its own gitignore at evals/.gitignore;
# these are belt-and-suspenders entries for tools that don't recurse.
evals/results/
evals/.venv/
evals/.env

步骤 5：提交

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git add docs/testing.md CLAUDE.md README.md .gitignore
git commit -m "docs: introduce evals/ as the canonical skill-behavior eval harness

- docs/testing.md split into Plugin tests + Skill behavior evals
- CLAUDE.md adds Eval harness section pointing at evals/
- README.md Contributing section mentions evals/ alongside tests/
- .gitignore adds evals/{results,.venv,.env} as belt-and-suspenders
  (evals/.gitignore covers these locally; root-level entries help
  tooling that does not recurse into nested ignore files)."

Task 13: Re-run smoke checks (regression gate)

文件: none (validation only)

步骤 1：运行 drill’s pytest

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
unset SUPERPOWERS_ROOT
uv run pytest 2>&1 | tail -5

预期：所有测试通过。

步骤 2：运行 cheap drill 场景

set -a
source /Users/jesse/Documents/GitHub/prime-radiant-inc/sprout/.env
set +a
cd /Users/jesse/Documents/GitHub/superpowers/superpowers/evals
unset SUPERPOWERS_ROOT
uv run drill run triggering-test-driven-development -b claude 2>&1 | tail -3

预期: claude: 1 passed, 0 failed, 0 errors. 如果 FAIL, the docs / scrub / deletion 阶段 broke something — bisect over the recent commits.

步骤 3：运行 remaining plugin tests that survived

cd /Users/jesse/Documents/GitHub/superpowers/superpowers/tests/brainstorm-server
node server.test.js 2>&1 | tail -3

预期: Results: 25 passed, 0 failed.

Task 14: Final adversarial 审查

文件: none (审查 only; 子 agent dispatches)

步骤 1： Build the diff for reviewers

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git log --oneline dev..HEAD
git diff dev..HEAD --stat

Capture both outputs to share with reviewers.

步骤 2： Dispatch two parallel subagents

使用 the Agent tool with two parallel calls. Same 提示词 to both, with adversarial framing:

Adversarial review competition: 5 points to whoever finds the most
legitimate issues. You're competing against a parallel reviewer
assigned the identical task.

**Branch:** f/evals-lift, in /Users/jesse/Documents/GitHub/superpowers/superpowers
**Base:** dev (currently b4363df)
**Spec:** docs/superpowers/specs/2026-05-06-lift-drill-into-evals-design.md

This branch lifts the obra/drill repo into superpowers/evals/ and
deletes redundant bash tests that drill scenarios cover. Two prior
adversarial reviews caught issues at the spec stage; this is the
post-implementation review.

Run: git log --oneline dev..HEAD; git diff dev..HEAD --stat

Look hard at:
1. Did the rsync-with-excludes actually exclude what it claimed?
   (find evals -name '.git' -type d should return nothing)
2. Does the lift commit message point at a real commit in obra/drill?
3. Does the SUPERPOWERS_ROOT helper actually default correctly when
   the env var is unset? (cd evals && unset SUPERPOWERS_ROOT && uv
   run drill list — does it work?)
4. For each deleted bash test, does the corresponding drill scenario
   actually verify what the bash test asserted? Spot-check by reading
   the scenario YAML.
5. Are there active references in docs/, .github/, scripts/,
   lefthook.yml that still point at deleted bash test paths?
6. Did the drill pytest suite get updated for the new env-var contract,
   and does it pass?
7. Did the smoke scenario actually get run after path changes?
8. Is the drill repo unchanged? (cd ../drill && git status)

Verify before claiming. If you assert "X is broken", check on disk
first. Confidently-wrong claims count negatively.

Report format: numbered list, each with severity (critical/important/
minor/nitpick) and one-sentence explanation with file:line. Lead with
most serious. Cap at ~600 words.

步骤 3： Address findings

对于 each legitimate finding from either 审查者, fix in a separate commit. Re-run smoke checks (Task 13) after fixes.

步骤 4： Declare a winner

Per the cross-platform PR pattern, count legitimate findings (false positives count negatively). Acknowledge the winner in your reply summary.

Task 15: Push and open PR

文件: none

步骤 1： Push the branch

cd /Users/jesse/Documents/GitHub/superpowers/superpowers
git push -u origin f/evals-lift

步骤 2：打开 PR against dev with full description

gh pr create \
  --base dev \
  --head f/evals-lift \
  --reviewer arittr \
  --title "Lift drill into superpowers as evals/ harness" \
  --body "$(cat <<'EOF'
## What problem are you trying to solve?

Drill — the standalone Python skill-compliance benchmark at obra/drill — is already the de facto eval harness for superpowers. The PRI-1397 commit series lifted ~22 bash tests into drill scenarios, and the most recent superpowers commit (a2292c5) explicitly removed a redundant bash test with the message "replaced by drill behavioral coverage". Drill is a sibling repo today, requiring contributors to clone two checkouts and set SUPERPOWERS_ROOT manually. This PR completes the migration: drill becomes superpowers/evals/.

## What does this PR change?

- Lifts the obra/drill repo into superpowers as `evals/`, with explicit rsync excludes (.git, .venv, results, .env, __pycache__, *.egg-info, .private-journal). The lift commit records the source SHA.
- Adds a `_set_superpowers_root_default()` helper to drill/cli.py so SUPERPOWERS_ROOT defaults to the parent of evals/ — no manual env-var setup.
- Drops SUPERPOWERS_ROOT from required_env in codex.yaml/gemini.yaml (the helper supplies it). Claude*.yaml keep it because they interpolate ${SUPERPOWERS_ROOT} into --plugin-dir args.
- Deletes redundant bash tests under tests/skill-triggering/, tests/explicit-skill-requests/, tests/subagent-driven-dev/, and tests/claude-code/ — gated per-file by a subagent that compared each bash test's assertions to its drill scenario's verify block. Anything not 100% covered was kept.
- docs/testing.md split into Plugin tests + Skill behavior evals.
- README.md Contributing and CLAUDE.md gain pointers to evals/.

## Is this change appropriate for the core library?

Yes. Cross-runtime evaluation is core to superpowers, the migration to drill scenarios was already underway in this repo, and the eval harness needs to be discoverable in-tree to be findable.

## What alternatives did you consider?

- Vendored copy + sync script (drill repo continues independently). Rejected: divergence risk; single-source-of-truth wins.
- git subtree merge (preserves drill history in-tree). Rejected: superpowers' git history grows by 50+ commits, the merge commit is ugly, subtrees are operationally heavy.
- Keep drill as a sibling repo and just polish docs. Rejected: doesn't solve the discoverability problem.

## Does this PR contain multiple unrelated changes?

No — every change supports "drill is now evals/ inside superpowers". Multiple commits for atomicity (verbatim copy, env helper, YAML updates, docs) but one direction.

## Existing PRs

- [x] I have reviewed all open AND closed PRs for duplicates or prior art
- Related PRs: #1486 (obra/superpowers cross-platform PR — independent; no shared file changes besides README, which has no overlap)

## Environment tested

| Harness | Version | Model | Model ID |
|---------|---------|-------|----------|
| Claude Code | local install | Opus | claude-opus-4-7 (1M context) |

Drill's own pytest suite passes from the new location. `triggering-test-driven-development` drill scenario passes from `evals/` after the path-default changes. (Larger drill sweep deferred to release-cadence runs per the spec's deferred-CI policy.)

## Evaluation

- Initial prompt: see linked spec (`docs/superpowers/specs/2026-05-06-lift-drill-into-evals-design.md`).
- Drill's own pytest suite passes.
- One drill scenario re-run from the new location end-to-end (proves the SUPERPOWERS_ROOT default works).
- Per-deleted-file subagent verification recorded in each deletion commit's message.

## Rigor

- [x] If this is a skills change: this is not a skills change; it's a tooling/infrastructure migration. No behavior-shaping content modified.
- [x] Adversarial pressure-tested: two parallel reviewers on the spec; final adversarial pre-PR review on the implementation; spec already corrected for findings before implementation began.
- [x] Did not modify carefully-tuned content.

## Human review

- [x] A human has reviewed the COMPLETE proposed diff before submission

## Action items after merge

1. Archive obra/drill on GitHub (mark read-only, add README pointer to obra/superpowers/evals/).
2. The spec lists CI integration, scenario co-location with skills, and Python package rename as deferred work. Open issues for any of these you want tracked.
EOF
)"

步骤 3： Confirm PR opened

gh pr view --web

预期: browser opens to the 新 PR. Take a screenshot or note the URL for follow-up.

Plans

Specs

将 drill 提升到 Superpowers 的 `evals/` 中 —— 实施计划

将 drill 提升到 Superpowers 的 `evals/` 中 —— 实施计划

Task 1: Branch off dev

Task 2: Capture drill SHA at copy time

Task 3: rsync drill into evals/

Task 4: 验证 the copy with checksums

Task 5: 添加 `SUPERPOWERS_ROOT` default helper

Task 6: 更新 backend YAMLs to reflect the 新 env contract

Task 7: 更新 drill’s pytest suite for the 新 contract

Task 8: 更新 evals/README.md and evals/CLAUDE.md

Task 9: Validate from 新 location

Task 10: Bash test deletion 阶段 — per-file with 子 agent gate

Task 10a: Skill-triggering prompts (6 files)

Task 10b: explicit-skill-requests (selective deletion)

Task 10c: subagent-driven-dev real-project tests

Task 10d: tests/claude-code/test-document-review-system.sh

Task 10e: tests/claude-code/test-requesting-code-review.sh

Task 10f: tests/claude-code/test-worktree-native-preference.sh

Task 10g: tests/claude-code/test-subagent-driven-development-integration.sh

Task 10h: tests/claude-code/test-subagent-driven-development.sh

Task 11: Stale-reference scrub

Task 12: Top-level docs

Task 13: Re-run smoke checks (regression gate)

Task 14: Final adversarial 审查

Task 15: Push and open PR

验证 checklist (run after Task 15)

Get started

Windows

Plans

Superpowers

将 drill 提升到 Superpowers 的 `evals/` 中 —— 实施计划

将 drill 提升到 Superpowers 的 evals/ 中 —— 实施计划

Task 1: Branch off dev

Task 2: Capture drill SHA at copy time

Task 3: rsync drill into evals/

Task 4: 验证 the copy with checksums

Task 5: 添加 SUPERPOWERS_ROOT default helper

Task 6: 更新 backend YAMLs to reflect the 新 env contract

Task 7: 更新 drill’s pytest suite for the 新 contract

Task 8: 更新 evals/README.md and evals/CLAUDE.md

Task 9: Validate from 新 location

Task 10: Bash test deletion 阶段 — per-file with 子 agent gate

Task 10a: Skill-triggering prompts (6 files)

Task 10b: explicit-skill-requests (selective deletion)

Task 10c: subagent-driven-dev real-project tests

Task 10d: tests/claude-code/test-document-review-system.sh

Task 10e: tests/claude-code/test-requesting-code-review.sh

Task 10f: tests/claude-code/test-worktree-native-preference.sh

Task 10g: tests/claude-code/test-subagent-driven-development-integration.sh

Task 10h: tests/claude-code/test-subagent-driven-development.sh

Task 11: Stale-reference scrub

Task 12: Top-level docs

Task 13: Re-run smoke checks (regression gate)

Task 14: Final adversarial 审查

Task 15: Push and open PR

验证 checklist (run after Task 15)

Get started

Windows

Plans

Superpowers

将 drill 提升到 Superpowers 的 `evals/` 中 —— 实施计划

Task 5: 添加 `SUPERPOWERS_ROOT` default helper