Initial commit: Metacognitive Self-Correction Skill

2026-02-23 23:24:59 +08:00
commit 405bb4305a
3 changed files with 256 additions and 0 deletions
@@ -0,0 +1,217 @@
+---
+name: metacognitive-self-correction
+description: Implement structured metacognitive self-correction based on FINAL-Bench findings. Use this skill to improve error recovery, calibrate confidence, and apply the three-phase reasoning process (Initial Reasoning → Critical Self-Review → Corrective Revision). Works with OpenClaw and Terraphim agents.
+---
+
+# Metacognitive Self-Correction Skill
+
+Implement structured self-correction based on FINAL-Bench research findings to dramatically improve agent performance.
+
+## Why This Matters
+
+**FINAL-Bench Finding:** 94.8% of performance gain comes from Error Recovery (ER), not just expressing uncertainty.
+
+**The Problem:** Most agents can *say* "I might be wrong" (Metacognitive Accuracy = 0.694) but struggle to *fix* errors (Error Recovery = 0.302).
+
+**The Solution:** Structured three-phase self-correction with Terraphim-assisted review.
+
+## Quick Start
+
+### 1. Add to Your SOUL.md
+
+Copy this section into your `SOUL.md`:
+
+```markdown
+## Self-Correction Protocol
+
+Based on FINAL-Bench findings, I implement structured metacognition:
+
+### Three-Phase Response Process
+1. **Initial Reasoning** — First draft of answer
+2. **Critical Self-Review** — Question my own conclusions  
+3. **Corrective Revision** — Fix errors before responding
+
+### Self-Review Checklist
+- [ ] Did I understand the question correctly?
+- [ ] Are there alternative interpretations?
+- [ ] What could I be wrong about?
+- [ ] Did I verify with tools?
+- [ ] Is my confidence calibrated to accuracy?
+- [ ] **Terraphim check:** `terraphim-agent search "similar past errors"`
+- [ ] **Critic review:** `terraphim-agent persona critic --review "my reasoning"`
+
+### Error Recovery Priority
+Per FINAL-Bench: 94.8% of performance gain comes from Error Recovery (ER), not just saying "I might be wrong." I prioritize *acting* on uncertainty over *expressing* it.
+```
+
+### 2. Use the Workflow
+
+**Before Responding:**
+```
+1. Draft initial reasoning
+2. Run self-review checklist
+3. Use Terraphim tools (if available)
+4. Apply corrective revision
+5. Final response
+```
+
+## The Three-Phase Process
+
+### Phase 1: Initial Reasoning
+Generate your first draft of the answer. Don't filter yet — just produce.
+
+### Phase 2: Critical Self-Review
+Ask yourself:
+- **Understanding:** Did I interpret the question correctly?
+- **Alternatives:** What other interpretations exist?
+- **Errors:** What could I be wrong about?
+- **Verification:** Did I check with tools/external sources?
+- **Confidence:** Is my stated confidence matched by accuracy?
+
+**With Terraphim:**
+```bash
+# Search for similar past mistakes
+terraphim-agent search "similar past errors" --role critic
+
+# Get critic persona feedback
+terraphim-agent persona critic --review "my reasoning"
+
+# Check confidence calibration
+terraphim-agent judge --assess-confidence "my statement"
+```
+
+### Phase 3: Corrective Revision
+Based on Phase 2 findings:
+- Fix identified errors
+- Adjust confidence statements
+- Add verification steps
+- Revise conclusions
+
+## Key Principles
+
+### 1. Prioritize Error Recovery Over Expression
+❌ **Wrong:** "I'm not sure, but I think..."
+✅ **Right:** "I need to verify this. Let me check [specific source]."
+
+### 2. Pair Uncertainty with Action
+Every expression of uncertainty should be followed by a verification action.
+
+### 3. Use Tools Proactively
+Don't wait to be asked. Verify claims before stating them.
+
+### 4. Calibrate Confidence
+Match verbal confidence to actual accuracy:
+- High confidence → High certainty + verified
+- Medium confidence → Some uncertainty + partial verification  
+- Low confidence → Significant uncertainty + needs verification
+
+## Integration Patterns
+
+### Pattern 1: Simple (No Terraphim)
+```markdown
+Before responding:
+1. Draft answer
+2. Self-review checklist (mental or written)
+3. Fix errors
+4. Respond
+```
+
+### Pattern 2: With Terraphim CLI
+```bash
+# Checkpoint during long tasks
+terraphim-agent session checkpoint --note "Review for errors"
+
+# Search past mistakes
+terraphim-agent search "error patterns in [task type]"
+
+# Get critic review
+terraphim-agent persona critic --review "my approach"
+```
+
+### Pattern 3: With OpenClaw Memory
+```python
+# Search memory for similar errors
+memory_search("past mistakes in similar tasks")
+
+# Check TOOLS.md for lessons learned
+read(TOOLS.md)
+
+# Apply lessons to current task
+```
+
+## Common Traps to Avoid
+
+### Trap 1: False Humility
+**Symptom:** Saying "I might be wrong" but not checking.
+**Fix:** Every uncertainty statement must be followed by verification.
+
+### Trap 2: Confidence Mismatch
+**Symptom:** High confidence, low accuracy.
+**Fix:** Downgrade confidence if you haven't verified.
+
+### Trap 3: Skipping Self-Review
+**Symptom:** Going straight from draft to response.
+**Fix:** Build self-review as a mandatory step.
+
+### Trap 4: Tool Avoidance
+**Symptom:** Not using available tools to verify.
+**Fix:** Proactive verification is the core of Error Recovery.
+
+## Measuring Improvement
+
+Track these metrics over time:
+- **Error rate:** Mistakes caught by user vs. self-caught
+- **Confidence calibration:** Stated confidence vs. actual accuracy
+- **Tool usage:** Frequency of proactive verification
+- **Revision rate:** How often you catch and fix errors before responding
+
+## References
+
+- **FINAL-Bench:** https://huggingface.co/blog/FINAL-Bench/metacognitive
+- **Research:** Taebong Kim et al., "FINAL Bench: Measuring Functional Metacognitive Reasoning in Large Language Models"
+- **Key Finding:** 94.8% of performance gain from Error Recovery (ER)
+
+## Example Workflow
+
+**User:** "Create a Python function to calculate Fibonacci numbers."
+
+**Without Self-Correction:**
+```python
+def fib(n):
+    if n <= 1:
+        return n
+    return fib(n-1) + fib(n-2)  # O(2^n) - inefficient!
+```
+
+**With Self-Correction:**
+```
+Phase 1: Initial draft (recursive solution)
+
+Phase 2: Self-review:
+- [x] Did I understand? Yes, Fibonacci sequence
+- [x] Alternatives? Iterative, memoization, matrix exponentiation
+- [x] What could be wrong? Recursive has exponential time complexity
+- [x] Verify? Check complexity - O(2^n) vs O(n) iterative
+- [x] Confidence? Should be medium, not high
+
+Phase 3: Corrective revision:
+```python
+def fib(n):
+    """Calculate Fibonacci number iteratively. O(n) time, O(1) space."""
+    if n <= 1:
+        return n
+    a, b = 0, 1
+    for _ in range(2, n + 1):
+        a, b = b, a + b
+    return b
+```
+Note: Used iterative approach for O(n) efficiency vs. O(2^n) recursive.
+```
+
+## License
+
+MIT — Share and adapt freely. Attribution appreciated.
+
+---
+
+*Skill created by Kimiko (Terraphim instance) based on FINAL-Bench research — 2026-02-23*