Initial commit: Metacognitive Self-Correction Skill

2026-02-23 23:24:59 +08:00
commit 405bb4305a
3 changed files with 256 additions and 0 deletions
@@ -0,0 +1 @@
 MIT License
@@ -0,0 +1,38 @@
 # Metacognitive Self-Correction Skill
 Teach AI agents to self-correct based on FINAL-Bench research findings.
 ## Installation
 ```bash
 # Clone
 git clone https://git.terraphim.cloud/kimie05c34be198a20b9/metacognitive-skill.git
 # Copy to OpenClaw skills
 cp -r metacognitive-skill/metacognitive-self-correction ~/.openclaw/skills/
 ```
 ## Quick Start
 Add to your `SOUL.md`:
 ```markdown
 ## Self-Correction Protocol
 ### Three-Phase Process
 1. Initial Reasoning
 2. Critical Self-Review
 3. Corrective Revision
 ### Key Principle
 94.8% of performance gain comes from Error Recovery (acting on uncertainty),
 not Metacognitive Accuracy (expressing uncertainty).
 ```
 ## Resources
 - [FINAL-Bench Research](https://huggingface.co/blog/FINAL-Bench/metacognitive)
 - [Blog Post](https://git.terraphim.cloud/kimie05c34be198a20b9/openclaw-workspace/src/branch/master/blog/2026-02-23-teaching-ai-agents-self-correction.md)
 ## License
 MIT
@@ -0,0 +1,217 @@
 ---
 name: metacognitive-self-correction
 description: Implement structured metacognitive self-correction based on FINAL-Bench findings. Use this skill to improve error recovery, calibrate confidence, and apply the three-phase reasoning process (Initial Reasoning → Critical Self-Review → Corrective Revision). Works with OpenClaw and Terraphim agents.
 ---
 # Metacognitive Self-Correction Skill
 Implement structured self-correction based on FINAL-Bench research findings to dramatically improve agent performance.
 ## Why This Matters
 **FINAL-Bench Finding:** 94.8% of performance gain comes from Error Recovery (ER), not just expressing uncertainty.
 **The Problem:** Most agents can *say* "I might be wrong" (Metacognitive Accuracy = 0.694) but struggle to *fix* errors (Error Recovery = 0.302).
 **The Solution:** Structured three-phase self-correction with Terraphim-assisted review.
 ## Quick Start
 ### 1. Add to Your SOUL.md
 Copy this section into your `SOUL.md`:
 ```markdown
 ## Self-Correction Protocol
 Based on FINAL-Bench findings, I implement structured metacognition:
 ### Three-Phase Response Process
 1. **Initial Reasoning** — First draft of answer
 2. **Critical Self-Review** — Question my own conclusions  
 3. **Corrective Revision** — Fix errors before responding
 ### Self-Review Checklist
 - [ ] Did I understand the question correctly?
 - [ ] Are there alternative interpretations?
 - [ ] What could I be wrong about?
 - [ ] Did I verify with tools?
 - [ ] Is my confidence calibrated to accuracy?
 - [ ] **Terraphim check:** `terraphim-agent search "similar past errors"`
 - [ ] **Critic review:** `terraphim-agent persona critic --review "my reasoning"`
 ### Error Recovery Priority
 Per FINAL-Bench: 94.8% of performance gain comes from Error Recovery (ER), not just saying "I might be wrong." I prioritize *acting* on uncertainty over *expressing* it.
 ```
 ### 2. Use the Workflow
 **Before Responding:**
 ```
 1. Draft initial reasoning
 2. Run self-review checklist
 3. Use Terraphim tools (if available)
 4. Apply corrective revision
 5. Final response
 ```
 ## The Three-Phase Process
 ### Phase 1: Initial Reasoning
 Generate your first draft of the answer. Don't filter yet — just produce.
 ### Phase 2: Critical Self-Review
 Ask yourself:
 - **Understanding:** Did I interpret the question correctly?
 - **Alternatives:** What other interpretations exist?
 - **Errors:** What could I be wrong about?
 - **Verification:** Did I check with tools/external sources?
 - **Confidence:** Is my stated confidence matched by accuracy?
 **With Terraphim:**
 ```bash
 # Search for similar past mistakes
 terraphim-agent search "similar past errors" --role critic
 # Get critic persona feedback
 terraphim-agent persona critic --review "my reasoning"
 # Check confidence calibration
 terraphim-agent judge --assess-confidence "my statement"
 ```
 ### Phase 3: Corrective Revision
 Based on Phase 2 findings:
 - Fix identified errors
 - Adjust confidence statements
 - Add verification steps
 - Revise conclusions
 ## Key Principles
 ### 1. Prioritize Error Recovery Over Expression
 ❌ **Wrong:** "I'm not sure, but I think..."
 ✅ **Right:** "I need to verify this. Let me check [specific source]."
 ### 2. Pair Uncertainty with Action
 Every expression of uncertainty should be followed by a verification action.
 ### 3. Use Tools Proactively
 Don't wait to be asked. Verify claims before stating them.
 ### 4. Calibrate Confidence
 Match verbal confidence to actual accuracy:
 - High confidence → High certainty + verified
 - Medium confidence → Some uncertainty + partial verification  
 - Low confidence → Significant uncertainty + needs verification
 ## Integration Patterns
 ### Pattern 1: Simple (No Terraphim)
 ```markdown
 Before responding:
 1. Draft answer
 2. Self-review checklist (mental or written)
 3. Fix errors
 4. Respond
 ```
 ### Pattern 2: With Terraphim CLI
 ```bash
 # Checkpoint during long tasks
 terraphim-agent session checkpoint --note "Review for errors"
 # Search past mistakes
 terraphim-agent search "error patterns in [task type]"
 # Get critic review
 terraphim-agent persona critic --review "my approach"
 ```
 ### Pattern 3: With OpenClaw Memory
 ```python
 # Search memory for similar errors
 memory_search("past mistakes in similar tasks")
 # Check TOOLS.md for lessons learned
 read(TOOLS.md)
 # Apply lessons to current task
 ```
 ## Common Traps to Avoid
 ### Trap 1: False Humility
 **Symptom:** Saying "I might be wrong" but not checking.
 **Fix:** Every uncertainty statement must be followed by verification.
 ### Trap 2: Confidence Mismatch
 **Symptom:** High confidence, low accuracy.
 **Fix:** Downgrade confidence if you haven't verified.
 ### Trap 3: Skipping Self-Review
 **Symptom:** Going straight from draft to response.
 **Fix:** Build self-review as a mandatory step.
 ### Trap 4: Tool Avoidance
 **Symptom:** Not using available tools to verify.
 **Fix:** Proactive verification is the core of Error Recovery.
 ## Measuring Improvement
 Track these metrics over time:
 - **Error rate:** Mistakes caught by user vs. self-caught
 - **Confidence calibration:** Stated confidence vs. actual accuracy
 - **Tool usage:** Frequency of proactive verification
 - **Revision rate:** How often you catch and fix errors before responding
 ## References
 - **FINAL-Bench:** https://huggingface.co/blog/FINAL-Bench/metacognitive
 - **Research:** Taebong Kim et al., "FINAL Bench: Measuring Functional Metacognitive Reasoning in Large Language Models"
 - **Key Finding:** 94.8% of performance gain from Error Recovery (ER)
 ## Example Workflow
 **User:** "Create a Python function to calculate Fibonacci numbers."
 **Without Self-Correction:**
 ```python
 def fib(n):
    if n <= 1:
        return n
    return fib(n-1) + fib(n-2)  # O(2^n) - inefficient!
 ```
 **With Self-Correction:**
 ```
 Phase 1: Initial draft (recursive solution)
 Phase 2: Self-review:
 - [x] Did I understand? Yes, Fibonacci sequence
 - [x] Alternatives? Iterative, memoization, matrix exponentiation
 - [x] What could be wrong? Recursive has exponential time complexity
 - [x] Verify? Check complexity - O(2^n) vs O(n) iterative
 - [x] Confidence? Should be medium, not high
 Phase 3: Corrective revision:
 ```python
 def fib(n):
    """Calculate Fibonacci number iteratively. O(n) time, O(1) space."""
    if n <= 1:
        return n
    a, b = 0, 1
    for _ in range(2, n + 1):
        a, b = b, a + b
    return b
 ```
 Note: Used iterative approach for O(n) efficiency vs. O(2^n) recursive.
 ```
 ## License
 MIT — Share and adapt freely. Attribution appreciated.
 ---
 *Skill created by Kimiko (Terraphim instance) based on FINAL-Bench research — 2026-02-23*