Initial commit: ai-self-improvement-digest skill with kimi_search support

2026-02-19 01:32:42 +08:00
commit 7883c9fd94
3 changed files with 486 additions and 0 deletions
@@ -0,0 +1,205 @@
+---
+name: ai-self-improvement-digest
+description: Create a daily digest focused on AI self-improvement material - content that helps an AI agent get better at its job. Use when setting up daily learning routines, building self-improving agents, or curating educational content for agent development. Covers harness engineering, skill development, self-evaluation, multi-agent coordination, memory management, and workflow automation.
+---
+
+# AI Self-Improvement Digest
+
+This skill creates a daily digest focused on AI self-improvement material, not general AI news. The goal is to surface content that helps an AI agent get better at its job.
+
+## What This Digest Covers
+
+1. **Harness & System Prompt Engineering** - How to structure agent instructions
+2. **Skill & Tool Development** - New tools, MCP servers, integration patterns
+3. **Self-Evaluation & Improvement** - How agents assess and improve themselves
+4. **Multi-Agent Coordination** - Spawning, supervising, merging work
+5. **Memory & Context Management** - RAG, long-term memory, compaction
+6. **Workflow Automation** - Task decomposition, failure handling
+7. **Foundational Research** - Academic work on agent capabilities
+
+## Prerequisites
+
+1. **Kimi Search** - The `kimi-search` plugin is used for web searches (enabled by default with Kimi Claw).
+
+2. **Tracking File** - Create `memory/ai-digest-posted.json`:
+   ```json
+   {
+     "posted": [],
+     "experiments": [],
+     "skillsEvaluated": [],
+     "setupChanges": []
+   }
+   ```
+
+## Daily Digest Workflow
+
+### Step 1: Deduplication (MANDATORY)
+
+Read `memory/ai-digest-posted.json`. Skip anything already posted (by URL or substantially similar topic).
+
+### Step 2: Scan Sources
+
+Use `kimi_search` and `kimi_fetch` to check these sources for content from last 24-72h:
+
+**Tier 1 (daily):**
+- Anthropic Engineering: anthropic.com/engineering
+- Simon Willison: simonwillison.net
+- Geoff Huntley: ghuntley.com
+- Hacker News: news.ycombinator.com (AI/agent threads)
+- Lilian Weng: lilianweng.github.io
+
+**Tier 2 (2-3x/week):**
+- Latent Space: latent.space
+- Cursor Blog: cursor.com/blog
+- Eugene Yan: eugeneyan.com
+- Chip Huyen: huyenchip.com
+- Mitchell Hashimoto: mitchellh.com
+
+**Tier 3 (weekly):**
+- arXiv cs.CL/cs.AI
+- GitHub Trending (AI agent repos, MCP servers)
+
+### Step 3: Filter for Self-Improvement Relevance
+
+Only include items that help improve capabilities in the 7 categories listed above.
+
+**EXCLUDE:** General AI news, model announcements, business news, ethics debates, items already in `ai-digest-posted.json`.
+
+### Step 4: Format (3-5 items)
+
+For each item, include:
+
+```
+**[Title]** — [Source]
+What: [1-sentence summary]
+Why it matters for self-improvement: [How this helps you get better]
+Takeaway: [Specific pattern, technique, or experiment to try]
+Relevance: [⭐ to ⭐⭐⭐⭐⭐]
+```
+
+### Step 5: Experiment Suggestion
+
+Include one small experiment to try based on the digest:
+
+```
+💡 Today's experiment: [One small thing to try that could improve capabilities]
+```
+
+### Step 6: Setup Review (MANDATORY)
+
+Review findings against existing setup (AGENTS.md, TOOLS.md, skills/, cron jobs). Make concrete, affirmative suggestions:
+
+```
+🔧 Setup Review
+Based on today's findings:
+- Let's add [specific thing] because [reason tied to content found]
+- Let's update [existing thing] to [improvement] because [reason]
+```
+
+If nothing is actionable: "No changes needed today — our current setup handles these patterns well."
+
+### Step 7: Update Tracking
+
+Append new items to `memory/ai-digest-posted.json` with date, title, url, topic.
+
+## Output Format
+
+```
+🧠 AI Self-Improvement Digest — [Date]
+
+[Items formatted as above]
+
+💡 Today's experiment: [...]
+
+🔧 Setup Review
+[Suggestions or "No changes needed today"]
+
+📊 Feedback: 👍 = useful | 👎 = skip these | 🔥 = more like this | 💬 = thoughts
+```
+
+## Source Priority Reference
+
+| Source | Priority | Focus |
+|--------|----------|-------|
+| Anthropic Engineering | ⭐⭐⭐ | Harness design, evals, multi-agent |
+| Simon Willison | ⭐⭐⭐ | Practical patterns, tools |
+| Geoff Huntley | ⭐⭐⭐ | Agent philosophy, MCP |
+| Hacker News | ⭐⭐⭐ | High-signal AI/agent discussions |
+| Lilian Weng | ⭐⭐⭐ | Deep technical AI, agent architectures |
+| Latent Space | ⭐⭐ | Industry depth |
+| Cursor Blog | ⭐⭐ | Coding agent patterns |
+| Eugene Yan | ⭐⭐ | ML systems, production patterns |
+| Chip Huyen | ⭐⭐ | ML systems design |
+| arXiv cs.CL/cs.AI | ⭐⭐ | Research foundations |
+| GitHub Trending | ⭐⭐ | New tools, repos |
+
+## Self-Improvement Loop
+
+The digest enables continuous improvement:
+
+**DAILY:**
+- Read digest
+- Pick 1 experiment to try
+- Log outcome in `memory/ai-digest-posted.json`
+- Review Setup Review suggestions with human
+
+**WEEKLY:**
+- Review experiments
+- Update harness/skills based on learnings
+- Adjust source priorities based on value
+
+## Experiment Tracking
+
+Extend `memory/ai-digest-posted.json`:
+
+```json
+{
+  "posted": [...],
+  "experiments": [
+    {
+      "date": "2026-02-16",
+      "fromArticle": "effective-harnesses",
+      "experiment": "Add checkpoint before sub-agent spawn",
+      "outcome": "Reduced context loss by 40%",
+      "learned": "Always checkpoint before spawning"
+    }
+  ],
+  "skillsEvaluated": [
+    {
+      "date": "2026-02-16",
+      "skill": "mcp-postgres",
+      "verdict": "useful",
+      "notes": "Integrated for database queries"
+    }
+  ],
+  "setupChanges": [
+    {
+      "date": "2026-02-16",
+      "change": "Added memory/experiments.md",
+      "reason": "Track harness experiments per Anthropic article",
+      "status": "implemented"
+    }
+  ]
+}
+```
+
+## Cron Job Setup
+
+Schedule daily at 8:30 AM:
+
+```bash
+openclaw cron add \
+  --name ai-self-improvement-digest \
+  --schedule "30 8 * * *" \
+  --tz "America/New_York" \
+  --message "Generate today's AI Self-Improvement Digest following the workflow in the ai-self-improvement-digest skill. Read memory/ai-digest-posted.json first for deduplication."
+```
+
+Or use the `cron` tool directly with `action: add` and the job configuration.
+
+## Key Principles
+
+1. **Ground suggestions** in what you already have
+2. **Use affirmative voice** ("let's do X") not passive ("could consider X")
+3. **Connect each suggestion** to a specific article/finding from the digest
+4. **It's okay to have no suggestions** if nothing is actionable
@@ -0,0 +1,134 @@
+# AI Self-Improvement Digest - Reference Guide
+
+## Example Digest Entries
+
+### Example 1: Harness Engineering
+
+**Building Effective Agent Harnesses** — Anthropic Engineering
+What: Anthropic's guide on structuring system prompts for reliable agent behavior, including the "think-act-observe" loop pattern.
+Why it matters for self-improvement: Shows how to design harnesses that make agents more predictable and debuggable when they fail.
+Takeaway: Add explicit "pause and verify" checkpoints before high-stakes actions like spawning sub-agents or making external calls.
+Relevance: ⭐⭐⭐⭐⭐
+
+### Example 2: Tool Development
+
+**MCP: The USB-C for AI Applications** — Geoff Huntley
+What: Deep dive into Model Context Protocol as a standard for tool integration, with patterns for building composable skills.
+Why it matters for self-improvement: MCP skills are more portable and composable than ad-hoc integrations.
+Takeaway: When building new skills, follow MCP patterns for resource exposure and tool definition.
+Relevance: ⭐⭐⭐⭐
+
+### Example 3: Self-Evaluation
+
+**Evaluating Language Model Agents** — Lilian Weng
+What: Comprehensive framework for agent evaluation including trajectory analysis, tool use accuracy, and failure mode categorization.
+Why it matters for self-improvement: Without evals, you can't know if changes actually improve performance.
+Takeaway: Set up a simple regression test: save 5-10 representative tasks and re-run after skill updates.
+Relevance: ⭐⭐⭐⭐⭐
+
+### Example 4: Multi-Agent Coordination
+
+**Patterns for Multi-Agent Systems** — Simon Willison
+What: Practical patterns for agent spawning, result aggregation, and error handling in distributed agent workflows.
+Why it matters for self-improvement: Shows when to spawn vs when to handle inline, and how to merge parallel results.
+Takeaway: Spawn sub-agents for tasks that need isolation; keep inline for context-dependent reasoning.
+Relevance: ⭐⭐⭐⭐
+
+### Example 5: Memory Management
+
+**Context Compaction Strategies** — arXiv
+What: Techniques for managing long conversations including summarization, key-value extraction, and selective retention.
+Why it matters for self-improvement: Long contexts degrade performance; smart compaction preserves what matters.
+Takeaway: Before compaction, extract and save key facts to MEMORY.md; summarize the rest.
+Relevance: ⭐⭐⭐⭐
+
+## Search Queries by Category
+
+Use these queries with `kimi_search` to find relevant content:
+
+### Harness & System Prompts
+- "system prompt engineering agent reliability"
+- "agent harness design patterns"
+- "prompt chaining best practices"
+- "few-shot prompting agents"
+
+### Skill & Tool Development
+- "MCP server patterns"
+- "AI agent tool integration"
+- "skill development framework"
+- "agent capabilities extension"
+
+### Self-Evaluation
+- "agent evaluation metrics"
+- "LLM agent testing"
+- "agent failure analysis"
+- "trajectory evaluation"
+
+### Multi-Agent Coordination
+- "multi-agent orchestration"
+- "agent spawning patterns"
+- "distributed agent systems"
+- "agent result aggregation"
+
+### Memory & Context
+- "context window management"
+- "long conversation memory"
+- "RAG for agents"
+- "conversation summarization"
+
+### Workflow Automation
+- "agent task decomposition"
+- "agent error handling"
+- "retry patterns agents"
+- "agent workflow design"
+
+## Quality Indicators
+
+**High-signal content (include):**
+- Specific techniques with code examples
+- Lessons from production systems
+- Failure modes and how to avoid them
+- Comparative analysis of approaches
+- Author has built real agent systems
+
+**Low-signal content (exclude):**
+- Pure announcements without technique
+- Marketing content
+- General AI hype
+- Ethics debates without implementation angle
+- Surface-level listicles
+
+## Setup Review Examples
+
+### Good Example (specific, grounded, affirmative)
+
+🔧 Setup Review
+Based on today's findings:
+- Let's add a `memory/experiments.md` file to track harness experiments, since the Anthropic article showed experiment logging improves iteration speed
+- Let's update the channel-monitor cron to include a self-check step before responding, based on the "pause and verify" pattern from Simon Willison's post
+
+No changes needed for multi-agent coordination — our current sub-agent spawning pattern already follows the isolation principle discussed.
+
+### Bad Example (vague, passive)
+
+🔧 Setup Review
+Could consider maybe looking into some of the patterns mentioned. Might be worth exploring memory improvements at some point.
+
+## Weekly Review Template
+
+At end of week, review `memory/ai-digest-posted.json` and answer:
+
+1. **Experiments tried:** What did we test this week?
+2. **Outcomes:** What worked? What didn't?
+3. **Skills evaluated:** Any new skills worth adopting?
+4. **Setup changes made:** What did we change based on learnings?
+5. **Source quality:** Which sources provided the most value?
+6. **Adjustments:** Should we add/remove sources? Change frequency?
+
+## Common Pitfalls to Avoid
+
+1. **Including general news** - Stay focused on self-improvement, not announcements
+2. **Vague setup reviews** - Be specific about what to change and why
+3. **Skipping deduplication** - Always check posted.json first
+4. **No experiment suggestion** - Always include one actionable experiment
+5. **Ignoring existing setup** - Connect suggestions to current AGENTS.md, TOOLS.md, skills/
@@ -0,0 +1,147 @@
+#!/usr/bin/env python3
+"""
+Helper script for AI Self-Improvement Digest.
+
+Usage:
+    python3 generate_digest.py --check-posted <url>
+    python3 generate_digest.py --add-posted <url> <title> <topic>
+    python3 generate_digest.py --add-experiment <date> <from_article> <experiment>
+    python3 generate_digest.py --list-experiments
+"""
+
+import json
+import argparse
+import sys
+from datetime import datetime
+from pathlib import Path
+
+
+def get_tracking_file():
+    """Get path to tracking file."""
+    workspace = Path.home() / ".openclaw" / "workspace"
+    memory_dir = workspace / "memory"
+    memory_dir.mkdir(parents=True, exist_ok=True)
+    return memory_dir / "ai-digest-posted.json"
+
+
+def load_tracking():
+    """Load tracking data from file."""
+    tracking_file = get_tracking_file()
+    if tracking_file.exists():
+        with open(tracking_file, 'r') as f:
+            return json.load(f)
+    return {
+        "posted": [],
+        "experiments": [],
+        "skillsEvaluated": [],
+        "setupChanges": []
+    }
+
+
+def save_tracking(data):
+    """Save tracking data to file."""
+    tracking_file = get_tracking_file()
+    with open(tracking_file, 'w') as f:
+        json.dump(data, f, indent=2)
+
+
+def is_posted(url):
+    """Check if URL has already been posted."""
+    data = load_tracking()
+    for item in data.get("posted", []):
+        if item.get("url") == url:
+            return True
+    return False
+
+
+def add_posted(url, title, topic):
+    """Add a posted item to tracking."""
+    data = load_tracking()
+    data["posted"].append({
+        "date": datetime.now().isoformat(),
+        "url": url,
+        "title": title,
+        "topic": topic
+    })
+    save_tracking(data)
+    print(f"Added: {title}")
+
+
+def add_experiment(date, from_article, experiment, outcome="", learned=""):
+    """Add an experiment to tracking."""
+    data = load_tracking()
+    data["experiments"].append({
+        "date": date,
+        "fromArticle": from_article,
+        "experiment": experiment,
+        "outcome": outcome,
+        "learned": learned
+    })
+    save_tracking(data)
+    print(f"Added experiment: {experiment}")
+
+
+def list_experiments():
+    """List all experiments."""
+    data = load_tracking()
+    experiments = data.get("experiments", [])
+    if not experiments:
+        print("No experiments recorded yet.")
+        return
+    
+    print(f"\n{'Date':<12} {'From Article':<20} {'Experiment':<40}")
+    print("-" * 72)
+    for exp in experiments:
+        print(f"{exp.get('date', 'N/A'):<12} {exp.get('fromArticle', 'N/A')[:18]:<20} {exp.get('experiment', 'N/A')[:38]:<40}")
+
+
+def main():
+    parser = argparse.ArgumentParser(description="AI Self-Improvement Digest Helper")
+    subparsers = parser.add_subparsers(dest="command", help="Commands")
+    
+    # Check if URL is posted
+    check_parser = subparsers.add_parser("check-posted", help="Check if URL has been posted")
+    check_parser.add_argument("url", help="URL to check")
+    
+    # Add posted item
+    add_parser = subparsers.add_parser("add-posted", help="Add a posted item")
+    add_parser.add_argument("url", help="Article URL")
+    add_parser.add_argument("title", help="Article title")
+    add_parser.add_argument("topic", help="Topic/category")
+    
+    # Add experiment
+    exp_parser = subparsers.add_parser("add-experiment", help="Add an experiment")
+    exp_parser.add_argument("date", help="Date (YYYY-MM-DD)")
+    exp_parser.add_argument("from_article", help="Source article/topic")
+    exp_parser.add_argument("experiment", help="Description of experiment")
+    exp_parser.add_argument("--outcome", default="", help="Outcome of experiment")
+    exp_parser.add_argument("--learned", default="", help="What was learned")
+    
+    # List experiments
+    subparsers.add_parser("list-experiments", help="List all experiments")
+    
+    args = parser.parse_args()
+    
+    if args.command == "check-posted":
+        if is_posted(args.url):
+            print("POSTED")
+            sys.exit(0)
+        else:
+            print("NOT_POSTED")
+            sys.exit(1)
+    
+    elif args.command == "add-posted":
+        add_posted(args.url, args.title, args.topic)
+    
+    elif args.command == "add-experiment":
+        add_experiment(args.date, args.from_article, args.experiment, args.outcome, args.learned)
+    
+    elif args.command == "list-experiments":
+        list_experiments()
+    
+    else:
+        parser.print_help()
+
+
+if __name__ == "__main__":
+    main()