Phần 4 của 580% hoàn thành

Secrets, Memory, và nghệ thuật không để lộ mọi thứ

English title: Secrets, Memory, and the Art of Not Leaking Everything

AI agent nhớ mọi thứ bạn nói. Nó cũng nhớ API keys bạn paste vào config. Và nếu bạn không cẩn thận, nó sẽ chia sẻ cả hai cho người không nên biết.

Bài viết này về hai loại "bí mật" trong AI agent systems: credentials (API keys, tokens) và conversational memory (điều users nói). Cả hai đều cần bảo vệ, nhưng bằng cách khác nhau.

Phần 1: Credentials — Token nằm đâu?

Vấn đề: Plaintext everywhere

Framework tôi dùng lưu credentials tại:

~/.openclaw/agents/<id>/agent/auth-profiles.json

Đây là file JSON thường. Không encrypt. Không vault. API keys, OAuth tokens, bot tokens — tất cả plaintext.

Filesystem access = credential theft. Theo MITRE ATLAS threat model (T-ACCESS-003), đây là HIGH risk — malware, unauthorized device access, hoặc backup exposure đều có thể dẫn tới credential leak.

Giải pháp: SecretRef — tham chiếu thay vì lưu trực tiếp

Framework hỗ trợ SecretRef — thay vì lưu secret value trong config, bạn lưu reference tới nơi secret thực sự nằm:

{
  "channels": {
    "telegram": {
      "botToken": {
        "source": "env",
        "provider": "default",
        "id": "TELEGRAM_BOT_TOKEN"
      }
    }
  }
}

Config file không chứa token. Token nằm trong env var — và env var có thể đến từ Docker secrets, vault, hoặc encrypted .env.

3 source types

Source	Cách hoạt động	Best for
`env`	Đọc từ environment variable	Đơn giản, Docker-friendly
`file`	Đọc từ JSON file, access bằng JSON pointer	Team shared secrets file
`exec`	Chạy binary (1Password CLI, HashiCorp Vault)	Production, enterprise

Ví dụ với 1Password:

{
  "secrets": {
    "providers": {
      "op": {
        "source": "exec",
        "command": "/opt/homebrew/bin/op",
        "args": ["read", "op://Personal/TelegramBot/password"]
      }
    }
  },
  "channels": {
    "telegram": {
      "botToken": { "source": "exec", "provider": "op", "id": "value" }
    }
  }
}

Runtime model

SecretRef không phải lazy lookup. Framework resolve tất cả secrets khi startup vào in-memory snapshot. Nếu secret không resolve được → gateway refuses to start. Fail fast, không fail silent.

Khi reload config, nếu secret mới fail → giữ snapshot cũ (last-known-good). Không bao giờ để gateway chạy với missing secrets.

Migration thực tế

# Bước 1: Audit credentials hiện tại
openclaw secrets audit --check
 
# Bước 2: Interactive migration
openclaw secrets configure
 
# Bước 3: Verify
openclaw secrets audit --check  # Should return CLEAN

Phần 2: Memory — AI nhớ gì và ai được đọc?

Architecture: Markdown là source of truth

Memory trong framework tôi dùng = plain Markdown files trên disk:

~/.openclaw/workspace/
├── MEMORY.md              # Long-term curated memory
├── memory/
│   ├── 2026-03-18.md      # Daily log (today)
│   └── 2026-03-17.md      # Daily log (yesterday)

Agent chỉ "nhớ" những gì được ghi ra file. Không có hidden vector database. Không có "background learning". Muốn agent nhớ gì → ghi vào file. Muốn agent quên → xóa file.

Simple. Reviewable. Git-backable.

Privacy boundary: MEMORY.md vs daily logs

File	Loaded khi nào	Groups?
`MEMORY.md`	Mỗi session start	KHÔNG — chỉ main/private session
`memory/YYYY-MM-DD.md`	On-demand (qua memory_search tool)	Tùy scope config

Đây là privacy protection by design:

MEMORY.md chứa curated facts (preferences, important decisions) → chỉ owner thấy
Daily logs có thể chứa sensitive info → memory_search scope mặc định deny groups

Config: Restrict memory search cho group sessions:

{
  "agents": {
    "defaults": {
      "memorySearch": {
        "scope": {
          "default": "deny",
          "rules": [
            { "action": "allow", "match": { "chatType": "direct" } }
          ]
        }
      }
    }
  }
}

Semantic search: Hybrid BM25 + Vector

Framework hỗ trợ hybrid search — kết hợp keyword (BM25) và semantic (vector embeddings):

BM25: tìm chính xác IDs, code symbols, env vars
Vector: tìm "ý nghĩa tương đương" (paraphrases)

Có cả MMR re-ranking (loại duplicate) và temporal decay (ưu tiên memory mới):

7 ngày trước: 84% relevance score
30 ngày: 50%
90 ngày: 12.5%

Memory cũ tự nhiên "mờ dần" — giống cách con người nhớ.

Phần 3: Bootstrap files — Token burn ẩn

Vấn đề ít ai biết

Framework inject workspace files vào context mỗi turn:

File	Injected?	Default max
AGENTS.md	Mỗi turn	20,000 chars (~5000 tok)
SOUL.md	Mỗi turn	20,000 chars
USER.md	Mỗi turn	20,000 chars
TOOLS.md	Mỗi turn	20,000 chars
IDENTITY.md	Mỗi turn	20,000 chars

Worst case: 5 files × 20,000 chars = 100,000 chars ≈ 25,000 tokens mỗi API call chỉ cho bootstrap files.

Thêm tool schemas (~8000 tokens), skills list (~500 tokens), và system prompt text (~2000 tokens) → 35,000+ tokens overhead trước khi conversation bắt đầu.

Fix

{
  "agents": {
    "defaults": {
      "bootstrapMaxChars": 15000,
      "bootstrapTotalMaxChars": 50000
    }
  }
}

Và quan trọng hơn: giữ files ngắn. TOOLS.md thường lớn nhất — trim aggressively. SOUL.md không cần 5 trang — 500 chars đủ cho persona.

Inspect bằng: /context list trong chat → thấy raw vs injected size từng file.

Phần 4: Compaction — Quên có chủ đích

Auto-compaction = tóm tắt lossy

Khi context window đầy, framework tự động compact — summarize conversation cũ thành bản rút gọn. Đây là cơ chế cần thiết, nhưng lossy.

Compaction mất nuance. Specific numbers, detailed decisions, exact wording — có thể bị summarize thành câu chung chung. Nếu bạn nói "budget Q2 là 450 triệu VNĐ" trong turn 5, sau compaction agent có thể chỉ nhớ "đã thảo luận về budget".

Pre-compaction memory flush

Framework có silent memory flush — trước khi compact, nó trigger 1 turn ẩn nhắc agent ghi notes ra disk:

{
  "agents": {
    "defaults": {
      "compaction": {
        "memoryFlush": {
          "enabled": true,
          "prompt": "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."
        }
      }
    }
  }
}

Gotcha: turn này tốn tokens và trigger tool calls (ghi file). Nếu workspace read-only (sandbox), flush bị skip.

Bài học về compaction

Important info → memory file (không rely on in-context memory)
Enable memory flush cho long-running sessions
Giữ context window reasonable (40K, không 200K) → compact ít hơn, mỗi lần compact rẻ hơn
/compact manual khi session feels stale — đừng đợi auto

Checklist Secrets & Memory

Credentials

Run openclaw secrets audit --check
Migrate bot tokens sang SecretRef (source: "env")
Migrate API keys sang SecretRef
.env file permissions = 600
No plaintext in config files
Gateway auth token via SecretRef

Memory Privacy

memorySearch.scope deny groups (default đã đúng)
MEMORY.md giữ ngắn (< 5000 chars)
Daily logs không chứa credentials
DM sessions isolated (dmScope: "per-channel-peer")

Token Budget

bootstrapMaxChars: 15000
bootstrapTotalMaxChars: 50000
TOOLS.md trimmed (biggest offender)
/context list verified — total overhead < 25K tokens
Compaction memory flush enabled

Kết

AI agent systems có paradox thú vị: chúng cần nhớ mọi thứ để hữu ích, nhưng cần quên đúng thứ để an toàn.

Credentials không nên nằm trong config file — dùng references. Memory không nên accessible cho mọi user — dùng scopes. Bootstrap files không nên tốn 25K tokens — trim ruthlessly. Compaction sẽ mất information — flush quan trọng ra disk trước.

Không có single config key nào giải quyết tất cả. Nhưng hiểu rõ gì nằm ở đâu, ai access được, và tốn bao nhiêu là bước đầu tiên.

Pillar: 1. Content type: lesson-learned. Engine: ACE-LDK-claire-personal-branding-engine.