Codegen Pipeline — Auto-Generate Agent Tools from API Specifications

English title: Code Generation Pipeline: Adapter + Command + Test

Mở đầu

Năm 2022, Shopify engineering team ra một quyết định: họ sẽ generate toàn bộ Shopify CLI từ Admin API GraphQL schema. 400+ mutations, 200+ queries — tất cả được generate tự động. Kết quả: shopify app generate cho phép developer scaffold full-stack Shopify app trong một lệnh.

Điều làm cho điều này khả thi không phải là GraphQL hay Shopify-specific tool. Đó là pipeline cơ bản: schema → template engine → generated code.

Bài này xây pipeline đó từ đầu, bắt đầu từ metadata JSON (bài 4), với Jinja2 templates.

1. Cấu trúc pipeline

metadata/api-metadata.json
        │
        ▼
pipeline/generator.py
        │
        ├── templates/adapter.py.j2  → adapters/{resource}_adapter.py
        ├── templates/command.py.j2  → cli/{resource}s.py
        └── templates/test.py.j2     → tests/unit/test_{resource}.py

Pipeline đọc metadata, với mỗi resource (unique set của methods), render 3 templates, ghi 3 files. 439 methods → ~30 resources → 90 files.

2. Adapter template

Template Jinja2 cho adapter class:

{# templates/adapter.py.j2 #}
"""
Auto-generated adapter for {{ resource }} resource.
DO NOT EDIT MANUALLY — regenerate with: make codegen
Generated: {{ generated_at }}
"""
import httpx
from typing import Iterator
from dataclasses import dataclass
from .base import BaseAdapter, AdapterError
 
 
@dataclass
class {{ resource | title }}:
    {% for prop in definitions[resource | title ~ ''] | default({}) | items %}
    {{ prop.name }}: {{ prop.python_type }}
    {% else %}
    id: int
    {% endfor %}
 
 
class {{ resource | title }}Adapter(BaseAdapter):
    """Adapter for {{ resource }} operations."""
 
    {% for method in methods %}
    def {{ method.action }}_{{ method.resource }}(
        self,
        {% for param in method.parameters %}
        {{ param.name }}: {{ param | python_type }}{% if not param.required %} = {{ param | default_value }}{% endif %},
        {% endfor %}
    ) -> {{ method | return_type }}:
        """{{ method.summary }}"""
        {% if method.action == 'list' and method.paginated %}
        page = 1
        while True:
            params = {
                {% for param in method.parameters if param.in == 'query' %}
                "{{ param.name }}": {{ param.name }},
                {% endfor %}
                "_page": page, "_per_page": self.PAGE_SIZE,
            }
            params = {k: v for k, v in params.items() if v is not None}
            data = self._request("{{ method.http_method }}", "{{ method.path }}", params=params)
            items = data.get("items", [])
            for item in items:
                yield {{ resource | title }}(**item)
            if not data.get("has_next_page") or len(items) < self.PAGE_SIZE:
                break
            page += 1
        {% elif method.action == 'get' %}
        {% set id_param = method.parameters | selectattr("in", "equalto", "path") | first %}
        data = self._request("{{ method.http_method }}", f"{{ method.path | replace('{' + id_param.name + '}', '{' + id_param.name + '}') }}")
        return {{ resource | title }}(**data)
        {% elif method.action == 'create' %}
        body = {
            {% for param in method.parameters if param.in == 'body' %}
            "{{ param.name }}": {{ param.name }},
            {% endfor %}
        }
        body = {k: v for k, v in body.items() if v is not None}
        data = self._request("{{ method.http_method }}", "{{ method.path }}", json=body)
        return {{ resource | title }}(**data)
        {% elif method.action == 'delete' %}
        {% set id_param = method.parameters | selectattr("in", "equalto", "path") | first %}
        self._request("{{ method.http_method }}", f"{{ method.path }}")
        {% else %}
        data = self._request("{{ method.http_method }}", "{{ method.path }}")
        return data
        {% endif %}
 
    {% endfor %}

Template này dense vì nó handle nhiều action types khác nhau. Trong production, bạn split thành sub-templates per action.

3. Generator code

Generator đọc metadata, prepare context, render templates:

# pipeline/generator.py
import json
import re
from pathlib import Path
from datetime import datetime, timezone
from collections import defaultdict
from jinja2 import Environment, FileSystemLoader, StrictUndefined
 
 
# Custom Jinja2 filters
def python_type(param: dict) -> str:
    """JSON type → Python type annotation."""
    type_map = {
        "string": "str",
        "integer": "int",
        "number": "float",
        "boolean": "bool",
        "array": "list",
        "object": "dict",
    }
    py_type = type_map.get(param.get("type", "string"), "str")
    if not param.get("required", True):
        return f"{py_type} | None"
    return py_type
 
 
def default_value(param: dict) -> str:
    """Generate default value cho optional param."""
    if "default" in param:
        default = param["default"]
        if isinstance(default, str):
            return f'"{default}"'
        return str(default)
    return "None"
 
 
def return_type(method: dict) -> str:
    """Infer return type annotation từ method action."""
    resource = method["resource"].title()
    action = method["action"]
    if action == "list":
        return f"Iterator[{resource}]"
    if action == "delete":
        return "None"
    return resource
 
 
def setup_jinja_env(templates_dir: str) -> Environment:
    env = Environment(
        loader=FileSystemLoader(templates_dir),
        undefined=StrictUndefined,
        trim_blocks=True,
        lstrip_blocks=True,
    )
    env.filters["python_type"] = python_type
    env.filters["default_value"] = default_value
    env.filters["return_type"] = return_type
    return env
 
 
def group_by_resource(methods: list[dict]) -> dict[str, list[dict]]:
    """Group methods by resource name."""
    grouped = defaultdict(list)
    for method in methods:
        grouped[method["resource"]].append(method)
    return dict(grouped)
 
 
def render_adapter(env: Environment, resource: str, methods: list[dict], output_dir: Path):
    """Generate adapter file cho một resource."""
    template = env.get_template("adapter.py.j2")
    context = {
        "resource": resource,
        "methods": methods,
        "generated_at": datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC"),
    }
    content = template.render(**context)
    output_file = output_dir / "adapters" / f"{resource}_adapter.py"
    output_file.parent.mkdir(parents=True, exist_ok=True)
    output_file.write_text(content)
    return output_file
 
 
def render_command(env: Environment, resource: str, methods: list[dict], output_dir: Path):
    """Generate CLI command file cho một resource."""
    template = env.get_template("command.py.j2")
    context = {
        "resource": resource,
        "methods": methods,
        "generated_at": datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC"),
    }
    content = template.render(**context)
    output_file = output_dir / "cli" / f"{resource}s.py"
    output_file.parent.mkdir(parents=True, exist_ok=True)
    output_file.write_text(content)
    return output_file
 
 
def render_test(env: Environment, resource: str, methods: list[dict], output_dir: Path):
    """Generate test file cho một resource."""
    template = env.get_template("test.py.j2")
    context = {
        "resource": resource,
        "methods": methods,
        "generated_at": datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC"),
    }
    content = template.render(**context)
    output_file = output_dir / "tests" / "unit" / f"test_{resource}.py"
    output_file.parent.mkdir(parents=True, exist_ok=True)
    output_file.write_text(content)
    return output_file
 
 
def run_pipeline(metadata_path: str, output_dir: str, templates_dir: str):
    """Main pipeline runner."""
    metadata = json.loads(Path(metadata_path).read_text())
    methods = metadata["methods"]
    output = Path(output_dir)
    env = setup_jinja_env(templates_dir)
 
    # Group by resource
    resources = group_by_resource(methods)
    print(f"📦 {len(methods)} methods across {len(resources)} resources")
 
    stats = {"adapters": 0, "commands": 0, "tests": 0, "errors": 0}
 
    for resource, resource_methods in resources.items():
        try:
            render_adapter(env, resource, resource_methods, output)
            stats["adapters"] += 1
 
            render_command(env, resource, resource_methods, output)
            stats["commands"] += 1
 
            render_test(env, resource, resource_methods, output)
            stats["tests"] += 1
 
        except Exception as e:
            print(f"❌ Error generating {resource}: {e}")
            stats["errors"] += 1
 
    print(f"\n✓ Generated:")
    print(f"  {stats['adapters']} adapter files")
    print(f"  {stats['commands']} command files")
    print(f"  {stats['tests']} test files")
    if stats["errors"]:
        print(f"  ⚠️  {stats['errors']} errors")
 
 
if __name__ == "__main__":
    import sys
    run_pipeline(
        metadata_path=sys.argv[1],
        output_dir=sys.argv[2],
        templates_dir=sys.argv[3] if len(sys.argv) > 3 else "templates",
    )

Chạy:

python pipeline/generator.py \
    metadata/api-metadata.json \
    generated/ \
    pipeline/templates/
 
# Output:
# 📦 439 methods across 32 resources
# ✓ Generated:
#   32 adapter files
#   32 command files
#   32 test files

4. Test template — vì sao test generator, không test từng file

Test template quan trọng bằng adapter template. Nhưng có một điều bạn cần hiểu trước khi nhìn code:

Nếu template sai, 439 tests sẽ sai cùng lúc.

Test generated code khác với test hand-written code. Bạn test generator behavior, không phải test từng output:

# tests/test_generator.py — test generator itself
import json
import tempfile
from pathlib import Path
from pipeline.generator import run_pipeline
 
 
def test_pipeline_generates_correct_files():
    """Generator phải tạo đúng số lượng files."""
    minimal_metadata = {
        "methods": [
            {
                "id": "user.list",
                "resource": "user",
                "action": "list",
                "cli_command": "list-users",
                "http_method": "GET",
                "path": "/users",
                "summary": "List users",
                "description": "",
                "parameters": [],
                "paginated": True,
                "idempotent": True,
                "requires_auth": True,
                "tags": ["users"],
            }
        ],
        "definitions": {},
        "meta": {"api_version": "v1", "base_url": "", "auth_type": "bearer",
                  "extracted_at": "2026-01-01T00:00:00Z", "total_methods": 1}
    }
 
    with tempfile.TemporaryDirectory() as tmp:
        metadata_path = f"{tmp}/metadata.json"
        Path(metadata_path).write_text(json.dumps(minimal_metadata))
 
        run_pipeline(metadata_path, tmp, "pipeline/templates")
 
        assert (Path(tmp) / "adapters" / "user_adapter.py").exists()
        assert (Path(tmp) / "cli" / "users.py").exists()
        assert (Path(tmp) / "tests" / "unit" / "test_user.py").exists()
 
 
def test_adapter_contains_list_method():
    """Generated adapter phải có method list_user với pagination logic."""
    with tempfile.TemporaryDirectory() as tmp:
        # ... same setup ...
        content = (Path(tmp) / "adapters" / "user_adapter.py").read_text()
        assert "def list_user" in content
        assert "Iterator" in content
        assert "has_next_page" in content
 
 
def test_command_contains_json_flag():
    """Generated command PHẢI có --json flag (agent-ready requirement)."""
    with tempfile.TemporaryDirectory() as tmp:
        # ... same setup ...
        content = (Path(tmp) / "cli" / "users.py").read_text()
        assert "--json" in content
        assert "json_output" in content

Đây là sự khác biệt: bạn test what the generator produces, không phải test từng generated file. Khi template đúng, 439 generated files đều đúng. Khi template sai, fix template một lần, tất cả đúng.

5. Makefile — codegen workflow

# Makefile
METADATA := metadata/api-metadata.json
GENERATED := generated/
TEMPLATES := pipeline/templates/
 
.PHONY: extract normalize codegen test clean
 
extract:
	python tools/extract_openapi.py openapi.yaml $(METADATA)
 
normalize:
	python tools/normalize_metadata.py $(METADATA) $(METADATA)
 
codegen: extract normalize
	python pipeline/generator.py $(METADATA) $(GENERATED) $(TEMPLATES)
	@echo "✓ Codegen complete"
 
test-generator:
	pytest tests/test_generator.py -v
 
test-generated:
	pytest $(GENERATED)/tests/ -v
 
test: test-generator test-generated
 
clean:
	rm -rf $(GENERATED)
 
# Full pipeline
all: codegen test

# Một lệnh: extract → normalize → generate → test
make all

6. Edge cases: pagination, idempotency, --force

Generator phải handle các trường hợp đặc biệt:

Pagination — Bài 3 đã cover. Template check method.paginated và generate iterator.

Idempotency — Generator tạo ensure_* alias cho create operations:

{# Nếu create method có idempotent: true → generate ensure_ variant #}
{% if method.action == 'create' and method.idempotent %}
def ensure_{{ method.resource }}(
    self,
    {% for param in method.parameters %}
    {{ param.name }}: {{ param | python_type }}{% if not param.required %} = {{ param | default_value }}{% endif %},
    {% endfor %}
) -> tuple[{{ resource | title }}, str]:
    """{{ method.summary }} — idempotent variant. Returns (object, action)."""
    # Try find existing
    try:
        existing = next(self.list_{{ method.resource }}s(
            {# Use unique identifier params #}
        ), None)
        if existing:
            return existing, "already_exists"
    except AdapterError as e:
        if e.code != "not_found":
            raise
    # Create new
    obj = self.create_{{ method.resource }}({{ method.parameters | param_call_args }})
    return obj, "created"
{% endif %}

--force flag — Cho destructive operations (delete):

{% if method.action == 'delete' %}
@app.command()
def {{ method.cli_command | underscore }}(
    {% for param in method.parameters %}
    {{ param.name }}: {{ param | typer_type }} = typer.{{ param | typer_arg }},
    {% endfor %}
    force: bool = typer.Option(False, "--force", help="Skip confirmation (required)"),
    json_output: bool = typer.Option(False, "--json"),
):
    """{{ method.summary }} — DESTRUCTIVE. Requires --force."""
    if not force:
        msg = "Use --force to confirm deletion"
        print(json.dumps({"error": "missing_force", "message": msg}), file=sys.stderr)
        raise SystemExit(2)
    # proceed with deletion
{% endif %}

7. Ứng dụng trong AI-centric engineering

Khi bạn có 439 commands được generate, agent không thể nhớ hết. Đây là lúc Skills doc (bài 8) trở nên quan trọng: generator tạo thêm một output — SKILL.md — tóm tắt 439 commands thành ~200 tokens.

codegen: extract normalize
	python pipeline/generator.py $(METADATA) $(GENERATED) $(TEMPLATES)
	python pipeline/generate_skills.py $(METADATA) SKILL.md   # ← thêm vào

Pipeline không chỉ generate code cho developer — nó generate interface cho agent.

Một con số để hình dung: trong project procurement CLI thực tế (439 methods, 395 tests), total generated code là ~16,000 dòng. Time để generate: dưới 3 giây. Time để verify (run tests): khoảng 45 giây. Nếu viết tay với tốc độ 100 dòng/giờ: 160 giờ.

Đó là ROI của codegen. Không phải về lười biếng — về correctness at scale.

Bài tiếp

Bài 6: Testing Generated Code — Strategies That Scale — 4 chiến lược test cho generated code: snapshot testing (đảm bảo template không thay đổi output), contract testing (generated code match API spec), mock adapter testing, và integration testing. Tại sao bạn test generator, không phải test từng generated file.