type: evals_registry entry_prefix: EVAL schema: id: EVAL-XXX date: YYYY-MM-DD output: string (what was produced) method: string (how it was evaluated - manual read, test, benchmark, user feedback) anomalies: list of strings (what was wrong, missing, surprising) action: [keep | correct | deprecate] rules:
| ID | Date | Output | Action |
|---|