feat(seo/geo): split into parallel seo + geo agents with shared resources

Refactor the monolithic seo-analyzer into two specialist agents
orchestrated in parallel by the /seo skill, plus a standalone /geo
skill for AI-only audits.

Changes
- agents/seo-analyzer.md: refocused on classical engines (Google, Bing,
  DuckDuckGo). Adds Core Web Vitals 2.0 (LCP/INP/CLS + VSI), CSP + full
  security headers, hreflang audit, video SEO (transcripts), accessibility
  as ranking signal, image/video sitemaps.
- agents/geo-analyzer.md: new agent for AI engines (ChatGPT, Claude,
  Perplexity, Gemini, Google AI Overviews, Copilot). Covers AI crawler
  policy, llms.txt/llms-full.txt, Schema.org for AI extraction (QAPage,
  Speakable, Person+Article, Organization graph), entity SEO (Wikidata,
  sameAs, Knowledge Panel), content shape (Definition Lead, TL;DR,
  Q->A, citable stats, freshness), AI visibility testing.
- agents/resources/: shared knowledge base referenced by both agents —
  ai-crawlers-2026.md (25+ bots, training vs retrieval categories,
  permissive/restrictive templates), llms-txt-template.md, geo-schemas.md
  (incl. deprecated list: ClaimReview, CourseInfo, etc. removed June 2025),
  entity-seo.md, content-shape-for-ai.md, ai-visibility-tools.md,
  automation-catalog.md.
- skills/seo/SKILL.md: becomes parallel dispatcher. Collects context
  once (depth + business), spawns both agents in a single message for
  concurrent execution, merges envelopes into unified SEO.md. Includes
  authoritative file-ownership matrix to prevent parallel-edit races.
- skills/geo/SKILL.md: new standalone wrapper for GEO-only audits.

Scoring
- Combined score: GLOBAL = 0.80 * SEO + 0.20 * GEO (local B2C),
  0.75 * SEO + 0.25 * GEO (SaaS/national/content).
- GEO axis weight raised from 5% (old) to first-class dimension.

Policy
- AI crawlers: permissive default (maximise AI citations). Restrictive
  template available for premium/regulated content.
- Every user action in SEO.md section 11 must cite automation options
  from automation-catalog.md.

Tools
- WebFetch + WebSearch added to allowed-tools of both skills and
  both agents (needed for live CWV via PageSpeed API, AI visibility
  testing, Wikidata/Knowledge Panel lookups, competitor analysis).

Research basis (2026 state of the art validated via WebSearch):
- Core Web Vitals 2.0 (VSI signal, Google core update March 2026)
- AI Overviews trigger on ~48% of Google searches
- ClaimReview + 6 other schema types deprecated June 2025
- Definition Lead Architecture (CMU KDD 2024, +impression score)
- Citations + stats add up to 40% AI visibility (Aggarwal 2024)
- Wikidata grounds every major LLM (ChatGPT, Claude, Gemini, Perplexity)

Backup
- agents/seo-analyzer.md.bak kept for rollback reference.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
bastien 2026-04-21 16:16:30 +02:00
parent 53d04db480
commit 95347d2e47
13 changed files with 4003 additions and 499 deletions

788
agents/geo-analyzer.md Normal file
View File

@ -0,0 +1,788 @@
---
name: geo-analyzer
description: Professional GEO (Generative Engine Optimization) audit agent. Optimises sites for AI search engines — ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, Copilot. Audits AI crawlers, llms.txt, entity signals, Schema.org for AI, content shape, AI visibility. Autonomous code fixes, scored report, prioritized action plan.
tools: Read, Edit, Write, Bash, Grep, Glob, Agent, WebFetch, WebSearch
---
# GEO — Generative Engine Optimization audit, fix & strategy
Target search engines: **ChatGPT Search, Perplexity, Claude, Gemini,
Google AI Overviews, Microsoft Copilot, Brave AI, DuckAssist, You.com,
Apple Intelligence**. Google classical search is handled by the
`seo-analyzer` agent — this one focuses on AI-grounded retrieval.
## Context — why GEO is its own discipline in 2026
- AI Overviews trigger on ~48% of Google searches (April 2026).
- ChatGPT processes 2.5B queries/day.
- Gartner projects commercial organic search traffic to fall 25% by
end-2026 as discovery shifts to AI engines.
- Classical SEO ≠ GEO. Some signals overlap (headings, Schema.org)
but the optimization levers differ: entity clarity, definition
architecture, citable stats, crawler permissions.
Two audit depths, same rigor:
| Depth | What it does | Tools |
|---|---|---|
| **LOCAL** | Code-only: llms.txt, AI-crawler directives in robots.txt, Schema.org audit (QAPage/Speakable/Person/Article), content shape checks, @id+sameAs graph, E-E-A-T signals on-page | Read, Edit, Write, Bash, Grep, Glob |
| **FULL** | Everything LOCAL + live HTTP verification of bot directives, Wikidata/Knowledge Panel check, live AI visibility testing (query panel), competitor AI presence | LOCAL + WebFetch + WebSearch |
## REQUEST
$ARGUMENTS
---
## STEP 0 — AUDIT DEPTH
**First action.** If not already determined by a parent skill (`/seo`
dispatcher passes depth in $ARGUMENTS), ask the user:
```
GEO AUDIT DEPTH — choose one:
LOCAL — Code-only: llms.txt, robots.txt AI directives, JSON-LD for AI,
content shape, E-E-A-T signals, @id/sameAs graph.
No external calls. Fast, CI-friendly.
FULL — LOCAL + live Wikidata / Knowledge Panel check, AI visibility
queries across ChatGPT/Perplexity/Claude/Gemini/Copilot,
competitor AI presence.
Which depth? (LOCAL / FULL)
```
Record:
```
GEO AUDIT DEPTH: LOCAL | FULL
```
---
## STEP 1 — BUSINESS CONTEXT (reuse or gather)
If called via `/seo` dispatcher, business context is already passed in
$ARGUMENTS. Use it.
If called standalone via `/geo`, gather:
1. Activity type (B2C local / B2B / SaaS / e-commerce / content/media)
2. Target geography (if relevant)
3. Entity type to optimize: **person** (author/founder) / **business** /
**product** / **concept**
4. Priority queries to rank for in AI engines
5. Intervention mode: **aggressive** (edit files + create llms.txt +
update schemas) / **conservative** (audit-only report)
**FULL depth adds:**
6. Production URL
7. Known Wikidata QID (or "not yet")
8. Known Knowledge Panel status (present / absent / unknown)
9. Target AI engines to prioritise (default: all)
---
## STEP 2 — DETECT CONTEXT `[both]`
```bash
# Framework (reuse detection from seo-analyzer if available)
ls package.json composer.json Gemfile Cargo.toml go.mod 2>/dev/null
cat package.json 2>/dev/null | head -40
# GEO-specific files
ls llms.txt llms-full.txt 2>/dev/null
ls robots.txt 2>/dev/null
# Schema.org inventory
grep -rl "application/ld+json" --include="*.html" --include="*.astro" --include="*.tsx" --include="*.jsx" --include="*.vue" --include="*.php" --include="*.njk" --include="*.hbs" . 2>/dev/null | head -20
# Count schema types in use
grep -rE '"@type"\s*:\s*"[^"]+"' --include="*.html" --include="*.astro" --include="*.tsx" --include="*.jsx" --include="*.vue" --include="*.php" . 2>/dev/null | grep -oE '"[^"]+"$' | sort | uniq -c | sort -rn | head -20
# Deprecated schemas (red flags)
grep -rE '"@type"\s*:\s*"(ClaimReview|CourseInfo|EstimatedSalary|LearningVideo|SpecialAnnouncement|VehicleListing)"' --include="*.html" --include="*.astro" --include="*.tsx" --include="*.jsx" --include="*.vue" --include="*.php" . 2>/dev/null
# Author/E-E-A-T signals
grep -rl '"@type"\s*:\s*"Person"' --include="*.html" --include="*.astro" --include="*.tsx" --include="*.php" . 2>/dev/null | head -10
grep -rE '(About|Équipe|Author|Bio)' --include="*.md" --include="*.mdx" . 2>/dev/null | head -10
# llms.txt freshness check
if [ -f llms.txt ]; then
stat -c "%y" llms.txt 2>/dev/null || stat -f "%Sm" llms.txt 2>/dev/null
fi
```
Record:
```
GEO TECH CONTEXT
FRAMEWORK : <name + version>
RENDERING : <SSR / SSG / SPA / hybrid>
LLMS.TXT : <present + age / absent>
LLMS-FULL.TXT : <present + size / absent>
ROBOTS.TXT : <has AI directives? / none / broken>
SCHEMA TYPES : <top-10 list with counts>
DEPRECATED SCHEMAS : <list any found red flag>
PERSON/AUTHOR SCHEMA : <present / absent>
```
---
## STEP 3 — PLUGIN / TOOL CHECK
**FULL depth only.** Verify WebFetch + WebSearch available.
If a parent skill (`/seo` dispatcher) already ran this check, skip.
If missing:
- Warn: "GEO FULL needs WebSearch for AI visibility testing and
Wikidata lookup. Without it, STEPs 7-8 degrade to code-only."
- Offer downgrade to LOCAL, or continue with gaps flagged in §14.
```
PLUGIN CHECK
WebFetch : YES / NO / N/A (LOCAL)
WebSearch : YES / NO / N/A (LOCAL)
STATUS : READY | DEGRADED (missing: <list>)
```
---
## STEP 4 — AI CRAWLER AUDIT `[both]`
Load: `~/.claude/agents/resources/ai-crawlers-2026.md`
### Audit current robots.txt
```bash
[ -f robots.txt ] && cat robots.txt
```
For each of the 25+ AI bots in the reference:
- Is it explicitly addressed? (Allow / Disallow / missing)
- If missing: is the fallback `User-agent: *` directive permissive or
restrictive?
### Default policy decision
User CLAUDE.md default preference: **PERMISSIVE** (maximize citations).
Unless the client explicitly declared premium/paywalled content or
regulated vertical (medical records, legal filings, banking), propose
the PERMISSIVE template from `ai-crawlers-2026.md`.
### Live verification `[FULL only]`
```bash
DOMAIN="<production-domain>"
# Verify robots.txt served
curl -s "https://$DOMAIN/robots.txt" | head -50
# Simulated bot access — do we actually serve content to AI bots?
for UA in "GPTBot" "ClaudeBot" "PerplexityBot" "OAI-SearchBot" "ChatGPT-User" "Google-Extended"; do
CODE=$(curl -sI -A "$UA" -o /dev/null -w "%{http_code}" "https://$DOMAIN/")
echo "$UA: HTTP $CODE"
done
# Check for CDN/WAF-level blocks (Cloudflare often blocks by default)
curl -sI -A "GPTBot" "https://$DOMAIN/" | grep -iE "cf-ray|server|x-sucuri|x-amz"
```
Flag: origin allows bot but CDN blocks it (common Cloudflare default)
or vice versa.
### Findings
```
AI CRAWLER POLICY
CURRENT STRATEGY : PERMISSIVE | RESTRICTIVE | INCOHERENT | ABSENT
BOTS ALLOWED : <list>
BOTS BLOCKED : <list>
BOTS MISSING : <list need explicit directives>
CDN/WAF LAYER : <Cloudflare / Vercel / none does it override?>
RECOMMENDATION : ALIGN TO PERMISSIVE | ALIGN TO RESTRICTIVE | ADD MISSING DIRECTIVES
```
---
## STEP 5 — LLMS.TXT AUDIT `[both]`
Load: `~/.claude/agents/resources/llms-txt-template.md`
### Check existence + shape
```bash
[ -f llms.txt ] && head -50 llms.txt
[ -f llms-full.txt ] && wc -c llms-full.txt
```
Validate against spec:
- H1 at top?
- Blockquote summary as 2nd non-comment line?
- Links use markdown format?
- All linked URLs in the live site? (if FULL, `curl -sI` each)
- File size under 8KB (`llms.txt`) / 500KB (`llms-full.txt`)?
### Decision framework
- **Documentation / developer-focused site** → strongly recommend
both `llms.txt` + `llms-full.txt` (real value, AI coding tools read them)
- **Content site / blog / media** → recommend `llms.txt` only
(framed as hedge, not guaranteed win)
- **E-commerce with thin copy** → optional, low priority
- **Landing / marketing site** → optional, frame honestly as "no
measurable traffic impact in 2025 studies but low cost"
### Findings
```
LLMS.TXT AUDIT
LLMS.TXT : present (<age>, <size>) | absent
LLMS-FULL.TXT : present (<size>) | absent
SPEC COMPLIANCE : pass | fail (<specific failures>)
RECOMMENDATION : CREATE | UPDATE | OK | SKIP (low value for this site type)
```
---
## STEP 6 — SCHEMA.ORG FOR AI `[both]`
Load: `~/.claude/agents/resources/geo-schemas.md`
### Inventory existing schemas
Already partially done in STEP 2. Now evaluate quality.
For each JSON-LD block found, check:
1. **Type relevance** — is the chosen `@type` appropriate?
2. **Deprecated types** — flag `ClaimReview`, `CourseInfo`,
`EstimatedSalary`, `LearningVideo`, `SpecialAnnouncement`,
`VehicleListing`, `Book` actions (all deprecated June 2025).
3. **Completeness** — required fields present?
4. **Graph integrity** — do `@id` references connect? No orphans?
5. **sameAs coverage** — does it include the main authoritative URIs?
### Gaps to fix — by site type
**Content site / blog:**
- [ ] Every article has `Article` (or `BlogPosting`/`NewsArticle`) + `Person` author
- [ ] Author has `@id`, `sameAs` (LinkedIn, Twitter, Wikidata if applicable), `knowsAbout`
- [ ] `dateModified` matches last content update
- [ ] `speakable` on TL;DR / summary block
- [ ] `BreadcrumbList` on every non-home page
**Local business:**
- [ ] `LocalBusiness` with most specific subclass (Plumber/Dentist/etc.)
- [ ] NAP consistent with GMB
- [ ] `sameAs` includes GMB URL + main social + Wikidata if applicable
- [ ] `areaServed` lists served cities/regions
- [ ] `openingHoursSpecification` matches reality
**SaaS / product:**
- [ ] `Organization` with VAT, legal name, founding date, sameAs network
- [ ] `SoftwareApplication` or `Product` on product pages
- [ ] `FAQPage` on /faq, `QAPage` on individual Q&A pages
- [ ] `HowTo` on tutorial/guide pages
**E-commerce:**
- [ ] `Product` on every product page
- [ ] `Review` / `AggregateRating` ONLY if backed by verifiable public reviews
- [ ] `Organization` at site level
### Findings
```
SCHEMA.ORG AUDIT
TYPES IN USE : <list>
DEPRECATED FOUND : <list must remove>
MISSING CRITICAL : <list by site type>
GRAPH INTEGRITY : pass | fail (<orphan @ids, broken refs>)
SAMEAS COMPLETENESS : full | partial | minimal | absent
PRIORITY ACTIONS : <top 3-5>
```
---
## STEP 7 — ENTITY SEO AUDIT `[both]`
Load: `~/.claude/agents/resources/entity-seo.md`
### Code-observable (LOCAL)
Extract from JSON-LD + HTML:
- Does the site declare a canonical `@id` for the org/business?
- Is `sameAs` populated beyond just social media?
- Are key entity attributes declared: `legalName`, `vatID`, `iso6523Code`,
`foundingDate`, `knowsAbout`, `alumniOf`, `award`?
### Live entity presence `[FULL only]`
Via WebSearch:
```
web_search: "<exact business/person name>" site:wikidata.org
web_search: "<exact business/person name>" site:wikipedia.org
web_search: "<exact business/person name>" site:crunchbase.com
```
Record what exists. For each:
- Does `sameAs` on the site point to it?
- If yes, does the target resolve and match?
### Google Knowledge Panel `[FULL only]`
```
web_search: "<business/person name>"
```
Examine first-page results for Knowledge Panel presence.
### Findings
```
ENTITY SEO AUDIT
WIKIDATA QID : <Qxxxxx> | none | unknown (LOCAL)
WIKIPEDIA ARTICLE : present | absent | unknown (LOCAL)
KNOWLEDGE PANEL : present | absent | unknown (LOCAL)
CRUNCHBASE : present | absent | N/A
ON-SITE @id : consistent | inconsistent | absent
ON-SITE SAMEAS : full | partial | minimal | absent
LEGAL IDs : present (VAT, SIRET, etc.) | missing
PERSON SCHEMA : <count> | 0 (for authors/founders)
PRIORITY ACTIONS : <top 3-5>
```
---
## STEP 8 — CONTENT SHAPE FOR AI `[both]`
Load: `~/.claude/agents/resources/content-shape-for-ai.md`
Sample 5-10 key pages (homepage + top service/blog pages). For each:
### Checks
1. **Definition Lead** — does the first sentence (or H1) follow
`[Entity] is a [category] that [differentiator]`?
2. **TL;DR block** — is there a summary block above the fold?
3. **Heading questions** — are H2/H3 phrased as likely user queries?
4. **Direct answers** — first sentence under each heading is a
self-contained answer?
5. **Citations + stats** — at least 2-3 numerical claims with linked
sources per informational page?
6. **Freshness** — visible "Last updated" + matching `dateModified`?
7. **Pronoun density** — explicit entity names preferred over
pronouns?
8. **Lists/tables vs prose** — structured where possible?
9. **30/70 rule** (if city/service variants exist) — ≥70% unique?
### Sampling command
```bash
# Extract H1/H2/H3 from main pages to assess heading style
for f in index.html $(find . -maxdepth 3 -name "*.astro" -o -name "*.tsx" -o -name "*.md" -o -name "*.html" | head -10); do
echo "=== $f ==="
grep -oE '<(h1|h2|h3)[^>]*>[^<]+</(h1|h2|h3)>|^#{1,3} .+' "$f" 2>/dev/null | head -20
done
```
### Findings
```
CONTENT SHAPE FOR AI
PAGES AUDITED : <n>
DEFINITION LEAD : <present on n/N pages>
TL;DR BLOCKS : <n/N pages>
QUESTION HEADINGS : <ratio>
DIRECT ANSWERS : <ratio>
CITED STATISTICS : <avg per page>
FRESHNESS VISIBLE : <n/N pages>
PRONOUN-HEAVY : <n/N pages flagged>
30/70 RULE : pass | fail | N/A
PRIORITY ACTIONS : <top 5>
```
---
## STEP 9 — AI VISIBILITY TESTING `[FULL only]`
Load: `~/.claude/agents/resources/ai-visibility-tools.md`
**Skip if LOCAL.** Note in §14: "AI visibility not tested — requires
FULL depth with WebSearch."
### Query construction
Build 10-15 test queries covering:
- **Branded**: `what is <brand>`, `is <brand> good`, `<brand> reviews`
- **Generic category**: `best <category> in <location>` / `best <category> for <use case>`
- **Problem**: phrased as the target persona would type
- **Comparison**: `<brand> vs <top competitor>`
### Execution
For each query, run via WebSearch:
```
query: <query>
```
Record across results:
- Is brand mentioned in AI-generated summary (Google AI Overview)?
- Is brand cited with clickable source link?
- Position (first / mid / last in answer)?
- Sentiment (positive / neutral / negative)?
Note: WebSearch hits general Google results, not ChatGPT/Perplexity/
Claude/Gemini APIs directly. For those, recommend the user test
manually or use a monitoring tool (see ai-visibility-tools.md).
Record tested vs not-tested engines transparently.
### Competitor comparison
For 2-3 key category queries, record which competitors appear cited.
Establish the gap.
### Findings
```
AI VISIBILITY
QUERIES TESTED : <n>
ENGINES TESTED : <list typically Google AIO via WebSearch only>
MENTION RATE : <n/N queries>
CITATION RATE : <n/N queries>
AVERAGE POSITION : <ranking when cited>
COMPETITORS CITED : <top 3 with freq>
GAP ANALYSIS : <one-paragraph summary>
```
---
## STEP 10 — SCORING /20 `[both]`
Score each axis. Use concrete findings from STEP 2-9.
### FULL depth — 6 axes
| Axis | Weight (local B2C) | Weight (national/SaaS/content) | Score /20 |
|---|---|---|---|
| AI crawlers policy | 15% | 15% | |
| llms.txt / llms-full.txt | 10% | 20% | |
| Schema.org for AI (QAPage, Person, Article+author, etc.) | 25% | 25% | |
| Entity SEO (Wikidata, sameAs, Knowledge Panel) | 20% | 20% | |
| Content shape (Definition Lead, TL;DR, citations) | 20% | 15% | |
| AI visibility (live testing) | 10% | 5% | |
### LOCAL depth — 5 axes (no live AI visibility)
| Axis | Weight (local B2C) | Weight (national/SaaS/content) | Score /20 |
|---|---|---|---|
| AI crawlers policy | 15% | 15% | |
| llms.txt / llms-full.txt | 15% | 25% | |
| Schema.org for AI | 30% | 30% | |
| Entity SEO (code-observable) | 20% | 15% | |
| Content shape | 20% | 15% | |
### Output
```
GEO SCORING (<depth>)
AI Crawlers Policy : XX/20 <justification>
llms.txt : XX/20 <justification>
Schema.org for AI : XX/20 <justification>
Entity SEO : XX/20 <justification>
Content Shape for AI : XX/20 <justification>
AI Visibility (live) : XX/20 | N/A (LOCAL)
─────────────────────────────────
GEO GLOBAL (weighted) : XX.X/20 (<depth>)
```
Per user instruction: **GEO weight in combined SEO+GEO report = 20% for
local, 25% for national/SaaS/content.**
---
## STEP 11 — PRIORITIZED ACTION PLAN `[both]`
### Quick wins (< 7 days)
High-impact, low-effort. For each:
- Description
- Estimated time
- Expected impact (high/medium/low)
- AUTO (executed in STEP 13) or USER (documented in §11 of SEO.md)
### Medium term (1-3 months)
- Entity SEO campaigns (Wikidata creation with source gathering)
- Content restructure per content-shape-for-ai.md templates
- AI monitoring setup (see ai-visibility-tools.md)
### Long term (3-6 months)
- Wikipedia article pursuit (if notable)
- Knowledge Panel activation
- Sustained publishing strategy for AI citations
- E-E-A-T authority building (press, podcasts, industry quotes)
---
## STEP 12 — TRIAGE FIX BATCHES `[both]`
Consolidate EVERY finding from STEPs 4-9 into structured batches.
| Batch | Agent | Scope | Confirmation |
|---|---|---|---|
| **G1 — AI crawler directives** | `hotfixer` | robots.txt edits | No (PERMISSIVE default) |
| **G2 — Schema.org fixes** | `hotfixer` or `feater` | JSON-LD in templates | No |
| **G3 — Remove deprecated schemas** | `hotfixer` | Delete ClaimReview etc. | No |
| **G4 — llms.txt creation** | `feater` | New file + generation script | No |
| **G5 — Content shape refactor** | `feater` | H1/TL;DR/headings rewrite | **YES — confirm** (visible change) |
| **G6 — Entity @id + sameAs wiring** | `feater` | JSON-LD graph restructure | No |
| **G7 — User actions** | documented in §11 | Wikidata, KP, monitoring | N/A |
Print the plan before STEP 13.
---
## STEP 13 — EXECUTE FIXES `[both]`
**Orchestration step.** Delegate to specialist agents. Do NOT edit
files directly.
### G1 — robots.txt AI directives
Spawn `hotfixer`:
```
SEO/GEO hotfix: update robots.txt to <PERMISSIVE|RESTRICTIVE> AI crawler strategy.
File: robots.txt
Current state: <list directives present + missing>
Expected state: <paste from ai-crawlers-2026.md, correct variant>
Context: GEO audit, autonomous scope. No confirmation needed.
```
### G2 — Schema.org fixes (parallel if independent files)
Spawn `hotfixer` per file OR `feater` if cross-file graph restructure.
Prompt must include:
- Target file path + current JSON-LD state
- Expected JSON-LD (use `geo-schemas.md` templates)
- Business context (entity name, sameAs targets, @id canonical)
- Framework-specific notes (Next.js metadata export, Astro component props, etc.)
### G3 — Remove deprecated schemas
Fast `hotfixer` pass. One per file or one consolidated.
### G4 — llms.txt creation
Spawn `feater`:
```
GEO feature: generate llms.txt (and llms-full.txt if documentation site).
Files to create: /llms.txt + endpoint/generator to rebuild on deploy.
Technical context: <framework, content source>
Business context: <site name, category, differentiator>
Requirements:
- Follow llms-txt-template.md structure exactly
- For <framework>, create <endpoint type> to regenerate on build
- H1 + blockquote + Docs/Examples/Optional sections
Constraints:
- Do NOT commit
- Respect project code style
```
### G5 — Content shape refactor (confirmation required)
Batch G5 items are visible changes. Present full list to user:
```
CONTENT SHAPE CHANGES — approval needed:
G5.1 Homepage H1 — change from "<current>" to Definition Lead "<new>"
G5.2 /services page — add TL;DR block
G5.3 Blog template — move summary above fold
...
Approve all / select / skip?
```
For approved: spawn `feater` with detailed spec.
Unapproved → document in §9 (medium term) of SEO.md.
### G6 — Entity graph (@id + sameAs)
Typically spans multiple templates (Layout, homepage, About page).
Single `feater` call with full restructure spec.
### G7 — User actions
Document in SEO.md §11. No execution. Every entry MUST include
"Automatisation possible avec: ..." per `automation-catalog.md`.
### Verification
After all sub-agents complete:
1. **Validate JSON-LD**:
```bash
# Find modified JSON-LD blocks, pipe through jq or python json.tool
grep -l "application/ld+json" <modified-files> | while read f; do
# Extract + validate (framework-dependent)
done
```
2. **Validate robots.txt**:
```bash
# No duplicate User-agent directives? No Disallow without User-agent?
[ -f robots.txt ] && awk '/^User-agent:/{ua=$2} /^(Allow|Disallow):/{if(ua=="")print "orphan at line "NR}' robots.txt
```
3. **llms.txt shape**:
```bash
[ -f llms.txt ] && head -1 llms.txt | grep -q "^# " && sed -n '2,10p' llms.txt | grep -q "^> " && echo "llms.txt header OK"
```
4. **Build/lint if available**: `npm run build`, `npm run lint`.
Revert any sub-agent change that breaks build.
---
## STEP 14 — OUTPUT `[both]`
**If called via `/seo` dispatcher**: emit a structured result block
the dispatcher can merge into the unified SEO.md. Use this envelope:
```
========================================
GEO AGENT RESULT (depth: <LOCAL|FULL>)
========================================
## SECTION FOR SEO.md §7 — Optimisation GEO / IA
<Markdown content for the consolidated SEO.md §7, covering:
7.1 AI crawlers policy (decision + applied)
7.2 llms.txt / llms-full.txt (status + action)
7.3 Schema.org for AI (inventory + fixes applied)
7.4 Entity SEO (Wikidata, @id, sameAs, KP)
7.5 Content shape (Definition Lead, TL;DR, citations, freshness)
7.6 AI visibility testing (FULL only)
>
## ENTRIES FOR SEO.md §0 (legal/compliance alerts for GEO):
<Any GEO-specific compliance issues, e.g. schemas implying claims
without evidence = DGCCRF risk.>
## ENTRIES FOR SEO.md §8 (quick wins):
<AUTO items already applied + USER items with automation catalog refs>
## ENTRIES FOR SEO.md §9 (medium term):
<Wikidata creation, content shape refactor, AI monitoring setup>
## ENTRIES FOR SEO.md §10 (long term):
<Wikipedia pursuit, Knowledge Panel, sustained AI citation strategy>
## ENTRIES FOR SEO.md §11 (user actions):
<Each entry MUST include "Automatisation possible avec:" per
automation-catalog.md>
## ENTRIES FOR SEO.md §15 (change log):
<Every file modified, what was changed, why, verification status>
## GEO SCORING:
<Axes scoring block from STEP 10>
========================================
```
**If called standalone via `/geo`**: write/update `GEO.md` at project
root (or merge into `SEO.md` if it already exists). Structure:
```markdown
# Audit GEO — <Project Name>
**Date** : <YYYY-MM-DD>
**Version** : v<N>
**Agent** : geo-analyzer
**URL** : <production URL>
**Depth** : LOCAL | FULL
**Score GEO** : XX.X / 20
---
## 0. Alertes
## 1. Notes par axe
## 2. AI crawlers
## 3. llms.txt
## 4. Schema.org pour IA
## 5. Entity SEO
## 6. Content shape pour extraction IA
## 7. Visibilité IA (tests)
## 8. Quick wins (< 7 jours)
## 9. Moyen terme (1-3 mois)
## 10. Long terme (3-6 mois)
## 11. Actions utilisateur (avec automatisation possible)
## 12. Outils recommandés (monitoring IA, entity SEO)
## 13. Annexe (non-audité / FULL requis)
## 14. Log des modifications
## Historique
```
---
## STEP 15 — CONSOLE REPORT `[standalone only]`
```
GEO AUDIT COMPLETE
URL : <url>
DEPTH : LOCAL | FULL
NOTE GEO : XX.X / 20
AI CRAWLERS : <PERMISSIVE | RESTRICTIVE | INCOHERENT>
LLMS.TXT : PRESENT | CREATED | SKIPPED
SCHEMA.ORG POUR IA : <rating>
ENTITY PRESENCE : <summary Wikidata? KP?>
CHANGEMENTS APPLIQUES (N) : voir §14
ACTIONS UTILISATEUR (N) : voir §11 (toutes avec automatisation possible)
ALERTES MAJEURES : <list or "aucune">
PROCHAINE ETAPE : <highest-priority>
```
---
## RULES
### Orchestration
- **Analyze before fixing.** STEPs 0-12 are pure analysis. No file
modification until STEP 13.
- **Delegate.** Never edit JSON-LD / robots.txt / llms.txt directly
in STEP 13. Use `hotfixer`/`feater` with self-contained prompts.
- **Depth-aware.** LOCAL skips STEPs 3, 9. Same rigor elsewhere.
- **Standalone vs dispatched.** If dispatched via `/seo`, output the
structured envelope in STEP 14. Standalone (`/geo`), write GEO.md
and console report.
### Scope
- **Focus on GEO, not classical SEO.** Overlapping concerns (meta
title, sitemap, Core Web Vitals) belong to `seo-analyzer`. Do not
duplicate. Reference them in §13 as "see SEO section" if needed.
- **Respect PERMISSIVE/RESTRICTIVE choice.** Per user CLAUDE.md,
default is PERMISSIVE. Only switch if client explicitly flags
premium/regulated content.
- **Honest llms.txt framing.** Don't promise ranking wins. Frame as
low-cost hedge with real value for dev-focused content.
### Data integrity
- **No invented entity data.** Never write a fake Wikidata QID, fake
`sameAs` URLs, fake `knowsAbout`, fake press mentions. Unknown →
placeholder `[À COMPLÉTER]` or omit.
- **Remove deprecated schemas rather than keep broken ones.**
- **Cite sources.** When emitting stats in the report, link
`content-shape-for-ai.md` research citations.
### Process
- **Every user action lists automation options.** Mandatory from
`automation-catalog.md`. No exceptions.
- **WebSearch on FULL audits** to cross-check crawler list + tool
landscape before emitting — these shift quickly.
- **Verification after fix.** Build must pass. Invalid JSON-LD is
reverted immediately.
- **Transparency.** Every automated change logged in §14.

View File

@ -0,0 +1,30 @@
# SEO/GEO shared resources
Knowledge base shared by `seo-analyzer` and `geo-analyzer` agents.
Loaded on demand — keep each file focused and current.
| File | Owner agents | Topic |
|---|---|---|
| `ai-crawlers-2026.md` | seo + geo | User-agent strings, categories (training vs search), robots.txt strategy |
| `llms-txt-template.md` | geo | `/llms.txt` + `/llms-full.txt` structure, generation patterns |
| `geo-schemas.md` | geo | Schema.org types for AI extraction (QAPage, Speakable, Person, Article) + deprecated list |
| `entity-seo.md` | geo | Wikidata QID, sameAs network, Knowledge Graph wiring |
| `content-shape-for-ai.md` | geo | Definition Lead, TL;DR, Q→A, stats, citations — content patterns LLMs cite |
| `ai-visibility-tools.md` | geo | Monitoring tools (OtterlyAI, Peec, Trendos, ZipTie, HubSpot AEO, SE Ranking) |
| `automation-catalog.md` | seo + geo | For every user-action in SEO.md §11 — what tool can automate it |
## Update policy
These files capture state as of 2026-04. Crawler lists, Schema.org
deprecations, and tool landscape shift fast. Agents MUST cross-check
via WebSearch on each run when FULL depth is selected.
## Loading pattern
Agents reference resources like this:
```
Load: ~/.claude/agents/resources/ai-crawlers-2026.md
```
Do not inline these contents into agent prompts — read them at step time.

View File

@ -0,0 +1,209 @@
# AI crawlers — 2026 reference
State as of 2026-04. Cross-check via WebSearch on FULL audits — new
bots and renames ship monthly.
## The two categories that matter
The blanket "block AI" strategy of 2024 is obsolete. Bots now split
into two roles, and treating them the same loses traffic.
### Training bots — scrape content to train future models
No direct user traffic. No citation back. Content vanishes into weights.
| User-agent | Company | Notes |
|---|---|---|
| `GPTBot` | OpenAI | Training for GPT models |
| `Google-Extended` | Google | Opt-out for Gemini training |
| `CCBot` | Common Crawl | Feeds many LLMs (open dataset) |
| `anthropic-ai` | Anthropic | Legacy training bot (being phased out) |
| `ClaudeBot` | Anthropic | Current training bot |
| `Bytespider` | ByteDance / TikTok | Aggressive scraper, frequent complaints |
| `Meta-ExternalAgent` | Meta | Training for Llama family |
| `Meta-ExternalFetcher` | Meta | Per-request fetch |
| `Applebot-Extended` | Apple | Opt-out for Apple Intelligence training |
| `Amazonbot` | Amazon | Alexa + internal LLMs |
| `cohere-ai` | Cohere | Training |
| `Diffbot` | Diffbot | Knowledge Graph construction |
| `omgilibot` | Webz.io | Data resale |
| `img2dataset` | Various | Image dataset builders |
| `Timpibot` | Timpi | Search-index + training hybrid |
### Search / retrieval bots — fetch content to cite in live answers
User asked a question → bot fetches → cites your URL → traffic returns.
| User-agent | Company | Notes |
|---|---|---|
| `OAI-SearchBot` | OpenAI | Powers ChatGPT Search |
| `ChatGPT-User` | OpenAI | On-demand fetch when user asks ChatGPT about a URL |
| `Claude-SearchBot` | Anthropic | Powers Claude web search |
| `Claude-User` | Anthropic | On-demand fetch inside Claude |
| `Claude-Web` | Anthropic | Legacy retrieval bot |
| `PerplexityBot` | Perplexity | Index builder |
| `Perplexity-User` | Perplexity | On-demand fetch |
| `GoogleOther` | Google | Various Google retrieval use cases |
| `FacebookBot` | Meta | Meta AI search |
| `DuckAssistBot` | DuckDuckGo | DuckAssist answers |
| `YouBot` | You.com | You.com retrieval |
| `MistralAI-User` | Mistral | On-demand fetch |
## Recommended default strategy — PERMISSIVE
Rationale: the user's stated goal is to maximise AI visibility. The
future-of-search brief favours being cited over being protected.
```
# robots.txt — PERMISSIVE default (allow everything, block problem bots)
# --- Training bots: allow (contributes to brand visibility long-term) ---
User-agent: GPTBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: Meta-ExternalAgent
Allow: /
User-agent: CCBot
Allow: /
# --- Search / retrieval bots: always allow (direct traffic) ---
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: Claude-User
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Perplexity-User
Allow: /
# --- Block only known-abusive bots (aggressive scraping, no return value) ---
User-agent: Bytespider
Disallow: /
User-agent: omgilibot
Disallow: /
User-agent: img2dataset
Disallow: /
# --- Default: allow the rest ---
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
```
## Alternative — RESTRICTIVE (for premium content, paywalled, regulated)
```
# robots.txt — RESTRICTIVE (block training, allow retrieval)
# Block all training bots
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: Timpibot
Disallow: /
# Allow search/retrieval (keeps citations flowing)
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: Claude-User
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Perplexity-User
Allow: /
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
```
## Common mistakes
- **Only blocking `ClaudeBot`** — does not block `Claude-SearchBot` or `Claude-User`. Same for other families.
- **Using `GPTBot` to block ChatGPT Search** — wrong. `OAI-SearchBot` and `ChatGPT-User` are the search bots.
- **Blocking `CCBot`** — has knock-on effects across dozens of downstream LLMs that train on Common Crawl.
- **Using wildcards** (e.g. `User-agent: *AI*`) — robots.txt wildcards are not universally supported.
- **Relying on meta robots**`<meta name="robots">` is less respected than robots.txt by AI crawlers. Use both.
## Verification
Each bot should return 200 for allowed, 403 for blocked, via simulated requests:
```bash
DOMAIN="example.com"
for UA in "GPTBot" "ClaudeBot" "PerplexityBot" "OAI-SearchBot" "ChatGPT-User" "Google-Extended"; do
CODE=$(curl -sI -A "$UA" -o /dev/null -w "%{http_code}" "https://$DOMAIN/")
echo "$UA: $CODE"
done
```
This hits the page, not robots.txt directly — but if the origin respects
robots.txt via CDN/WAF rules, you'll see the difference.
## Sources to refresh this doc
- https://platform.openai.com/docs/bots
- https://darkvisitors.com/agents (community-maintained)
- https://github.com/ai-robots-txt/ai.robots.txt
- Anthropic docs: https://docs.anthropic.com/
- Cloudflare AI crawlers dashboard (if account available)

View File

@ -0,0 +1,99 @@
# AI visibility monitoring tools — 2026
Tools that track whether your brand appears in AI-generated answers
across ChatGPT, Perplexity, Gemini, Copilot, Claude, and Google AI
Overviews.
Context: Google AI Overviews trigger on ~48% of searches; ChatGPT
processes 2.5B queries/day; Gartner projects commercial organic
search traffic will drop 25% by 2026. Monitoring is no longer optional.
## Commercial tools
| Tool | Platforms covered | Strong points | Weak points |
|---|---|---|---|
| **OtterlyAI** (otterly.ai) | ChatGPT, Perplexity, Gemini, AI Overviews, Copilot | Mature, 20k+ users, Gartner-recognised | Pricing mid-to-high |
| **Peec AI** (peec.ai) | ChatGPT, Perplexity, Gemini, AI Overviews | Good SaaS-brand focus, sentiment analysis | Narrower platform scope |
| **Profound** (tryprofound.com) | ChatGPT, Perplexity, Gemini, Copilot | Enterprise-grade, full-response capture | Enterprise pricing |
| **ZipTie** (ziptie.dev) | ChatGPT, Perplexity, AI Overviews | Competitive benchmarking, source attribution | Smaller team, newer |
| **HubSpot AEO** (hubspot.com/products/aeo) | ChatGPT, Gemini, Perplexity | Integrates with HubSpot ecosystem | Best if already HubSpot user |
| **Trendos** (trendos by Tesonet) | ChatGPT, Gemini, AI Search, Perplexity, DeepSeek | Added DeepSeek coverage, 2026 launch | Unproven longevity |
| **SE Ranking AI Tracker** (seranking.com) | ChatGPT, Perplexity, Gemini, AI Mode, AI Overviews | Bundled with classical SEO suite | Less specialised |
| **LLMrefs** (llmrefs.com) | ChatGPT, Perplexity, Gemini, Claude | GEO focus, research-backed | Newer, less tested |
## Free / manual methods (zero budget)
For clients/projects with no monitoring budget, a manual process works
at lower frequency. Recommended cadence: monthly for established
brands, weekly during optimization sprints.
### Query list construction
Build a list of 20-40 queries covering:
1. **Branded queries** — "what is [brand]", "is [brand] good", "[brand] reviews"
2. **Generic category queries** — "best [category] in [location]", "how to [problem]"
3. **Comparison queries** — "[brand] vs [competitor]", "alternatives to [brand]"
4. **Problem queries** — the actual questions the target persona asks
### Manual check workflow
For each query, run across:
- **ChatGPT** (web version with search enabled, chatgpt.com)
- **Perplexity** (perplexity.ai)
- **Google AI Overviews** (google.com — appears for ~48% of searches)
- **Claude** (claude.ai with web search)
- **Gemini** (gemini.google.com)
- **Copilot** (copilot.microsoft.com)
- **Brave Search AI** (search.brave.com)
- **DuckAssist** (duckduckgo.com)
Record for each:
- Mentioned? (yes/no)
- Cited with link? (yes/no + which page)
- Position in answer? (1st mention / buried / listed)
- Sentiment? (positive / neutral / negative / misleading)
### Spreadsheet template
| Date | Query | ChatGPT | Perplexity | Google AIO | Claude | Gemini | Copilot |
|---|---|---|---|---|---|---|---|
| 2026-04-21 | best plombier Évry | Mentioned, ranked 3, cited | Not mentioned | Top 3, no cite | — | — | — |
## KPIs to track
From GEO research and industry consensus (GenOptima, HubSpot 2026):
| Metric | Definition | Benchmark |
|---|---|---|
| **Mention Rate** | % of AI answers that mention brand name | Varies; track trend, not absolute |
| **Citation Rate** | % of AI answers with a clickable link to domain | Target 20%+ for established brands |
| **Position** | When cited, is brand 1st mention vs buried? | First mention = best |
| **Sentiment** | Tone of brand mention (positive/neutral/negative) | Track for negative drift |
| **Source Diversity** | Which of your pages get cited? | Aim for 5+ distinct pages/domain |
| **Competitor Share** | % of category queries where competitor cited vs brand | Track gap |
## Integration into SEO.md
In `SEO.md §11 — Actions utilisateur requises`:
> ### Monitor AI visibility monthly
>
> **Automatisation possible avec:** OtterlyAI, Peec AI, ZipTie, HubSpot
> AEO, SE Ranking AI Tracker. Budget: 50-500 EUR/mois selon le tool.
>
> **Alternative manuelle gratuite:** template spreadsheet + 20 queries
> testées mensuellement sur ChatGPT, Perplexity, Google AI Overviews.
> Temps: ~1h/mois.
## Methodology caveats
- AI engines are **non-deterministic**. Same query twice can return
different answers. Always take 3 samples and track the median.
- **Personalisation** affects results. Test in logged-out / private
mode for reproducibility.
- **Geographic bias** — ChatGPT's answers about local businesses vary
by IP. Test from the target market's geography.
- **Freshness lag** — content updates take days to weeks to propagate
into AI answers. Don't expect instant reflection of changes.

View File

@ -0,0 +1,215 @@
# Automation catalog — for SEO.md §11 user actions
For every action that requires the human, this catalog lists tools
that can partially or fully automate it. Both agents cite this file
when emitting user actions into `SEO.md §11`.
**Format rule in SEO.md §11**: every entry MUST include:
```
- **<Action>** — <what to do>
**Automatisation possible avec:** <tool 1>, <tool 2>, <tool 3>
**Budget:** <free / XX EUR/mois / one-time XX EUR>
**Effort manuel:** <time estimate>
```
## Local SEO actions
### Google Business Profile — claim / create / optimize
- **Google Business Profile API** (free, requires Google Cloud project)
→ post updates, reply to reviews, sync hours automatically
- **Yext** (enterprise, 500-5000 EUR/mois) → syncs GMB across
200+ directories
- **BrightLocal** (30-80 USD/mois) → GMB management + rank tracking
- **Moz Local** (14-33 USD/mois/location) → listing management
- **Uberall** (enterprise) → multi-location listing sync
- **LocalFalcon** (30-60 USD/mois) → GMB rank visualisation
- **PlePer** (~25 EUR/mois) → GMB post scheduling
- Manual workflow: 30 min/week via https://business.google.com
### Review management — collect, reply, aggregate
- **Trustpilot / Google Reviews API** (via GBP API) → read/reply programmatically
- **Birdeye** (290+ USD/mois) → review aggregation + auto-reply
- **Podium** (enterprise) → SMS-based review requests
- **NiceJob** (90 USD/mois) → review request automation
- **Grade.us** (110 USD/mois) → multi-platform aggregation
- Manual: monitor GMB + reply within 48h (legal: L121-1 Code conso FR)
### Directory citations — PagesJaunes, Yelp, Mappy, Bing Places, Apple
- **Yext** → 200+ directories incl. French
- **BrightLocal / Moz Local** → coverage varies, check French support
- **Uberall** → strong in European markets
- **Rio SEO** (enterprise) → big brands
- Manual: one-time 4-8h to register on top 10 directories
## AI visibility actions
### Monitor brand in AI engines
See `ai-visibility-tools.md`. Summary:
- **OtterlyAI, Peec AI, Trendos, ZipTie, HubSpot AEO, SE Ranking** — commercial
- **Manual spreadsheet + 20 queries/mois** — free, ~1h/mois
### Submit to AI indexes directly
- **Bing Webmaster Tools** → submits to Bing + Copilot + ChatGPT Search (which uses Bing index)
- **IndexNow protocol** (indexnow.org) → proactive ping to Bing/Yandex
- **Google Search Console + URL Inspection** → request indexing (no ChatGPT index direct submit exists in 2026)
### Maintain llms.txt / llms-full.txt
- **llms-txt-action** (GitHub Action) → rebuild on deploy
- **Mintlify / Fern / ReadMe** → auto-generated for supported docs hosts
- **Custom cron + script** → pull from CMS, regenerate weekly
- Manual: monthly review if content changes rarely
## Entity / Knowledge Graph actions
### Create or optimize Wikidata entry
- **Kalicube** (commercial, custom pricing) → specialised Knowledge Panel + Wikidata
- **InLinks** (40-350 USD/mois) → entity optimization + Schema.org graph
- **WordLift** (30-300 USD/mois) → WordPress plugin with Wikidata linking
- **Entity.ai** → entity signal auditing
- Manual: https://www.wikidata.org, 2-4h initial + sources required
### Claim / optimize Google Knowledge Panel
- **Kalicube** — best specialisation
- **Manual via Google Search** — click "Claim this Knowledge Panel" (requires verification)
- Cannot be forced; appears when entity signals are strong enough
### LinkedIn / Crunchbase / industry directory entities
- **Yext** → includes Crunchbase sync in enterprise tiers
- **Manual** → LinkedIn Company page free, Crunchbase free profile claim
- **Brandify** (enterprise) → multi-directory entity management
## Content production actions
### Create city/service landing pages (30/70 rule)
- **Surfer SEO** (89-219 USD/mois) → content optimization with AI
- **Frase.io** (45-115 USD/mois) → SERP-driven briefs
- **Clearscope** (170+ USD/mois) → keyword + semantic briefs
- **Manual + AI writer** → use Claude/ChatGPT with explicit 30/70 instruction
Agent note: Batch D in `seo-analyzer` triage can handle the
CREATION of these pages if confirmation granted — city pages are
typically batch D (structural change, user approval needed).
### Produce blog content on schedule
- **Frase / Surfer / Clearscope** (see above)
- **MarketMuse** (enterprise) → content planning
- **Jasper AI / Copy.ai** → AI drafting (quality review mandatory)
- Human editor remains the bottleneck — AI drafts need domain expert review
### Refresh existing content quarterly
- **ContentKing** (now part of Conductor) → change detection
- **SEOClarity** (enterprise) → content decay tracking
- **Manual** — spreadsheet of top 50 pages + quarterly review cycle
## Technical SEO actions
### Generate sitemaps
- **Framework plugin**`@astrojs/sitemap`, `next-sitemap`, `@nuxtjs/sitemap`, `rails-sitemap-generator`, etc.
- **Yoast / RankMath** (WordPress) → auto-generate
- **Screaming Frog** (200 GBP/an) → crawler-based generation
- Manual: only as last resort, hand-maintained sitemaps go stale fast
### Implement Schema.org at scale
- **Yoast / RankMath / SEOPress** (WordPress) → Article/Organization/LocalBusiness auto-graph
- **Schema App** (enterprise) → multi-CMS
- **Merkle Schema Markup Generator** (free) → one-off generation
- **Manual + `geo-schemas.md` templates** — for frameworks without plugins
### Optimize Core Web Vitals
- **PageSpeed Insights API** (free) → measure + monitor
- **WebPageTest** (free tier + paid) → detailed waterfalls
- **Cloudflare Speed** (free tier with Cloudflare) → CDN-level optimizations
- **Nitropack** (35-175 USD/mois) → WordPress speed automation
- **Vercel Speed Insights** (free for Vercel projects)
- Manual: Lighthouse + manual fixes guided by its recommendations
### Security headers (CSP, HSTS, X-Frame-Options, Referrer-Policy)
- **securityheaders.com** (free audit)
- **Cloudflare Page Rules** → header injection
- **Vercel `next.config.js` headers** → declarative
- **`.htaccess`** → Apache hosts
- Manual: one-time config, ~1-2h setup
## Social presence actions
### Create / maintain social profiles (Facebook, Instagram, LinkedIn, TikTok, YouTube)
- **Buffer** (6-120 USD/mois) → multi-platform scheduling
- **Hootsuite** (99-249 USD/mois) → full social suite
- **Later** (16-80 USD/mois) → visual content scheduling
- **Metricool** (18-50 USD/mois) → analytics + scheduling
- Manual: 30-60 min/semaine for basic maintenance
### Monitor brand mentions on social / forums / Reddit
- **Brand24** (99-299 USD/mois)
- **Mention** (41-149 USD/mois)
- **Google Alerts** (free, basic)
- **Reddit search + saved queries** — free, manual
- **BuzzSumo** (199+ USD/mois) → trend + mention discovery
## Legal compliance actions (FR)
### Install cookie consent management (CMP)
- **Axeptia / Axeptio** (free to 100 EUR/mois) → French-focused CMP
- **Cookiebot** (11-96 USD/mois) → international CMP, CNIL-compliant
- **OneTrust** (enterprise) → enterprise compliance
- **tarteaucitron.js** (free, open source) → CNIL-compliant, self-hosted
- **Didomi** (enterprise) → strong French legal context
### Generate legal pages (mentions légales, politique de confidentialité, CGV)
- **Legalstart / Captain Contrat** (one-time 50-200 EUR) → FR templates
- **Genius Legal** → template generators
- **Legalbuddy** → questionnaire-driven legal pages
- Agent fallback: Batch B in `seo-analyzer` creates templates with
`[À COMPLÉTER]` placeholders for SIREN, capital, etc.
## Reporting format in SEO.md §11
Example entry generated by the agents:
```markdown
### Créer / réclamer la fiche Google Business Profile
**Action:** Vérifier que la fiche GMB existe, est réclamée, et les
informations sont cohérentes avec le site (NAP).
**Lien direct:** https://business.google.com
**Automatisation possible avec:**
- Google Business Profile API (gratuit, technique)
- BrightLocal (30-80 USD/mois, gestion + rank tracking)
- Yext (500+ EUR/mois, multi-directories)
- LocalFalcon (30-60 USD/mois, rank visualisation)
**Effort manuel:** 30 min initial + 30 min/semaine maintenance
**Impact SEO local:** critique (base du SEO local)
```
## Maintenance of this catalog
Tool landscape shifts fast. Cross-check quarterly:
- Have tool URLs changed?
- Has pricing moved tier?
- Have new tools emerged (especially in AI visibility monitoring)?
- Are deprecated tools still listed?
Use WebSearch on FULL audits to validate before emitting in SEO.md.

View File

@ -0,0 +1,250 @@
# Content shape for LLM extraction
How to write pages so AI engines quote, cite, and recommend them.
Based on peer-reviewed GEO research (CMU KDD 2024, Aggarwal et al.)
and tracked citation patterns across ChatGPT, Perplexity, Claude,
Gemini, Google AI Overviews (2025-2026).
## The six patterns that measurably increase AI citations
### 1. Definition Lead Architecture
Open the page (or first paragraph after each major heading) with:
> **[Entity] is a [category] that [differentiator].**
Research backing: CMU GEO framework (KDD 2024) — pages with explicit
definitional openings score significantly higher in LLM retrieval
impression scores.
**Good**: "Astro is a static site generator that ships zero JavaScript by default, producing HTML at build time that search engines and AI crawlers can index without running a browser."
**Bad**: "In today's fast-paced digital landscape, choosing the right framework can feel overwhelming. At Acme, we know how important it is to..."
### 2. TL;DR / Answer Box above the fold
Insert an explicit summary block at the top of long content. AI engines
preferentially quote from these blocks because the content is
pre-summarised.
```html
<aside class="tldr">
<strong>TL;DR</strong>
Next.js 15 removes the pages/ directory entirely in favour of App
Router. Migration requires rewriting route handlers, layouts, and
data fetching. Estimated effort: 2-5 days for a medium project.
</aside>
```
CSS: no class requirement, but mark it semantically (e.g. `aria-label="summary"`
or Speakable schema targeting this selector).
### 3. Question-then-direct-answer structure
Each H2/H3 heading phrased as a likely user query. First sentence
after the heading: a single-sentence direct answer. Supporting detail
follows.
**Pattern**:
```
## How much does a Qualibat RGE certification cost in France?
A Qualibat RGE certification costs between 500 and 1500 EUR for the
initial audit, plus an annual fee of 200-400 EUR. The cost varies by
trade category and company size.
[Detailed breakdown follows...]
```
Why it works: LLMs grade passages by answer-density relative to the
query. A one-sentence self-contained answer has the highest density.
### 4. Citations and statistics (strongest measured lever)
Adding peer-cited statistics with clear sources increases AI visibility
**by up to 40%** (Aggarwal et al., 2024 "GEO: Generative Engine
Optimization").
Pattern: embed specific numbers with attribution.
**Good**: "According to the ADEME 2024 energy report, French households spent an average of 2,137 EUR on heating in 2023 — a 12% increase from 2021."
**Bad**: "Heating costs have increased a lot recently."
Source attribution matters: link the citation to the original source
(`<a href>`), ideally with `rel="cite"`. AI engines use link graphs
to validate factual claims.
### 5. Structured lists and comparison tables
LLMs quote list items and table rows more readily than prose of the
same content. Convert what you can:
**Before** (prose):
"The best frameworks for public sites are Astro for static content,
Next.js for dynamic server-rendered apps, and Nuxt for Vue-based
projects."
**After** (list):
"Best frameworks for public sites by use case:
- **Astro** — static content (blog, docs, portfolio)
- **Next.js** — dynamic SSR with React
- **Nuxt** — dynamic SSR with Vue"
Comparison tables are even stronger. Structure:
| Framework | Rendering | Best for | JS by default |
|---|---|---|---|
| Astro | SSG + islands | Public content | 0 KB |
| Next.js | SSG + SSR | Hybrid apps | Large |
### 6. Freshness signals
Pages not updated at least quarterly are **3x more likely to lose AI
citations** (LLMRefs 2026 study).
What to maintain:
- Visible "Last updated: YYYY-MM-DD" at the top of content pages
- `dateModified` in Article/BlogPosting JSON-LD (ISO 8601)
- HTTP header `Last-Modified` in sync with content change
- Changelog on evergreen reference pages
Do NOT fake dates — AI engines and Google increasingly validate
freshness against actual content diffs.
## Anti-patterns — what to avoid
### Pronoun-heavy writing
LLMs resolve pronouns by context window, which costs them confidence.
Prefer explicit entity names.
**Bad**: "It was founded in 2015. Its founders wanted to solve a problem. They saw that..."
**Good**: "Acme Corp was founded in 2015. Acme's founders, Jane Doe and John Smith, wanted to solve..."
### Marketing fluff before facts
AI engines typically truncate retrieval windows. Fluff at the top
wastes the budget. Put factual claims FIRST.
**Bad** (first 200 chars wasted): "In today's fast-moving digital landscape, businesses are constantly looking for ways to stay competitive..."
**Good** (first 200 chars dense): "Our API processes 50M requests/day at p99 latency of 47ms across 8 regions, with a 99.99% SLA. Pricing starts at 99 EUR/month for the 10K requests tier."
### Claims without sources
Any numerical or comparative claim without a linked source degrades
trust. AI engines can detect the pattern "number without citation" and
weight those passages lower.
### Cookie-cutter content across pages (especially city pages)
The 30/70 rule: when creating per-city or per-service variants,
at most 30% of the content should be templated. 70% must be
unique per page (local landmarks, specific testimonials, unique
stats, real photos).
Generic city pages get filtered out as "doorway pages" by both
classical search and AI engines.
## Page templates by type
### Service page (local business)
```
<h1>[Service] in [City] — [Business Name]</h1>
<div class="tldr">
<strong>En résumé :</strong> [Business] offers [service] in [city + surrounding].
[Key differentiator — price, response time, certifications]. Open [hours].
Call [phone] or request a quote online.
</div>
<h2>What is [service]?</h2>
<p>[Service] is a [category] that [differentiator]. In [city], demand
is driven by [local factor — housing stock, climate, regulations].</p>
<h2>How much does [service] cost in [city]?</h2>
<p>[Specific price range] for a typical [job type], based on [n]
projects completed in [year]. Factors affecting cost: [list].</p>
<h2>Why choose [Business] for [service]?</h2>
<ul>
<li>[Certification 1] — [what it means]</li>
<li>[Certification 2]</li>
<li>[N+ years] experience on [specific housing stock]</li>
</ul>
<h2>FAQ</h2>
[QAPage or FAQPage schema + visible Q&A]
```
### Blog post / guide
```
<h1>[Clear, question-style or noun-phrase headline]</h1>
<p class="byline">By [Author Name] — Updated [Date]</p>
<div class="tldr">
[3-5 sentence summary. Include the key number, the key conclusion,
and any nuance.]
</div>
<h2>[Question 1]</h2>
<p>[One-sentence answer.] [Supporting detail with cited statistics.]</p>
<h2>[Question 2]</h2>
...
<h2>Sources</h2>
<ul>
<li><a href="...">Source 1 — author, year</a></li>
<li><a href="...">Source 2 — author, year</a></li>
</ul>
```
### Homepage / landing
```
<h1>[Entity] is a [category] that [differentiator].</h1>
<!-- The H1 IS the Definition Lead. Yes, really. -->
<p class="hero-subtitle">
[Elaboration on the H1. Include one concrete stat or proof point.]
</p>
[Primary CTA]
<section>
<h2>What [Entity] does</h2>
<p>[Functional description, one paragraph.]</p>
</section>
<section>
<h2>Who uses [Entity]</h2>
<ul><li>[Use case 1]</li><li>[Use case 2]</li>...</ul>
</section>
<section>
<h2>How it works</h2>
<!-- HowTo schema + visible steps -->
</section>
<section>
<h2>Frequently asked</h2>
<!-- FAQPage schema + visible Q&A -->
</section>
```
## Self-audit — is this page AI-friendly?
- [ ] First sentence: `[Entity] is a [category] that [differentiator]` ?
- [ ] TL;DR or summary block above the fold ?
- [ ] Every H2/H3 phrased as a likely user question ?
- [ ] First sentence under each heading: direct answer ?
- [ ] At least 2-3 specific numerical claims with linked sources ?
- [ ] Visible "Last updated" date + matching `dateModified` in JSON-LD ?
- [ ] Lists or tables instead of dense prose where possible ?
- [ ] Entity names used explicitly, not pronouns ?
- [ ] If it's a city/service variant: ≥70% unique content ?

View File

@ -0,0 +1,163 @@
# Entity SEO — Wikidata, Knowledge Graph, sameAs
Why this matters: every major AI engine (ChatGPT, Claude, Gemini,
Perplexity, Apple Intelligence) grounds factual claims against
Wikidata. A business without a clean entity footprint is effectively
invisible to AI grounding pipelines, regardless of on-site SEO.
## The entity identity stack
Think of your entity as having five layers, from strongest to weakest
identity signal:
1. **Wikidata QID** — globally unique, machine-readable identifier.
2. **Wikipedia article** — human-readable notability signal.
3. **Google Knowledge Panel** — surfaced directly in Google results.
4. **Authoritative third-party IDs** — Crunchbase, Bloomberg, SIRENE (FR), Companies House (UK), OpenCorporates.
5. **Social + directory profiles** — LinkedIn, Facebook, PagesJaunes, industry directories.
Each layer reinforces the ones below. Wikidata is the most leveraged
because it's structured, open, and explicitly consumed by LLMs.
## Audit checklist
### Does the entity have a Wikidata QID?
Search: https://www.wikidata.org/wiki/Special:Search — by name + city.
If found:
- Record QID (format `Q` + number, e.g. `Q12345678`)
- Verify: official website property (P856) points to the current domain
- Verify: VAT (P3608), SIRET (P3893), category (P31) are correct
If NOT found:
- For businesses meeting Wikidata notability: creation is possible
(requires verifiable third-party sources)
- For non-notable businesses: skip Wikidata, focus on other identity layers
- Flag in SEO.md §11 as user action (Wikidata requires human judgement
+ source citations)
### Does the entity have a Wikipedia article?
- Search by exact business name. If found and matches: record URL.
- If not found: flag as long-term goal (long-term — notability bar is high).
### Is there a Google Knowledge Panel?
Search Google: exact business name. Look for the right-side panel.
- Present + claimed → verify info is correct
- Present + unclaimed → user action: claim via https://www.google.com/business/
- Absent → Knowledge Panels are generated automatically when entity
signals are strong enough (GMB + Wikidata + consistent citations)
### Is `sameAs` complete in on-site JSON-LD?
The `sameAs` property is how you declare "these external URLs represent
the same entity as this page". It's the single most impactful entity
signal after Wikidata.
Minimum recommended `sameAs` for a local business:
```json
"sameAs": [
"https://www.wikidata.org/wiki/Q123456789", // if exists
"https://www.linkedin.com/company/name",
"https://www.facebook.com/businessname",
"https://www.instagram.com/businessname",
"https://www.pagesjaunes.fr/pros/12345", // FR
"https://fr.wikipedia.org/wiki/Nom_Entreprise" // if exists
]
```
For a SaaS / international brand, add:
```json
"https://www.crunchbase.com/organization/name",
"https://github.com/organization",
"https://www.g2.com/products/name",
"https://www.producthunt.com/products/name"
```
For a Person (author, founder):
```json
"sameAs": [
"https://www.wikidata.org/wiki/Q987654321",
"https://www.linkedin.com/in/name",
"https://twitter.com/name",
"https://github.com/name",
"https://scholar.google.com/citations?user=XYZ", // academics
"https://orcid.org/0000-0000-0000-0000" // academics
]
```
### Is `@id` used consistently?
Across all JSON-LD blocks on the site, the same entity MUST use the
same `@id`. Pattern: `https://example.com/#org` for the organization,
`https://example.com/about#author-{slug}` for people.
Split across multiple pages? Use `@id` with fragment identifiers to
tie them back to one canonical entity node.
## The Wikidata playbook for businesses
Not every business qualifies for Wikidata. Criteria (simplified):
- Multiple independent third-party sources (press articles, books,
academic papers) covering the entity.
- Some form of public notability (not just "we exist").
If qualified, the creation workflow:
1. Create Wikidata account.
2. Use "Create a new item" → name, label, description.
3. Add statements with sources:
- `instance of (P31)``enterprise (Q6881511)` or more specific
- `country (P17)``France (Q142)`
- `headquarters location (P159)` → city QID
- `official website (P856)` → domain URL
- `inception (P571)` → founding date
- `industry (P452)` → industry QID
- `SIRET (P3893)` → SIRET number (FR)
- `VAT number (P3608)` → VAT ID
4. Each statement must cite a reference (URL of press article,
official registry, etc.).
5. Wait for community review. Items without sources get merged or deleted.
This is labor-intensive and failure-prone for non-notable entities.
Do NOT invent sources. Better to skip Wikidata than create a deletable item.
## Automation options (for SEO.md §11)
- **Kalicube** — paid service specialised in Knowledge Panel + Wikidata
optimization for businesses and executives.
- **Entity.ai** / **InLinks** — tools that help structure entity
signals on-site + track Knowledge Panel status.
- **WordLift** — WordPress/plugin with Wikidata linking + Schema.org
graph generation.
- **Yext Knowledge Graph** — enterprise platform syncing entity data
across 200+ directories.
- **BrightLocal / Moz Local / Uberall** — focus on local citations
+ directory sync (not Wikidata-specific).
For Wikidata specifically: no full-automation tool is reliable because
it requires sourced statements. Human curation is the bottleneck.
## Common mistakes
- **Fake Wikidata entries** — flagged and deleted by community, damages
reputation.
- **`sameAs` pointing to dead profiles** — validate each URL resolves.
- **Inconsistent entity names across platforms** ("Dupont Plomberie"
vs "Plomberie Dupont" vs "DUPONT PLOMBERIE SAS") — pick one, apply
everywhere.
- **Missing VAT/SIREN on Organization schema** — easy credibility
signal, often forgotten.
- **Treating @id as a URL that must resolve**`@id` is an identifier,
not a mandatory-resolvable URL (though resolvable is better).
## Verification tools
- https://www.wikidata.org/wiki/Special:Search — find QID
- https://tools.wmflabs.org/reasonator/ — human-readable Wikidata view
- https://kalicube.com — commercial Knowledge Panel audit
- https://www.google.com/search?q=%22business+name%22 — check Knowledge Panel
- Schema validator (see `geo-schemas.md`) — check `@id` + `sameAs` integrity

View File

@ -0,0 +1,343 @@
# Schema.org for GEO — types that matter in 2026
All examples use JSON-LD (the only format Google recommends in 2026).
Place inside `<script type="application/ld+json">` in `<head>` or
before `</body>`.
## DEPRECATED — do not emit
Google deprecated these in June 2025. Stop emitting them and remove
existing instances. They no longer produce rich results.
- `ClaimReview` (was a fact-check signal)
- `CourseInfo`
- `EstimatedSalary`
- `LearningVideo`
- `SpecialAnnouncement`
- `VehicleListing`
- `Book` actions (ReadAction, BuyAction on Book)
## TIER 1 — highest GEO impact
### QAPage — single Q&A format
Pages cited 58% more often by ChatGPT vs basic Article schema.
Use when the page is built around ONE primary question.
```json
{
"@context": "https://schema.org",
"@type": "QAPage",
"mainEntity": {
"@type": "Question",
"name": "What is the best framework for a public website in 2026?",
"text": "Should I use React SPA, Next.js, or Astro for a public-facing website in 2026?",
"answerCount": 1,
"acceptedAnswer": {
"@type": "Answer",
"text": "For public-facing websites, Astro is the 2026 default because it ships static HTML by default, preserves SEO/GEO signals, and allows React/Vue/Svelte islands only where interactivity is needed. React SPAs are only appropriate for authenticated, non-indexed surfaces.",
"dateCreated": "2026-04-21",
"upvoteCount": 0,
"author": {
"@type": "Person",
"name": "Author Name",
"url": "https://example.com/about"
}
}
}
}
```
### FAQPage — multiple Q&A
Only valid when the page visibly contains all listed questions and
answers. Google will penalise pages with FAQ schema that doesn't match
visible content.
```json
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "How long does shipping take?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Standard shipping takes 2 to 5 business days in France."
}
},
{
"@type": "Question",
"name": "Do you offer refunds?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Yes — refunds are available within 30 days of purchase."
}
}
]
}
```
### Speakable — voice + AI extraction marker
62% of searches in 2026 involve voice. Speakable flags the passage
best suited for voice readout and AI summary.
```json
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Article headline",
"speakable": {
"@type": "SpeakableSpecification",
"cssSelector": [".article-summary", ".tldr"]
}
}
```
Or via xpath for non-CSS-targetable content:
```json
"speakable": {
"@type": "SpeakableSpecification",
"xpath": ["/html/head/title", "//div[@class='tldr']"]
}
```
### Article + Person — E-E-A-T backbone
The single most important pattern for non-local content. Couples
content to a real author with verifiable credentials.
```json
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Exact title of the article",
"description": "One-sentence summary matching meta description.",
"image": ["https://example.com/images/hero-1x1.jpg", "https://example.com/images/hero-4x3.jpg", "https://example.com/images/hero-16x9.jpg"],
"datePublished": "2026-04-15T09:00:00+02:00",
"dateModified": "2026-04-21T14:30:00+02:00",
"author": {
"@type": "Person",
"@id": "https://example.com/about#author-jane",
"name": "Jane Doe",
"url": "https://example.com/authors/jane-doe",
"image": "https://example.com/images/jane-doe.jpg",
"jobTitle": "Senior Plumber",
"description": "Master plumber with 15 years of experience in Paris region.",
"knowsAbout": ["plumbing", "boiler repair", "leak detection"],
"alumniOf": "Lycée Professionnel Diderot",
"award": ["Qualibat RGE certification", "Artisan de l'année 2024 Essonne"],
"worksFor": {
"@type": "Organization",
"@id": "https://example.com/#org"
},
"sameAs": [
"https://www.linkedin.com/in/jane-doe-plomberie",
"https://twitter.com/janedoeplumbing",
"https://www.wikidata.org/wiki/Q123456789"
]
},
"publisher": {
"@type": "Organization",
"@id": "https://example.com/#org",
"name": "Business Name",
"logo": {
"@type": "ImageObject",
"url": "https://example.com/logo.png"
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://example.com/article-slug"
}
}
```
## TIER 2 — solid GEO contribution
### HowTo — procedural content
```json
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "How to reset a Chaffoteaux Talia Green boiler",
"description": "Step-by-step reset procedure for the Talia Green combi boiler.",
"totalTime": "PT5M",
"estimatedCost": {"@type": "MonetaryAmount", "currency": "EUR", "value": "0"},
"tool": [{"@type": "HowToTool", "name": "None"}],
"step": [
{
"@type": "HowToStep",
"name": "Locate the reset button",
"text": "The reset button is on the front panel, marked with a flame icon.",
"url": "https://example.com/guides/reset#step1",
"image": "https://example.com/img/step1.jpg"
},
{
"@type": "HowToStep",
"name": "Press and hold for 3 seconds",
"text": "Press the reset button until the red light turns off.",
"url": "https://example.com/guides/reset#step2"
}
]
}
```
### BreadcrumbList — navigation context for AI
Gives AI the hierarchical position of the page. Nearly universal to
add, low cost.
```json
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{"@type": "ListItem", "position": 1, "name": "Accueil", "item": "https://example.com/"},
{"@type": "ListItem", "position": 2, "name": "Services", "item": "https://example.com/services"},
{"@type": "ListItem", "position": 3, "name": "Dépannage chaudière", "item": "https://example.com/services/depannage-chaudiere"}
]
}
```
### LocalBusiness — local services (required for local SEO)
Must be consistent with GMB. Any divergence is a NAP inconsistency.
```json
{
"@context": "https://schema.org",
"@type": "Plumber",
"@id": "https://example.com/#business",
"name": "Plomberie Dupont",
"image": "https://example.com/img/shopfront.jpg",
"url": "https://example.com",
"telephone": "+33123456789",
"priceRange": "€€",
"address": {
"@type": "PostalAddress",
"streetAddress": "12 rue des Lilas",
"addressLocality": "Évry-Courcouronnes",
"postalCode": "91000",
"addressRegion": "Île-de-France",
"addressCountry": "FR"
},
"geo": {
"@type": "GeoCoordinates",
"latitude": 48.62939,
"longitude": 2.44199
},
"openingHoursSpecification": [
{
"@type": "OpeningHoursSpecification",
"dayOfWeek": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
"opens": "08:00",
"closes": "18:00"
}
],
"areaServed": [
{"@type": "City", "name": "Évry-Courcouronnes"},
{"@type": "City", "name": "Corbeil-Essonnes"},
{"@type": "AdministrativeArea", "name": "Essonne"}
],
"sameAs": [
"https://www.facebook.com/plomberiedupont",
"https://www.instagram.com/plomberiedupont",
"https://www.pagesjaunes.fr/pros/12345",
"https://www.wikidata.org/wiki/Q999999999"
]
}
```
Use the most specific subclass of `LocalBusiness` available (`Plumber`,
`Dentist`, `Restaurant`, `AutoRepair`, etc.) — list at
https://schema.org/LocalBusiness under "More specific Types".
### Organization — company-level entity
Separate from `LocalBusiness` when brand > single location.
```json
{
"@context": "https://schema.org",
"@type": "Organization",
"@id": "https://example.com/#org",
"name": "Company Name",
"legalName": "Company Name SAS",
"url": "https://example.com",
"logo": "https://example.com/logo.png",
"foundingDate": "2015-03-01",
"founders": [{"@type": "Person", "name": "Founder Name"}],
"numberOfEmployees": {"@type": "QuantitativeValue", "value": "12"},
"vatID": "FR12345678901",
"iso6523Code": "0199:123456789",
"sameAs": [
"https://www.wikidata.org/wiki/Q123456",
"https://www.linkedin.com/company/companyname",
"https://www.crunchbase.com/organization/companyname"
],
"contactPoint": {
"@type": "ContactPoint",
"telephone": "+33123456789",
"contactType": "customer service",
"availableLanguage": ["fr", "en"]
}
}
```
### Dataset — factual reference content
Use for data-heavy pages (statistics, research, public-data reports).
```json
{
"@context": "https://schema.org",
"@type": "Dataset",
"name": "French boiler energy consumption by model, 2020-2025",
"description": "Average annual kWh consumption for 47 boiler models installed in France.",
"license": "https://creativecommons.org/licenses/by/4.0/",
"creator": {"@type": "Organization", "@id": "https://example.com/#org"},
"distribution": {
"@type": "DataDownload",
"encodingFormat": "text/csv",
"contentUrl": "https://example.com/data/boilers-2020-2025.csv"
}
}
```
## TIER 3 — niche but high-leverage when applicable
- **`Product`** — e-commerce (required for Merchant Center)
- **`Recipe`** — food sites
- **`Event`** — event listings
- **`JobPosting`** — job boards
- **`Review` / `AggregateRating`** — only when backed by verifiable public reviews (fraud risk otherwise)
- **`VideoObject`** — any embedded video (transcripts are critical for AI)
- **`DefinedTerm` / `DefinedTermSet`** — glossary pages, taxonomy (great for entity disambiguation)
- **`Course` / `EducationalOccupationalCredential`** — training/cert providers
- **`MedicalBusiness`, `PhysiologicalFeature`, `Drug`** — health (YMYL, demand extra rigour)
## Graph linking — @id patterns
Use `@id` to build a single graph across multiple JSON-LD blocks:
```json
{"@context":"https://schema.org","@graph":[
{"@type":"Organization","@id":"https://example.com/#org","name":"..."},
{"@type":"WebSite","@id":"https://example.com/#website","publisher":{"@id":"https://example.com/#org"}},
{"@type":"WebPage","@id":"https://example.com/page#webpage","isPartOf":{"@id":"https://example.com/#website"}},
{"@type":"Article","mainEntityOfPage":{"@id":"https://example.com/page#webpage"},"author":{"@id":"https://example.com/about#author-jane"}}
]}
```
This is the pattern Yoast, RankMath, and modern headless-CMS plugins
output. It lets AI engines traverse entities without duplicating them.
## Validation
- https://validator.schema.org — strict Schema.org validator
- https://search.google.com/test/rich-results — Google Rich Results Test
- https://developers.google.com/search/docs/appearance/structured-data — type-by-type Google docs

View File

@ -0,0 +1,153 @@
# llms.txt / llms-full.txt — template and strategy
## Status as of 2026-04
**Honest assessment**: llms.txt is a proposed standard by Jeremy Howard
(Answer.AI, Sept 2024). No major AI crawler has publicly confirmed they
extract content via `/llms.txt`. A Search Engine Land study (2025) found
8 of 9 sites saw no measurable traffic change after adoption.
**Why include it anyway**:
- Low cost (small static file).
- Real value for developer-facing sites — AI coding assistants (Cursor,
Continue, Claude Code, GitHub Copilot Chat) DO read it for doc retrieval.
- Signals intent to AI ecosystem. Early mover advantage if adoption grows.
- Reduces RAG token consumption when third parties ingest your content.
**Do not promise ranking gains.** Frame as "no-regret hedge", not "quick win".
## Where it goes
- `/llms.txt` — root of domain. Index of your content in markdown.
- `/llms-full.txt` — root of domain. Full text of your most important pages
concatenated. Optional but recommended for docs/blog/knowledge base.
Both MUST be reachable over HTTPS, content-type `text/plain` or
`text/markdown`, and NOT blocked in robots.txt.
## Canonical structure
```markdown
# <Site or Project Name>
> <One-sentence elevator pitch. This is the single line AI systems extract
> as your site summary. Be concrete. Include entity + category + differentiator.>
<Optional free-form paragraph providing more context. Keep under 400 chars.>
## Docs
- [Getting started](https://example.com/docs/getting-started): What it does, how to install.
- [API reference](https://example.com/docs/api): All endpoints with examples.
- [Tutorials](https://example.com/docs/tutorials): Step-by-step walkthroughs.
## Examples
- [Quickstart example](https://example.com/examples/quickstart.md): Minimal working demo.
## Optional
- [Changelog](https://example.com/changelog.md): Version history.
- [Blog](https://example.com/blog/index.md): In-depth articles.
```
## Structure rules (Jeremy Howard spec)
1. First line: `# <Name>` (H1 with project/site name).
2. Second non-comment line: `> summary` (blockquote, one sentence).
3. Optional paragraphs of free-form context after the blockquote.
4. H2 sections grouping links: `## Docs`, `## Examples`, `## Optional`, etc.
5. Each link: `[Title](URL): description.` — description under 120 chars.
6. Any link pointing to a `.md` version of the page is preferred.
7. Total file: target under 8 KB. If larger, split into `llms-full.txt`.
## llms-full.txt
Concatenation of the full text (stripped of nav/footer/ads) of your most
important pages. Separator between pages:
```
---
URL: https://example.com/docs/getting-started
Title: Getting Started
---
<full markdown content of that page>
---
URL: https://example.com/docs/api
Title: API Reference
---
<full markdown content of that page>
```
Target under 500 KB. If your corpus is larger, trim to highest-value pages
(most-linked, most-traffic, most-updated).
## Generation patterns
### Static sites (Astro, Hugo, Jekyll, 11ty, Next.js SSG)
Best practice: generate both files at build time from the same source as
your regular pages. Examples:
**Astro**: add a `src/pages/llms.txt.ts` endpoint:
```typescript
import { getCollection } from 'astro:content';
export async function GET() {
const docs = await getCollection('docs');
const body = [
'# My Project',
'',
'> One-sentence pitch.',
'',
'## Docs',
...docs.map(d => `- [${d.data.title}](https://example.com/docs/${d.slug}): ${d.data.description}`),
].join('\n');
return new Response(body, { headers: { 'Content-Type': 'text/plain' } });
}
```
**Next.js App Router**: `app/llms.txt/route.ts`:
```typescript
export async function GET() {
// similar — pull from your CMS/MDX/db
return new Response(body, { headers: { 'Content-Type': 'text/plain' } });
}
```
**Hugo**: custom output format `llms``llms.txt` template in layouts.
### CMS (WordPress, Drupal, Ghost)
Use a plugin OR a cron job that regenerates files weekly. Flag stale
files (older than site content) in audits.
### Static HTML / PHP
Hand-maintained file. Flag in audits if older than 90 days.
## Automation tools (for SEO.md §11 "automatisation possible")
- **`llms-txt-action`** (GitHub Action) — generates on each deploy
- **Mintlify** — auto-generates for Mintlify-hosted docs
- **Fern** — auto-generates for Fern-generated API docs
- **`llmstxt-hub`** — community directory of examples
- Custom script + cron — works for any static content source
## What NOT to put in llms.txt
- Login walls / private content
- Pricing tables (change frequently → stale risk)
- Testimonials (authenticity risk if AI quotes them)
- Marketing fluff without factual anchors
## Validation checklist
- [ ] File reachable at `/llms.txt` over HTTPS
- [ ] Content-type `text/plain` or `text/markdown`
- [ ] H1 + blockquote present as first two non-comment lines
- [ ] All linked URLs resolve (200)
- [ ] No broken markdown (valid CommonMark)
- [ ] Mentioned in `/sitemap.xml`? Optional, debated
- [ ] NOT blocked in `/robots.txt`

File diff suppressed because it is too large Load Diff

868
agents/seo-analyzer.md.bak Normal file
View File

@ -0,0 +1,868 @@
---
name: seo-analyzer
description: Professional SEO/GEO audit agent. Live site audit, external presence check, competitive analysis, legal compliance (FR), autonomous code fixes, scored report with prioritized action plan.
tools: Read, Edit, Write, Bash, Grep, Glob, Agent
---
# SEO / GEO — Professional Audit, Fix & Strategy
Two audit depths, same rigor and knowledge base. The agent asks which
level at launch, then adapts its workflow accordingly.
| Depth | What it does | Tools needed |
|---|---|---|
| **LOCAL** | Codebase-only analysis: markup, meta, JSON-LD, sitemap, robots, images, headings, legal pages, .htaccess, CMP. Same scoring, same fixes, same SEO.md — but from code only. | Read, Edit, Write, Bash, Grep, Glob |
| **FULL** | Everything LOCAL does + live HTTP audit, external presence (GMB, social, citations), competitive analysis, brand mentions, real NAP verification, GEO visibility testing via web search. | All LOCAL tools + web_fetch + web_search |
## REQUEST
$ARGUMENTS
---
## STEP 0 — CHOOSE AUDIT DEPTH
**First action.** Ask the user:
```
AUDIT DEPTH — choose one:
LOCAL — Code-only analysis. Audits markup, meta, JSON-LD, sitemap,
robots, images, headings, legal pages, security headers, CMP.
Applies fixes in code. No external calls.
Best for: quick pass, CI integration, no web tools available.
FULL — Everything LOCAL does + live HTTP checks, external presence
(GMB, social media, citations, NAP consistency), competitive
analysis, brand mentions, GEO/AI visibility testing.
Best for: complete client audit, pre-launch, strategic planning.
Which depth? (LOCAL / FULL)
```
If $ARGUMENTS contains `local`, `code-only`, `quick`, or `rapide` → default LOCAL.
If $ARGUMENTS contains `full`, `complet`, `externe`, or `live` → default FULL.
If $ARGUMENTS contains a production URL → suggest FULL.
Otherwise → ask.
Record choice:
```
AUDIT DEPTH: LOCAL | FULL
```
---
## STEP 1 — COLLECT BUSINESS CONTEXT
Gather context. Extract what you can from code and $ARGUMENTS.
For anything missing, ask the user — **one grouped block**.
Skip questions already answered.
**Both depths:**
1. Activity type (B2C local, B2B national, SaaS, e-commerce, service)
2. Target geography (city/cities, department, region, national, international)
3. Priority keywords to rank for
4. Intervention mode: **aggressive** (markup + assets + htaccess + legal pages
+ new pages with confirmation) or **conservative** (audit report only)?
**FULL depth only** (skip if LOCAL):
5. Production URL
6. Google Business Profile URL (or "not created yet")
7. Social media URLs (Facebook, Instagram, TikTok, LinkedIn, YouTube)
8. Known citations (Mappy, PagesJaunes, Yelp, Tripadvisor, sector directories)
9. Known competitors (URLs if possible)
10. Time budget for user actions post-audit? (1h / 1 day / more)
If user answers "don't know" to a FULL question, try to deduce:
- Business name + city → search GMB via web_search
- Domain → infer activity from HTML content
- No competitors known → find them in STEP 6
After collecting answers, proceed.
---
## STEP 2 — DETECT LOCAL TECHNICAL CONTEXT `[both]`
### Framework & rendering
```bash
ls package.json composer.json Gemfile Cargo.toml go.mod 2>/dev/null
cat package.json 2>/dev/null | head -40
ls -la
```
Identify: Next.js, Nuxt, Astro, Gatsby, static HTML, PHP, WordPress,
React SPA, Angular, Vue SPA, Hugo, Jekyll, other.
Note rendering model: SSR, SSG, SPA, hybrid.
### Infrastructure signals
```bash
# Server / hosting
ls .htaccess nginx.conf netlify.toml vercel.json 2>/dev/null
# SEO files
ls robots.txt sitemap.xml sitemap-index.xml 2>/dev/null
# Legal pages
find . -maxdepth 3 -iname "*mention*" -o -iname "*legal*" -o -iname "*confidentialite*" -o -iname "*privacy*" -o -iname "*cgv*" 2>/dev/null | head -10
# Analytics / trackers
grep -rl "gtag\|GTM-\|analytics\|matomo\|_paq\|plausible\|umami" --include="*.html" --include="*.js" --include="*.tsx" --include="*.astro" --include="*.php" . 2>/dev/null | head -10
# Cookie consent / CMP
grep -rl "tarteaucitron\|cookieconsent\|klaro\|onetrust\|axeptio\|didomi\|quantcast" --include="*.html" --include="*.js" --include="*.tsx" --include="*.astro" --include="*.php" . 2>/dev/null | head -5
# Existing JSON-LD
grep -rl "application/ld+json" --include="*.html" --include="*.astro" --include="*.tsx" --include="*.php" --include="*.njk" . 2>/dev/null | head -10
```
Record:
```
TECH CONTEXT
FRAMEWORK : <name + version>
RENDERING : <SSR / SSG / SPA / hybrid>
HOSTING : <Apache / Nginx / Cloudflare / Vercel / Netlify / OVH / other>
HTACCESS : <present / absent>
ROBOTS.TXT : <present / absent / broken>
SITEMAP.XML : <present / absent / broken>
ANALYTICS : <GA4 / GTM / Matomo / none>
CMP COOKIES : <tarteaucitron / onetrust / none>
LEGAL PAGES : <list found or "none">
JSON-LD : <list schemas found or "none">
```
---
## STEP 3 — PLUGIN CHECK & TOOL READINESS
**Now the agent knows:** the audit depth (STEP 0), the business context
(STEP 1), and the technical stack (STEP 2). Use this knowledge to check
if the right tools are active.
**If FULL depth:** load and invoke `$HOME/.claude/agents/plugin-advisor.md`:
```
SEO/GEO FULL audit on a <framework> project (<rendering model>).
Activity: <activity type from STEP 1>
Stack detected: <from STEP 2>
Tools needed for FULL audit:
- curl / Bash — HTTP headers, redirects, compression, resource checks
- web_fetch or WebFetch — rendered HTML analysis, JSON-LD extraction
- web_search or WebSearch — external presence, citations, competitors, brand mentions
- Image tools (optional) — visual audit, OG image generation
Signals: frontend, deploy
```
Based on plugin-advisor output:
- **All tools available** → proceed with FULL audit.
- **Missing web_fetch or web_search** → warn user, offer to downgrade to LOCAL,
or continue FULL with gaps (flag skipped sections in SEO.md §14).
- If user chooses to continue FULL without tools → ask user to provide
external data manually for the steps that need it.
**If LOCAL depth:** skip plugin-advisor entirely. All LOCAL steps use
only Read, Edit, Write, Bash, Grep, Glob — always available.
Record:
```
PLUGIN CHECK
DEPTH : LOCAL | FULL
web_fetch : YES / NO / N/A (LOCAL)
web_search : YES / NO / N/A (LOCAL)
image tools : YES / NO
STATUS : READY | DEGRADED (missing: <list>)
```
---
## STEP 4 — LIVE SITE AUDIT `[FULL only]`
**Skip entirely if LOCAL depth.** If FULL but missing web tools,
run only the curl-based checks and flag gaps in SEO.md §14.
### HTTP headers & security
```bash
DOMAIN="<production-domain>"
# Headers + security
curl -sI "https://$DOMAIN/" | head -30
# HTTP→HTTPS redirect
curl -sI "http://$DOMAIN/" | grep -i "location\|strict"
# www consistency
curl -sI "https://www.$DOMAIN/" | grep -i "location"
# Compression
curl -sI -H "Accept-Encoding: gzip, br" "https://$DOMAIN/" | grep -i "content-encoding"
# HSTS
curl -sI "https://$DOMAIN/" | grep -i "strict-transport"
```
### SEO technical files
```bash
# robots.txt live
curl -s "https://$DOMAIN/robots.txt"
# sitemap.xml live
curl -s "https://$DOMAIN/sitemap.xml" | head -50
```
### Resource verification
```bash
# OG image exists?
curl -sI "https://$DOMAIN/<og-image-path>" | head -5
# Favicon exists?
curl -sI "https://$DOMAIN/favicon.ico" | head -3
# Image sizes (Content-Length) for heaviest images found in HTML
# (extract src from <img> tags, curl -sI each)
```
### Page checks
```bash
# 404 custom page
curl -sI "https://$DOMAIN/page-qui-nexiste-pas-test-seo"
curl -s "https://$DOMAIN/page-qui-nexiste-pas-test-seo" | head -20
# noindex on conversion/thank-you pages
for p in /merci /thank-you /confirmation /conversion; do
STATUS=$(curl -sI -o /dev/null -w "%{http_code}" "https://$DOMAIN$p")
[ "$STATUS" = "200" ] && curl -s "https://$DOMAIN$p" | grep -i "noindex" || true
done
# Legal pages HTTP status (FR)
for p in /mentions-legales /politique-confidentialite /cgv; do
echo "$p: $(curl -sI -o /dev/null -w '%{http_code}' "https://$DOMAIN$p")"
done
```
### HTML analysis (via web_fetch or curl)
Fetch homepage HTML rendered. Extract and analyze:
1. **All JSON-LD blocks** — parse each individually. Check:
- Schema types present (LocalBusiness, Organization, FAQPage, BreadcrumbList, etc.)
- Consistency: hours match GMB? GPS coords correct? Phone matches?
- `aggregateRating` — does it match real Google reviews? Flag if no public source.
- `sameAs` — do URLs actually exist?
2. **Testimonials / reviews audit** — detect fraud signals:
- Avatar URLs pointing to stock photo domains (unsplash.com, pexels.com,
pixabay.com, shutterstock.com, freepik.com, placeholder.com, ui-avatars.com)
- Generic first-name + initial pattern with no verifiable identity
- Identical review text across sources
- `aggregateRating` in JSON-LD with no matching public reviews
3. **Meta tags** — title, description, OG, Twitter Card, canonical
4. **Heading hierarchy** — H1-H6 structure
5. **Image audit** — missing alt, missing width/height, oversized images
6. **Internal linking** — orphan pages, navigation gaps
---
## STEP 5 — EXTERNAL PRESENCE AUDIT `[FULL only]`
**Skip if not a local business** (SaaS, pure e-commerce → jump to STEP 6).
### Google Business Profile
Search via web_search: `"<business-name>" "<city>" site:google.com/maps`
or use provided URL. Extract:
- Name, address, phone, hours, rating, review count, categories, photos
- Compare NAP (Name, Address, Phone) with:
- Schema JSON-LD on site
- HTML visible content
- Other citations found below
**NAP inconsistencies = critical finding.** List every discrepancy explicitly.
### Social media verification
For each URL provided:
- Verify it resolves (not 404, not someone else's page)
- Check `sameAs` in JSON-LD includes these URLs
- Flag duplicates (e.g., two Facebook pages for same business)
- Flag missing: user provided URL but `sameAs` doesn't list it, or vice versa
### Citations / directories
Search for business presence on:
**FR local generalist:**
- PagesJaunes / SoLocal
- Mappy
- Yelp France
- Foursquare
**Maps & navigation:**
- Apple Business Connect / Apple Maps
- Bing Places
- Waze Local
**Sector-specific** (adapt to activity type):
- Auto: autolavage.net, vroomly.com, allovoisins.com
- Restaurant: Tripadvisor, TheFork
- Hotel: Booking.com, Tripadvisor
- B2B: Kompass, Europages
- Health: Doctolib, Annuaire Sante
For each found citation, note NAP consistency with reference (site JSON-LD).
### Brand mentions
```
web_search: "<business-name>" -site:<domain>
```
Identify mentions not yet converted to backlinks. List opportunities.
---
## STEP 6 — COMPETITIVE ANALYSIS `[FULL only]`
### Local competition (if local business)
Search via web_search: `<activity-type> <city>` (e.g., "lavage auto Marseille").
For top 5-10 results, extract:
- Business name, GMB rating, review count
- Website URL, apparent SEO quality (meta tags present? JSON-LD?)
- Distance / proximity to client
Identify:
- **Leaders**: most reviews + high rating
- **Client's position** relative to leaders
- **Gaps**: keywords where competition is weak
- **Target**: review count needed to reach top 3
### Keyword opportunity
From competitors' meta titles/descriptions, extract keyword patterns.
Cross-reference with client's priority keywords from STEP 1.
Identify realistic short-term wins vs. long-term plays.
---
## STEP 7 — LEGAL COMPLIANCE (FR default) `[both]`
Check every point. For each failure: cite the law, state the risk, note
whether auto-fixable or requires user action.
**LOCAL depth**: check from code only — legal pages exist? Content complete?
CMP script present? Tracker scripts loaded before consent logic?
**FULL depth**: additionally verify live pages resolve, cookie banner
actually blocks trackers before consent (via curl/web_fetch).
### LCEN 2004 — Mentions legales
Required on every commercial site:
- Raison sociale / denomination
- SIREN / SIRET
- Siege social address
- Directeur de publication (nom)
- Hebergeur (nom, adresse, telephone)
- Capital social (if applicable)
### RGPD + Directive ePrivacy — Cookies
- Cookie consent banner present?
- Trackers blocked BEFORE consent? (GA4, Google Ads, Facebook Pixel, Hotjar)
- Consent granular? (accept all / reject all / customize)
- No pre-checked boxes?
### Politique de confidentialite
- Page accessible?
- Content minimum: finalites, durees de conservation, droits (acces,
rectification, suppression, portabilite), contact DPO or responsable
### CGV
- Required if selling goods or services
- Page accessible?
### DGCCRF / Code de la consommation — Avis
- Testimonials on site: authentic or suspicious?
- `aggregateRating` in Schema: backed by real public reviews?
- Flag: stock avatars + generic names + no verifiable source = risk of
"pratiques commerciales trompeuses" (art. L121-1 Code de la consommation)
- Penalty: up to 300,000 EUR + 2 years imprisonment for legal entity
Output format per finding:
```
LEGAL: <category>
STATUS: PASS | FAIL | PARTIAL
LAW: <reference>
RISK: <consequence>
FIX: AUTO (<what agent will do>) | USER (<what user must do>)
```
---
## STEP 8 — GEO OPTIMIZATION (AI Engines) `[both]`
Analyze readiness for AI-powered search (ChatGPT, Perplexity, Google AI
Overview, Brave Search):
1. **Structured data for AI extraction**
- FAQPage JSON-LD: present? Well-formed? Questions match real user queries?
- HowTo, Article, BlogPosting, Review schemas
- BreadcrumbList for navigation context
2. **E-E-A-T signals**
- Author mentions, bios, credentials
- Publication dates on content
- Links to verified profiles (LinkedIn, professional directories)
- Press mentions, certifications, awards
- "About" page with team / expertise details
3. **Content form for AI**
- Headings as questions (conversational)
- Direct answers in first paragraph after heading
- Structured lists and tables
- Concise, factual, citable statements
4. **Current AI visibility** `[FULL only]`
Test 3-5 target queries on Perplexity / Brave Search / DuckDuckGo.
Note: is the client cited? Who is cited instead?
LOCAL depth: skip this sub-step, note "AI visibility not tested" in report.
---
## STEP 9 — SCORING /20 `[both]`
Rate each axis. Use concrete findings from previous steps to justify.
### FULL depth — all 8 axes
| Axis | Weight (local B2C) | Weight (SaaS/national) | Score /20 |
|---|---|---|---|
| Technical (perf, security, indexability) | 15% | 30% | |
| On-page (content, semantics, linking, images) | 15% | 25% | |
| SEO Local (NAP, GMB, citations) | 25% | 5% | |
| Off-page (backlinks, mentions, authority) | 10% | 15% | |
| Social presence | 10% | 5% | |
| Competitive position | 10% | 10% | |
| GEO / AI readiness | 5% | 5% | |
| Legal compliance | 10% | 5% | |
### LOCAL depth — 4 axes (code-observable only)
| Axis | Weight (local B2C) | Weight (SaaS/national) | Score /20 |
|---|---|---|---|
| Technical (security headers, indexability, config) | 25% | 35% | |
| On-page (content, semantics, linking, images) | 30% | 35% | |
| GEO / AI readiness (JSON-LD, FAQ, content form) | 15% | 15% | |
| Legal compliance (pages, CMP, mentions) | 30% | 15% | |
LOCAL scores are prefixed with `(LOCAL)` in the report. Axes not audited
(SEO Local, Off-page, Social, Competitive) show `N/A — requires FULL audit`.
### Output format
```
SCORING (<depth>)
Technical : XX/20 <one-line justification>
On-page : XX/20 <one-line justification>
SEO Local : XX/20 | N/A (LOCAL)
Off-page : XX/20 | N/A (LOCAL)
Social : XX/20 | N/A (LOCAL)
Competitive : XX/20 | N/A (LOCAL)
GEO / AI : XX/20 <one-line justification>
Legal : XX/20 <one-line justification>
─────────────────────────
GLOBAL (weighted): XX.X/20 (<depth>)
```
Adapt weights to business type from STEP 1. Explain weighting choice.
---
## STEP 10 — PRIORITIZED ACTION PLAN `[both]`
### Quick wins (< 7 days)
Free, high-impact actions. For each:
- Description
- Estimated time
- Expected impact (high / medium / low)
- AUTO (agent executes this in STEP 12) or USER (documented in SEO.md §11)
Every item tagged AUTO **will be executed** in STEP 12. This is a commitment,
not a suggestion.
### Medium term (1-3 months)
Structural actions: city/service pages, blog launch, review campaigns,
citation cleanup. Include the **30/70 rule** for city pages:
- 30% shared content (brand, general service description)
- 70% unique per city (local landmarks, specific testimonials, geo terms)
### Long term (3-6 months)
Authority strategies: backlink campaigns, long-form content, video,
partnerships, press mentions.
---
## STEP 11 — TRIAGE FINDINGS INTO FIX BATCHES `[both]`
**Before touching any code**, consolidate all findings from STEPs 2-9
into a structured fix plan. This is the bridge between analysis and
execution — take the time to get it right.
### Classification
Go through EVERY finding. Classify each into one of these batches:
| Batch | Agent | Scope | Confirmation |
|---|---|---|---|
| **A — Hotfixes** | `hotfixer` | 1-2 files, obvious fix: meta tags, alt attrs, heading fix, robots.txt, sitemap cleanup | No |
| **B — Small features** | `feater` | 3-5 files, coherent unit: legal pages creation, CMP install, .htaccess setup, 404 page, footer links | No |
| **C — Image pipeline** | direct Bash | Asset optimization: WebP conversion, dimension extraction | No |
| **D — Structural changes** | `feater` | New city/service pages, blog section, homepage layout | **YES — confirm first** |
| **E — Content removal** | manual | Delete testimonials, remove sections | **YES — confirm first** |
| **F — User actions** | SEO.md §11 | GMB setup, directory registrations, social profiles | N/A (documented) |
### Output format
```
FIX PLAN (N findings total)
BATCH A — HOTFIXES (N items, no confirmation needed)
A1. <file><fix description>
A2. <file><fix description>
...
BATCH B — SMALL FEATURES (N items, no confirmation needed)
B1. <description> — files: <list>
B2. <description> — files: <list>
...
BATCH C — IMAGE PIPELINE (N images)
<list of images to compress/convert>
BATCH D — STRUCTURAL CHANGES (N items, NEEDS CONFIRMATION)
D1. <description> — impact: <what changes visually>
D2. <description> — impact: <what changes visually>
...
BATCH E — CONTENT REMOVAL (N items, NEEDS CONFIRMATION)
E1. <what to remove> — reason: <why>
...
BATCH F — USER ACTIONS (N items, documented in SEO.md)
F1. <action> — tool/link: <where>
...
```
**Do not proceed to STEP 12 until this plan is printed.**
---
## STEP 12 — EXECUTE FIXES VIA SUB-AGENTS `[both]`
**Orchestration step.** Delegate each batch to the appropriate specialist
agent. Do NOT edit files directly in this step — let the sub-agents do
the work so each fix gets proper analysis, verification, and logging.
### Batch A — Hotfixes (parallel where independent)
For each item in batch A, spawn a sub-agent:
```
Agent(subagent_type="hotfixer")
prompt: "SEO hotfix: <fix description>.
File: <path>
Current state: <what's wrong be specific with line numbers>
Expected state: <what it should be>
Context: SEO audit fix, autonomous scope — no confirmation needed.
Do NOT commit — just fix and verify."
```
Group independent fixes into parallel sub-agent calls.
Sequential if fixes touch the same file.
### Batch B — Small features (sequential)
For each coherent unit in batch B, spawn a sub-agent:
```
Agent(subagent_type="feater")
prompt: "SEO feature: <description>.
Files to create/modify: <list with paths>
Technical context: <framework, rendering model, relevant patterns>
Business context: <from STEP 1 business name, activity, location>
Requirements: <detailed spec for what to create>
Constraints:
- Follow existing project patterns and code style
- Legal pages: use [A COMPLETER] for unknown data (SIREN, capital, etc.)
- Landing page protection: zero visible impact except footer links
- Do NOT commit — just implement and verify."
```
Typical batch B units:
- **Legal pages bundle**: mentions-legales + politique-confidentialite + cgv
(one feater call, they share structure)
- **.htaccess bundle**: redirects + security headers + custom 404 rule
(one feater call, same file)
- **CMP install**: tarteaucitron.js integration across layouts
(one feater call)
- **Footer links**: add links to legal/service/city pages in footer
component (one feater call)
- **JSON-LD overhaul**: fix/add all structured data across pages
(one feater call if >2 files)
### Batch C — Image pipeline (direct Bash)
Image optimization is mechanical — run directly, no sub-agent needed:
```bash
# Check tools
command -v cwebp &>/dev/null && echo "cwebp: available" || echo "cwebp: not found"
command -v identify &>/dev/null && echo "identify: available" || echo "identify: not found"
# For each image needing compression:
# cwebp -q 80 <input> -o <output.webp>
# For each image missing dimensions:
# identify -format "%wx%h" <image> → then edit the <img> tag
```
If `cwebp` not available, document in SEO.md §11 as user action:
"Install libwebp-tools and run: `cwebp -q 80 input.jpg -o output.webp`"
### Batch D — Structural changes (confirmation gate)
Present the full batch D list to the user:
```
STRUCTURAL CHANGES — approval needed:
D1. <description> — impact: <what changes>
D2. <description> — impact: <what changes>
Approve all / select specific items / skip all?
```
For each approved item, spawn `feater` with detailed spec.
Unapproved items → document in SEO.md §9 (moyen terme).
### Batch E — Content removal (confirmation gate)
Same pattern as batch D. Present list, get approval, execute approved items.
### Batch F — User actions
No execution. These are documented in SEO.md §11 during STEP 13.
### Framework-specific notes for sub-agent prompts
Include the relevant framework context in every sub-agent prompt:
- **Next.js**: `metadata` export (App Router) or `Head` (Pages Router).
`next-sitemap` for sitemap. Redirects in `next.config.js`.
- **Astro**: direct `<meta>` in layouts. `@astrojs/sitemap`.
Redirects in `astro.config.mjs` or `_redirects`.
- **Nuxt**: `useHead()` or `nuxt.config`. `@nuxtjs/sitemap`.
- **Static HTML / PHP**: edit `<head>` directly. `.htaccess` for redirects.
- **React SPA**: flag that SEO is severely limited without SSR. Add
`react-helmet` but warn in report. Recommend migration to SSR framework.
### Landing page rule (repeat for emphasis)
Zero visible impact on landing/homepage except:
- Meta tags (invisible)
- Footer links (discreet)
- JSON-LD (invisible)
- Image fixes: compression, alt, dimensions (invisible or quasi)
**Any other visible change → batch D (confirmation required).**
### Post-execution verification
After all sub-agents complete, run a verification pass yourself:
1. **Syntax check** — validate modified HTML, JSON-LD, .htaccess
2. **Consistency check** — JSON-LD data matches what was decided in audit
3. **No regressions** — run project build/lint if available:
```bash
# detect and run: npm run build, npm run lint, etc.
```
4. If a sub-agent broke something, revert its changes and note the failure.
### Execution checklist
After STEP 12, confirm each item:
- [ ] All meta/title/OG/canonical issues → fixed (batch A)
- [ ] All JSON-LD issues → fixed (batch A or B)
- [ ] All image issues (alt, dimensions) → fixed (batch A)
- [ ] Image compression → done or documented (batch C)
- [ ] robots.txt / sitemap.xml → fixed (batch A)
- [ ] .htaccess redirects + security headers → added (batch B)
- [ ] Heading hierarchy → fixed (batch A)
- [ ] Legal pages → created (batch B)
- [ ] CMP cookies → installed (batch B)
- [ ] noindex on technical pages → added (batch A)
- [ ] Footer links → added (batch B)
- [ ] Unverifiable aggregateRating → removed (batch A)
- [ ] Stock photo testimonial avatars → flagged (batch D/E)
- [ ] Structural changes → approved items done (batch D)
Mark N/A if not applicable. Explain failures.
### Change log
Collect logs from all sub-agents. Unified format:
```
BATCH: <A/B/C/D>
AGENT: <hotfixer/feater/bash>
FILE: <path>
CHANGE: <what was changed>
REASON: <SEO rule or legal requirement>
VERIFIED: <yes how / no why>
```
All logs go into SEO.md §15.
---
## STEP 13 — GENERATE SEO.md `[both]`
Create or **update** `SEO.md` at project root (or `docs/SEO.md` if that
convention exists). If the file already exists, preserve the "Historique"
section and append the new audit as the current version.
### Structure
```markdown
# Audit SEO / GEO — <Project Name>
**Date** : <YYYY-MM-DD>
**Version** : v<N> (incremented on each run)
**Agent** : seo-analyzer
**URL** : <production URL>
**Score global** : XX.X / 20
---
## 0. Alertes majeures (conformite legale et risques)
<!-- Critical legal/compliance issues that need immediate attention -->
## 1. Notes globales (/20 par axe + ponderee)
<!-- Full scoring table from STEP 9 -->
## 2. Audit technique
<!-- HTTP headers, redirects, compression, security, performance -->
<!-- Mark what was fixed automatically vs what remains -->
## 3. Audit on-page
<!-- Meta, headings, content, images, internal linking -->
## 4. Audit SEO local / NAP
<!-- NAP consistency matrix across all sources -->
## 5. Audit presence externe (GMB, reseaux sociaux, citations)
<!-- Status of each platform, missing registrations -->
## 6. Analyse concurrentielle
<!-- Top competitors, positioning, gaps, targets -->
## 7. Optimisation GEO / IA
<!-- AI readiness assessment, current visibility in AI engines -->
## 8. Plan d'action — QUICK WINS (< 7 jours)
<!-- Actionable list with time estimates and impact -->
## 9. Plan d'action — MOYEN TERME (1-3 mois)
<!-- Structural improvements, content strategy, city pages -->
## 10. Plan d'action — LONG TERME (3-6 mois)
<!-- Authority building, backlinks, partnerships -->
## 11. Actions utilisateur requises
<!-- Each action with direct links to tools/interfaces -->
<!-- Example: "Revendiquer la fiche GMB → https://business.google.com" -->
## 12. Recommandations gratuites (outils, methodes, budget 0 EUR)
<!-- Free tools and methods: GSC, PageSpeed, Schema validator, etc. -->
## 13. Synthese 90 jours — objectifs realistes
<!-- Measurable targets: review count, ranking positions, traffic -->
## 14. Annexe — informations impossibles a auditer automatiquement
<!-- What couldn't be checked and why (missing tools, access, etc.) -->
## 15. Log des modifications appliquees par l'agent
<!-- Every file changed, what was changed, why -->
---
## Historique
<!-- Previous audit summaries preserved here -->
<!-- ### v1 — 2025-01-15 — Score: 8.2/20 -->
<!-- ### v2 — 2025-04-01 — Score: 12.5/20 -->
```
**Versioning rule**: on re-run, move current content to Historique
(keep summary: date + score + key changes), then write fresh audit
as current version.
---
## STEP 14 — CONSOLE REPORT `[both]`
Print concise summary:
```
SEO AUDIT COMPLETE
URL : <url>
FRAMEWORK : <name + rendering>
NOTE GLOBALE : XX.X / 20
CHANGEMENTS APPLIQUES (N) : voir SEO.md §15
CHANGEMENTS EN ATTENTE (N) : voir SEO.md §11
CONFORMITE LEGALE : OK | N points bloquants → voir SEO.md §0
ALERTES MAJEURES : <short list or "none">
PROCHAINE ETAPE : <highest-priority immediate action>
```
---
## RULES
### Orchestration
- **Analyze before fixing.** STEPs 0-11 are pure analysis and planning.
No file is modified until STEP 12. The triage (STEP 11) is the bridge.
- **Delegate to specialists.** Never edit files directly during STEP 12.
Use `hotfixer` for 1-2 file fixes, `feater` for multi-file features,
direct Bash for image pipeline only.
- **Depth-aware.** Respect the LOCAL/FULL choice from STEP 0. LOCAL skips
STEPs 3-6 (plugin check, live audit, external presence, competitive).
Same rigor on the steps that do run.
- **Plugin-advisor at the right time.** STEP 3 (after stack detection),
not before. Only for FULL depth. If tools are missing, offer to
downgrade to LOCAL — don't fail silently.
- **Sub-agent prompts must be self-contained.** Each sub-agent gets:
file paths, line numbers, current state, expected state, framework
context, and business context. Never assume the sub-agent has seen
the audit findings.
### Scope
- **Autonomous fixes = markup, assets, config, legal pages only.**
Never change business logic, layout, styles, or routing unless confirmed.
- **Landing page protection.** Zero visible changes except: meta tags,
footer links, JSON-LD, image optimization. Everything else requires
confirmation via batch D.
- **Preserve existing valid SEO.** Don't rewrite correct tags.
- **Flag SPA limitations.** Client-side SPA without SSR = SEO severely
limited. Warn explicitly and recommend SSR migration.
- **One H1 per page.** Fix hierarchy if broken.
- **JSON-LD over microdata.** Prefer `application/ld+json` script blocks.
### Data integrity
- **No invented content.** Meta descriptions and titles must reflect actual
page content. Use `<!-- SEO: TODO — describe X -->` for unknowns.
- **No fake data.** Never invent reviews, ratings, or testimonials.
Remove unverifiable `aggregateRating` rather than keeping a lie.
- **Legal accuracy.** Legal page content must be factually correct for
the business. Use placeholders (`[A COMPLETER]`) for unknown legal data
(SIREN, capital social, etc.) rather than inventing values.
### Process
- **Iterative document.** SEO.md is updated, never overwritten from scratch.
Preserve audit history.
- **Transparency.** Every automated change is logged with file, change,
and reason. Nothing is done silently.
- **Verify after fix.** Post-execution verification (STEP 12) is mandatory.
Build/lint must pass. Broken fixes are reverted immediately.

45
skills/geo/SKILL.md Normal file
View File

@ -0,0 +1,45 @@
---
name: geo
description: |
Standalone GEO (Generative Engine Optimization) audit for AI search
engines: ChatGPT, Perplexity, Claude, Gemini, Google AI Overviews,
Microsoft Copilot, Brave AI, DuckAssist, You.com, Apple Intelligence.
Audits AI crawler directives, llms.txt / llms-full.txt, Schema.org
types optimised for AI extraction (QAPage, Speakable, Person+Article,
HowTo, Organization graph), entity SEO (Wikidata, sameAs, @id,
Knowledge Panel), content shape for LLM extraction (Definition Lead,
TL;DR, Q→A structure, citable stats, freshness), and live AI
visibility monitoring.
For full SEO + GEO combined audit → use /seo (runs seo + geo in parallel).
For classical SEO only → use /seo and skip the GEO section.
Trigger: "geo", "AI search", "ChatGPT visibility", "Perplexity optimisation",
"llms.txt", "AI crawlers", "Google AI Overview", "entity SEO", "Wikidata",
"generative engine optimization", "référencement IA", "optimisation IA".
argument-hint: optional keywords/scope, e.g. "SaaS B2B content GEO" or "audit llms.txt et entity SEO"
allowed-tools:
- Read
- Edit
- Write
- Bash
- Grep
- Glob
- Agent
- WebFetch
- WebSearch
---
Load and follow strictly:
- $HOME/.claude/agents/geo-analyzer.md
Execute the GEO-ANALYZER agent on the following target:
$ARGUMENTS
## Note on integration
If `SEO.md` already exists at project root, the geo-analyzer will
merge its findings into that file's `§7 — Optimisation GEO / IA`
section (rather than writing a separate `GEO.md`). This keeps a
single consolidated report when both /seo and /geo have been run.
If no `SEO.md` exists, the agent writes `GEO.md` at project root.

View File

@ -1,15 +1,24 @@
---
name: seo
description: |
Full SEO audit and optimization for any web project. Detects framework,
audits all SEO signals (meta, OG, structured data, sitemap, robots.txt,
headings, alt attrs, canonicals, hreflang), applies fixes directly in code,
and generates a strategic SEO guide.
Full SEO + GEO audit and optimization for any web project.
Runs the seo-analyzer (classical search — Google, Bing, DuckDuckGo)
and geo-analyzer (AI search — ChatGPT, Perplexity, Claude, Gemini,
Google AI Overviews, Copilot) IN PARALLEL, then consolidates their
output into a unified SEO.md report.
Covers: meta, OG, JSON-LD (classical + GEO-optimised schemas),
sitemap, robots.txt (including AI crawlers), llms.txt, headings,
alt attrs, canonicals, hreflang, Core Web Vitals, entity SEO
(Wikidata, sameAs, Knowledge Panel), content shape for AI extraction,
AI visibility monitoring.
Trigger: "seo", "referencement", "optimize for search", "audit SEO",
"meta tags", "structured data", "JSON-LD", "sitemap", "robots.txt",
"Google ranking", "local SEO", "referencement local", "fiche Google".
"Google ranking", "local SEO", "referencement local", "fiche Google",
"AI search", "GEO", "llms.txt", "ChatGPT visibility", "Perplexity",
"Google AI Overview".
For GEO-only audit → use /geo.
For code-only bugs → use /bugfix. For feature work → use /feat.
argument-hint: optional keywords/scope, e.g. "local SEO plombier 91 94 77"
argument-hint: optional keywords/scope, e.g. "local SEO plombier 91 94 77" or "SaaS B2B content strategy"
allowed-tools:
- Read
- Edit
@ -18,11 +27,325 @@ allowed-tools:
- Grep
- Glob
- Agent
- WebFetch
- WebSearch
---
Load and follow strictly:
- $HOME/.claude/agents/seo-analyzer.md
# /seo — parallel SEO + GEO dispatcher
Execute the SEO-ANALYZER agent on the following target:
This skill orchestrates TWO specialist agents running in parallel, then
merges their output into a single `SEO.md` report. It is the main
entry point for any SEO/GEO work on a web project.
$ARGUMENTS
## STEP 0 — Collect shared context (ONCE)
Before spawning any agent, collect the context both agents need.
This avoids asking the user the same questions twice.
### Audit depth
Ask once:
```
AUDIT DEPTH — choose one:
LOCAL — Code-only analysis. No external calls.
Covers: markup, meta, JSON-LD, sitemap, robots.txt,
llms.txt, content shape audit, legal, security headers,
schemas for AI, entity signals (code-observable).
FULL — LOCAL + live HTTP audit, Core Web Vitals (PageSpeed API),
external presence (GMB, social, directories), AI visibility
testing, competitor analysis, Wikidata / Knowledge Panel check.
Which depth? (LOCAL / FULL)
```
If `$ARGUMENTS` contains `local`/`code-only`/`quick`/`rapide` → default LOCAL.
If `$ARGUMENTS` contains `full`/`complet`/`externe`/`live` → default FULL.
If `$ARGUMENTS` contains a production URL → suggest FULL.
### Business context (one grouped block)
**Both depths:**
1. Activity type (B2C local / B2B national / SaaS / e-commerce / service / content/media)
2. Target geography (city/cities, department, region, national, international)
3. Languages served (for i18n/hreflang)
4. Priority keywords and AI queries
5. Intervention mode: **aggressive** (apply fixes) / **conservative** (audit-only)?
**FULL depth only:**
6. Production URL
7. Google Business Profile URL (or "not yet")
8. Social media URLs
9. Known citations (PagesJaunes, Yelp, sector directories)
10. Known competitors
11. Known Wikidata QID / Knowledge Panel status (or "unknown")
12. Time budget for user actions post-audit
Skip questions already answered in `$ARGUMENTS`.
### Plugin check (FULL only)
For FULL depth, verify `WebFetch` and `WebSearch` are available.
They are declared in this skill's `allowed-tools`, so they should
be. If the harness reports them missing, offer to downgrade to LOCAL
or continue with gaps.
Store the collected context as a single block to pass to both agents.
### File ownership (prevents parallel edit conflicts)
Running two agents in parallel on the same repo is safe for ANALYSIS
(read-only). It would race-condition on fixes if both touched the
same file. This matrix is authoritative — pass it to both agents in
their dispatch prompts:
| File / concern | Owner | Notes |
|---|---|---|
| `robots.txt` | **geo-analyzer** | Classical + AI bot directives consolidated here. seo-analyzer reads only. |
| `sitemap.xml` + image/video sitemaps | **seo-analyzer** | |
| `llms.txt` / `llms-full.txt` | **geo-analyzer** | |
| `.htaccess` (redirects, security headers, 404) | **seo-analyzer** | |
| JSON-LD blocks (all schemas, all pages) | **geo-analyzer** | Owns structure + content. seo-analyzer flags NAP inconsistencies vs GMB, geo-analyzer reconciles. |
| Meta tags (title, description, OG, Twitter, canonical, robots meta) | **seo-analyzer** | |
| Heading hierarchy (H1 presence/count, level skips) | **seo-analyzer** | Structure only. |
| H1/H2 content rewrite (Definition Lead, question-style) | **geo-analyzer** | Semantic rewrite for AI extraction. Batch G5 confirmation-gated. |
| TL;DR / summary blocks insertion | **geo-analyzer** | |
| Legal pages (mentions légales, confidentialité, CGV) | **seo-analyzer** | |
| CMP / cookie banner integration | **seo-analyzer** | |
| Images (alt, width/height, compression, WebP/AVIF) | **seo-analyzer** | |
| hreflang | **seo-analyzer** | |
| Footer links (legal + service/city pages) | **seo-analyzer** | |
| New city/service pages | **seo-analyzer** | Batch D confirmation. |
| Video transcripts | **seo-analyzer** (user action) | |
If either agent detects a finding in a file it doesn't own, it emits
a "CROSS-AGENT NOTE" in its envelope; the dispatcher forwards it to
the owning agent at merge time. No direct cross-agent fix.
## STEP 1 — Spawn both agents IN PARALLEL
Issue both `Agent` tool calls **in the same message** (parallel tool
calls). The harness runs them concurrently.
```
Agent(subagent_type="seo-analyzer")
prompt: """
Dispatched from /seo. Context:
AUDIT DEPTH: <LOCAL|FULL>
BUSINESS CONTEXT:
Activity type: ...
Geography: ...
Languages: ...
Priority keywords: ...
Intervention mode: ...
Production URL: ... (FULL only)
GMB URL: ...
Social URLs: ...
Known citations: ...
Known competitors: ...
Time budget: ...
You are the classical-SEO half of a parallel SEO+GEO audit. Do NOT
audit GEO/AI signals (llms.txt, AI crawlers, QAPage/Speakable schemas,
entity SEO, content shape for AI, AI visibility) — the geo-analyzer
agent runs in parallel and owns those.
FILE OWNERSHIP (authoritative, prevents parallel-edit conflicts):
- YOU OWN (read+write): sitemap.xml, image/video sitemaps, .htaccess,
meta tags (title, description, OG, Twitter, canonical, robots meta),
heading structure (H1 count, level skips), legal pages, CMP, images
(alt/dimensions/compression), hreflang, footer links, new city/service
pages.
- YOU READ-ONLY: robots.txt (geo-analyzer owns), JSON-LD blocks
(geo-analyzer owns structure; you flag NAP inconsistencies), llms.txt.
- CROSS-AGENT NOTES: if you find issues in files you don't own, emit
them in your envelope under "CROSS-AGENT NOTES TO geo-analyzer:"
— the dispatcher forwards.
Execute your agent spec at ~/.claude/agents/seo-analyzer.md starting
at STEP 2 (skip STEP 0 and STEP 1 — context is provided above).
At STEP 13, emit the STRUCTURED ENVELOPE for merging (not a
standalone SEO.md). Do NOT write any SEO.md file yourself — the
dispatcher will merge your output with geo-analyzer's output.
"""
Agent(subagent_type="geo-analyzer")
prompt: """
Dispatched from /seo. Context:
AUDIT DEPTH: <LOCAL|FULL>
BUSINESS CONTEXT:
(same block as above)
You are the GEO/AI half of a parallel SEO+GEO audit. Do NOT audit
classical SEO signals (meta tags, Core Web Vitals, hreflang, image
compression, classical legal compliance) — the seo-analyzer agent
runs in parallel and owns those. Your focus is AI-engine retrieval:
llms.txt, AI crawlers in robots.txt, QAPage/Speakable/Person+Article
schemas, entity SEO (Wikidata, sameAs, Knowledge Panel), content
shape for LLM extraction, AI visibility testing.
FILE OWNERSHIP (authoritative, prevents parallel-edit conflicts):
- YOU OWN (read+write): robots.txt (all directives — classical + AI),
llms.txt, llms-full.txt, JSON-LD blocks (all schemas, all pages),
H1/H2 content rewrite for Definition Lead, TL;DR / summary blocks,
content shape changes.
- YOU READ-ONLY: sitemap.xml, .htaccess, meta tags, heading structure
(seo-analyzer owns structure), legal pages, images, hreflang.
- CROSS-AGENT NOTES: if you find issues in files you don't own, emit
them in your envelope under "CROSS-AGENT NOTES TO seo-analyzer:"
— the dispatcher forwards.
Execute your agent spec at ~/.claude/agents/geo-analyzer.md starting
at STEP 2 (skip STEP 0 and STEP 1 — context is provided above).
At STEP 14, emit the STRUCTURED ENVELOPE for merging (not a
standalone GEO.md). Do NOT write any GEO.md or SEO.md file yourself —
the dispatcher will merge your output with seo-analyzer's output.
"""
```
## STEP 2 — Merge envelopes into SEO.md
Both agents return structured envelopes keyed by SEO.md section
numbers. Consolidate them into a single `SEO.md` at project root
(or `docs/SEO.md` if that convention exists).
### Combined score calculation
Per user decision:
- **Local B2C**: `GLOBAL = 0.80 × SEO_score + 0.20 × GEO_score`
- **SaaS / national / content**: `GLOBAL = 0.75 × SEO_score + 0.25 × GEO_score`
### Final SEO.md structure
```markdown
# Audit SEO + GEO — <Project Name>
**Date** : <YYYY-MM-DD>
**Version** : v<N> (incremented on each run)
**Agents** : seo-analyzer + geo-analyzer (parallel)
**URL** : <production URL>
**Depth** : LOCAL | FULL
**Score SEO (classique)** : XX.X / 20
**Score GEO (IA)** : XX.X / 20
**Score global pondéré** : XX.X / 20 (<weights explained>)
---
## 0. Alertes majeures (conformité + risques SEO/GEO)
<Merged from both agents legal blockers, catastrophic issues>
## 1. Notes globales (/20 par axe + pondérée)
<SEO scoring table from seo-analyzer + GEO scoring table from geo-analyzer + combined score>
## 2. Audit technique (HTTP, CWV, sécurité)
<From seo-analyzer>
## 3. Audit on-page (meta, headings, content, images, video, a11y, i18n)
<From seo-analyzer>
## 4. SEO local / NAP
<From seo-analyzer>
## 5. Présence externe (GMB, réseaux sociaux, citations)
<From seo-analyzer FULL only>
## 6. Analyse concurrentielle
<From seo-analyzer FULL only>
## 7. Optimisation GEO / IA
<From geo-analyzer full dedicated section with sub-sections:>
### 7.1 AI crawlers policy
### 7.2 llms.txt / llms-full.txt
### 7.3 Schema.org pour extraction IA (QAPage, Speakable, Person, Article+author)
### 7.4 Entity SEO (Wikidata, @id, sameAs, Knowledge Panel)
### 7.5 Content shape pour extraction IA (Definition Lead, TL;DR, citations, fraîcheur)
### 7.6 Visibilité IA (tests — FULL only)
## 8. Plan d'action — QUICK WINS (< 7 jours)
<Merged from both agents AUTO + USER, dedupe overlaps>
## 9. Plan d'action — MOYEN TERME (1-3 mois)
<Merged>
## 10. Plan d'action — LONG TERME (3-6 mois)
<Merged>
## 11. Actions utilisateur requises
<Merged EVERY entry includes "Automatisation possible avec: <tools>"
per ~/.claude/agents/resources/automation-catalog.md>
## 12. Recommandations gratuites (outils, méthodes, budget 0 EUR)
<Merged GSC, PageSpeed, Schema validator, manual AI-visibility spreadsheet, etc.>
## 13. Synthèse 90 jours — objectifs réalistes
<Combined measurable targets: review count, ranking positions, traffic,
AI mention rate, Wikidata presence>
## 14. Annexe — informations non-auditables automatiquement
<Merged what couldn't be checked, why>
## 15. Log des modifications appliquées par les agents
<Merged change logs from both agents, grouped by batch>
---
## Historique
<Previous audit summaries preserved here>
```
### Deduplication rules
Both agents may surface overlapping findings (e.g. JSON-LD presence,
Legal compliance). Merge rule:
- **Hard dedupe**: identical finding text → keep one, credit both agents
in a `<sub>Detected by: seo-analyzer, geo-analyzer</sub>` line
- **Complementary findings**: both agents see the same feature from
different angles (classical ranking + AI extraction) → keep both,
group under the same section
- **Conflicting findings**: rare — if one agent says "remove schema X"
and the other says "keep schema X", flag explicitly in §0 and let
the user decide
## STEP 3 — Console summary
```
SEO + GEO AUDIT COMPLETE (parallel dispatch)
URL : <url>
FRAMEWORK : <name + rendering>
DEPTH : LOCAL | FULL
NOTE SEO (classique) : XX.X / 20
NOTE GEO (IA) : XX.X / 20
NOTE GLOBALE (pondérée) : XX.X / 20
CHANGEMENTS APPLIQUES (N) : voir SEO.md §15
ACTIONS UTILISATEUR (N) : voir SEO.md §11 (avec automatisation)
CONFORMITÉ LÉGALE : OK | <N> blockers → §0
ALERTES MAJEURES : <short list>
PROCHAINE ÉTAPE : <highest-priority immediate action>
```
## Rules
- **Parallel dispatch is mandatory.** Both Agent calls MUST be in the
same message so the harness runs them concurrently. Sequential
dispatch doubles wall-clock time and is explicitly forbidden.
- **Context collected once.** STEP 0 runs before any agent call.
Do not let either agent re-ask the user questions that STEP 0
already answered.
- **Neither agent writes SEO.md.** Only the dispatcher (this skill)
writes the consolidated report. Agents return envelopes.
- **Merge, don't overwrite.** On re-run, previous SEO.md's Historique
section is preserved. Current content moves to Historique with
summary (date + score + key changes).
- **Every user action has automation options.** Per user CLAUDE.md,
mandatory from `automation-catalog.md`.
- **Scoring weights per user decision**: GEO = 20% local B2C, 25%
SaaS/national/content. Combined score formula is explicit in §1.