Short answer
A manual prompt visibility audit is a repeatable process for checking how AI systems describe your brand, competitors, category, and source material. It uses a stable prompt set, clear logging, saved outputs, competitor tracking, citation notes, and confidence labels. The goal is to identify patterns over time, not to overinterpret one answer.
Outcome statement
By the end of this guide, you will have a stable prompt set, a logging structure, and a repeatable schedule for turning AI answer patterns into prioritized infrastructure tasks.
Overview
A one-off prompt check tells you how a model answered once. A repeatable audit tells you how AI systems describe you over time, which is the signal worth acting on.
This guide is the operational, spreadsheet-level version of a manual teardown. It is designed to run on a schedule so the outputs become comparable. Prompt outputs vary between sessions and systems, so a single run is directional at best. Repetition is what turns scattered observations into patterns.
Step-by-step workflow
Step 1: Define the goal of the audit
Decide what the audit is for before you write prompts. Common goals: check whether the brand appears for buyer-intent questions, track how accurately it is described, or watch a competitor's movement. The goal shapes the prompt set.
Step 2: Choose a stable prompt set
Write a fixed set of prompts and keep them the same across runs. Changing the prompts every time makes results impossible to compare. Treat the prompt set as a versioned asset, similar to a Controlled Prompt Set.
Step 3: Group prompts by intent
Organize prompts into intent categories so coverage is deliberate:
- Brand prompts: what does [brand] do, who is [brand] for
- Category prompts: best tools for [category], who helps with [problem]
- Problem prompts: how do I improve AI search visibility
- Comparison prompts: [brand] vs [competitor], best alternative to [competitor]
- Local or segment prompts: best [category] for [segment or location]
- Product/service prompts: does [brand] offer [capability]
- Next-step prompts: how do I get started with [category]
Step 4: Choose which AI systems to check
Test more than one system, such as ChatGPT, Perplexity, and Gemini, plus Google AI Overviews where relevant. Record which system produced each answer, because different systems synthesize from different sources.
Step 5: Record brand mentions and competitor mentions
For each prompt and system, log whether your brand appeared, where it ranked in the answer, and which competitors were named. Competitor patterns often reveal more than your own mention count.
Step 6: Record citations and source URLs where available
When a system cites sources, capture the URLs. Over time, cited-source patterns show which pages and which third parties AI systems rely on to describe your category.
Step 7: Score answer accuracy
Note whether the description was accurate, vague, outdated, or wrong. Track definition drift: the gap between how you define yourself and how AI systems describe you.
Step 8: Label findings by confidence level
Apply the Public Proof Package labels so a fast audit does not become overclaiming:
- Proof-Grade: directly verifiable, such as a page that exists or schema that is present.
- Directional: a real but non-causal signal, such as the brand missing across several prompts this run.
- Not Yet Measured: anything requiring more data, such as revenue or retention impact.
Step 9: Repeat on a consistent schedule
Run the same prompt set on a fixed cadence, such as monthly. Consistent timing is what makes movement meaningful instead of anecdotal.
Step 10: Turn findings into infrastructure tasks
Map each recurring gap to an action: publish a missing answer, correct a definition, add schema, surface proof, or connect internal links. The prioritized list is the output that matters.
Suggested spreadsheet columns
Log each result in one row:
- Date
- System tested
- Prompt
- Brand mentioned?
- Competitors mentioned
- Mention position
- Citation/source URL
- Accuracy
- Sentiment
- Recommendation strength
- Notes
- Confidence label
- Follow-up task
Decision points and conditions
- If prompt outputs vary widely between runs: increase the number of prompts and runs before drawing conclusions. One answer is not final truth.
- If manual logging becomes unmanageable: EchoScan can monitor more systematically with recurring, controlled checks.
- If gaps trace back to missing pages: the fix is answer infrastructure, not more monitoring. EntityMesh builds infrastructure from those gaps.
- If you need a comparative metric: SOMV can be measured from prompt sets over time, but treat short-window movement as directional.
Common mistakes
- Mistake: Treating one prompt run as proof.
Safer approach: repeat the stable prompt set on a schedule and label findings by confidence.
- Mistake: Changing prompts every run.
Safer approach: keep a fixed, versioned prompt set so results are comparable.
- Mistake: Assuming AI systems produce consistent answers.
Safer approach: test multiple systems and expect variation between sessions.
FAQ
Q: How often should I run a manual prompt visibility audit? A: A consistent cadence, such as monthly, is more useful than an occasional deep dive, because it makes movement comparable over time.
Q: Can a manual audit prove my AI visibility improved? A: No. It shows directional patterns. Prompt outputs vary, so a single run cannot prove causation.
Q: When should I move from manual audits to EchoScan? A: When manual logging becomes hard to maintain or you need systematic, recurring monitoring across many prompts and systems.
Related guides/answers
Related answers
Related reading
Companion blog posts on the Blue Ninja Systems blog cover this topic in more depth: "What is Share of Model Voice?", "Prompt tracking does not fix AI search visibility", and "How to run a manual AI visibility teardown in 30 minutes", the tactical companion to this operational guide.
Next step
Run the free EntityMesh scan at entitymesh.io if your manual audit shows weak visibility, inaccurate descriptions, or missing source material.