A content audit is a systematic inventory of every content item on your site — URLs, titles, word counts, last-updated dates, traffic data, and current categorization. It's the natural precursor to card sorting: you can't sort what you haven't inventoried. An audit reveals the true scope of your content, surfaces what's outdated or redundant, and directly shapes the card set for your study.
At minimum, capture these fields for every content item:
| Field | Why It Matters |
|---|---|
| URL | Unique identifier and current location |
| Page title | Becomes a candidate card label |
| Last updated | Flags stale content |
| Monthly traffic | Separates high-value pages from dead weight |
| Content type | Article, FAQ, video, tool — affects how it should be categorized |
| Current section | Where it lives now (which your sort may change) |
| Word count | Identifies thin content that might need consolidation |
Optional but useful: content owner, reading level, SEO keyword target, and a qualitative rating (keep, revise, merge, delete). The qualitative rating is where the audit earns its keep — it forces decisions about content before you ask users to organize it.
Start with a crawl. Tools like Screaming Frog, Sitebulb, or even a simple wget can spider your site and produce a URL list in minutes. That gives you the skeleton.
The manual work follows. For each URL, open the page and assess: Is the content accurate? When was it last reviewed? Does it duplicate another page? Is the traffic meaningful or is it ghost content nobody visits?
For a site with 200 pages, expect 2-3 days of focused work. The crawl takes 10 minutes. The remaining time is human judgment — reading pages, making keep/kill/merge decisions, and noting content gaps. This can't be fully automated because accuracy and relevance require context a crawler can't evaluate.
The audit directly determines three things about your card sort:
What cards to include. Every content item you mark "keep" or "revise" is a candidate card. Items marked "delete" don't need sorting. Items marked "merge" become a single card representing the combined content.
What cards to add. The audit reveals gaps. If you find zero content about a feature your product shipped 6 months ago, that gap should become a card in your sort — you need to know where users would look for it once it exists.
What to expect from results. Content you know is poorly placed today (buried 3 levels deep with declining traffic) will likely end up somewhere different in the sort. The audit gives you hypotheses about which cards will show low agreement rates based on their current confusing placement.
You're preparing a card sort for a B2B SaaS help center. The crawl returns 312 URLs. During the manual review, you discover:
Those 12 articles about dead features get cut — no need to sort them. The 23 duplicates get merged into 11 consolidated topics. The missing API versioning content gets added as a new card. And the 89-article Troubleshooting section becomes a priority area for the sort, since it's clearly overstuffed and probably poorly organized.
Without the audit, you'd have included cards for nonexistent features, created duplicate cards for the same topic, missed the API gap entirely, and lacked context about which sections were bloated. The sort results would've been harder to interpret and act on.
Beyond individual pages, the audit exposes structural patterns. Maybe 60% of your content sits under one top-level category while three other categories have fewer than 10 items each. That imbalance suggests the dominant category needs subdivision — something your card sort can directly address.
Or you might find that your taxonomy has 8 top-level categories but users only visit 4 of them regularly. The low-traffic categories might be candidates for consolidation, and the card sort can test whether users agree.
The audit is also the right time to evaluate whether your current labels match the language your users actually use. If your analytics show that users search for "billing" but your nav says "subscription management," that's a labeling disconnect the card sort should test.
What is included in a content audit? A content audit catalogs every content item on your site with key metadata including URL, page title, word count, last updated date, traffic volume, content type, and current category or section. Some audits also track content owner, reading level, SEO performance, and whether the content is still accurate. The goal is a complete inventory that reveals the true scope and condition of your content.
How does a content audit relate to card sorting? A content audit is the natural precursor to card sorting. You can't sort what you haven't inventoried. The audit tells you what content exists, which items are outdated or redundant, and where gaps exist. This directly shapes the card set for your sort — you wouldn't include cards for content you plan to delete, and you might add cards for content gaps you discovered during the audit.
How long does a content audit take? For a site with under 100 pages, expect 1-2 days using a crawler tool plus manual review. Sites with 500-1000 pages typically require 3-5 days. Enterprise sites with thousands of pages can take 2-4 weeks. The crawl itself is fast — the time goes into manually reviewing content quality, accuracy, and relevance for each item, which can't be fully automated.
Explore related concepts, comparisons, and guides