You have run your card sort. Participants have grouped your cards, maybe named their categories, and now you are staring at a spreadsheet of raw data. The temptation is to eyeball the most common groupings and call it done. Resist that. Card sort analysis done well turns ambiguous participant data into a defensible information architecture. Done poorly, it produces a structure that reflects your assumptions rather than your users' mental models.
This guide walks through the full analysis process, from cleaning data to presenting recommendations to stakeholders. If you need a refresher on setting up the study itself, see the complete guide to card sorting.
Step 1: Clean your data
Before you analyze anything, remove responses that will skew your results. Bad data does not average out — it muddies every pattern you are trying to find.
Flag these participants for removal:
- Speeders. If your card sort has 40 cards and someone finished in 90 seconds, they were not reading. Compare each participant's completion time to the median. Anyone finishing in less than one-third of the median time is suspect.
- Single-group sorters. Participants who dump all cards into one or two groups are telling you nothing about their mental model. They are telling you they did not want to do the exercise.
- Random sorters. Harder to spot, but look for participants whose groupings have zero overlap with any other participant. A single outlier with genuinely different mental models is valuable. A participant whose sort looks like it was done by a random number generator is not.
- Duplicate responses. If you recruited from an open panel, check for identical IP addresses or suspiciously similar sort patterns submitted within minutes of each other.
Action: Review your participant list sorted by completion time. Remove the fastest 5–10 percent and manually inspect any sort that looks suspicious. For a 30-participant study, expect to remove 2–4 responses. If you are removing more than 20 percent, your recruitment or study design needs work.
Step 2: Look at the big picture first
Do not jump straight to statistical tools. Spend 15 minutes manually reviewing the raw groupings to build intuition about your data. You will interpret the quantitative outputs better if you already have a sense of the patterns.
What to look for:
- Obvious clusters. Are there cards that almost every participant grouped together? These are your anchors — the items where users have strong consensus. In a study for an e-commerce site, "shipping policy," "return policy," and "delivery times" will almost certainly cluster. That is not surprising, but it confirms the obvious, which is worth knowing.
- Obvious outliers. Are there cards that never land in the same group? These are your problem cards — items where users have no shared expectation about where they belong. Note them now. You will come back to them in Step 6.
- Category naming patterns. For open sorts, scan the category names participants created. Do not fixate on exact wording yet — look for themes. If 8 participants call a group "Account," 5 call it "My Profile," and 4 call it "Settings," that is one concept expressed three ways. The naming differences matter (you will use them to pick the best label), but the underlying agreement is the important finding.
Action: Write down 3–5 observations about obvious patterns and 2–3 cards that seem problematic. These notes will serve as hypotheses you can confirm or challenge with the quantitative analysis.
Step 3: Read the similarity matrix
A similarity matrix shows how often each pair of cards was placed in the same group, expressed as a percentage. If 20 out of 25 participants put "shipping policy" and "return policy" in the same group, those two cards have 80 percent similarity.
The matrix is a square grid with cards on both axes. Each cell shows the co-occurrence percentage for that pair. The diagonal is always 100 percent (every card is always in the same group as itself).
How to read it:
- Look for dark clusters. Most tools display the matrix as a heat map. Dark blocks of high-similarity cells indicate groups of cards that participants consistently sorted together. These blocks are your candidate categories.
- Look for pale rows and columns. A card with low similarity to everything else has no natural home. Participants did not agree on where it belongs — or they each put it somewhere different.
- Look for split affinities. A card might show 60 percent similarity with one cluster and 50 percent with another. That card has a foot in two camps. You will need to make a judgment call about where it goes, and you should note it as a potential cross-link or candidate for a different label.
What the numbers mean practically:
- Above 70 percent: Strong agreement. These cards belong together. Build your categories around these clusters.
- 50–70 percent: Moderate agreement. These cards are related but the grouping is not universal. Consider whether a better category label would strengthen the association.
- 30–50 percent: Weak agreement. Some participants see a connection, but many do not. Do not force these cards into the same category unless you have a strong reason.
- Below 30 percent: No meaningful agreement. These cards are not related in users' minds.
CardSort generates similarity matrices automatically from your study results — no manual calculation or spreadsheet work required.
Action: Identify the 3–6 strongest clusters in your matrix (blocks of cards with mutual similarity above 60 percent). These are the foundation of your information architecture. Write down each cluster and the cards it contains.
Step 4: Read the dendrogram
A dendrogram is a tree diagram that shows how cards cluster together hierarchically. It is built from the same data as the similarity matrix but visualized differently — instead of showing every pairwise relationship, it shows the order in which cards and groups merge as you relax your similarity threshold.
At the bottom (or left, depending on orientation), individual cards are listed. Lines connect cards that were most frequently sorted together, then those small clusters merge into larger clusters, and so on until everything connects into one tree at the top.
How to read it:
- The height of a merge matters. When two branches join low on the tree (at a high similarity level), those items are strongly related. When branches join high on the tree (at a low similarity level), the connection is weak — participants only occasionally grouped those items together.
- Look for clear gaps. A good information architecture shows up as distinct branches that remain separate until late in the tree. If you see four branches that each stay coherent for a long stretch before merging with other branches, those are your four categories.
- Watch for "stragglers." Cards that only merge with a cluster at the very top of the tree are orphans. They do not naturally belong to any group.
Where to "cut" the tree:
This is the most important skill in dendrogram interpretation and the one most guides skip. You need to choose a horizontal line across the tree that produces a useful number of categories. Cut too low and you get 15 micro-categories. Cut too high and you get 3 mega-categories. Neither is useful for navigation.
Here is how to decide:
- Start with your constraints. If you are designing a main navigation, you probably want 5–8 top-level categories (based on well-established cognitive load principles — people can scan 5–8 options without effort). If you are organizing a settings page, you might want 3–5 sections.
- Find the cut line that produces that range. Slide your imaginary horizontal line up and down the dendrogram. At each position, count the number of distinct branches the line crosses. Find the level that gives you the number of categories you need.
- Check the coherence of each resulting cluster. Do the cards in each group make sense together? If one cluster contains "pricing plans," "billing history," and "weather forecast," the cut is at the wrong level — or "weather forecast" is a problem card that landed in this branch by weak association.
- Prefer cuts at large gaps. If there is a big vertical jump between two merge levels, that gap represents a natural boundary. Cutting within the gap is usually better than cutting through a region where many merges happen close together.
Common misinterpretations:
- Treating the dendrogram as definitive. The dendrogram is a visualization of statistical clustering, not a prescription. If the dendrogram suggests merging "account settings" and "billing" into one group but your product separates them for business reasons, that is fine — as long as you acknowledge the trade-off.
- Over-reading small differences. Two cards that merge at 72 percent similarity vs 68 percent similarity are effectively equivalent. Do not build your IA around 4-percentage-point differences.
- Ignoring the bottom. The most useful information in a dendrogram is often at the bottom — the earliest, strongest merges. These show you which cards are inseparable in users' minds.
Action: Choose a cut level that produces the right number of categories for your context. List each resulting category and its member cards. Compare this to the clusters you identified from the similarity matrix in Step 3 — they should largely agree. Where they differ, investigate why.
Step 5: Calculate agreement rates
Agreement rate measures how consistently participants placed a specific card into the same category. If 22 out of 25 participants put "shipping policy" into a group they labeled something like "Orders & Delivery," the agreement rate for that card is 88 percent.
For open sorts, you need to standardize category names first (group "Account," "My Profile," and "Settings" under a single label before calculating). This is a judgment call — be consistent and document your choices.
What the rates tell you:
- Above 80 percent: Near-universal agreement. This card has an obvious home. Use it as an anchor for that category.
- 60–80 percent: Good agreement. Most users agree, but a meaningful minority sorted differently. Worth investigating the minority view — they might represent a different user segment.
- 40–60 percent: Split opinion. This card could reasonably go in multiple places. You will need to use other evidence (task importance, business context, user interviews) to decide.
- Below 40 percent: No consensus. This is a problem card (see Step 6).
How many participants do you need for reliable agreement rates?
With 15 participants, a card that shows 80 percent agreement could genuinely be at 80 percent, or it could be anywhere from 60 to 95 percent due to small sample effects. With 30 participants, the uncertainty band tightens considerably. For most practical purposes, 20–30 participants give you agreement rates stable enough to act on. If you are making a high-stakes decision based on a single card's agreement rate, aim for the upper end of that range.
Action: List every card sorted by agreement rate, highest to lowest. The top of the list is your foundation — high-confidence placements. The bottom is where you need to focus your problem-solving energy.
Step 6: Deal with problem cards
Every card sort has them: cards that participants could not agree on. These are not failures of the study — they are the most valuable findings, because they reveal the exact points where your IA needs careful thought.
Types of problem cards:
Cards that split between two categories
"Transfer credits" might be sorted into "Admissions" by 45 percent of participants and "Academics" by 40 percent. Neither placement is wrong — the card genuinely relates to both concepts. What to do: Place the card in the category that best matches the user's task context (when are they looking for this?). Add a cross-link, alias, or secondary navigation path from the other category. In your final IA, flag this as a known ambiguity that should be addressed with redundant navigation.
Cards that land everywhere
"Resources" or "Tools" often gets placed in a different group by nearly every participant. This usually means the card label is too vague, not that users are confused. What to do: Examine what the card actually represents. If it is a catch-all label hiding multiple distinct items, break it into specific cards and re-test. If it is genuinely a standalone item, it may need its own top-level position or a more descriptive name.
Cards that no one groups with anything
A card that shows low similarity to every other card might be an outlier that does not fit your content model. What to do: Ask whether this content belongs in the scope at all. If it does, it might need to live in a utility area (like a footer or a "More" section) rather than the main navigation.
Cards that participants consistently rename
In open sorts, watch for cards that participants relabel before sorting. If 8 participants rename "Knowledge Base" to "Help" or "Support," they are telling you your terminology does not match their vocabulary. What to do: Use the participant-preferred label. This is free usability data.
Action: For each problem card, write a recommendation that includes where to place it, what alternative access paths to provide, and whether the label needs to change. These recommendations are often the most impactful part of your analysis.
Step 7: Turn analysis into recommendations
You now have clusters, agreement rates, and a list of problem cards. The final step is translating that into a proposed information architecture with clear reasoning behind every decision.
Structure your recommendations like this:
-
Proposed categories. List each category with a recommended label and its member items. Note which items have strong agreement (above 70 percent) and which are judgment calls (below 60 percent).
-
Category labels. For open sorts, recommend specific labels based on participant naming patterns. Choose the label that was most common, but also the most distinct — if "Account" and "Settings" were equally popular but your product already has an "Account" page that handles something different, go with "Settings" to avoid confusion.
-
Cross-links and redundant paths. List every item that needs to appear in multiple places. This is not a sign of a messy IA — it is a sign of a realistic one. Users' mental models overlap, and your navigation should accommodate that.
-
Open questions. Be honest about what the card sort did not resolve. If you have three cards with 45 percent agreement rates that you placed based on gut instinct, say so. These are candidates for follow-up research — ideally a tree test. See card sorting vs tree testing for how to sequence the two methods.
-
Next steps. Recommend a tree test to validate the structure, or a closed card sort if the open sort left too many ambiguities.
Action: Build a proposed sitemap (even a simple indented list) based on your analysis. Annotate each placement with its evidence level: strong agreement, moderate agreement, or judgment call. This becomes the deliverable you present to stakeholders.
Presenting results to stakeholders
Most stakeholders do not need to see similarity matrices or dendrograms. They need to see recommendations and the confidence level behind each one.
What to include in a stakeholder presentation:
- The proposed structure. A visual sitemap or indented hierarchy showing the recommended IA.
- High-confidence findings. The 3–5 clearest results from the study. "Users unanimously group financial content together. Our current site splits it across three sections."
- Surprises. Findings that challenge assumptions. "We assumed 'Transfer Credits' belonged under Academics, but 45 percent of users looked for it under Admissions."
- Risk areas. Placements where agreement was weak and the recommendation is a judgment call. Frame these as "needs validation" rather than "we guessed."
- What you recommend doing next. Usually a tree test, sometimes a follow-up card sort with revised cards.
What to leave out: Raw data tables, detailed methodology, and statistical terminology. If a stakeholder wants to see the similarity matrix, have it ready in an appendix. Do not lead with it.
CardSort can auto-generate a shareable findings report from your study results, giving stakeholders an interactive view of the data without requiring them to log into the tool. See how to present UX research to stakeholders for broader advice on this.
Frequently asked questions
How long should card sort analysis take?
For a study with 25–30 participants and 40–50 cards, expect 3–5 hours for a thorough analysis using a tool that generates the similarity matrix and dendrogram for you. Without automated analysis, budget a full day — most of it on manually building the matrix in a spreadsheet. The analysis time is independent of whether you ran an open or closed sort, though open sorts add 30–60 minutes for standardizing category names.
Should I analyze open and closed sorts differently?
The core process is the same, but open sorts require an extra step: standardizing the category names participants created before you can calculate agreement rates or build a meaningful dendrogram. With closed sorts, the categories are already defined, so you skip straight to measuring how consistently participants placed items within them. Closed sort analysis is faster but produces narrower insights — you learn about item placement but not about how users conceptualize the categories themselves.
What if participants created more (or fewer) categories than I expected?
This is data, not a problem. If you expected 6 categories but participants consistently created 10, your content may have more natural divisions than you assumed. If they created 3, you may be over-segmenting. Look at the dendrogram to see whether the participant-generated category count aligns with natural break points in the clustering. If it does, consider adjusting your target number of categories rather than forcing items into your original expectation.
Can I combine results from multiple card sort studies?
Only if the studies used the same cards. You can pool data from two identical studies run with different participant populations — this is actually useful for comparing how different user segments organize content. But if the card sets differ, the similarity calculations become meaningless. If you need to expand your card set, run a new study rather than trying to merge incompatible datasets.
Further reading
- What Is Card Sorting? The Complete Guide
- The complete guide to card sorting — planning, running, and analyzing card sorts end to end
- Card sorting vs tree testing: when to use each — how to validate your card sort results with a follow-up tree test
- Dendrogram (UX Glossary) — how hierarchical clustering diagrams work
- Similarity Matrix (UX Glossary) — how co-occurrence data is calculated and displayed
- Cluster Analysis (UX Glossary) — the statistical methods behind card sort groupings