We rebuilt the analysis around categories. The OpenAI ↔ Google divergence below is one example of the vocabulary fragmentation we now measure across every category.

Case Study

Google rates Bing 98. OpenAI rates Bing 80.

The competitor is generous. The partner underrates. We query 11 AI models weekly. The data is clear.

The 18-point gap that shouldn't exist.

Google rates Bing

OpenAI rates Bing

18 pts

The gap

"Search Engine" category. Week 53 of 2025. 11 AI models queried.

The discovery

We query 11 AI models weekly, asking each to rate brands across categories. When we analyzed Microsoft Bing, we found something unexpected.

Google and Bing are direct competitors in search. You'd expect Google's AI to rate Bing lower—competitive bias seems natural.

But it's the opposite:

Google (Gemini) rates Bing highest at 98.

OpenAI rates Bing lowest at 80.

The competitor is generous. The partner underrates.

Model-by-model breakdown

ModelBing Score

Gemini (Google)98

Cohere98

AI2195

Anthropic (Claude)85

DeepSeek85

xAI (Grok)85

Mistral80

GPT-4o-mini (OpenAI)80

Scores represent category strength (0-100) for "Search Engine" category.

The pattern holds across categories

OpenAI rates Bing lower in every category where ChatGPT competes directly:

Voice Search

OpenAI's score for Bing

Academic Search

OpenAI's score for Bing

Local Search

OpenAI's score for Bing

Shopping Search

OpenAI's score for Bing

The timing

We began tracking Bing's AI model perception on June 30, 2025—two weeks after reports emerged of Microsoft-OpenAI partnership tensions.

June 16, 2025

WSJ breaks the story

OpenAI executives considered publicly accusing Microsoft of anticompetitive behavior.

June 17-24, 2025

Tensions escalate

The fight goes from "nasty to toxic." Both companies consider "nuclear options."

June 30, 2025

We begin baseline measurement

First CategoryRank collection for Bing across 7 AI models.

December 2025

Fresh data confirms pattern

Week 53 collection shows 18-point gap persists. OpenAI still rates Bing lowest.

Correlation, not causation.

We don't claim the partnership dynamics caused the rating gap. We observe that OpenAI's model rated its partner's product lower than any other model—including its partner's direct competitor—during a period of documented partnership strain.

Why this matters

AI companies have bias

OpenAI's models remember Microsoft's competitors better. Google's models favor Google products. There is no "neutral" AI recommendation.

Model choice matters

If you're a Microsoft partner, you get better treatment from Claude than ChatGPT. Your brand's AI visibility depends on which models your customers use.

This will get worse

As AI companies compete for market share, model training will become more politicized. Today it's Bing. Tomorrow it's your brand.

GPT serves 200 million users.
Gemini serves 150 million.
Claude serves 80 million.

When these models disagree about your brand, millions of people get different answers.

Shouldn't you know what they're saying?

See this for your brand

Get notified when we publish new model bias research.

No spam. Just research drops.

Which models disagree about your brand?

See your scores across all 11 models. Find the gaps.

Check Your Brand See Bing's Full Data

Methodology: CategoryRank queries 11 AI models weekly, asking each to rate brands across self-generated categories on a 0-100 scale. Data shown is from week 53 of 2025. Statistical analysis compares each model's brand-specific scores to its global average across 37,000+ brands. Full methodology →