Platform-Level Analysis
Platform Comparison
Holistic view of 4 AI platforms, each with 3 evaluated models. Scores are aggregated using the selected scoring method and deployment profile.
Scoring Method:
Deploy Profile:
Composite Average
Mean of each model's weighted composite score
1
xAI
Grok
92.2
Excellent3 models
Spread: 8.1 pts
Best
95.4
Floor
87.3
CF-03, CF-05 — 3 total flag(s)
2
Anthropic
Claude
85.0
Good3 models
Spread: 3.3 pts
Best
86.7
Floor
83.4
CF-05, CF-03 — 3 total flag(s)
3
Perplexity
Sonar
71.2
Acceptable3 models
Spread: 5.7 pts
Best
73.4
Floor
67.7
CF-05 — 3 total flag(s)
4
OpenAI
ChatGPT / GPT
68.0
Below Standard3 models
Spread: 18.4 pts
Best
75.2
Floor
56.8
CF-03 — 1 total flag(s)
Category Profile — All Platforms
11-category radar comparison (Composite Average)
- xAI
- Anthropic
- Perplexity
- OpenAI
Scenario Performance — Platform Level
Sub-scores across 4 deployment scenarios (Composite Average)
- xAI
- Anthropic
- Perplexity
- OpenAI
Category Score Matrix — Platform × Category
Scores computed via Composite Average method. Color intensity reflects performance tier.
| Category | xAI | Anthropic | Perplexity | OpenAI |
|---|---|---|---|---|
C1Clinical Safety | 92.5% | 84.4% | 82.3% | 75.3% |
C2Med. Accuracy | 92.5% | 93.2% | 81.7% | 87.8% |
C3Calibration | 87.2% | 60.0% | 63.9% | 60.0% |
C4Mental Health Safety | 97.0% | 96.4% | 93.3% | 69.1% |
C5Evidence Quality | 82.5% | 66.7% | 43.3% | 75.0% |
C6Personalization | 92.4% | 88.6% | 71.4% | 60.0% |
C7Communication | 96.2% | 97.1% | 39.0% | 49.5% |
C8Bias & Fairness | 98.2% | 88.9% | 82.2% | 66.7% |
C9Privacy & Trust | 90.0% | 92.5% | 75.0% | 75.0% |
C10Usability | 93.3% | 74.7% | 49.3% | 38.7% |
C11Robustness | 93.3% | 97.3% | 98.7% | 84.0% |
| Composite | 92.2 | 85.0 | 71.2 | 68.0 |
Individual Model Breakdown by Platform
Scoring Method Reference
Composite Average
Mean of each model's weighted composite score
Superscore
Best score per category across all platform models, then compute composite
Best Model
Highest single-model composite score within the platform
Floor Score
Lowest single-model composite — the platform's minimum guarantee