Page 62 - JSOM Spring 2026
P. 62
gains. Sensitivity improved from 0.63 (SD 0.143) to 0.90 (SD confident”) nearly doubled, increasing from 20% to 37%. The
0.092), specificity from 0.70 (SD 0.145) to 0.86 (SD 0.096), proportion of responses indicating no confidence at all was cut
and accuracy from 0.67 (SD 0.097) to 0.88 (0.060). These im- in half, from 8.2% to 4.1%. The most frequently selected con-
provements were statistically significant based on McNemar’s fidence level shifted from “moderately confident” in the unas-
test (P<.001) (Figure 2). sisted condition (42%) to “confident” in the assisted condition
(29%). These changes in confidence distribution were statisti-
Reader confidence in their clip-level interpretations improved cally significant according to the Stuart-Maxwell chi-squared
markedly with the assistance of AI. Across conditions, there test (P<.001) (Table 1).
was a clear change in the distribution of confidence ratings
when AI support was available. The proportion of low- Standalone AI Interpretation
confidence ratings (“not at all confident” and “slightly con- The standalone AI system demonstrated excellent diagnostic
fident”) decreased from 38% without AI to 33% with AI. performance relative to the expert consensus standard, achiev-
In contrast, high-confidence ratings (“confident” and “very ing a sensitivity of 1.00, a specificity of 0.96, and an accuracy
FIGURE 1 AUROC curves of
diagnostic performance with and
without AI assistance.
Each colored line represents an
individual corpsman’s performance
across conditions. The black dashed
line shows the group mean with
95% CIs.
FIGURE 2 Improvement in diagnostic performance with AI assistance.
Each colored dot-dashed line pair traces an individual corpsman’s score in the AI-unassisted (left) and AI-assisted (right) sessions; the black
dashed line marks the group mean and its 95% CI. At the group level, mean sensitivity, mean specificity, and mean accuracy increased with AI
assistance, and these gains reached statistical significance. The pink dotted line indicates an individual reader whose specificity and accuracy did
not improve.
60 | JSOM Volume 26, Edition 1 / Spring 2026

