We read with great interest the article by Mikelarena Erdozain et al on the use of point-of-care ultrasound for the detection of finger fractures in pediatric emergency departments.1 We commend the authors for addressing a clinically relevant issue and providing primary data in a field where evidence is limited.
Still, we thought it would be useful to contribute a few methodological considerations that might help ensure a more rigorous interpretation of the results.
First, the original article in Spanish uses the term precisión diagnóstica (“diagnostic precision”) to describe what would actually be diagnostic accuracy. In clinical epidemiology, precision refers to the reproducibility of measurements, whereas accuracy refers to how close the results are to the true value.2 Although this confusion is common in everyday language, it can lead to errors in studies of diagnostic tests.
Second, the authors report differences between groups (pediatricians versus residents, first versus second half of the study) as significant findings, even though the p values did not reach statistical significance. Interpreting results with p values greater than 0.05 as real differences increases the risk of drawing conclusions that are not supported by the evidence and may lead to error.
We believe that the analysis of diagnostic performance would benefit from the addition of likelihood ratios (LRs) and the weights of evidence (WoE), expressed in decibans.3 Based on the published data, and using the combination of an orthopedic surgeon and plain radiography as the reference standard (Table 1), sonography shows a positive predictive value (PPV) of nearly 9 and a negative predictive value (NPV) of approximately 0.2, which translates to a high positive WoE and a moderate negative WoE.
2 × 2 contingency table and accuracy metrics for the diagnostic test.
| Gold standard | ||||
|---|---|---|---|---|
| Fracture | No fracture | |||
| Ultrasound | Positives | 51 | 14 | 65 |
| Negatives | 13 | 146 | 159 | |
| 64 | 160 | 224 | ||
| Percentile | Sensitivity | Specificity | LR+ | LR− | WoE+ | WoE− |
|---|---|---|---|---|---|---|
| P2.5 | 0.682 | 0.858 | 5.43 | 0.135 | 7.35 | −8.69 |
| P50 | 0.791 | 0.909 | 8.64 | 0.230 | 9.36 | −6.38 |
| P97.5 | 0.824 | 0.923 | 15.0 | 0.352 | 11.8 | −4.54 |
Gold standard: orthopedic surgeon + plain radiograph, study data.
WoE+ = 9.36 decibans: The presence of a fracture on ultrasound substantially increases the likelihood of a fracture being confirmed by an orthopedic specialist.
WoE− = −6.38 decibans: A negative ultrasound result moderately reduces the likelihood of a fracture, but it does not rule it out with certainty.
Abbreviations: dB, decibans; LR−, negative likelihood ratio; WoE−, negative weight of evidence; LR+, positive likelihood ratio; WoE+, positive weight of evidence.
These results indicate that a positive ultrasound provides evidence to confirm a fracture, while a negative result reduces the likelihood but does not rule it out sufficiently to alter the treatment plan if there is a strong clinical suspicion.
From a formal perspective of mathematical decision theory, based on the Pauker-Kassirer threshold approach to clinical decision-making, this suggests that ultrasound is useful in “confirmation” scenarios (treat), but less robust in “rule out” scenarios (do not treat), which is a relevant consideration in the proposed diagnostic algorithms (Fig. 1). In situations with intermediate pre-test probability, a negative ultrasound result may not be sufficient to modify management and may warrant further testing to reduce diagnostic uncertainty.
Diagnostic accuracy using a Bayesian model.
This allows us to verify that, although the precision of both weights is very similar (WoE+ 95% CI, 7.35 to 11.8; WoE− 95% CI, –8.69 to 4.54), with a difference of about 4 decibans (dB) between the extreme percentiles of both accuracy indices, their accuracy is extremely different. Thus, by integrating both probability densities, we find the following:
- Probability of being an excellent discriminator: P(WoE− < −10 dB) = 0.0019
- Probability of being an acceptable discriminator: P(WoE− < −5 dB) = 0.289
- Probability of being an acceptable confirmatory: P(WoE+ > +5 dB) = 1
- Probability of being an excellent confirmatory: P(WoE+ > +10 dB) = 0.922
Abbreviations: dB, decibans; P, probability; WoE−, negative weight of evidence; WoE+, positive weight of evidence.
Finally, in addition to interobserver agreement analysis, which could not be performed in the study, we believe that future research could benefit from incorporating formal clinical decision-making analyses—based on mathematical decision theory—to help contextualize the results in clinical practice.
We thank the authors for their valuable contribution, and we trust that these reflections will enrich the scientific debate, reminding us that, in diagnostic medicine, beyond numerical values, what matters most is their interpretation in the real-world clinical setting, and that the true challenge lies in deciding when to treat and when not to. A diagnostic test is not useful when it confirms what we already suspect, but when it allows us to cross this threshold with confidence, modifying our therapeutic approach.




