AI sees weird things in X-rays
Beer drinkers and bean eaters: Artificial intelligence in medical imaging can produce misleading results. A study shows how AI models derive absurd connections from knee x-rays. The researchers warn clearly.
“This knee, it’s a beer drinker”
When researchers at Dartmouth College fed AI models over 25,000 knee X-rays, they expected insights into orthopedic problems. Instead, the algorithms made amazing “predictions” about beer and fried bean consumption – with seemingly high levels of accuracy.
But how could the models draw such absurd conclusions? The answer lies in so-called “shortcut learning”. The AI systems did not use the medically relevant features, but instead clung to hidden patterns in the data. Completely contrary to the intended purpose, they learned to prioritize subtle differences in X-ray equipment or clinical location markers rather than analyzing actual medical information.
The study, published in Scientific Reports, reveals the pitfalls of AI in medical research. Dr. Peter Schilling, lead author, warns: “These models can detect patterns that humans cannot see, but not all patterns identified are meaningful or reliable.” What’s particularly strange is that the researchers tried to eliminate the distortions, but the AI kept finding new ways to use irrelevant data. Key Features of Shortcut Learning:
- Leveraging dataset artifacts: The model learns to rely on superficial or random patterns in the training data.
- Lack of generalization: The learned “shortcuts” often only work for the specific training data set, but fail with new, unseen data.
- Apparent accuracy: The model can achieve high accuracies on test data without solving the actual problem.
- Difficult to recognize: It is often not obvious to people that the model uses shortcuts because the results initially seem plausible.
Like an alien
Brandon Hill, co-author of the study, published in Nature published compares working with AI to “dealing with an extraterrestrial intelligence.”
“You could say the model is ‘cheating,’ but that humanizes the technology. It has simply found a way to solve the task given to it – although not in the way a human would. It has neither logic nor reason , at least not in the sense we normally understand it.” Brandon Hill, co-author of the study
The models solve tasks in unexpected ways, far removed from human logic. This finding presents researchers with new challenges: How can they ensure that AI-generated patterns are actually medically relevant? The study highlights the need for rigorous evaluation standards and urges caution when interpreting AI results in medical imaging.