The largest retrospective study of its kind into the use of artificial intelligence as a tool to assist radiologists with breast cancer screening has demonstrated that the technology can significantly improve screening accuracy – leading to more accurate diagnoses – without sacrificing mandatory high safety standards.
The study, published in The Lancet Digital Health and commissioned by deeptech firm Vara, evaluated the performance of an AI-based approach using mammograms from more than 100,000 women in Germany, including more than 4,400 with identified cancers. Radiologists and researchers found that when AI is implemented along the screening pathway in a way that complements, rather than replaces human radiologists both sensitivity and specificity of radiologists improved significantly. Different algorithm configurations were evaluated, with some demonstrating that over 70% of the workload could be automatically triaged by the AI.
The success of this approach, known as decision referral, suggests that the most effective way to incorporate AI technology into clinical practice is as a tool used by radiologists, helping them to make better decisions and better manage the administrative burden of their task.
Vara’s AI technology optimises performance of both normal triaging and cancer detection. Normal mammograms are triaged and their respective structured reports are automatically generated, while mammograms assessed by the reader receive additional post-hoc AI cancer detection support when needed.
This key feature of decision referral, the safety net, is triggered only when AI disagrees with the reader, flagging potentially missed cancers. In direct contrast to traditional computer aided detection (CAD) systems which have struggled with increased false positives, Vara’s decision referral approach may prove to be the optimal use of AI to find low-prevalence cancers without biasing the reader negatively.
The complementary processes were shown to outperform both the average (unaided) radiologist and the use of AI in stand-alone mode – that is, when the AI was working independently. In some cases, the decision referral approach could improve radiologist sensitivity by up to 7.2 percentage points.
Existing approaches to adopting AI in screening have focused on using the technology to replace radiologists by allowing the technology to interpret mammograms on their behalf. These approaches, while presented as the most promising route for the technology to be implemented into clinical settings, have been repeatedly criticised.
In a recent article, published in the British Medical Journal, researchers from Warwick University concluded that existing solutions using the stand-alone approach were “not ready for routine use in clinical practice based on existing published evidence”. In particular, they cited concerns with study bias and the fact that stand-alone AI still underperforms against single radiologists in realistic settings.
At the same time, women surveyed in the Netherlands voiced concerns about the fully independent use of AI-based diagnostics in screening mammography without involvement of a radiologist. Almost 78% said they didn’t support its use, and that it is too premature to leave the interpretation of screening mammograms completely up to independently operating AI algorithms.
Vara’s decision referral specifically combines the strengths of both radiologists and AI by taking a case-by-case approach to assessing the mammograms. Where the AI is not confident enough, it refers the decision to radiologists. This also creates potential to save workload.
Comparing screening approaches
The latest study was a collaboration between breast radiologists Dr. Lale Umutlu of the Essen University Hospital, Dr. Katja Pinker of Memorial Sloan Kettering Cancer Center, and researchers from Vara, led by Machine Learning Director, Dr. Christian Leibig.
A total of 104,518 mammograms from between 2007 and 2020, including 4,463 screen-detected cancers from eight screening units participating in Germany’s national mammography screening program were used to evaluate the algorithm’s performance. Mammograms from six screening units were used as an internal test set, while data from two further screening units were used as an external test set.
The authors compared the performance of two distinct screening approaches for AI – a stand-alone AI approach, and the AI decision referral approach. The sensitivity and specificity of each approach was evaluated using the two sets, and compared to the average unaided radiologists’ real-world performance on the same mammograms.
The unaided radiologists achieved higher sensitivity and specificity across both sets compared to the stand-alone AI.
However, when using the decision referral approach, these scores surpassed those of the radiologists alone. In one example configuration, the approach improved radiologist sensitivity by 4.0 and 2.6 percentage points on the internal and external sets, respectively. With high accuracy, this approach allowed AI to independently perform normal triaging and cancer detection leading to over 60% of screening mammograms automatically triaged at this configuration.
The results from Leibig et al. aim to shift the dialogue away from stand-alone AI practices and towards using a collaborative approach to bring AI safely and effectively into clinical practice.
Professor Lale Umutlu said: “This study demonstrates and underlines that AI is not meant to replace human radiologists but can assist us to improve our diagnoses and in the long run – improve patient care.“
Professor Katja Pinker-Domenig said: “Just as in clinical practice, two (or more) readers are always better when it comes to reading mammograms. It’s both common and encouraged that radiologists seek counsel from their colleagues and the decision referral approach is simply an extension of this. We are encouraged to see that this study helps to show this method can improve results.”
Dr Christian Leibig said: “AI’s promise in clinical settings and beyond has always centred around improving efficiency and accounting for human error yet, as we’ve seen with various approaches taken to date, there is insufficient evidence that AI is able to achieve this on its own in realistic settings. The decision referral approach leverages both the strengths of the AI, and the strengths of the radiologist to make both better, even if either party is better on average. We’ve seen this in our work at Vara but it’s great to now have this approach validated by some of the world’s leading radiologists.”