The radical Blog
About Archive radical✦
  • Top AI models fail spectacularly when faced with slightly altered medical questions

    Shocking (not):

    Artificial intelligence systems often perform impressively on standardized medical exams—but new research suggests these test scores may be misleading. A study published in JAMA Network Open indicates that large language models, or LLMs, might not actually “reason” through clinical questions. Instead, they seem to rely heavily on recognizing familiar answer patterns. When those patterns were slightly altered, the models’ performance dropped significantly—sometimes by more than half.

    Link to report

    → 12:59 PM, Aug 26
    Also on Bluesky
  • How GLP-1s Are Breaking Life Insurance

    Here’s an interesting one (talking about the implications of the implication – or, in other words, our Disruption Mapping tool):

    GLP-1s (Ozempic) have the potential to break the life insurance industry – and maybe not for the reasons you would expect.

    Life insurers can predict when you'll die with about 98% accuracy. […] Typically, underwriters- suspiciously sounds like undertakers-rely on a handful of key health metrics like HbA1c, cholesterol, blood pressure, and BMI to calculate your risk of dying earlier than expected (and thus costing them money). Those eagle-eyed readers among you have probably noticed something interesting already. Those same four metrics are exactly what GLP‑1s improve. Not just a little, but enough to entirely shift someone's risk profile within at least 6 months of using them.

    The insurer sees a 'mirage' of good health and approves them as low-risk. […] If we assume about 65% of people who start GLP-1 medications quit by the end of year one, that creates a big problem. When someone stops the medication, they'll usually regain the weight they lost, and in two years, most of those key health indicators bounce back to their starting point.

    Yep, it’s going to get messy.

    Link to article.

    → 2:33 PM, Jul 15
    Also on Bluesky
  • Medicine's Rapid Adoption of AI Has Researchers Concerned

    Looks like your doctor likes his AI not only in his LinkedIn feed but also in all sorts of medical devices and platforms. According to the FDA, more than 1,000 medical AI products have been cleared for use – with interesting (and potentially troubling) consequences.

    "Unlike most other FDA-regulated products, AI tools continue to evolve after approval as they are updated or retrained on new data.”

    It gets worse:

    "...medical algorithms often perform poorly when applied to populations that differ from the ones they were trained on.”

    and:

    "...many hospitals are buying AI tools 'off the shelf' and using them without local validation. That is a recipe for disaster.”

    Brave new world. Maybe ask your doc next time if his diagnosis was aided by AI…

    Link to article in Nature.

    → 4:24 PM, Jun 9
    Also on Bluesky
  • The Quest for A.I. ‘Scientific Superintelligence’

    Of the things you ought to be excited about when it comes to AI (other than AI-powered singing fish with the voice of Arnold Schwarzenegger), it is scientific discovery. Lila, a Cambridge, MA-based startup, with $200M in initial funding, just came out of stealth and showed their creation:

    "A.I. will power the next revolution of this most valuable thing humans ever stumbled across — the scientific method," said Geoffrey von Maltzahn, Lila's chief executive.

    Link to article.

    → 11:34 AM, Mar 12
    Also on Bluesky
  • Streetlight vs. Floodlight Effects Determine AI-Based Discovery

    A lot of excitement exists around the use of tailored AI models to do things such as drug discovery and the development of new materials. It turns out that Ethan Mollick’s Jagged Frontier of the use and application of AI applies here too. As Matt Clancy points out in his deep dive "Prediction Technologies and Innovation”:

    We can imagine Kim (2023)’s technology is like a lonely streetlight, only illuminating protein structures that are near to others we already know, while Toner-Rodgers’ technology is a gigantic set of floodlights that illuminate a whole field.

    In summary, the streetlight effect leads to a concentration of research efforts on well-trodden paths, while the floodlight effect can promote exploration of more novel and diverse areas. Thus, the former leads to sustaining innovation (at best), while the latter can lead to breakthrough innovation.

    Link to article.

    → 12:51 PM, Jan 27
    Also on Bluesky
  • AI-Supported Breast Cancer Screening: 17.6% Higher Detection Rate

    Wondering what AI could actually be useful for (other than creating funny images and spellchecking this blog post)?

    In a large-scale study in Germany, researchers found that AI-assisted breast cancer screening yielded vastly better results than the non-AI control group:

    […] after taking into account factors such as age of the women and the radiologists involved, the researchers found this difference increased, with the rate 17.6% higher for the AI group at 6.70 per 1,000 women compared with 5.70 per 1,000 women for the standard group. In other words, one additional case of cancer was spotted per 1,000 women screened when AI was used.

    Link to article and study.

    → 5:15 PM, Jan 7
    Also on Bluesky
  • RSS
  • JSON Feed