The company’s LLM has passed 100+ healthcare certifications and exceeded GPT-4 and other commercial models’ performance on those same benchmarks. The company has also developed a novel benchmark measuring the bedside manner of large language models to ensure emotional well-being of patients

Hippocratic AI launched out of stealth to announce the industry’s first safety-focused Large Language Model (LLM) designed specifically for healthcare, as well as a $50M seed round co-led by General Catalyst and Andreessen Horowitz.

Large language models (LLMs) and Foundation Models (FMs) like ChatGPT and GPT-4 have surprised the world with their abilities. While researchers have shown that these AI models can pass the USMLE (US Medical Licensing Exam), no company has built a commercial model specifically tuned for healthcare applications. Hippocratic AI is building the first LLM for Healthcare with an initial focus on non-diagnostic, patient-facing applications. This will allow the company to ensure patient safety while improving healthcare access and outcomes.

RELATED: Nia Health Raises €3.5M in Seed Funding

“The healthcare industry needs its own AI platform, one that is focused on empowering the workforce, reducing burnout, and improving patient safety and experiences with the healthcare system. We joined forces with the Hippocratic AI team, our health assurance ecosystem, and the a16z team to build this platform. Our goal is to fundamentally increase the supply and scalability of healthcare professionals. This is the key to achieving the health assurance vision: a more proactive, more affordable, and equitable system of care for all,” said Hemant Taneja, CEO and Managing Director at General Catalyst.

Hippocratic AI was founded by a group of physicians, hospital administrators, Medicare professionals, and artificial intelligence researchers from El Camino Health, Johns Hopkins, Washington University in St. Louis, Stanford, UPenn, Google, and Nvidia.

“After working with Munjal and team for years in his prior company, we know that his lived experience as a healthcare and tech operator gives him an edge in understanding what it takes to bring high-ROI products to market – especially at a time when existing industry players are in such dire need of better operating leverage and financial sustainability. We believe Hippocratic AI’s cross-disciplinary, safety-first approach is what the healthcare industry needs to be able to maintain trust in the power of responsible deployment of generative AI solutions,” said Julie Yoo, General Partner at Andreessen Horowitz.

To build a safer large language model the company has focused on three main things: certification, RLHF via healthcare professionals, and bedside manner.

Certification

Passing the USMLE is not enough to ensure a model is ready for the wide variety of healthcare roles that exist in care and payor settings. Therefore, Hippocratic AI focused on testing its model on a wide variety of 114 healthcare certifications and roles. The company also strived to not just get a passing score but to outperform existing state-of-the-art language models such as GPT-4 and other commercially available models. The company was able to outperform GPT-4 on 105 of the 114 tests and certifications, outperform by 5% or more on 74 of the certifications, and outperform by 10% or more on 43 of their certifications. Below are some sample results. Full results here: (www.HippocraticAI.com/benchmarks)

RLHF with Healthcare professionals

Hippocratic AI has decided that the best people to determine LLM readiness for deployment in the healthcare system are the experts who serve in that role in today’s system. In large language models, there is a technique to mold the AI using human feedback: Reinforcement Learning with Human Feedback (RLHF). Many believe this technique is what led to the remarkable performance of ChatGPT compared to that of prior versions of OpenAI’s language models.

In building Hippocratic AI, the company has engaged healthcare professionals to help guide and train the LLM by rating its responses.

“RLHF with healthcare professionals isn’t just a feature but is really our commitment to partner deeply with the industry,” said Munjal Shah, Co-Founder and CEO of Hippocratic AI. “We aren’t just saying these professions will help us evaluate our system. We are saying we won’t launch each unique role for the LLM unless the professionals who do that exact task today agree the system is ready and safe.”

Some of the roles and tasks the company is exploring include patient navigator, dietician, genetic counselor, enrollment specialist, medication reminders, and more.

Bedside Manner

“In healthcare settings, it isn’t just important to answer the patient accurately. It is equally important that it is done with great bedside manner. Many studies have shown that bedside manner impacts emotional well-being and quality of outcomes. This isn’t just true for doctors but also true for everyone interacting with patients: billing agents, schedulers, and more,” said Meenesh Bhimani MD, Co-Founder and Chief Medical Officer of Hippocratic AI.

To date there are no benchmarks for evaluating the bedside manner of a language model when interacting with patients. Hippocratic AI will be releasing the first of many bedside manner benchmarks for the entire community to use. Below are the initial results the company has achieved against these benchmarks.

Today marks the beginning of Hippocratic AI’s vision to use language models to massively increase healthcare access, reduce costs, and close the healthcare skills gap left behind by the global pandemic. Large language models are one of the best new ways to achieve this, but it has to be done in a safe way and tuned for the healthcare industry.

About Hippocratic AI
Hippocratic AI’s mission is to develop the safest artificial Health General Intelligence (HGI). The company believes that safe HGI can dramatically improve healthcare accessibility and health outcomes in the world by bringing deep healthcare expertise to every human. No other technology has the potential to have this level of global impact on health. The company was founded by a group of physicians, hospital administrators, Medicare professionals, and artificial intelligence researchers from El Camino Health, Johns Hopkins, Washington University in St. Louis, Stanford, Google, and Nvidia. Hippocratic AI received $50M in seed financing from two of the pioneering healthcare investors in Silicon Valley: General Catalyst and Andreessen Horowitz. For more information on Hippocratic AI’s performance on 100+ Medical and Compliance Certifications, go to www.HippocraticAI.com

About General Catalyst
General Catalyst is a venture capital firm that invests in powerful, positive change that endures — for our entrepreneurs, our investors, our people, and society. We support founders with a long-term view who challenge the status quo, partnering with them from seed to growth stage and beyond to build companies that withstand the test of time. With offices in San Francisco, Palo Alto, New York City, London, and Boston, the firm has helped support the growth of businesses such as Airbnb, Deliveroo, Guild, Gusto, Hubspot, Illumio, Lemonade, Livongo, Oscar, Samsara, Snap, Stripe, and Warby Parker. For more: www.generalcatalyst.com.

About Andreessen Horowitz (a16z)
Founded in Silicon Valley in 2009 by Marc Andreessen and Ben Horowitz, Andreessen Horowitz (known as “a16z”) is a venture capital firm that backs bold entrepreneurs building the future through technology. We are stage agnostic: We invest in seed to venture to late-stage technology companies, across bio + healthcare, consumer, crypto, enterprise, fintech, games, and companies building toward American dynamism. a16z has $33.3B in assets under management across multiple funds.