From scripts to systems: where health chatbots add value
A blog by the Frontier Tech Hub
An AI-generated image
Health chatbots don’t succeed because they’re clever, but because they’re embedded in trusted systems with clear referrals to real people at the right time. After six years of pilots run across Peru, Nigeria and Kenya, here’s what we’ve learned: just-in-time support matters, but governance, integration and design for real-world use matter more.
Sixty years of chatbots
The first chatbot was created in 1966 by Joseph Weizenbaum, a professor at MIT. It was called ELIZA, and pretended to be a psychotherapist, using basic pattern matching to mimic a human conversation. Users knew they were interacting with a machine, but the funny thing is, they reported feeling genuinely understood and supported by it.
Sixty years later, chatbots can do far more than respond based on a script. Our portfolio of pilots testing their potential to improve health outcomes has grown with advances in natural language processing and generative AI over the past six years.
In 2019, a structured decision-tree model was used to provide discreet SRH information to encourage contraceptive use among young Kenyans. Later, in 2023, AI was tested within a WhatsApp chatbot to improve vaccine uptake in Nigeria.
These experiments harnessed the powerful knowledge that, even before AI arrived, a chatbot sits with people at the moment choices are made. That timeliness leapfrogs evidence from phone helpline data, which shows that longer reply times are associated with a lower counselling impact. Behavioural science calls this ‘just-in-time support’: small, specific nudges delivered at the moment they’re most useful. Evidence for behavioural interventions in healthcare is strong, and chatbots have been shown to have more success than traditional reminders.
But as our latest report, ‘How to Build Better Health Chatbots for the Global Majority,’ shares: “their success depends on much more than technological sophistication.” Chatbot innovation is not just a technical endeavour, but a systems-level challenge that demands coordination across institutions, disciplines and communities.
“AI should be a lifeline”
One of our portfolio pilots developed Kem, a chatbot created by mDoc, and embedded in their omnihealth platform, CompleteHealthTM in Nigeria to respond to the growing needs of their members in the wave of COVID-19. As they shared: “AI should be a lifeline, making living easier, especially within the context of health.”
Kem was just rule-based and provided access to general advice during that period for their users who were struggling with non-communicable diseases (NCDs) such as hypertension. With the advent of LLMs, Kem’s capacity was improved. NCDs are a growing financial pressure in Nigeria, particularly for low-income households. Even before the pandemic, around 30% of households with NCDs were experiencing catastrophic health expenditure. However, just-in-time support can reduce patient costs by addressing questions that don’t require costly clinic visits.
When we asked the mDoc team what success looks like, the answer wasn’t about a model or algorithm. It was about people: self-efficacy for patients while reducing the burden on frontline healthcare workers. It’s the human capital perspective that can lead to healthier, happier lives.
And just as those speaking to ELIZA in 1966 felt space to open up to a machine, so did Kem’s users. Space and access to ask questions widened the scope beyond the original, rigid scripts, as did requests for multi-lingual capabilities. Kem has since evolved in response, harnessing the power of generative AI.
The human(s)-in-the-loop
Health chatbots are not a substitute for clinical care. They are a route to encouraging health-seeking advice and increasing access to information. But if not carefully implemented, chatbots can deepen inequities, spread misinformation, or erode trust in care systems.
Each pilot has demonstrated a fierce commitment to safety, inclusion, and trust. The mDoc team described their ‘human-in-the-loop' setup: guardrails, manual review, and escalation to coaches for anything sensitive. Strong governance frameworks, ethical oversight, and clear hand-off points to human providers are essential, as well as investing in regionally relevant LLMs, local language models, and offline alternatives. The World Health Organisation’s 2025 guidance is a solid compass on when and how to use LMMs in health.
User-led design is a critical pathway to creating a safe and inclusive tool, partially because it surfaces unintended consequences and blind spots, such as language limitations. Peruvian partners have requested support in Quechua and Aymara beyond the Spanish available. In Nigeria, mDoc’s team learnt from user testing that some struggled to type, so speech-to-text and text-to-speech functions were added. Across our portfolio of chatbot interventions, similar adaptations have been made: shorter answers, using pictures when text is difficult to absorb, and a clear referral to a real human being when needed.
The entry point shifts in line with the existing behaviour and context as well: our pilots have embedded chatbots where people already are, whether that’s WhatsApp, SMS, or IVR, keeping routes open for those using low-fidelity tech. In Sub-Saharan Africa, mobile internet continues to rise. Penetration reached 27% in 2023, with 527 million mobile subscribers, which still leaves many people offline.
But there are more humans beyond the user - an entire system of them – that need to be in the loop. Partnerships across clinics and ministries are critical to ensure that one chatbot isn’t acting in a vacuum and can effectively triage a user to the proper support they need. To scale effectively, chatbots must be embedded in existing workflows, supported by policy, and paired with training and capacity-building for providers.
mDoc has established strong, collaborative relationships across the Nigerian healthcare system. Still, other pilots in our portfolio discovered signals that health systems are often not structured to accommodate such early-stage innovation.
While policy and governance are slowly catching up with the rate of tech, it’s up to every team to bring systems thinking and user-led design expertise to their work, keeping humans in the loop and tackling the ever-widening digital divide.
What good looks like right now
Across our portfolio of health chatbots, four key commitments consistently emerge.
Scope narrowly, integrate deeply. Pick a job the system actually needs: self-care tips and navigation for a specific condition, triage to the right service, or myth-busting for a campaign. Embed it in real pathways, not a side channel. Set explicit escalation rules for human care.
Design for how people really communicate. Short, plain answers. Multiple languages. An offline channel for those who need it. Voice input and output when typing is a barrier. Images or short videos, when that’s clearer than text. These small choices widen access.
Make behaviour change a feature, not a hope. Build MASTER into the flow: Messenger, Attractive, Social, Timeliness, Ease, Regularity, and study Behavioural science insights from BIT’s work in Health.
Run a safety and governance loop. Keep a human in the loop. Log risky prompts and responses. Review for accuracy, empathy and context.
Document your model and content sources. Align with local rules and sector guidance, such as the WHO’s LLM guidance.
These are the critical, unglamorous things that will make the difference between a cool demo and something a nurse, a patient, or a new parent will trust and rely on throughout their broader healthcare journey.
That’s the frontier these teams are chasing: not just more intelligent bots, but stronger systems around them. The right support, at the right moment, contextualised and communicated inclusively, with a safe route to the next action elsewhere in the system.
If you’d like to dig in further…
📚 Lessons from the Frontier: Health Chatbots in LMICs report