Responsible AI on the Frontier

2 Jul

A blog by Dr Sam Stockley-Patel,
Research & Engagement Manager, Frontier Tech Hub

*Elise Racine & Digit /* *https://betterimagesofai.org* / *https://creativecommons.org/licenses/by/4.0/*

When it comes to implementing AI for development, the Frontier Tech Hub has a practical, evidence-informed lens on questions of ethics and responsibility. It's one that's been shaped by a decade of hands-on field-testing of diverse AI tools in high-uncertainty, early-stage, and operationally complex environments, in close collaboration with in-country partners and alongside local users.

The pilots in our portfolio are not longitudinal studies or deployments with dedicated ethics teams. They are small, fast, iterative tests, typically running for twelve to eighteen months on limited budgets, in environments where data is sparse, institutional capacity is stretched, and the value of testing something imperfect often outweighs the cost of not testing it at all.

Below is a set of cross-portfolio insights that we believe are instructive to others currently designing, testing, funding, or scaling AI in similarly nascent, challenging, and high-potential contexts. Some observations reflect responsible AI practice, others reflect challenges we've experienced and want to name.

1. Data quality is treated as a technical problem; it is often also an ethical one

Some pilots have used data that failed quality checks at scale, yet the outputs still fed into real decisions, because the alternative was no input at all. Others have used proxy measures where ground truth didn't exist.

Sometimes, imperfect data is the only data available, so the choice isn't between 'good' data and 'bad' data, but between deploying with acknowledged and considered limitations, or not deploying. Our teams have been transparent about these trade-offs.

What's been harder to surface is where ethical weight is attached to data limitations. For instance, a model validated against official data may miss the most marginalised people who aren’t in that data. These are technical problems that impact accuracy problems, but crucially they also manifest as equity problems and should be considered as such.

Deploying AI on imperfect data is often necessary, and in early-stage pilots, it's frequently the only option. It's important to name limitations clearly and assess downstream implications honestly. Where that's harder is recognising when a technical framing is also carrying ethical weight and making sure both are visible.

2. The communities most affected can be the least visible in AI design

Genuine co-design with marginalised communities is often hard. It's typically expensive, slow, and in some contexts, raises its own ethical difficulties. Our goal to rapidly test early-stage ideas in operationally tricky environments can mean that community involvement, even when it's acknowledged, becomes something to test later rather than sooner.

Several pilots have named this tension directly: a system that would involve disadvantaged groups without an established consent framework, or a design that linguistically excluded some users.

Others show what deliberate co-design looks like: technology built from the ground up alongside the community groups whose expertise it extends, or a partner organisation brought in specifically to lead community engagement as part of the design process.

The populations AI is designed to help are sometimes absent from the design process. This is rarely a simple oversight, but instead reflects a strategic decision to focus limited time and resource on other critical aspects of getting a pilot off the ground. Naming that absence and its implications is crucial, and community involvement should be treated as a design priority when conditions allow.

3. Scope reduction is a responsible AI decision

In some pilots, narrowing from a complex system to a simpler, single-purpose tool was framed as a technical pivot. In others, stepping back from more contested technologies in favour of more basic ones was framed as pragmatic.

In each case, the pilot team arrived at a less technically ambitious but more practically appropriate solution. And in each case, the decision was named as something other than what it also was: an act of responsible AI practice. By being less ambitious about what AI systems should accomplish, clearer paths to adoption and impact might be possible.

Deploying a less capable, more legible, more human-controlled system can result in better and therefore more responsible outcomes. Decisions around simplification, legibility, and appropriate scope can be buried in operational or feasibility narratives, but designing for real-world impact is also more ethical and responsible.

4. Intermediaries and interfaces carry unexamined risk

Multiple pilots route AI outputs through human intermediaries or user interfaces that can be under-resourced, under-scrutinised, and under-supported.

In some pilots, operators have captured sensitive personal data on personal devices with no formal protocol. In others, a handful of human validators have been expected to review AI outputs across very large volumes of records with no clear accountability framework.

In many ways, these are the predictable result of designing AI systems within resource and time constraints, where data governance frameworks and accountability structures for human-AI interfaces are genuinely difficult to build well at pilot scale. What they point to is a clearer set of choices that are better made early than late: descope the feature if it can't be governed properly; name the gap explicitly in documentation so that anyone funding or scaling the next phase knows what governance infrastructure is still missing; or make the case for the additional resource required to do it right.

The responsible AI risks at human-AI interface level (data handling, accountability, power dynamics, and sustainability) are genuinely hard to address at pilot scale and can go underexamined during design phases, but typically require specific attention and mitigation.

5. AI trained on unjust data risks encoding injustice as prediction

Several pilots have trained models on data reflecting past systemic failures, risking the encoding of those failures as predictions: a model could flag the poor or marginalised as "high risk" simply because they've historically not accessed services, or recommend the same thing to everyone because that's what the historical data shows.

Without training data that accurately represents the populations the AI will serve, pilots risk building systems that learn the shape of existing injustice before generating recommendations, decisions, or predictions. In early-stage pilots, training data is often imperfect, and the implications of that for specific groups need to be documented and accounted for.

6. Ethics approval is present, but uneven

Several pilots have undergone formal ethics processes, drawing on institutional ethics boards and consent frameworks. These are genuinely positive steps, though for pilots of this scale, ethics approval tends to be experienced as a compliance process rather than a substantive design one.

In some pilots, ethics approval focused on the research study design rather than the AI system itself. In others, teams produced dedicated ethics analysis, debating the case for and against using AI, and naming risks like de-skilling, accountability, and privacy as needing mitigation. The question remains what a minimum meaningful standard looks like for pilots of this kind and scale.

Ethics approval and responsible AI are not the same thing. Ethical engagement should be codified, and should include AI system design, not just human subjects research protocols, and it should treat transparency as a governance principle rather than a communications choice.

7. The context-transfer problem, and its human consequences

The portfolio has repeatedly tested whether tools developed in one context transfer to another with retraining: models trained to recognise one landscape haven't reliably detected similar features elsewhere, or AI trained on one population has underperformed on another. In at least one case, a missing local-language model was framed as a technical hurdle when it may have pointed to something deeper - a tool can't be assumed to work in a new context just because it learns the words.

Context-specificity is experienced practically across multiple AI pilots, but rarely named as a responsible AI principle. We should be wary of any implicit assumptions that AI tools can transfer between geographies even with retraining.

8. The AI that wasn't built is as revealing as the AI that was

Sometimes the most interesting responsible AI story in a pilot isn't what was deployed but what wasn't, and why. In some pilots, a more sensitive classification system was designed but abandoned for lack of underlying data, not because of ethical reconsideration. In others, a more invasive identification technology was proposed for use at scale but never implemented, due to a technical or access barrier rather than any ethical rethink.

In both cases, circumstance did the work that ethical design review might have done instead. They surface an important question worth remembering: "should we build this?" and "can we build this?" are different, and both deserve an explicit answer.

AI non-deployment is typically framed as a technical or resource failure rather than as an opportunity for ethical reconsideration. Build in the habit of asking "should we?" alongside "can we?" before circumstance answers on your behalf.

We want to reiterate something crucial: Across our portfolio, every pilot team is making active choices about where to focus, and not every responsible AI consideration can be a priority. Where trade-offs happen, what's crucial is that they are made consciously, that any limitations are named, and that pilots grow and share an evidence base that improves future practice.

When you've faced these trade-offs in your own AI work, which consideration did you consciously set aside? Would you make the same call again? Reach out to Sam to share your thoughts and insights: sam@hellobrink.co

Next up, we will ask a practical question around which tools are genuinely useful for development practitioners.

We've started gathering the answers in one open space. If you've built something useful, or have a problem worth solving, there'll be a way in.

If you’d like to dig in further…

🔘 An AI pitch that became a much needed data infrastructure project

🔘 Cutting through the noise of superabundant AI content

🔘 What if we try Y? The best way to unwrap AI’s potential

AIgen AIresponsibleAIresponsible&inclusivetechinclusionAIethics

Frontier Tech Hub

The Frontier Technologies Hub works with UK Foreign, Commonwealth and Development Office (FCDO) staff and global partners to understand the potential for innovative tech in the development context, and then test and scale their ideas.

Responsible AI on the Frontier

An ethics for the disappeared in a technological age

The pilot ended, but the flying didn’t