AI Trust and Safety Re-Imagination Programme: Building Frameworks of the Future
In early 2025, the United Nations Development Programme (UNDP) launched the AI Trust & Safety Re-imagination Programme, where we heard from practitioners in 60 countries who are addressing emerging AI-related harms. The programme is designed as a networked ecosystem of Trust and Safety leaders, innovators, researchers, think tanks and civil society organizations in countries, drawing on the lived experiences of stakeholders and highlighting emerging insights to solve for AI-driven harms.
These insights from countries are critical in shaping the new paradigm for Trust and Safety models – shifting from universal, centralized approaches towards solutions tailored to local contexts.
Submissions by region
Needs and opportunities by region
- Africa: Submissions highlighted a wide spectrum of challenges and critical gaps – from inadequate policy enforcement to the unregulated deployment of AI models. Notably, the marginalization of low-resource languages emerged as a key barrier to advancing AI safety across the continent. There was also a strong emphasis on community-led solutions to address infrastructure gaps, with a focus on cybercrime, misinformation, child safety and youth reskilling.
- Asia-Pacific: Priorities included addressing harms affecting migrant groups, financial fraud and linguistic diversity. Submissions called for a balance between centralized governance and culturally specific adaptations. Common themes were misinformation, gender-based violence, child safety and identity fraud. These reflect growing concerns around AI and digital risks that disproportionately affect vulnerable populations.
- Latin America and the Caribbean: Governance gaps, financial fraud and deepfakes used for disinformation were identified as paramount. Submissions leaned towards combining centralized compliance tools with locally grounded approaches that empower civic leaders and civil society groups.
- Europe and Central Asia: Concerns centred on low-resource languages and algorithmic fairness in diverse contexts. While some favoured EU-inspired centralized frameworks, others stressed the need for localized approach for marginalized groups.
- Arab States: Governance gaps and disinformation are key concerns, and localized approaches are seen as essential.
North America: Misinformation, data provenance, deepfakes and bias were identified as top issues.
Top emerging themes
Across regions, submissions reimagine AI Trust and Safety as ‘decentralized with shared foundations,’ aiming to strike a balance between global and local approaches. Shared foundations, such as global datasets and AI Safety infrastructure, led by community-driven implementation strategies were proposed to counter the pitfalls of one-size-fits-all models while avoiding the fragmentation of purely local solutions.
The Global South led in the number of submissions, pointing to early warning signs of emerging AI harms and providing evidence of their widespread, cross-border amplification of AI harms. Common themes include misinformation, deepfakes, bias, fraud, and child safety. These challenges are particularly significant in low-resource contexts, where limited digital literacy, weak regulations, and cultural or linguistic diversity amplify risks because existing solutions are scaled and optimized primarily for presumed mono-cultural and mono-lingual populations.
- Disinformation / deepfakes: Solutions such as AI detection tools (e.g., WITNESS's TRIED benchmark, Karna’s audio deepfake detection) or monitoring platforms (e.g., Phoenix in Kenya).
- Bias and discrimination / gender-based violence: Frameworks for culturally adaptive AI or chatbots for migrant workers (e.g., PoBot by Migrasia in Hong Kong).
- Financial fraud / identity theft / authentication and cybersecurity: Tools such as fraud detection apps (e.g., Silverguard in Brazil) or personas for inclusive fintech (e.g., Assessed Intelligence).
- Child Safety/Exploitation: Reskilling programmes (e.g., Trust & Safety Africa Academy in Ghana).
- Safety impact of low-resource languages: Datasets and benchmarks addressing low- resource language inaccuracies that lead to misinformation. (e.g. Distant Voices work by Ushahidi, Trustweave, Duco and Tattle).
26%
Misinformation
Spread of false information undermining public trust and democratic processes
23%
Governance gaps
Insufficient regulatory frameworks for emerging digital technologies
14%
Impact of low-resource languages
Digital exclusion affecting communities with limited language representation online
9%
Child safety
Online risks and inadequate protection measures for minors in digital spaces
5%
Cybercrimes
Digital security threats including hacking, fraud, and data breaches
5%
Deepfakes
AI-generated synthetic media threatening authenticity and trust
6%
Bias and discrimination
Algorithmic bias perpetuating social inequalities in digital systems
7%
Financial fraud
Digital payment scams and cryptocurrency-related financial crimes
3%
Gender-based violence
Online harassment and digital violence disproportionately affecting women
2%
Identity fraud and authentication
Digital identity theft and inadequate verification systems
Across submissions, the underlying causes of AI-related harms were consistent: insufficient regulatory frameworks, limited digital literacy, and inadequate cultural or linguistic adaptation of AI systems. These gaps result in significant risks, including election interference, fraud, reinforcement of bias (e.g., gender-based violence), and the exploitation of vulnerable populations. The insights underscore the importance of context-specific, locally grounded Trust an Safety solutions.
Inaugural Cohort
The programme’s Re-imagination Lab features 17 game-changing initiatives. This cohort will collaborate to co-design user-centred localized strategies to address AI harms and strengthen Trust and Safety globally.
Check out their brilliant ideas: Assessed Intelligence, Change.org, DUCO, Fatma Elsafoury, GRIT, KARNA, Migrasia, Phoenix, S.A.F.E.AI, Silverguard, SimPPL, SOMOS Civic Lab, Tattle, Trust&Safety Africa Academy, Trust Weave, Ushahidi, Witness
Multi-National AI strategies – Scaling AI Trust & Safety globally
Trust and Safety for AI is now a core part of the Regulation & Ethics pillar in the UNDP Artificial Intelligence Landscape Assessment (AILA). This element of the AILA evaluates whether countries are equipped with tooling, safeguards, accountability measures, infrastructure and capabilities to anticipate and respond to risks. This ensures the assessment reflects not only countries’ technical readiness but also their capacities to support safe, inclusive, and trustworthy AI deployment and adoption.
AI Trust and Safety solutions accessibility
UNDP is committed to developing an AI Trust and Safety ecosystem development that creates pathways to accessible, equitable access to tools and resources that promote assurance and protection. Global public goods such as open toolkits, datasets, training materials, and shared frameworks can bridge common needs and help countries close capacity gaps. Submissions show that AI risks cross-borders. Multi-national strategies are needed across regions that can be adapted to diverse languages and cultural contexts, highlighting the value of global public goods that provide common foundations while supporting context-specific approaches to address them effectively. This will require shared and accessible frameworks to align on Trust and Safety interests and build institutional capacity to implement and sustain solutions.
The need for complementary, interoperable safety strategies
The landscape of AI Trust and Safety methods is fragmented. To effectively mitigate wide-scale harms, strategies need to operate across the stack. Frontier AI developers are embedding safeguards to reduce risks at the source. This baseline safety is necessary but insufficient for current local realities. When AI is applied in sectors like health, education, or elections, safeguards must be adapted locally to ensure they are fair and appropriate. Governments, regulators, and civil society need oversight and accountability mechanisms to reinforce safe use. Local actors need to adapt tools to reflect local needs. Trust depends on whether citizens actually experience protections against harms such as misinformation, bias, or online violence. Model safeguards are necessary but not sufficient as they cannot address all the ways harms manifest in local contexts. The goals of the UNDP AI Trust & Safety Reimagination Lab is to engage local innovators and communities who bring practical experience and methodologies needed to address region-specific risks and create new collaboration models with innovators globally. Real trust comes from combining global efforts with local governance, innovation, and community protections.
What is next for us in the Re-Imagination programme?
The next milestone is the launch of the Re-imagination Lab this fall with 17 finalists. This will be structured as a bootcamp to co-design multi-national, complementary strategies and ecosystem support into 2026. A report detailing the outcomes of the Re-imagination Lab, including key insights and co-designed solutions will be published. We will also capture and share lessons from the cohort to inform countries and global policy discussions.