Human Guided AI for Scaled and Trusted Risk Insights
Challenge to overcome
Banks, asset managers, and corporates face growing complexity in identifying and managing business conduct risks arising from biodiversity, climate, human rights, and corruption across their business relationships, portfolios, and supply chains. Risk assessments based solely on self-reported disclosures often obscure material issues and delay timely intervention. The core challenge is twofold:
- Finding the relevant sources across worldwide, including local media, NGOs, government publications, and industry reports.
- Transforming vast unstructured content into consistent, comparable, and decision‑ready risk intelligence.
Without systematic, multilingual screening and classification, early warning signals remain undetected – affecting financial performance, compliance, reputation, and stakeholder trust.
AI application / use case
With the rise of powerful LLMs and AI agents, the sustainable finance industry now faces four overarching challenges – all of which the solution addresses:
- Hallucinations. Without guardrails, AI can invent facts – producing inconsistent risk assessments, misclassified companies, and costly errors. A wrong link between a pollution case and the wrong firm isn’t a small mistake – it’s a liability.
- Historical data. The internet forgets. Older risk signals disappear, making in house historical data essential for understanding long term patterns.
- Relevant sources. Generic name based web searches often return noise, nothing at all, or the wrong entity – leading to unreliable coverage and dangerous gaps.
- Traceability. Sustainable finance requires decisions to be auditable. A single opaque AI model raises both hallucination risk and the inability to trace back to the underlying data – a critical requirement across lending, compliance, and fraud detection.
To address the challenges stated above, RepRisk employs a hybrid approach that combines advanced AI models with human intelligence. This integration enables more accurate identification and classification of business conduct risks across global sources. The combination of human and artificial intelligence performs five core tasks:
1. Multilingual ingestion
A web-scale ingestion pipeline continuously collects structured and unstructured material from 150k+ sources which are curated by language- and domain-specific analysts. Language-agnostic AI models identify relevant content even when keywords differ culturally or linguistically.
2. Risk signal extraction with transformer-based models
Inhouse trained and hosted transformer models detect mentions of potentially harmful business conduct and categorize them according to RepRisk’s framework of 28 Business Conduct Issues and 80 Topic Tags. Models identify and evaluate, for example:
- Companies and projects
- Business conduct issue type (e.g., waste issues, forced labor, fraud) and severity.
This transforms raw text into semi-structured risk signals.
3. Refinement and fine-tuning of models with humans in the lead
Human experts review and validate model outputs to correct ambiguities, ensure contextual accuracy, and avoid false positives. Feedback loops inform continuous model finetuning. This hybrid setup strengthens innovation, relevance, accuracy, consistency and cross-sector comparability.
4. An agentic workflow for quality-assured outputs
The solution employs multi step reasoning loops:
- LLM interprets query or document set.
- Models classify risks and check consistency.
- Human analysts perform validation.
- Human input is used to fine-tune models.
Figure 1: Combining human and artificial intelligence for trusted risk insights
Use case key beneficiaries
☒ Relationship Managers
☒ Portfolio Managers
☒ Research teams, macroeconomists
☒ Control functions
☒ Support functions (HR, CFO, …)
Benefits of AI use case for financial services sector
The resulting dataset is characterized by:
- Relevant coverage of diverse and curated sources, including media, NGOs, and regulatory reports.
- High accuracy: validation through expert review to reduce false positives and contextual misinterpretations.
- Timeliness: Fast updates to capture emerging risks as they occur.
- Historical benchmarks: inhouse-hosted, longitudinal data for time series and trend analysis, backtesting, auditability, and traceability.
Banks use the solution to embed business conduct risk into credit processes and policies – supporting origination, KYC, lending, and reputational risk management. Daily updated data enables early risk detection, protects the balance sheet, and guides escalation from onboarding to sector policies – ensuring financing aligns with risk appetite and sustainable finance commitments.
Asset managers integrate the solution to monitor investee behaviour, surface risk signals, and incorporate business conduct risk into research, stewardship, and portfolio construction. Insights help identify outliers, prioritize engagements, meet regulatory expectations, and support a forward looking investment strategy.
Supporting technology
The solution applies an orchestrated multi-model approach that integrates in-house trained and hosted transformer-based models, fine-tuned LLM capabilities, Retrieval-Augmented Generation (RAG), advanced prompt engineering, and AI agents. To ensure the quality of the models and data, a multi-layered rating system is employed – combining automated assessments with expert-curated evaluation and refinement.
The solution integrates five complementary technical functions:
Data ingestion and pre processing
- Automated harvesting of structured and unstructured data: Internal pipelines collect model outputs, entity masters, and analyst-validated signals to form a unified, structured input for processing
- Multilingual normalization and deduplication: A transformer-based standardized workflow harmonizes metadata, deduplicates content, resolves entities, applies taxonomy tags, and aligns classifications before human QA
- Entity linking for companies, projects, and locations by leveraging models trained on manually labeled data
Proprietary model orchestration
- Transformer-based models trained on risks-specific datasets
- Classifiers for severity scoring
- Embedding-based similarity matching for incident clustering
LLM integration for analytical workflows
Finetuned connected LLMs interpret user queries, generate summaries, and prepare briefs. LLMs do not replace core risk classification – instead, they operate on pre-validated data that is reviewed by human experts to ensure factuality.
Retrieval-augmented Generation (RAG)
RAG ensures that generative outputs reference only verified truth – anchoring responses in evidence rather than probability.
Human-supervised rating and validation
AI innovations allow human experts to concentrate where AI falls short – applying contextual and cultural understanding and adding the depth and nuance only human judgment can provide. This continuous interplay between HI and AI steadily drives further innovation, as analysts surface the gaps where AI falls short – and engineers close them by improving and fine tuning the models.