What is the due diligence matrix about?
The below is an interactive version of our SustainableAI due diligence matrix from our paper "Fostering Sustainable Finance with Artificial Intelligence".
The matrix is structured along the three main pillars of sustainable finance: Environmental (E), Social (S) and Governance (G) considerations. It maps these three dimensions to the main steps of a model lifecycle. By doing so, it allows you to quickly identify a selection of key sustainability considerations at different stages of model implementation and on the ESG dimension you select.
What use is this MATRIX?
This SustainableAI due diligence matrix can be applied for internal purposes, when developing your own models, or can serve as a source to enhance due diligence on third parties (service providers, investee companies in portfolios, loan recipients for lender, etc.). It does not pretend to be exhaustive, but rather an inspiration source when developing your own frameworks
How to read and use it?
First, we mapped to each of the three E, S and G dimensions what we see as the key overarching risk (across the whole lifecycle) to be addressed when adopting and implementing AI-powered solutions:
Key Environmental risk: across its lifecycle, how energy intensive resp. energy efficient is the model, and can it drive negative environmental impact?
Key Social risk: is the social impact of the model’s usage, such as job displacement or bias reproduction/reinforcement, properly addressed?
Key Governance risk: is appropriate governance in place to ensure transparency, explainability and accountability for the model’s outputs?
We then suggest main considerations to help you Measure and Mitigate these risks, at each stage of the AI model lifecycle. The "Measure" sections help you assess the likelihood of the main risk to materialize at this stage of model lifecycle. The "Mitigate" sections support you in evaluating the robustness of the mitigating measures in place to prevent the risk from materializing.
Environment / Select data:
Measure:
- How energy intensive is the data acquisition (collection, storage, preprocessing) process: energy consumption in KWh, carbon footprint in kgCO2?
- What is the grid carbon intensity of the data centers locations?
- Are there regular audits on energy consumption at the data curation stage?
- How is the environmental impact of third-party data sources measured?
- How is the energy consumption tracked during data collection and preprocessing?
- How frequently is energy usage audited during data acquisition?
Mitigate:
- Consider geographical location of the data centers and favor locations with low grid carbon intensity
- Favor data centers powered by renewable energy
- Select data centers certified for energy efficiency (e.g., LEED, ISO 50001)
- Remove redundant or irrelevant datasets during curations to reduce storage and computation needs (and in turn energy use)
- Establish and implement policies for reducing the carbon footprint in the data curation phase
- Implement lifecycle analysis during dataset creation
Social / Select data:
Measure:
- What percentage of datasets undergo bias and fairness audits?
- How is data bias assessed during collection?
- How is diversity within training datasets quantified?
- How frequently are external reviews conducted for data ethics?
- How are stakeholders consulted to ensure fairness in data curation?
Mitigate:
- Apply adversarial and other debiasing techniques to datasets (e.g. reweighting, resampling)
- Verify the existence of mechanisms for reporting and correcting biased data
- Screen datasets for social bias and underrepresentation
- Anonymize sensitive data (e.g., demographic information)
- Establish and implement a policy for ensuring diverse, representative datasets
- Make sure that informed consent is obtained for data usage where relevant
- Use active learning to balance representation shifts dynamically
- Embed bias heatmaps and fairness reports in dataset documentation
Governance / Select data:
Measure:
- Are data privacy and security risks identified during data collection?
- Are there documented data governance frameworks in place?
- How often are data governance policies reviewed?
- What percentage of datasets are subject to independent audits?
- Is there potential for intellectual property infringement in sourced data?
Mitigate:
- Ensure board-level oversight of data policies.
- Involve a data ethics board in data curation decisions.
- Enforce GDPR, CCPA, and other relevant regulations.
- Assign clear accountability across management levels for data breaches or misuse.
- Implement anomaly detection for early identification of fraud or compliance violations.
- Enforce stricter consent and anonymization practices for sensitive features
- Implement traceability principles for the full data lifecycle
- Mandate third-party audits of data governance frameworks
- Apply blockchain or similar immutable logs for dataset changes
Environment / Select model:
Measure:
- How is the energy cost and wider environmental impact of different model architectures assessed?
- Are energy-intensive models (e.g., large language models) assessed for sustainability?
- Is hardware energy efficiency assessed during model selection?
- Is there a record of energy consumption (kWh) during model evaluation?
Mitigate:
- Select models based on energy efficiency benchmarks (e.g., CodeCarbon).
- Prioritize small, efficient models over large, energy-intensive models when the task requirements allow.
- Use measures such as model distillation or compression to reduce energy consumption.
- Choose cloud providers based on their environmental policies.
- Apply sparsity techniques to reduce computational needs.
- Analyse training versus inference energy trade-offs to optimize long-term efficiency.
- Select models with adaptative energy scaling features
Social / Select model:
Measure:
- How is potential bias assessed in different models during selection?
- Could the model disproportionately harm certain social groups?
- How do selected models perform across diverse demographic groups?
- How are models tested for disparate impacts across demographic groups?
- What benchmarks are used to evaluate fairness?
- Are external audits conducted on model fairness?
Mitigate:
- Apply adversarial debiasing techniques.
- Integrate fairness metrics and prioritize fairness-aware algorithms during model selection.
- Prefer interpretable models to enhance transparency.
- Select models based on their compatibility with explainable AI methodologies.
- Include model diversity tests simulating different user profiles
Governance / Select model:
Measure:
- Is the model selection influenced by vendor lock-in or conflicts of interest?
Mitigate:
- Ensure model selection processes are transparent and documented.
- Have AI governance committees oversee model selection.
- Follow governance frameworks, such as ISO 42001, during model selection.
- Ensure model selection includes human-centered design input
Environment / Train model:
Measure:
- What tools are used to measure the energy cost of training different models?
- How long does model training take, and how is this optimized?
- How energy intensive is the model training process: energy consumption in KWh, carbon footprint in kgCO2?
- How frequently is energy usage monitored during model training?
- How energy intensive are model inference operations: energy consumption in KWh, carbon footprint in kgCO2?
- Are metrics like FLOPs (Floating Point Operations) or GPU-hours tracked and logged?
- Are deployment sites assessed for environmental impact?
- What type of hardware (e.g., TPUs, GPUs, CPUs) is used for AI training and inference?
Mitigate:
- Deploy AI models on edge devices to reduce reliance on cloud computing.
- Implement dynamic scaling of AI resources to avoid unnecessary computational waste.
- Optimize AI model size and efficiency using techniques such as pruning, quantization, or knowledge distillation.
- Use adaptive computation techniques, such as early stopping or dynamic computation allocation.
- Schedule training sessions during periods of renewable energy availability.
- Make energy efficiency of supporting hardware a key selection criterion.
- Repurpose waste heat from servers (e.g., for heating buildings).
- Schedule AI workloads in a carbon-aware manner, based on grid carbon intensity.
- Choose data center locations based on sustainability criteria.
- Implement carbon-aware scheduling for training models during low-grid demand periods.
- Optimize training schedules for low-carbon grid times
- Move energy-intensive training to regions with lower grid carbon footprint
Social / Train model:
Measure:
- Are fairness-aware performance metrics defined and measured?
- How has the potential impact of AI model rollout on employment within affected industries/departments been assessed?
- Has an impact study on AI-driven workforce transformation been conducted?
- How is the risk of AI-generated misinformation or deepfakes, or amplification of harmful content assessed?
Mitigate:
- Are adversarial and other debiasing techniques applied during training?
- Have privacy-enhancing technologies (e.g., differential privacy, federated learning, homomorphic encryption) been implemented?
- Has a workforce transition or reskilling plan for employees affected by automation been established?
- Is differential privacy implemented to address sensitive data access concerns?
- Enable stakeholder review sessions during model training
- Train staff and users on model limitations and safe use
Governance / Train model:
Measure:
- Are training datasets vulnerable to tampering?
- How are AI ethics guidelines integrated into model training?
- What documentation exists for key decisions during training?
- What governance structures are in place to ensure AI models align with ethical and regulatory standards?
- Have stress tests been conducted to evaluate AI’s susceptibility to misuse?
- How does the AI system comply with human rights frameworks and ethical AI guidelines?
- Have legal risks related to AI liability been assessed?
Mitigate:
- Implement access controls for model training environments.
- Maintain contingency plans for model failures.
- Keep audit logs for all model development stages.
- Ensure AI decisions are explainable to end-users and stakeholders.
- Provide sufficient documentation on how the model works, including its limitations.
- Inform affected users when they are interacting with an AI-driven system.
- Establish mechanisms for users to challenge AI-driven decisions.
- Document the AI system’s reasoning so it can be reviewed by regulators and auditors.
- Define a clear accountability framework specifying who is responsible for AI-driven decisions.
- Conduct human review of AI-generated decisions in high-stakes scenarios.
- Assess models through the lens of explainable AI to identify and analyze existing captured biases.
Environment / Monitor model:
Measure:
- Are there lifecycle assessments (LCA) conducted for the data infrastructure?
- How often are AI models reviewed for efficiency, and what triggers model decommissioning or updates?
- Are environmental impact assessments considered as part of the AI development lifecycle?
- Are energy efficiency evaluations conducted for inference infrastructure?
- Is hardware utilization regularly analyzed to identify and eliminate underutilized resources?
- Are inference pipelines assessed for cumulative energy load?
Mitigate:
- Establish a formal policy for retiring inefficient AI models to free up computing resources.
- Transition inference operations to low-power edge computing to reduce reliance on energy-intensive cloud computing.
- Use model quantization and pruning techniques to lower inference power consumption without reducing accuracy.
Social / Monitor model:
Measure:
- Are real-world data distributions regularly compared with training data to identify drift or emerging biases?
- Are models retrained periodically?
- How frequently is bias re-evaluated during data updates?
- Is there continuous monitoring for fairness drift post-deployment?
- How are feedback loops from users and communities integrated?
- How are updates or modifications to the AI system communicated to stakeholders and end-users?
- Are employees using AI-assisted tools monitored for unintended stress, workload increases, or ethical dilemmas?
- Do users report stress, overload, or decision fatigue using AI tools?
Mitigate:
- Maintain an ongoing process to monitor and correct biases post-deployment.
- Track potential disparities in AI service quality across different demographic groups.
- Conduct fairness audits periodically and implement corrective actions if disparities are detected.
- Establish a process to address comments and concerns from affected groups regarding potential bias in AI outcomes.
- Adapt workforce reskilling and training programs to AI’s evolving role in the organization.
Governance / Monitor model:
Measure:
- Are external audits conducted regularly?
- Is there a governance structure for continuous monitoring?
- Have new explainability challenges emerged since deployment, and how are they addressed?
- Are AI models regularly checked for potential privacy leaks (e.g., membership inference attacks, model inversion risks)?
Mitigate:
- Update documentation and model cards to reflect any changes in model behavior.
- Ensure updated models comply with evolving data privacy regulations (e.g., GDPR, CCPA).
- Maintain governance structures that ensure continuous accountability for AI decisions.
- Implement corrective actions when AI is found to amplify harmful or misleading content.
- Use reproductive pipelines with model versioning and rollback procedures
- Make audit trails, changelog and decision rationales accessible, at least on request
Social / Select data:
Measure:
Mitigate: