Recommendations
Area for action: enhance access to essential AI infrastuctures and tools
Recommendation 1
Governments, research funders and AI developers should improve access to essential AI infrastructures
Access to computing resources has been critical for major scientific breakthroughs, such as protein folding with AlphaFold. Despite this, compute power and data infrastructures for AI research are not equally accessible or distributed across research communities (footnote 11).
Scientists from diverse disciplines require access to infrastructure to adopt more complex AI techniques, process higher volume and types of data, and ensure quality in AI-based research.
Proposals to improve access have included institutions sponsoring access to supercomputing (footnote 12) and the establishment of regional hubs – akin to a CERN for AI (footnote 13). Wider access can extend the benefits of AI to a greater number of disciplines, improve the competitiveness of non-industry researchers, and contribute towards more rigorous science by enabling reproducibility at scale.
Expanding access to computing must also be informed by environmentally sustainable computational science (ESCS) best practices, including the measurement and reporting of environmental impacts (footnote 14).
Actions to enhance access to AI infrastructures and tools may include:
- Funders, industry partners, and research institutions with computing facilities actively sharing essential AI infrastructures such as high-performance computing power and data resources.
- Relevant stakeholders (eg government agencies, research institutions, industry, and international organisations) ensuring access to high-quality datasets and interoperable data infrastructures across sectors and regions. This could involve advancing access to sensitive data through privacy enhancing technologies and trusted research environments (footnote 15).
- Research funders supporting strategies to monitor and mitigate the environmental impact associated with increased computational demands and advancing the principle of energy proportionality in AI applications (footnote 16).
Area of action: enhance access to essential AI infrastuctures and tools
Recommendation 2
Funders and AI developers should prioritise accessibility and usability of AI tools developed for scientific research
Access to AI does not guarantee its meaningful and responsible use. Complex and high-performance AI tools and methods can be challenging for researchers from non-AI backgrounds to adopt and utilise effectively (footnote 17). Similarly, new skills are needed across the AI lifecycle, such as data scientists who understand the importance of metadata and data curation, or engineers who are familiar with GPU programming for image-based processing.
Taking steps to improve the usability of AI-based tools (eg software applications, libraries, APIs, or general AI systems) should therefore involve a combination of mechanisms that make AI understandable for non-AI experts and build their capacity to use AI responsibly. For example, training should ensure that every scientist is able to recognise when they require specialised data or programming expertise in their teams, or when the use of complex and opaque AI techniques could undermine the integrity and quality of results.
Improving usability can also enhance the role of non-AI scientists as co-designers (footnote 18) – as opposed to passive users – who can ensure AI tools meet the needs of the scientific community. Creating conditions for co-design requires bridging disciplinary siloes between AI and domain experts through the development of shared languages, modes of working, and tools.
Actions to enhance the usability of AI tools may include:
- Research institutions and training centres establishing AI literacy curriculums across scientific fields to build researchers’ capacity to understand the opportunities, limitations, and adequacy of AI-based tools within their fields and research contexts.
- Research institutions and training centres establishing comprehensive data literacy curriculums tailored to the specific needs of AI applications in scientific research. This involves building capacity for data management, curation, and stewardship, as well as implementation of data principles such as FAIR (Findable, Accessible, Interoperable, and Reusable) and CARE (Collective benefit, Authority to control, Responsibility, and Ethics) (footnote 19).
- Research funders and AI developers investing in strategies that improve understanding and usability of AI for non-AI experts, with a focus on complex and opaque models (footnote 20). This can include further research on domain-specific explainable AI (XAI) or accessible AI tools that enhance access in resource-constrained research environments (footnote 21).
- Research institutions, research funders, and scientific journals implementing mechanisms to facilitate knowledge translation across domains and meaningful collaboration across disciplines. This requires a combination of cross-discipline training, mentorship, publication outlets and funding (eg through bodies such as the UKRI’s Cross-Council Remit Agreement that governs interdisciplinary research proposals) (footnote 22).
Area of action: build trust in the integrity and quality of AI-based scientific outputs
Recommendation 3
Research funders and scientific communities should ensure that AI-based research meets open science principles and practices to facilitate AI’s benefits in science.
A growing body of irreproducible AI and machine learning (ML)-based studies are raising concerns regarding the soundness of AI-based discoveries (footnote 23, footnote 24). However, scientists are facing challenges to improve the reproducibility of their AI-based work. These include insufficient documentation released around methods, code, data, or computational environments (footnote 25); limited access to computing to validate complex ML models (footnote 26); and limited rewards for the implementation of open science practices (footnote 27). This poses risks not only to science, but also to society, if the deployment of unreliable or untrustworthy AI-based outputs leads to harmful outcomes (footnote 28).
To address these challenges, AI in science can benefit from following open science principles and practices. For example, the UNESCO Recommendation on Open Science (footnote 29) offers relevant guidelines to improve scientific rigour, while noting that there is not a one-size-fits-all approach to practising openness across sectors and regions. This aligns well with the growing tendency towards adopting ‘gradual’ open models that pair the open release of models and data with the implementation of detailed guidance and guardrails to credible risks (footnote 30)
Open science principles can also contribute towards more equitable access to the benefits of AI and to building the capacity of a broader range of experts to contribute to its applications for science. This includes underrepresented and under-resourced scholars, data owners, or non-scientist publics.
Further work is needed to understand the interactions between open science and AI for science, as well as how to minimise safety and security risks stemming from the open release of models and data.
Actions to promote the adoption of open science in AI-based science may include:
- Research funders and research institutions incentivising the adoption of open science principles and practices to improve reproducibility of AI-based research. For example, by allocating funds to open science and AI training, requesting the use of reproducibility checklists (footnote 31) and data sharing protocols as part of grant applications, or by supporting the development of community and field-specific reproducibility standards (eg TRIPOD-AI (footnote 32)).
- Research institutions and journals rewarding and recognising open science practices in career progression opportunities. For example, by promoting the dissemination of failed results, accepting pre-registration and registered reports as outputs, or recognising the release of datasets and documentation as relevant publications for career progression.
- Research funders, research institutions and industry actors incentivising international collaboration by investing in open science infrastructures, tools, and practices. For example, by investing in open repositories that enable the sharing of datasets, software versions, and workflows, or by supporting the development of context-aware documentation that enables the local adaptation of AI models across research environments. The latter may also contribute towards the inclusion of underrepresented research communities and scientists working in low-resource contexts.
- Relevant policy makers considering ways of deterring the development of closed ecosystems for AI in science by, for example, mandating the responsible release of benchmarks, training data, and methodologies used in research led by industry.
Area for action: ensure safe and ethical use of AI in scientific research
Recommendation 4
Scientific communities should build the capacity to oversee AI systems used in science and ensure their ethical use for the public good
The application of AI across scientific domains requires careful consideration of potential risks and misuse cases. These can include the impact of data bias (footnote 33), data poisoning (footnote 34), the spread of scientific misinformation (footnote 35, footnote 36), and the malicious repurposing of AI models (footnote 37). In addition to this, the resource-intensive nature of AI (eg in terms of energy, data, and human labour) raises ethical questions regarding the extent to which AI used by scientists can inadvertently contribute to environmental and societal harms.
Ethical concerns are compounded by the uncertainty surrounding AI risks. As of late 2023, public debates regarding AI safety had not conclusively defined the role of scientists in monitoring and mitigating risks within their respective fields. Furthermore, varying levels of technical AI expertise among domain experts, and the lack of standardised methods for conducting ethics impact assessments, limit the ability of scientists to provide effective oversight (footnote 38). Other factors include the limited transparency of commercial models, the opaque nature of ML-systems, and how the misuse of open science practices could heighten safety and security risks (footnote 39, (footnote 40).
As AI is further integrated into science, AI assurance mechanisms (footnote 41) are needed to maintain public trust in AI and ensure responsible scientific advancement that benefits humanity. Collaboration between AI experts, domain experts and researchers from humanities and science, technology, engineering, the arts, and mathematics (STEAM) disciplines can improve scientists’ ability to oversee AI systems and anticipate harms (footnote 42).
Similarly, engaging with communities represented in or absent from AI training datasets, can improve the current understanding of possible risks and harms behind AI-based research projects.
Actions to support the ethical application of AI in science can include:
- Research funders and institutions investing in work that operationalises and establishes domain-specific taxonomies (footnote 43) of AI risks in science, particularly sensitive fields (eg chemical and biological research).
- Research funders, research institutions, industry actors, and relevant scientific communities embracing widely available ethical frameworks for AI, as reflected in the UNESCO Recommendation on the Ethics of Artificial Intelligence (footnote 44), or the OECD’s Ethical Guidelines for Artificial Intelligence (footnote 45), and implementing practices that blend open science with safeguards against potential risks.
- Funders, research institutions and training centres providing AI ethics training and building the capacity of scientists to conduct foresight activities (eg horizon scanning), pre-deployment testing (eg red teaming), or ethical impact assessments of AI models to identify relevant risks and guardrails associated with their field.
- Research funders, research institutions, and training centres supporting the development of interdisciplinary and participatory approaches to safety auditing, ensuring the involvement of AI and non-AI scientists, and affected communities in the evaluation of AI applications for scientific research.
Footnotes
-
11. Technopolis Group, Alan Turing Institute. 2022. Review of Digital Research Infrastructure Requirements for AI. See: https://www.turing.ac.uk/sites/default/files/2022-09/ukri-requirements-report_final_edits.pdf (accessed February 6 2024).
Back to report -
12. UKRI. Transforming our world with AI. See: https://www.ukri.org/publications/transforming-our-world-with-ai/ (accessed 6 February 2024)
Back to report -
13. United Nations. 2023 Interim Report: Governing AI for Humanity. See: https://www.un.org/sites/un2.un.org/files/ai_advisory_body_interim_report.pdf (accessible 6 February 2024)
Back to report -
14. Lannelongue, L, et al. 2023. Greener principles for environmentally sustainable computational science. Nat Comput Sci3, 514–521. (https://doi.org/10.1038/s43588-023-00461-y )
Back to report -
15. The Royal Society. 2023 Privacy Enhancing Technologies. See https://royalsociety.org/topics-policy/projects/privacy-enhancing-technologies/ (accessed 21 December 2023).
Back to report -
16. The Royal Society. 2020 Digital technology and the planet: Harnessing computing to achieve net zero. See https://royalsociety.org/topics-policy/projects/digital-technology-and-the-planet/ (accessed 21 December 2023)
Back to report -
17. Cartwright H. 2023 Interpretability: Should – and can – we understand the reasoning of machine-learning systems? In: OECD (ed.) Artificial Intelligence in Science. OECD. (https://doi.org/10.1787/a8d820bd-en).
Back to report -
18. UKRI. Trustworthy Autonomous Systems Hub. Developing machine learning models with codesign: how everyone can shape the future of AI. See: https://tas.ac.uk/developing-machine-learning-models-with-codesign-how-everyone-can-shape-the-future-of-ai/ (accessed 7 March 2023)
Back to report -
19. Global Indigenous Data Alliance. Care Principles for Indigenous Data Governance. See https://www.gida-global.org/ care (accessed 21 December 2023)
Back to report -
20. Szymanski M, Verbert K, Vanden Abeele V. 2022. Designing and evaluating explainable AI for non-AI experts: challenges and opportunities. In Proceedings of the 16th ACM Conference on Recommender Systems (https://doi.org/10.1145/3523227.3547427)
Back to report -
21. Korot E et al. 2021 Code-free deep learning for multi-modality medical image classification. Nat Mach Intell. 3, 288–298. (https://doi.org/10.1038/s42256-021-00305-2).
Back to report -
22. UKRI. Get Support For Your Project: If your research spans different disciplines. See: https://www.ukri.org/apply-for-funding/how-to-apply/preparing-to-make-a-funding-application/if-your-research-spans-different-disciplines/ (accessed 13 December 2023)
Back to report -
23. Haibe-Kains B et al. 2020 Transparency and reproducibility in artificial intelligence. Nature. 586, E14–E16. (https://doi.org/10.1038/s41586-020-2766-y).
Back to report -
24. Kapoor S and Narayanan A. 2023 Leakage and the reproducibility crisis in machine-learning-based science. Patterns. 4(9) (https://doi.org/10.1016/j.patter.2023.100804)
Back to report -
25. Pineau, J, et al. 2021. Improving reproducibility in machine learning research (a report from the Neurips 2019 Reproducibility program).” Journal of Machine Learning Research 22.164.
-
26. Bommasani et al. 2021. On the opportunities and risks of foundation models. See: https://crfm.stanford.edu/assets/report.pdf. (accessed 21 March 2024)
Back to report -
27. UK Parliament, Reproducibility and Research Integrity – Report Summary See: https://publications.parliament.uk/pa/cm5803/cmselect/cmsctech/101/summary.html (accessed 7 February 2024)
Back to report -
28. Sambasivan, N, et al 2021. “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.
Back to report -
29. UNESCO Recommendation on Open Science. 2021. See: https://www.unesco.org/en/legal-affairs/recommendation-open-science (accessed 6 February 2024).
Back to report -
30. Solaiman, I. 2023 The gradient of generative AI release: Methods and considerations. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (pp. 111-122). (https://doi.org/10.48550/ arXiv.2302.04844).
Back to report -
31. McGill School of Computer Science. The Machine Learning Reproducibility Checklist v2.0. See: https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf (accessed 21 December 2023).
Back to report -
32. Collins G et al. 2021 Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ open, 11(7), e048008. (https://doi.org/10.1136/bmjopen-2020-048008).
Back to report -
33. Arora, A, Barrett, M, Lee, E, Oborn, E and Prince, K 2023 Risk and the future of AI: Algorithmic bias, data colonialism, and marginalization. Information and Organization, 33. (https://doi.org/10.1016/j.infoandorg.2023.100478)
Back to report -
34. Verde, L., Marulli, F. and Marrone, S., 2021. Exploring the impact of data poisoning attacks on machine learning model reliability. Procedia Computer Science, 192. 2624-2632. (https://doi.org/10.1016/j.procs.2021.09.032).
Back to report -
35. Truhn D, Reis-Filho J.S. & Kather J.N. 2023 Large language models should be used as scientific reasoning engines, not knowledge databases. Nat Med 29, 2983–2984. (https://doi.org/10.1038/s41591-023-02594-z).
Back to report -
36. The Royal Society. 2024. Red teaming large language models (LLMs) for resilience to scientific disinformation. See https://royalsociety.org/news-resources/publications/2024/red-teaming-llms-for-resilience-to-scientific-disinformation/.
Back to report -
37. Kazim, E and Koshiyama, A.S 2021 A high-level overview of AI ethics. Patterns, 2. (https://doi.org/ 10.1016/j.patter.2021.100314).
Back to report -
38. Wang H et al. 2023 Scientific discovery in the age of artificial intelligence. Nature, 620. 47-60. (https://doi.org/10.1038/ s41586-023-06221-2)
Back to report -
39. Solaiman, I. 2023 The gradient of generative AI release: Methods and considerations. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (pp. 111-122). (https://doi.org/10.48550/arXiv.2302.04844).
Back to report -
40. Vincent J. 2023 OpenAI co-founder on company’s past approach to openly sharing research: ‘We were wrong’. The Verge. See https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview (accessed 21 December 2023).
Back to report -
41. Brennan, J. 2023. AI assurance? Assessing and mitigating risks across the AI lifecycle. Ada Lovelace Institute. See https://www.adalovelaceinstitute.org/report/risks-ai-systems/ (accessed 30 September 2023)
Back to report -
42. The Royal Society. 2023. Science in the metaverse: policy implications of immersive technology. See https://royalsociety.org/news-resources/publications/2023/science-in-the-metaverse/
Back to report -
43. Weidinger L, et al. 2022 Taxonomy of risks posed by language models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 214-229. (https://doi.org/10.1145/3531146.3533088)
Back to report -
44. UNESCO. 2022. Recommendation on the ethics of artificial intelligence. See: https://www.unesco.org/en/artificial-intelligence/recommendation-ethics (accessed 5 March 2024)
Back to report -
45. OECD. Ethical guidelines for artificial intelligence. See: https://oecd.ai/en/catalogue/tools/ethical-guidelines-for-artificial-intelligence (accessed 5 March 2024)
Back to report