Professor Maria Liakata: How to make AI work for the public good
This comment, authored by Professor Maria Liakata, Professor of Natural Language Processing, explores the opportunities and challenges associated with the large-scale deployment of Artificial Intelligence (AI) across public services in the UK.
The government’s announcement of a significant expansion of AI in the public sector marks a bold step forward. AI is set to play a role in everything from identifying potholes to personalising teaching, with plans to increase its deployment 20-fold over the next five years. While this signals a welcome embrace of AI’s potential, it also underscores the critical need to balance innovation with responsibility.
As someone deeply engaged in research on evaluating and mitigating the limitations of using AI systems (like ChatGPT) based on Large Language models (LLMs), particularly in sensitive areas like healthcare and law which involve private data and vulnerable individuals, I consider it crucial to ensure these technologies are both effective and ethically sound. This is the aim of our RAi UK Keystone project on Addressing Sociotechnical limitations of LLMs which brings together 11 co-investigators with strong technical expertise as well as domain knowledge and expertise in responsible innovation.
Our findings show that evaluation of the effectiveness and appropriateness of generated outputs in a variety of tasks and use cases is unsystematic and in many cases inappropriate for assessing the quality of language generation. Our research has shown significant limitations of LLMs in capturing temporal relations as simple as temporal ordering (imagine an LLM-based court-case summarisation system getting the order of events wrong or failing to capture the order of events in medical records or event mentions in therapy sessions). We have also revealed important shortcomings in reasoning, even with the latest models such as GPT4o, showing inability to distinguish between different types of reasoning (e.g. deductive vs abductive) while the addition of accompanying explanations, e.g. through Chain-of-Thought reasoning, mostly hinders rather than helps assessments. This has serious implications; imagine e.g. prescribing a medication or other medical intervention without being able to trust supporting evidence for the choice. Only recently a lawyer unknowingly submitted fake AI generated evidence in presenting a court case and is facing the consequences.
While the opportunities and potential benefits to the public by employing AI in public services could indeed be groundbreaking, a lot of effort needs to go into rigorous evaluation of the technology and the implication of its use in different contexts. Moreover such evaluation must not only be guided by output suitability but also regulatory robust frameworks for accountability, transparency, and fairness. It should be noted that academics and projects such as ours have a huge role to play in this as they are funded by and designed to serve public interest rather than company profits. Only if we have sufficient evidence of the actual benefits to the public and can convince them of this, can we ensure that AI is a force for good, empowering public services while safeguarding public interest.
Related items
News story: Queen Mary University of London data centre waste heat to provide hot water and heating for campus
9 December 2024
News story: New Queen Mary spinout Syntex secures £250k to develop invention for superior synthetic heart valves
28 November 2024