Research Projects of the Dialog Systems and Machine Learning Working Group
DYMO - Dynamic Dialogue Modelling
An ERC Starting Grant awarded by the European Research Council
The dialogue systems currently deployed both in academia and in industry are typically built for fixed domains, requiring a new data set every time the domain changes. End-to-end methods attempting to solve this problem directly with open-domain chat-bots are far from reaching competitive results on goal-oriented tasks. The aim of this project is to tackle the most substantial obstacles on the way to intelligent conversational systems that can expand dynamically and span the problem of operating with dynamic knowledge, dynamic policies, rich user models and sophisticated measures of quality.
- Duration: 01.09.2019 - 31.01.2026
- Value: 1.5M Euro
- People: Prof Dr Milica Gasic, Dr Christian Geishauser, Dr Hsien-Chin Lin, Dr Benjamin Ruppik and Renato Vukovic
Trustworthy Integration of Large Language Models in Human-Computer Interactive Systems
Lamarr Fellow Network Ramp up - a project awarded by the ministery of culture and science of NRW
Large language models achieve impressive results across the entire spectrum of computational linguistics. Their success is based on an innovative combination of three mechanisms: a transformer network, prompt-based learning and reinforcement learning through human feedback. As impressive as the capabilities of these language models are, the problems they bring with them are just as diverse.
Large language models need to be trained on huge data sets. Accessing the models is costly and time-consuming. The output includes misleading content with no apparent reference to the training data. There is no evidence of the reliability. The models lack the ability to seamlessly incorporate external knowledge into their outputs.
Large language models are therefore not a replacement for conventional human-computer interfaces. However, they have the potential to take the quality and user-friendliness of such interfaces to a whole new level. The central question that we would like to address as part of this project is: How and where is it necessary and responsible to integrate large language models into an interactive human-computer dialogue system? The project therefore lies at the intersection of human-focused artificial intelligence, trustworthy artificial intelligence and linguistic data processing.
- Duration: 01.04.2024 – 31.03.2028
- Value: 599.222,14 Euro
- People: Prof. Dr. Milica Gasic, Dr. Carel van Niekerk und Dr. Hsien-Chin Lin
Curriculum learning in natural language generation
A project awarded by the German Research Foundation (DFG)
State-of-the-art Natural Language Generation models (NLG) address the task of language generation as next token prediction: Given a sequence of tokens, e.g., words, what is the most probable next token? The simple problem formulation, taken together with the large amount of unlabelled data available for training, allows to make these systems big and powerful, which has led to today's AI boom. The results are impressive in terms of naturalness and grammaticality of the generated output, leading to its mainstream adaption for real world applications today. However, the large amount of knowledge learned by neural language models comes at a cost: it manifests in form of hallucinations, which are outputs that are untrue based on real world knowledge or user intent. It is especially difficult to pinpoint where this problem originates from, as NLG models are trained on huge data sets. Hallucination detection and mitigation are becoming increasingly important and urgent topics in natural language processing (NLP) field as language models become used and relied upon by to millions of users worldwide.
The goal of this project is to better understand the origins of hallucinations and eventually reduce their magnitude and severity in the output of NLG models.
- Duration:01.08.2025 – 31.07.2028
- Value: 361.852 Euro
- People: Prof Dr Milica Gasic, TBA
Towards Intelligent Dialogue Systems
A Sofja Kovalevskaja grant awarded by the Alexander von Humboldt foundation
A human can read a book and instantly be able to talk about what he or she has read. Machines, on the other hand, are very good in storing huge amount of information but not so good in sharing this information with humans in a natural and human-like way. The question we try to answer is how we enrich dialogue system knowledge using non-dialogue unstructured data and what is the adequate dialogue system architecture and features which are needed to support that.
- Duration: 01.12.2018 - 31.01.2025
- Value: 1.65M Euro
- People: Prof. Dr Milica Gasic, Dr Michael Heck, Dr Nurul Lubis, Dr Carel van Niekerk, Dr Benjamin Ruppik and Shutong Feng