Date of Award
Daniel Felix Ritchie School of Engineering and Computer Science, Electrical and Computer Engineering
Mohammad H. Mahoor
Deep neural networks, Artificial intelligence, Natural language processing (NLP), Dialogue management system (DMS)
The advent of deep neural networks has sparked a revolution in Artificial Intelligence (AI), notably with the creation of Transformer models like GPT-X and ChatGPT. These models have surpassed previous methods in various Natural Language Processing (NLP) tasks. As the NLP field evolves, there is a need to further understand and question the capabilities of these models. Text generation, a crucial part of NLP, remains an area where our comprehension is limited while being critical in research.
This dissertation focuses on the challenging problem of controlling the general behaviors of language models such as sentiment, topical focus, and logical reasoning. Controlling these properties influences language model generative processes in ways that enhance their emotional resonance, improve logical consistency, and maintain topical relevance.
To accomplish this objective I develop a rule-based Dialogue Management System (DMS), Program-R, which operates through the Ryan companionbot. Program-R incorporates controllable elements like sentiment and facial expression recognition. It uses these factors to choose the most fitting response from a set of pre-determined dialogues, offering a unique interaction experience.
Moving forward, I develop and assess EmpTransfo, a dialog system built on deep learning principles that can simultaneously comprehend emotions and generate responses. The proposed methodology is based on training a conditional language model using emotions. I further enrich the model by feeding emotion embedding into EmpTransfo, thereby setting an emotional context for language generation.
In the next method, I utilize topic modeling information and integrate it with a pretrained language model. By viewing topic probabilities as the prior and language model probabilities as the likelihood, the resulting probability for topical language generation becomes the posterior. Experimental results underscore that this topical language generation approach surpasses existing benchmarks in open text generation.
Finally, I expand the control over text generation to encompass abductive reasoning. Given incomplete observations from real-world situations, the aim is to modify the text generation process to reason about plausible hypotheses that could lead to the given observations. In this study, I propose a deep learning model that employs a novel approach of combining temporal commonsense reasoning for each observation from pre-trained models.
Copyright Statement / License for Reuse
All Rights Reserved.
Copyright is held by the author. User is responsible for all copyright compliance.
Received from ProQuest
Zandie, Rohola, "Controllable Language Generation Using Deep Learning" (2023). Electronic Theses and Dissertations. 2336.
Artificial intelligence, Computer science