Date of Award

6-15-2024

Document Type

Dissertation

Degree Name

Ph.D.

Organizational Unit

Daniels College of Business

First Advisor

Ryan Elmore

Second Advisor

Kellie Keeling

Third Advisor

Benjamin Williams

Keywords

Corpus, Large language model (LLM), Non-factoid question taxonomy, Retrieval augmented generation (RAG), Vector database

Abstract

Large Language Models have emerged to great fanfare in the Information Technology market. Business and Information Technology leaders are currently exploring ways to apply these models to assist their organizations in executing business processes and generating innovation. Software vendors, consultants, and academics promote various approaches to making Large Language Models work effectively for business. However, little academic literature is available today that quantifies the degree of improvement possible with these domain-specific approaches over the standard capabilities of generalized Large Language Models.

The study seeks to quantify the benefits of one approach, Retrieval Augmented Generation. The study uses a collection of questions across several topics with known reference answers. The context for these questions is used to assemble a study corpus. Responses are generated by both a standard Large Language Model system and a Retrieval Augmented Generation system. The study analyzes the quality of generated responses to determine the degree to which the Retrieval Augmented Generation responses differ from those generated by the standard Large Language Model.

Copyright Date

6-2024

Copyright Statement / License for Reuse

All Rights Reserved
All Rights Reserved.

Publication Statement

Copyright is held by the author. User is responsible for all copyright compliance.

Rights Holder

Phillip D. Crippen

Provenance

Received from ProQuest

File Format

application/pdf

Language

English (eng)

Extent

76 pgs

File Size

979 KB

Available for download on Wednesday, August 12, 2026



Share

COinS