"Real Time PII Scanning" by John David

Date of Award

Summer 8-24-2024

Document Type

Masters Thesis

Degree Name

M.S. in Computer Science

Organizational Unit

Daniel Felix Ritchie School of Engineering and Computer Science, Computer Science

First Advisor

Rinku Dewri

Second Advisor

Tianjie Deng

Third Advisor

Matt Rutherford

Copyright Statement / License for Reuse

All Rights Reserved
All Rights Reserved.

Keywords

Web application, Log data, Personally identifiable information (PII), Extraction, Real-time data analysis, Bull Extractor, Amazon Web Services (AWS), Cloud platform

Abstract

The increased amount of web applications and internet software solutions utilizing cloud frameworks has contributed to large data sets of system log messages being generated constantly. These messages may contain sensitive data, creating an additional security risk for the systems and contributing to the need for analysis of such large volumes of data in real time. Large commercial data monitoring systems can solve for these analysis requirements, but they can be costly. We present a solution to analyzing web application log data which ingests it, processes it and visualizes sensitive data found within in real time. Our solution utilizes an open source method for bulk data analysis and extraction of sensitive information, a digital forensics utility called Bulk Extractor. We focus specifically on Personally Identifiable Information (PII) artifacts present within log data as the targets for extraction. We call our solution the PII Scanner, and present prototype implementations of the scanner on the Amazon Web Services cloud platform, as well as results from tests performed on them to demonstrate their effectiveness and explore implementation options.

Copyright Date

8-2024

Publication Statement

Copyright is held by the author. User is responsible for all copyright compliance.

Rights Holder

John David

Provenance

Received from Author

File Format

application/pdf

Language

English (eng)

Extent

128 pgs

File Size

1.9 MB



Share

COinS