Date of Award

2020

Document Type

Masters Thesis

Degree Name

M. S.

Organizational Unit

Daniel Felix Ritchie School of Engineering and Computer Science, Computer Science

First Advisor

Rinku Dewri

Second Advisor

Scott Leutenegger

Third Advisor

Young Jin Lee

Keywords

Change detection, Natural language processing, Privacy policy

Abstract

Privacy policies notify Internet users about the privacy practices of websites, mobile apps, and other products and services. However, users rarely read them and struggle to understand their contents. Also, the entities that provide these policies are sometimes unmotivated to make them comprehensible. Due to the complicated nature of these documents, it gets even harder for users to understand and take note of any changes of interest or concern when these policies are changed or revised.

With recent development of machine learning and natural language processing, tools that can automatically annotate sentences of policies have been developed. These annotations can help a user quickly identify and understand relevant parts of the policy. Similarly a tool can be developed that can help identify changes between different versions of a policy that can be informative for the user. For example, suppose according to the new policy a website will start sharing audio data as well. The proposed tool can help users to be aware of such important changes. This thesis presents a tool that takes two different versions of a privacy policy as input, matches the sentences of one version of a policy to the sentences of another version of the policy based on semantic similarity, and inform the user of key relevant changes between two matched sentences. We discuss different supervised machine learning models that are explored to develop a method to annotate the sentences of privacy policies according to expert-identified categories for organization and analysis of the contents. Different word-embedding and similarity techniques are explored and evaluated to develop a method to match the sentences of one version of the policy to another version of a policy. The annotation of the sentences are used to increase the efficiency of the matching process. Methods to detect changes between two matched sentences through analysis of the structure of sentences are then implemented. We combined the developed methods for annotation of policies, matching the sentences between two versions of a policy and detecting change between sentences to realize the proposed tool.

The research work not only shows the potential of machine learning and natural language processing as an important tool for privacy engineering but also introduces various techniques that can be utilized for any natural language document.

Copyright Date

January 2020

Publication Statement

Rights Holder

Andrick Adhikari

Provenance

Received from ProQuest

File Format

application/pdf

Language

File Size

116 p.

Recommended Citation

Adhikari, Andrick, "Automated Change Detection in Privacy Policies" (2020). Electronic Theses and Dissertations. 1706.
https://digitalcommons.du.edu/etd/1706

Discipline

Computer science, Artificial intelligence

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Digital Commons @ DU

Electronic Theses and Dissertations

Automated Change Detection in Privacy Policies

Date of Award

Document Type

Degree Name

Organizational Unit

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Copyright Date

Publication Statement

Rights Holder

Provenance

File Format

Language

File Size

Recommended Citation

Discipline

Included in

Browse

Search

Author Corner

Digital Commons @ DU

Electronic Theses and Dissertations

Automated Change Detection in Privacy Policies

Author

Date of Award

Document Type

Degree Name

Organizational Unit

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Copyright Date

Publication Statement

Rights Holder

Provenance

File Format

Language

File Size

Recommended Citation

Discipline

Included in

Share

Browse

Search

Author Corner