A Statistical Approach to Detecting Low-Throughput Exfiltration through the Domain Name System Protocol

Emily Joback; Leslie Shing; Kenneth Alperin; Steven R. Gomez; Steven Jorgensen; Gabe Elkin

Conference ProceedingsOPEN ACCESS

A Statistical Approach to Detecting Low-Throughput Exfiltration through the Domain Name System Protocol

ACM International Conference Proceeding Series (2020)

DOI: 10.1145/3477997.3478007

0Citations

10Readers

Get full text

Abstract

The Domain Name System (DNS) is a critical network protocol that resolves human-readable domain names to IP addresses. Because it is an essential component necessary for the Internet to function, DNS traffic is typically allowed to bypass firewalls and other security services. Additionally, this protocol was not designed for the purpose of data transfer, so is not as heavily monitored as other protocols. These reasons make the protocol an ideal tool for covert data exfiltration by a malicious actor. A typical company or organization has network traffic containing tens to hundreds of thousands of DNS queries a day. It is impossible for an analyst to sift through such a vast dataset and investigate every domain to ensure its legitimacy. An attacker can use this as an advantage to hide traces of malicious activity within a small percentage of total traffic. Recent research in this field has focused on applying supervised machine learning (ML) or one-class classifier techniques to build a predictive model to determine if a DNS domain query is used for exfiltration purposes; however, these models require labelled datasets. In the supervised approach, models require both legitimate and malicious data samples, but it is difficult to train these models since realistic network datasets containing known DNS exploits are rarely made public. Instead, prior studies used synthetic curated datasets, but this has the potential to introduce bias. In addition, some studies have suggested that ML algorithms do not perform as well in situations where the ratio between the two classes of data is significant, as is the case for DNS exfiltration datasets. In the one-class classifier approach, these models require a dataset known to be void of exfiltration data. Our model aims to circumvent these issues by identifying cases of DNS exfiltration within a network, without requiring a labelled or curated dataset. Our approach eliminates the need for a network analyst to sift through a high volume of DNS queries, by automatically detecting traffic indicative of exfiltration.

Author supplied keywords

Cite

CITATION STYLE

APA

Joback, E., Shing, L., Alperin, K., Gomez, S. R., Jorgensen, S., & Elkin, G. (2020). A Statistical Approach to Detecting Low-Throughput Exfiltration through the Domain Name System Protocol. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3477997.3478007

A Statistical Approach to Detecting Low-Throughput Exfiltration through the Domain Name System Protocol

Abstract

Author supplied keywords

Cite

Register to see more suggestions