Domain Generation Algorithm Detection
Bots communicate with a Command & Control (C2) server in order to obtain instructions or to exfiltrate gathered data. Since connection attempts to a C2 server utilizing fixed IP addresses or fixed domain names are easy to block, botnets rely on Domain Generation Algorithms (DGAs). DGAs periodically generate a large number of algorithmically-generated domains (AGDs) which serve as rendezvous points with a C2 server. These AGDs are pseudo-randomly generated using a seed, enabling a botnet herder to predict and to register one or more generated domain names in advance. If the bots query one of these domain names, they obtain a valid IP address for their C2 server. All other queries result in Non-Existent Domain (NXD) responses. The utilization of DGAs makes it much harder to prevent all possible connection attempts of bots to their C2 server.
FANCI (Feature-based Automated NXDomain Classification and Intelligence) is a system which can detect infections with DGA-based malware by monitoring NXD responses in DNS traffic. It utilizes classical machine learning classifiers such as random forests or support vector machines to classify domain names into DGA-related and benign NXDs. FANCI does not require any additional context information since the features for classification are extracted solely from the individual NXD that is to be classified.
Ongoing work deals with further improving FANCI's performance and adding multiclass classification capabilities in order to attribute AGDs to specific malware families. Simultaneously, we research on different deep learning approaches for the binary and multiclass classification task.