A Platform for Automated Threat Report Collection and IOC Extraction

A few days ago I came across this project from the University of Madrid. Below is a summary and the entire document. Enjoy the reading 🙂

To adapt to a constantly evolving landscape of cyber threats, organizations actively need to collect Indicators of Compromise (IOCs), i.e., forensic artifacts that signal that a host or network might have been compromised. IOCs can be collected through open-source and commercial structured IOC feeds. But, they can also be extracted from a myriad of unstructured threat reports written in natural language and distributed using a wide array of sources such as blogs
and social media. This work presents GoodFATR an automated platform for collecting threat reports from a wealth of sources and extracting IOCs from them. GoodFATR supports 6 sources: RSS, Twitter, Telegram, Malpedia, APTnotes, and ChainSmith. GoodFATR continuously monitors the sources, downloads new threat reports, extracts 41 indicator types from the collected reports, and filters generic indicators to output the IOCs. We propose a novel majority-vote methodology for evaluating the accuracy of indicator extraction tools, and apply it to compare 7 popular tools with GoodFATR’s indicator extraction module. We run GoodFATR over 15 months to collect 472,891 reports from the 6 sources; extract 1,043,932 indicators from the reports; and identify 655,971 IOCs. We analyze the collected data to identify the top IOC contributors and the IOC class distribution. Finally, we present a case study on how GoodFATR can assist in tracking cybercrime relations on the Bitcoin blockchain.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.