Friday, 28 December 2012

Timeline Pivot Points with the Malware Domain List

I thought as its the end of the year it would be a good opportunity to briefly break away from the SANS Forensic Artifact posts I've been writing. In my own time I've been playing around with some code that parses a Timeline file for any URL discovered within and then compares that with the URLs listed in the Malware Domain List (MDL).

If a match is found it lists the malicious URL from MDL and the description which explains why that URL has been listed on MDL. I'm creating this for greater ability to find "Pivot Points" which both Rob Lee and Harlan Carvey mention which serve as an anchor for our investigations. Pivot Points can come in a variety of forms both verbally and technically and will hopefully assist us with a starting point or area of focus. The less time we can spend poking around an image the more time we can spend providing value to our customers or employers.

So to get started I first downloaded a copy of the Malware Domain List. You can get yourself a copy at the following location -> Once you have the list I proceeded to create an SQLlite database and imported the MDL list within it. You can easily install the Firefox addon SQLlite Manager which is the method I've used.
  1. Create a new database in the same directory as the script called malwaredomainlist.sqlite
  2. Import the MDL from CSV into new table called mdomain
  3. See screenshot below for appropriate field names to use for the table

Above are some basic steps to get you up and running and if you review the screenshot you'll see the table and field names I've used. If you decide to use the tool that I post you will want to ensure that your  filename, table name and your field names are the same as mine otherwise you'll generate some errors.

I had a few attempts at tackling how I would compare the domains discovered within my timeline to the ones within the MDL. I felt the only way to do this accurately would be to reduce both URLs down to their domain name including the suffix / tld / gtld. I had a few attempts at coding this but always found that some domain would break the script at a point. In the end I went with a pre packaged module -> To install the module type the following command from command prompt while ensuring you've obviously installed Perl in the first place. Below is the command plus the output

 ppm install Domain::PublicSuffix  
 Downloading Domain-PublicSuffix-0.07...done  
 Downloading Data-Validate-Domain-0.10...done  
 Downloading Net-Domain-TLD-1.69...done  
 Unpacking Domain-PublicSuffix-0.07...done  
 Unpacking Data-Validate-Domain-0.10...done  
 Unpacking Net-Domain-TLD-1.69...done  
 Generating HTML for Domain-PublicSuffix-0.07...done  
 Generating HTML for Data-Validate-Domain-0.10...done  
 Generating HTML for Net-Domain-TLD-1.69...done  
 Updating files in site area...done  
  11 files installed  

The above module make use of a Firefox dat file which it uses to identify the TLD / suffix on the domain. So in order for the script to work you'll need to also download this dat file which you can find at the following -> and save it within the same directory at the script.

Now that we have our database sorted I've created the following script. At this point its still a work in progress and I haven't commented it very well. As always my code is taken "as is" and I provide no additional support or responsibility for the output it provides. I'm no coding guru and always appreciate feedback on a better or more efficient way of doing things so feel free to shout out.

 #! c:\perl\bin\perl.exe   
 use Domain::PublicSuffix;  
 use DBI;   
 use strict;   
 use Getopt::Long;   
 use Regexp::Common qw /URI/;   
 use URI;  
 use List::MoreUtils qw/ uniq /;  
 my %config = ();   
 GetOptions(\%config, qw(file|f=s system|s=s user|u=s help|?|h));   
 if ($config{help} || ! %config) {   
   exit 1;   
 die "You must enter a path.\n" unless ($config{file});   
 #die "File not found.\n" unless (-e $config{file} && -f $config{file});   
 my $file = $config{file};   
 my @uniq_domains;  
 my $suffix = new Domain::PublicSuffix ({  
   'data_file' => 'effective_tld_names.dat'  
 open( my $fh, '<', $file ) or die "Can't open $file: $!";  
 while ( my $line = <$fh> ) {  
      my @url = $line =~ m/($RE{URI}{HTTP}{-scheme => qr(https?)})/g;   
           my $temp_domain = URI->new( $url[0] );  
           my $domain = $temp_domain->host;  
           my $domain1 = getDomain($domain);  
      close $fh;  
      my @unique = uniq @uniq_domains;  
           foreach ( @unique ) {       
                     if($_) {  
                          my $db = DBI->connect("dbi:SQLite:dbname=malwaredomainlist.sqlite","","") || die( "Unable to connect to database\n" );   
                          my $all = $db->selectall_arrayref("SELECT domain,description from mdomain where domain LIKE '%$_%'");   
                          foreach my $row (@$all) {   
                               my ($maldomain,$description) = @$row;        
                               my @splitdomain = split('/',$maldomain);  
                               my @splitdomain = split(':',$splitdomain[0]);                                
                               my $tempmdomain = getDomain($splitdomain[0]);                      
                               if($_ eq $tempmdomain) {  
                                    print $_.",".$maldomain.",".$description."\n";   
 sub getDomain {  
 my $root = $suffix->get_root_domain($_[0]);  
 return $root;  
  sub _syntax {   
  print<< "EOT";   
  Produce list of malware domain hits from timeline output   
  -f file..................path to timeline file   
  -h ......................Help (print this information)   
  **All times printed as GMT/UTC   
  copyright 2012 Sploit   

At this point if you run a command such as the following:

 malwaredomainlist -f timeline.csv > output.txt  

You'll be presented with output in csv format (assuming my instructions made sense) where the fields presented are the domain in question, the complete malware domain url and the description/comments. Here is a sample output:,|,RFI,,RFI,,RFI,,RFI,,Mebroot calls home,,Rogue,,compromised site directs to exploits  

As you can see from the above there are some URLs which will consistently generate false positives such as My script grabs the unique URLs listed within a timeline and typically you'll almost always have listed.

At this point I'm not sure the value in this tool. Its fairly quick to run and if you find yourself with a massive timeline file and you're not sure where to start then potentially this might be your next best bet. While i'm tweaking the code I haven't created the executable version of it yet however I have uploaded the  following code to my Google code repository to save you any issues with copying the source code above.

Hopefully you get some value out of the tool please let me know if you have any success with using it. In the meantime I'll continue to tweak and update the code. At this point it would be nice to have an option to download a fresh MDL and update the database. Overall this wouldn't take long to do manually however it would be nice for it to be automatic.


  1. Excellent post! This really illustrates how timelines can be used to augment and enhance other forms of analysis, and vice versa. Great job!

  2. Agree, great post and idea.

    Using the new log2timeline, plaso, you can filter on URL in a event one off.. however not in batch like this. Perhaps we could add a way to point it to a db or text file to "filter" against. Think post processing.

    Also you could do a string search in 4n6time, however this is not as intuitive as your method, perhaps I should add a way to search for these more easily or even integrate with the malware domain list. hmmmm so many ideas :-)

    1. Hi Davnads,

      I'm so glad I could give you some ideas that may in some way contribute to 4n6time. My implementation could do with some code tweaking but for someone of your coding ability that shouldn't be a problem. I have done similar implementations to the above using MSSQL and imported proxy log files.