Tagging and Timelines: Part 1
During a recent attempt at answering the Honeynet Log Mysteries Challenge, I wrote a series of reasoned analyses for the supplied Honeynet logging data. Unfortunately, teaching workloads stopped me from submitting any realistic challenge answer.
Inspired by the idea of applying the Scientific Method to Digital Forensics (see Casey2009 and Carrier2006) and using data visualisation (see Conti2007 and Marty2008), I set about attempting to apply the same principles to analysing the Log Mysteries data sets.
When analysing the auth.log sudo commands, we often want to try and group these commands by (for example) their intended system or user functionality. In most modern operating systems, some form of package management system is often employed to maintain and update installed commands. If we are able to correctly relate commands to their corresponding packages, it is possible to use package meta-information to then extract descriptions of system and user functionality (which, by package association, can then be inherited by the initial command).
In this blog post, we intend to tag and classify the auth.log sudo commands using information screen scraped from Debian's package tagging project, debtags (see the vocabulary file for a description of each Debian tag). In the next blog post, we plan to use these debtag-derived taggings to locate interesting events using a timeline.
In Apache2 Version Analysis: Ubuntu Packaging, we concluded that we were working with an Ubuntu server. Now, both Debian and Ubuntu use .deb-formatted packages and target a file system organised in a manner similar to the linux standard base. However, in this article, we choose not to quantify the similarities and differences of these filesystems under their respective package managers. Instead, we opt to build an approximate or estimate model using the Debian package management system to define a mostly correct command to (Debian) package name map. When our sudo commands belong to multiple Debian packages, we shall apply the principle of minimal common knowledge and so choose the intersection of all debtag lists as our mappings result.
A basic Rails application has been implemented to hold the sudo parsed events from sanitized_log/auth.log within an Sudo model (the rake task: db:seed is used to build up the underlying database; and test:units is used to verify that the database has been correctly built - the data on this page has been verified against the master copy using these unit tests).
Within this Rails application, the Sudo model uses:
- the function
package_lookupto define our command to Debian package name mapping by screen scraping dpkg - the function
debtags_lookupto define a Debian package name to debtag list mapping by screen scraping debtags - the virtual attribute
debian_tagsto map the current model instance (ie. anauth.logentry) to a list of associated debtags via the previous functions.
tag:with:debtags is used to build up tagging relationships for commands, Debian packages and debtags, using the virtual attribute debian_tags. The Rails gem acts-as-taggable-on provides our tagging implementation here.
Using this code framework, we can now extract frequency data (see the honeynet controller's index view) using the following ActiveRecord code pattern:
Sudo.tagged_with(tag_list).tag_counts_on(tag_context)
.map { |t| [t.name, t.count] }
tag_list is the list of tags that we wish to filter on and tag_context is one of the tagging contexts commands, packages or debtags.
By building this Rails application, the resulting taggings (along with their subsequent frequency analysis), can be seen by clicking on the image below:
Using this tagging frequency data, we are now able to make the following observations:
- Command Usage Overview: 60.1% of all sudo commands involve restarts of Apache2, the
teecommand and subversion; 76.5% of administrator sudo activity is due to configuration; 96.1% of sudo commands work with the html format and 73.1% work on files; 60.9% of all sudo commands are implemented in C, Perl or Python; 37.5% of network related sudo commands are clients and 76.0% are server commands; 86.8% of all sudo commands involveuser1usingrootprivileges, whilst 0.3% involvedhg(a user associated with the keywordspsybncandeggdrop- see below). - System Web Directory Location: Ubuntu and Debian both use
/var/wwwas their default (public facing) system web directory. By examining the otherpwddirectory names, we can see that/opt/software/webhas 76.7% of all sudo commands occurring within it (only 0.2% occur within/var/www). - Keyword Analysis using Snort Rules: ad-hoc keyword searching using the snort rules, allows one to discover that the keyword
psybnccan be associated with an IRC bouncer program. Based on prior Honeynet challenges (eg. see An Introduction to psyBNC 2.3.1 and Know your Enemy: Web Application Threats), we further have that bothpsybncandeggdropcan be associated as parts of attacker kits for use in the post-compromise phase of an attack. We use these observations as the basis for tagging log events asthreats.
In the next blog post, we plan to use these debtag-derived taggings to define timelines for use in locating interesting events.
Tools Used
JGR to initially explore and visualise data
Protovis 3.2 used to plot graphs in Rails application.