April 16, 2010

Ad Hoc Queries

Posted in Uncategorized at 4:05 pm by dgcombs

The firewall database I’ve been using has been designed to store very large integers for the IP addresses. IP Addresses that most humans are used to consist of four numbers separated by periods, each number less than 256. In order to store these as a whole number, we’re going to have to do some math. But that kind of math is hard to do on your fingers. Some Internet-aware systems such as MySQL and PHP already have a functions which convert an IP address into an integer. I’ll use their convention to do the same thing directly in MongoDB.

Let’s say 10.20.30.15 is an IP address we want to store as an integer. I’ll multiply the first number by 256 to the third power (256 ** 3), the second number by 256 squared (256 ** 2), the third number by 256 (256 * 1) and the final number by one (256 ** 0). Adding these numbers together, will be a unique integer representation of the IP address. Here’s the JavaScript for this function.
function inet_aton(ip) {     var ips = ip.split(".");     var iplong = ips[0] * Math.pow(256, 3) + ips[1] * Math.pow(256, 2) + ips[2] * 256 + parseInt(ips[3]);     return iplong; }
The parseInt function is used to make sure the string of characters making up the final number of the IP address gets used as a number by JavaScript. The other octets are involved in mathematical operations, but this one is taken by itself (since it is multiplied by one). parseInt is an easy way to convert it to an integer.
The inverse takes the large integer number and returns an IP address.

function inet_ntoa(ip) {

var ip1 = Math.floor(ip / Math.pow(256, 3));

var ip2 = Math.floor(ip % Math.pow(256, 3) / Math.pow(256, 2));

var ip3 = Math.floor(ip % Math.pow(256, 3) % Math.pow(256, 2) / 256);

var ip4 = Math.floor(ip % Math.pow(256, 3) % Math.pow(256, 2) % 256 / 1);

return ip1 + “.” + ip2 + “.” + ip3 + “.” + ip4;

}

Even with these functions at my fingertips, using them is cumbersome. So MongoDB offers a feature called server side code execution. This feature allows you to store JavaScript code and functions along with the database as database values and then execute it using db.eval(). In this case, I want to store the function inet_aton along with the database. In order to do this I’ll use a command like this:

db.system.js.save( { _id : “foo” , value : function( x , y ) { return x + y; } } );

system.js is a special collection within the database ready to save functions for reuse. MongoDB uses the _id field as a unique representation of each item in the database. In this case it is pressed into service as the public name of the server side function. Once saved, I can use inet_aton in any database expression. If I want a count of all transactions where 10.64.0.115 is the source IP address, it isn’t necessary for me to make that quick calculation in my head any more.

> db.logs.find({src:db.eval(“inet_aton(‘10.64.0.115’)”)}).count();

> 2579492

The implementation of FW1-Loggrabber permits the use of a very large integer to store the date/time. This is the way UNIX does it. UNIX calculates time internally by the number of seconds since the beginning of the epoch which it defines as January 1, 1970. Some contend this is based on the date man first walked on the moon. Others believe it is the zeroth second of the first operating system. Either way, it helps index firewall logs but also hinders quick, ad hoc searching.
JavaScript comes to the rescue with a Date feature that makes this conversion in either direction a snap. date2unix converts a date string in the form of “April 1, 1990” to a UNIX integer number representation.

function date2unix(datestr) {

var myDate = new Date(datestr);

return myDate.getTime() / 1000;

}

The following function, unix2date goes the other way, converting a UNIX number of seconds since epoch into a date string, including the number of minutes and seconds.

function unix2date(nbr) {

var myDate = new Date(nbr * 1000);

return myDate.toLocaleString();

}

Now to find all the firewall log entries that occurred at midnight on Tax Day, 2010, I use the command:

> db.logs.find({time:db.eval(“date2unix(‘April 15, 2010’)”)}).count();

73

Combined with a well indexed MongoDB collection, I can now do quick ad hoc searches for specific types of information. So when that over wrought manager steps into my office and furtively discloses he believes one of his best workers is looking for a job on http://www.monster.com… I can tell him. ” Not today, my friend. Not today.”

Posted via email from Meyeview (Posterous Style)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: