UPDATE 2011-08-11: As of today, Flowtbag is in a working state which can be compared to NetMate. Development is still active, however, the execution and setup of Flowtbag is much more simple. If you’re not terribly concerned about configuration, it may be a better solution to your problem.
We use NetMate at NIMS to convert capture files of network traffic into flow statistics. NetMate lets you generate a comma seperated value file containing flows from a capture file . A flow is a sequence of packets from a source to a destination (IP and port combination) over a certain period of time (See RFC 2724).
NetMate is an open source tool for *nix systems, and combined with netAI one can easily generate the statistics we use at the lab.
This article should get you started on converting network captures into a set of flows and their attributes including a modified script to capture header lengths.
Installing NetMate with netAI
Head over to the NetMate-flowcalc and pick up the NetMate+netAI package. This package includes some modifications I made to the code for readability as well as calculates total header length of flows.
Extract the contents, and go to the directory in a terminal.
./configure --enable-threads --prefix=<install dir> make make install
If you don’t include the –prefix in the configure stage, NetMate will install to the default directory /usr/local/bin/ and you will need to be root for the make install command. (so use su, sudo, or whatever).
A million things can go wrong at this stage. Fear not: although NetMate is older than time itself, it will compile. You just need to figure out how to make it compile. If you run into problems, check below for a solution. If it isn’t in here, then either try to fix it yourself, or let me know. Send me an email, leave a comment, or whatever works. If you decide to fix it on your own, please email me with the error an how you fixed it, so that I may include it here.
Make sure you have the following libraries installed
readline libpcap libxml2 libxslt libcurl
If you’re getting errors related to libtool, you’re likely experiencing an error because of the discrepancies between different versions of libtool. Try the following code to rebuild and reconfigure libtool.
Then start over from the beginning.
error: asm/bitsperlong.h: No such file or directory
This might only be related to Gentoo systems. I haven’t had a whole lot of time to look into it. I know that this is caused by a problem in the kernel headers, and you’ll need to make this link to get it to compile. Be sure to make the
/usr/src/linux/include/asm directory if it doesn’t already exist (it unlikely will). This problem is a result of certain headers in the linux kernel not being updated correctly.
mkdir -p /usr/src/linux/include/asm ln -snf /usr/src/linux/include/asm-generic/bitsperlong.h /usr/src/linux/include/asm/bitsperlong.h
If everything went well, you can now use NetMate to calculate flow statistics from a capture file.
<prefix>netmate -r <RULE FILE> -f <CAPTURE>
You probably want to use the rule file included in the package called netAI-rules-stats-ni.xml which is located in the base directory. This creates flows from TCP and UDP packets with a timeout of 600 seconds. Alternatively, if you have flows that are labelled in the Differentiated Services Code Point then you’ll likely want to use the netAI-rules-stats-dscp.xml rule file instead.
IMPORTANT: You’ll need to edit the line
<PREF NAME="Filename">/home/darndt/netmate.out</PREF> to reflect your desired output location.
NOTE: It is also important to note that currently, netmate-flowcalc doesn’t check for a packet in each direction for flows that aren’t TCP, so if you’d like to use other protocols you’ll likely want to remove these instances yourself after processing with script or something.
As mentioned earlier, this particular build will produce an extra two features in adition to the original 38, namely total_fhlen and total_bhlen which correspond to the total forward header length and total backward header length respectively.
The output, if you used one of the rule files included in the package, will be a comma separated list of values. The columns correspond to each attribute in the following arff header (ie. column 1 is srcip).
Formatting the output file for Weka
This is what the header looks like:
@RELATION <40-flow-features> @ATTRIBUTE srcip STRING @ATTRIBUTE srcport NUMERIC @ATTRIBUTE dstip STRING @ATTRIBUTE dstport NUMERIC @ATTRIBUTE proto NUMERIC @ATTRIBUTE total_fpackets NUMERIC @ATTRIBUTE total_fvolume NUMERIC @ATTRIBUTE total_bpackets NUMERIC @ATTRIBUTE total_bvolume NUMERIC @ATTRIBUTE min_fpktl NUMERIC @ATTRIBUTE mean_fpktl NUMERIC @ATTRIBUTE max_fpktl NUMERIC @ATTRIBUTE std_fpktl NUMERIC @ATTRIBUTE min_bpktl NUMERIC @ATTRIBUTE mean_bpktl NUMERIC @ATTRIBUTE max_bpktl NUMERIC @ATTRIBUTE std_bpktl NUMERIC @ATTRIBUTE min_fiat NUMERIC @ATTRIBUTE mean_fiat NUMERIC @ATTRIBUTE max_fiat NUMERIC @ATTRIBUTE std_fiat NUMERIC @ATTRIBUTE min_biat NUMERIC @ATTRIBUTE mean_biat NUMERIC @ATTRIBUTE max_biat NUMERIC @ATTRIBUTE std_biat NUMERIC @ATTRIBUTE duration NUMERIC @ATTRIBUTE min_active NUMERIC @ATTRIBUTE mean_active NUMERIC @ATTRIBUTE max_active NUMERIC @ATTRIBUTE std_active NUMERIC @ATTRIBUTE min_idle NUMERIC @ATTRIBUTE mean_idle NUMERIC @ATTRIBUTE max_idle NUMERIC @ATTRIBUTE std_idle NUMERIC @ATTRIBUTE sflow_fpackets NUMERIC @ATTRIBUTE sflow_fbytes NUMERIC @ATTRIBUTE sflow_bpackets NUMERIC @ATTRIBUTE sflow_bbytes NUMERIC @ATTRIBUTE fpsh_cnt NUMERIC @ATTRIBUTE bpsh_cnt NUMERIC @ATTRIBUTE furg_cnt NUMERIC @ATTRIBUTE burg_cnt NUMERIC @ATTRIBUTE total_fhlen NUMERIC @ATTRIBUTE total_bhlen NUMERIC @DATA
You have two choices. The first is to change this line in the rule file, the second is to prepend the header to the output file seperately to allow multiple outputs to one file.
Automated prepending of header by altering the rule file
This will add a header to each output produced by NetMate.
IMPORTANT: This will create a new header for each file you analyze. If you’d like to analyze several capture files and output them to one location, this can cause problems. This brings forth the benefit of the second method.
Prepend the header manually
I like this method because it allows me to analyze several capture files and output them to one location. Save the above header to a text file. You can then either edit the rule file to output to this header file (all output will be appended), or attach the header when you need the output in .arff formatt.
cat header netmate.out > netmate.arff
Questions, comments, corrections? Please e-mail me at [dan @ my lastname dot ca] or use the commenting option below. If you have found this guide useful, please consider pressing the +1 below!