(disclaimer: the author of Savarin, Matthieu Kaczmarek, is a colleague working in the office next door and a friend of mine)
Savarin is a free online binary classification service (you can think of it as automatic diff’ing against large databases of programs). It is in beta, not fully polished yet, but you can still squeeze some interesting results out of it. Here is your daily shot of binary analysis, freshly brewed.
You will need:
- 2 different malware samples in the same malware family. We are going to use Sasser.A (already in Savarin’s database) and an unpacked Sasser.G (md5 b973853d0863070aca89ce00d4ee0fb9 [offensivecomputing.net])
- IDA with IDAPython for the actual diff’ing (I have IDA 5.5, I don’t know if this works with the free version)
- open Savarin
- in “Classification against custom database”, choose SasserA
- upload the Sasser.G sample
- in the results page, click More to see the similarity with other binaries in the Sasser family
- you can see that the sample is 41.95% similar to a sample with md5 edc66a4031f5a41f9ddf08595a1d4c92
At this point, you have a classification of a sample against a (small) database of programs. You can therefore see the distance between this sample and other samples. If you ask me, it’s a lot better to see that unknownsample.exe is 80% similar to badguy.exe and 90% similar to badguy2.0.exe than just “infected” or “not infected”.
For the actual diff’ing, follow these steps:
- open the Sasser.G sample in IDA
- download the IDAPython analysis report on Savarin’s analysis page (this report contains all the data needed to visualize the binary differences in IDA)
- execute the IDAPython analysis report
- right now, the situation is pretty anticlimactic since you should see no change apart from a few lines in the console. Wait until next step for the interesting stuff. Yes, you had nothing to do in this step, so what?
- type SavColor(‘md5.edc66a4031f5a41f9ddf08595a1d4c92’, 0x0088ff) in the IDAPython console (it is the md5 value of the Sasser.A sample)
- type SavComment(‘md5.edc66a4031f5a41f9ddf08595a1d4c92’) in the IDAPython console
- this is it, now you can browse the Sasser.G sample, and the common parts with Sasser.A will be colored. Additionally, for two matching instructions you will see the corresponding address in the Sasser.A sample.
The Fine Screenshots: