Self Advertisement

———————————————————————-
*** Malware 2010 Call for Papers ***
———————————————————————-
Important dates
———————————————————————-
Submission of papers: June 30, 2010 23:59:59 EST
Notification of Acceptance: August 27th, 2010, 23:59:59 EST
Camera ready paper: September 10th, 2010: 23:59:59 EST
Workshop dates: October 20-21, 2010
———————————————————————-
Overview
———————————————————————-
The 5th IEEE International Conference on Malicious and Unwanted
Software (Malware 2010) will be held in Nancy, France, Oct. 20-21,
2010. The conference is designed to bring together experts from
industry, academia, and government to present and discuss, in an open
environment, the latest advances and discoveries in the field of
malicious and unwanted software. Techniques, economics and legal
issues surrounding the topic of Malware, and the methods to detect and
control them will be discussed.
This year’s conference will pay particular attention to (and will also
be extensively discussed in a panel session) the pressing topic of
“Malware and Cloud Computing”.  As low-cost netbooks become popular,
Google’s Chrome OS enters the mainstream, and social networks
(Facebook, YouTube, Twitter, LinkedIn, and so forth) become
ubiquitous, the security dangers associated with the new computing
paradigm increase exponentially.  In effect, “Cloud Computing”,
Multi-tenant, Single Schema, Single Server Platforms (C2S3P) increase
vulnerabilities by providing a single point of failure and attack for
organized criminal networks. Critical/sensitive/private information is
at risk, and very much like previous technology adoption trends, such
as wireless networks, the dash for success is trumping the need for
security.
Thus, the organizers of Malware 2010 solicit original written
contributions addressing these issues and research questions.
Manuscripts focusing on the security properties of Cloud Computing,
the risks associated with the deployment of such networks, and the
analysis of real incidents where a breach has occurred will be
particularly welcomed.
———————————————————————-
Submissions are solicited in, but not limited to, the following areas
———————————————————————-
Theoretical aspects and new directions on Malware related research
Analysis and measurements of real malware incidents
Worms, viruses and other propagating Malware
Spyware, keystroke loggers, information theft Malware
Honeypots and other sample collection methodologies
Botnet attacks, detection/tracking and defense
Malware economics and black market studies
Code reverse engineering tools and practices
Malware performance, analysis and capture tools
Anti-spam and anti-phishing techniques and practices
Legal aspects of unwanted software use
Malware and its impact in social networking and cloud computing
Rootkit and virtualization techniques
Malware in wireless mobile devices
———————————————————————-
Publication
———————————————————————-
The proceedings of the conference will be published in printed, and
DVD, form and will be included in the IEEE Xplore digital library.  In
addition, the Conference’s Technical Program Committee will select one
manuscript as a recipient of the “Best Paper Award”.  The Best Paper
Award author, together with the authors of a few selected manuscripts
from the conference, will be invited to submit an extended version to
a special issue of the Journal.
———————————————————————-
Paper Submission Information
———————————————————————-
Papers should be submitted through EDAS system at:
Submitted manuscripts must be 10-point font size, and should not
exceed 8 single-spaced pages in length, including the abstract,
figures, and references. Authors whose manuscript exceeds the 8 page
limit may be allowed to include two additional pages for an extra
charge.  However, under no circumstances shall a submitted manuscript
exceed the 10 page limit. Submitted papers must not substantially
overlap with papers that have been published or that are
simultaneously submitted to a journal or a conference with
proceedings.
———————————————————————-
Additional Information
———————————————————————-
For more information on Malware 2010 or if you are interested in
contributing to the organization of the conference please contact
Dr. Fernando C. Colon Osorio, General Program Chair, Malware 2010 at
information concerning submission of an original manuscript to the
conference, please contact the Technical Program Committee Chairs
(TPC).
———————————————————————-
Chairs of Malware 2010
———————————————————————-
Jean-Yves Marion, Nancy University, France, TPC
Noam Rathaus, Beyond Security, USA, TPC
Cliff Zhou, University Central Florida, USA, TPC
———————————————————————-
PC members
———————————————————————-
Anthony Arrott, Trends Micro, USA
Pierre-Marc Bureau, ESET, Canada
Mila Dalla Preda, Verona University, Italy
Saumya Debray, Arizona University, USA
Thomas Engel,  University of Luxembourg, Luxembourg
Jose M. Fernandez, Ecole Polytechnique de Montreal, Canada
Olivier Festor, INRIA Nancy Grand-Est, France
Brent Kang, North Carolina University, USA
Felix Leder, Bonn University, Germany
Bo Olsen, Kaspersky, USA
Jose Nazario, Arbor networks, USA
Phil Porras, SRI International, USA
Fred Raynal, Sogeti, France
Andrew Walenstein, Lafayette University, USA
Jeff Williams, Microsoft, USA
Yang Xiang, Deakin University, Australia
———————————————————————-
Publicity chair: Daniel Reynaud, Nancy University – Loria, France
Local chair: Matthieu Kaczmarek, INRIA Nancy Grand-Est, France
Advertisements

Getting Started with Savarin

(disclaimer: the author of Savarin, Matthieu Kaczmarek, is a colleague working in the office next door and a friend of mine)

Savarin is a free online binary classification service (you can think of it as automatic diff’ing against large databases of programs). It is in beta, not fully polished yet, but you can still squeeze some interesting results out of it. Here is your daily shot of binary analysis, freshly brewed.

You will need:

  • 2 different malware samples in the same malware family. We are going to use Sasser.A (already in Savarin’s database) and an unpacked Sasser.G (md5 b973853d0863070aca89ce00d4ee0fb9 [offensivecomputing.net])
  • IDA with IDAPython for the actual diff’ing (I have IDA 5.5, I don’t know if this works with the free version)

Let’s go:

  1. open Savarin
  2. in “Classification against custom database”, choose SasserA
  3. upload the Sasser.G sample
  4. in the results page, click More to see the similarity with other binaries in the Sasser family
  5. you can see that the sample is 41.95% similar to a sample with md5 edc66a4031f5a41f9ddf08595a1d4c92

At this point, you have a classification of a sample against a (small) database of programs. You can therefore see the distance between this sample and other samples. If you ask me, it’s a lot better to see that unknownsample.exe is 80% similar to badguy.exe and 90% similar to badguy2.0.exe than just “infected” or “not infected”.

For the actual diff’ing, follow these steps:

  1. open the Sasser.G sample in IDA
  2. download the IDAPython analysis report on Savarin’s analysis page (this report contains all the data needed to visualize the binary differences in IDA)
  3. execute the IDAPython analysis report
  4. right now, the situation is pretty anticlimactic since you should see no change apart from a few lines in the console. Wait until next step for the interesting stuff. Yes, you had nothing to do in this step, so what?
  5. type SavColor(‘md5.edc66a4031f5a41f9ddf08595a1d4c92’, 0x0088ff) in the IDAPython console (it is the md5 value of the Sasser.A sample)
  6. type SavComment(‘md5.edc66a4031f5a41f9ddf08595a1d4c92’) in the IDAPython console
  7. this is it, now you can browse the Sasser.G sample, and the common parts with Sasser.A will be colored. Additionally, for two matching instructions you will see the corresponding address in the Sasser.A sample.

The Fine Screenshots:

A look at anti-virtualization in malware samples

In previous posts, I described PuppetMaster, a way to dynamically detect and control CPU-based VMM detection methods in malware samples. We ran it on 2 sets of malware samples, and here are the results.

1. 60k samples from a Nepenthes honeypot

  • 62498 samples on the honeypot
  • 59554 of them being executable files
  • 48404 were analysed “correctly”
  • 13409 samples were terminated due to a 2 minutes timeout

The number of samples trying to detect virtualization is surprisingly low:

  • 71 (0.15%) binaries used at least one anti-virtualization technique
  • 65 (0.13%) binaries used the SIDT anti-virtualization technique
  • 0 (0.00%) binaries used the STR anti-virtualization technique
  • 0 (0.00%) binaries used the SLDT anti-virtualization technique
  • 0 (0.00%) binaries used the SGDT anti-virtualization technique
  • 14 (0.03%) binaries used the VMware channel anti-virtualization technique

2. 25k samples from uh… somewhere

These samples were shared by Paul Royal, so thanks Paul :)

  • 25118 samples
  • 23104 of them being executable files
  • 18670 were analysed “correctly”
  • 8298 samples were terminated due to a 2 minutes timeout

Again, the number of samples trying to detect virtualization is very low:

  • 117 (0.63%) binaries used at least one anti-virtualization technique
  • 56 (0.30%) binaries used the SIDT anti-virtualization technique
  • 0 (0.00%) binaries used the STR anti-virtualization technique
  • 2 (0.01%) binaries used the SLDT anti-virtualization technique
  • 6 (0.03%) binaries used the SGDT anti-virtualization technique
  • 58 (0.31%) binaries used the VMware channel anti-virtualization technique

Conclusion

There are a few potential reasons why the numbers are so low:

  1. the samples used other techniques that we do not support (such as detecting the VMware tools, or hardware version)
  2. or the samples we got are really not representative of malware samples in the wild. Indeed, our 60k samples contain mostly Allaple samples.
  3. or anti-virtualization techniques are not that common in actual malware samples…

It would be interesting to run the test on better malware repositories, unfortunately this is not something obvious to get our hands on. So if you have a big malware repo ready to be dissected, and you would like to share them with an academic lab for free, I’d be glad to hear from you: reynaudd at loria dot fr.

Do We Really Need Malware Analysis?

Recently I’ve been wondering, how is malware analysis different from traditional program analysis? The fundamental reason is that programs can generally self-modify themselves. There is a direct consequence: with malware we have to admit that we don’t have static access to the program listing (thus preventing standard program analyses). And since turning self-modifying code (SMC) into normal code is undecidable, we end up only with technical (i.e. partial) solutions. This is why virtually every paper on malware analysis will only be a report on how a given technology/implementation is better/faster/stronger than the others.
This has a corollary too: since we have only partial solutions, malware authors actively implement techniques to defeat our implementations. This opens a sub-research field: the production of techniques to defeat the analysis-defeating techniques. Yes, there is some irony in this, for instance this about packing -> emulation-based unpacking -> anti-emulation techniques -> other-wonderful-unpacking-techniques…
Now, you might wonder, how did we get into this quagmire? As Schneier (http://www.schneier.com/blog/archives/2007/05/do_we_really_ne.html) pointed it out before me, this is an accident – a historic by-product of the way the IT industry evolved. The x86 architecture allowed self-modifying code, and operating systems did nothing to prevent or regulate that. And bam, a research niche was born.

omgwtfRecently I’ve been wondering, how is malware analysis different from traditional program analysis? The fundamental reason is that programs can generally self-modify themselves. There is a direct consequence: with malware we have to admit that we don’t have static access to the program listing (thus preventing standard program analyses). And since turning self-modifying code into normal code is undecidable, we end up only with technical, partial solutions. This is why virtually every paper on malware analysis will only be a report on how a given technology/implementation is better/faster/stronger than the others.

This has a corollary too: since we have only partial solutions, in some cases they don’t work. And malware authors actively exploit that fact, by implementing techniques to defeat our implementations. This opened a sub-research field: the production of techniques to defeat the analysis-defeating techniques. Yes, there is some irony in this, for instance think about packing -> emulation-based unpacking -> anti-emulation techniques -> other-wonderful-unpacking-techniques…

Now, you might wonder, how did we get into this quagmire? As Schneier pointed it out before me, this is an accident – a historic by-product of the way the IT industry evolved. The x86 architecture allowed self-modifying code, and operating systems did nothing to prevent or regulate that. And bam, a research niche was born.

Puppetmaster Strikes Back

Vincent Mussot and I implemented new virtualization counter-countermeasures in puppetmaster. This time we can detect and thwart 6 tests out of 7 in ScoopyNG. In addition to the SIDT test, we counter the SLDT, SGDT and STR techniques in a similar way: instrument the binary until one of these instructions is found, intercept the memory address it writes to and patch the return value.

The 2 other tests use the VMware backdoor (the in eax instruction, with a magic value in eax and edx). We thwart it by detecting the backdoor trigger, and changing the magic values (this way an exception is raised, as if there were no backdoor). We then restore the magic values to add a little bit of stealth.

You can compile and run puppetmaster with the latest version (27887) of Pin on Windows and Linux. The makefile for the free version of Visual C++ is given below.

// puppetmaster.cpp
// Usage: pin -t <puppetmaster.dll> -- <binary> [arguments]\n
// Currently supported anti-virtualisation techniques: SIDT, SLDT, SGDT, STR, VMWare channel
// works with pin-2.6-27887-msvc9

#include <string>
#include "pin.H"

int needRestore = 0;

VOID poisonSIDT(ADDRINT memIp) {
 char *data = (char *)memIp;
 unsigned int* m = (unsigned int *)(data+2);
 *m = 0xd00dbeef; // if ((idt_base >> 24) == 0xff) -> vmware detected
}

void poisonSLDT(ADDRINT memIp) {
 char *data = (char *)memIp;
 unsigned int* m = (unsigned int *)(data);
 *m = 0xdead0000; // if (ldt_base != 0xdead0000) -> vmware detected
}

void poisonSGDT(ADDRINT memIp) {
 char *data = (char *)memIp;
 unsigned int* m = (unsigned int *)(data+2);
 *m = 0xdeadbabe; // if ((gdt_base >> 24) == 0xff) -> vmware detected
}

void poisonSTR(ADDRINT memIp) {
 char *data = (char *)memIp;
 unsigned int* m = (unsigned int *)(data);
 *m = 0xbebaadde; // if ((mem[0] == 0x00) && (mem[1] == 0x40)) -> vmware detected
}

void poisonVMWareChannel() {
 unsigned int EAX_save;
 unsigned short int DX_save;
 __asm {
 mov EAX_save, eax
 mov DX_save, dx
 }
 if ((EAX_save == 0x564D5868) && (DX_save == 0x5658)){
 __asm {
 mov dx, 0x0004
 }
 needRestore = 1;
 }
 else needRestore = 0;
}

void restoreVMWareChannel() {
 if (needRestore == 1) {
 __asm {
 mov eax, 0x564D5868
 mov dx, 0x5658
 }
 needRestore = 0;
 }
}

VOID Instruction(INS ins, VOID *v) {
 string buffer = INS_Disassemble(ins);
 if (buffer.substr(0,4) == "sidt")
 INS_InsertCall(ins, IPOINT_AFTER, (AFUNPTR)poisonSIDT, IARG_MEMORYWRITE_EA, IARG_END);
 else if (buffer.substr(0,4) == "sldt")
 INS_InsertCall(ins, IPOINT_AFTER, (AFUNPTR)poisonSLDT, IARG_MEMORYWRITE_EA, IARG_END);
 else if (buffer.substr(0,4) == "sgdt")
 INS_InsertCall(ins, IPOINT_AFTER, (AFUNPTR)poisonSGDT, IARG_MEMORYWRITE_EA, IARG_END);
 else if (buffer.substr(0,3) == "str")
 INS_InsertCall(ins, IPOINT_AFTER, (AFUNPTR)poisonSTR, IARG_MEMORYWRITE_EA, IARG_END);
 else if (buffer.substr(0,6) == "in eax") {
 INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)poisonVMWareChannel, IARG_END);
 INS_InsertCall(ins, IPOINT_AFTER, (AFUNPTR)restoreVMWareChannel, IARG_END);
 }
}

int main(int argc, char * argv[]) {
 PIN_Init(argc, argv);
 INS_AddInstrumentFunction(Instruction, 0);
 PIN_StartProgram();
 return 0;
}

Here is the Nmakefile for the Nmake utility:


# Nmakefile
######################################################################################
# This is the NMAKE file for building and testing PIN toos contained in one of the
# subdirectories of the PinTool project or PIN kit.
#
# For description of targets and options, see Nmakefile in the root directory.
######################################################################################

!if "$(PIN_HOME)"==""
PIN_HOME=..
!endif

# Define tools to be buit and tested
######################################################################################
COMMON_TOOLS=puppetmaster.dll

# Include building and testing rules from the root Nmakefile.
######################################################################################
INCLUDE_SUB_RULES=1
!INCLUDE $(PIN_HOME)\Nmakefile

Build instructions:

  • download and install Pin and Visual C++
  • under $PIN_HOME\source\tools, create a puppetmaster directory
  • put puppetmaster.cpp and the Nmakefile in that directory
  • from that directory, run ..\nmake.bat puppetmaster.dll

Now you can run it and use your newly acquired ninja skills on ScoopyNG:

<pre>C:\pin-2.6-27887-msvc9-ia32_intel64-windows\source\tools\puppetmaster>pin -t obj-ia32\puppetmaster.dll -- ScoopyNG.exe
####################################################
::       ScoopyNG - The VMware Detection Tool     ::
::              Windows version v1.0              ::

[+] Test 1: IDT
IDT base: 0xd00dbeef
Result  : Native OS

[+] Test 2: LDT
LDT base: 0xdead0000
Result  : Native OS

[+] Test 3: GDT
GDT base: 0xdeadbabe
Result  : Native OS

[+] Test 4: STR
STR base: 0xdeadbabe
Result  : Native OS

[+] Test 5: VMware "get version" command
Result  : Native OS

[+] Test 6: VMware "get memory size" command
Result  : Native OS

::                   tk,  2008                    ::
::               [ www.trapkit.de ]               ::
####################################################

Note: we do not support the last test in ScoopyNG because Pin does not currently support far rets in different code segments. But as far as I can tell the bug that this last test uses has been patched, I was not able to trigger it. It should probably be considered deprecated.

Reversing CUDA Software

After my Ruxcon talk on GPGPU malware, some people doubted that malware could use GPUs at all and that even if malware used GPUs, they would just be like normal malware (and since I did not provide any code sample at the conference, I can understand the frustration).

Here is a small code sample to convince the unconvinced: it contains encrypted strings, that are sent on the GPU to be decrypted. And once decrypted, they are executed in a shell.

#include <stdio.h>
#include <cuda.h>
#define MAX_SIZE 255

// caution: kickass encryption ahead
__global__ void decodeOnDevice(char *a) {
  char cap;
  int i = 0;
  while(a[i] && i<MAX_SIZE) {
    cap = a[i] & 32;
    a[i] &= ~cap;
    a[i] = ((a[i] >= 'A') && (a[i] <= 'Z') ? ((a[i] - 'A' + 13) % 26 + 'A') : a[i]) | cap;
    i++;
  }
}

int main(void) {
  char *temp_host;       // pointers to host memory
  char *temp_device;     // pointers to device memory
  char commands[2][MAX_SIZE];
  int i;

  // allocate arrays on host
  temp_host = (char *)malloc(MAX_SIZE);

  // allocate arrays on device
  cudaMalloc((void **) &temp_device, MAX_SIZE);

  // initialize host data
  memset(commands[0], 0, MAX_SIZE);
  memset(commands[1], 0, MAX_SIZE);

  // these are the encoded commands
  memcpy(commands[0], "rpub Jung vf lbhe anzr, unaqfbzr xavtug?", strlen("rpub Jung vf lbhe anzr, unaqfbzr xavtug?"));
  memcpy(commands[1], "rpub - Fve Tnynunq... gur Punfgr.", strlen("rpub - Fve Tnynunq... gur Punfgr."));

  for(i = 0; i<2; i++) {
    memset(temp_host, 0, MAX_SIZE);
    memcpy(temp_host, commands[i], strlen(commands[i]));

    // send data from host to device
    cudaMemcpy(temp_device, temp_host, MAX_SIZE, cudaMemcpyHostToDevice);

    // data copied on device, invoking kernel
    decodeOnDevice <<< 1, 1 >>> (temp_device);

    // retrieve data from device
    cudaMemcpy(temp_host, temp_device, MAX_SIZE, cudaMemcpyDeviceToHost);

    // execute the decoded command
    system(temp_host);
  }
}

PIN me if you can

Or how to escape PIN in 5 instructions, using the self-modification technique seen in the previous post. Ready ? Go:

#include <stdio.h>
main() {
  asm("call foo\n\t"
      "foo: pop %rax\n\t"
      "movl $0x4004e7, 10(%eax)\n\t"  // put @nottraced() in the next mov
      "movl $0x4004fb, %eax\n\t"      // @traced(), will be overwritten
                                      // by @nottraced() if not instrumented
      "call *%rax\n\t");
}
// we don't want PIN to analyse this
nottraced() {
  printf("trace me if you can!\n");
}
// we want PIN to analyse this, a dummy function
traced() {
  printf("you're not supposed to get here\n");
}

As usual: compile, make the .text section and the program header writable, and run.

reynaudd@lhs-2:~/test/packed$ ./escape2
trace me if you can!
reynaudd@lhs-2:~/test/packed$ pin -t ../pin-2.5-24110-gcc.4.0.0-ia32_intel64-linux/source/tools/ManualExamples/obj-intel64/inscount0.so -- ./escape2
you're not supposed to get here

‘Nuff said.

UPDATE: as the authors of PIN pointed out, this situation in handled correctly by PIN with the option -smc_strict. That’s because for performance reasons (and standards compliance), PIN makes the assumption that there is at least a taken branch between a modification of the code and its execution (i.e. no basic block modifies itself). My example violates this assumption.