The Mythical Man-Month Reading Notes

  • the man-month is a fallacious and dangerous myth, for it implies than men and months are interchangeable
  • Brooks’s Law: adding manpower to a late project makes it later (effort is needed to repartition the work, train the new people, and more time is spent on intercommunication)
  • “Good cooking takes time” (Restaurant Antoine’s Chef)
  • discipline is good for art: providing an architecture enhances the creative style of an implementing group
  • the purpose of organization is to reduce the amount of communication and coordination necessary
  • “The programmer delivers satisfaction of a user need rather than any tangible product” (Cosgrove)
  • all repairs tend to destroy structure, to increase the entropy of a system
  • “How does one project get to be a year late? One day at a time”
  • the first step in controlling a project on a tight schedule is to have a schedule with concrete, measurable milestones. A programmer will rarely lie about a milestone if it’s so sharp he can’t deceive himself.
  • chronic schedule slippage is a morale-killer: “if you miss one deadline, make sure you make the next one” (Mac Carthy)
Advertisement

A note on the x86 semantics modeling in JPC

JPC is a full PC emulator (à la Bochs but in Java), including the BIOS, VGA BIOS, floppy drive and other hardware components. Of particular interest to me, is the way x86 instructions are modeled and executed. It works and stays at the binary level (no fancy disassembly), and actually compiles x86 instructions to a simpler microcode language representing “atomic” instructions. This microcode language is then straightforward to execute, although a bit more complex than similar micro-languages (such as VEX, REIL or BIL).

The core of the x86 semantics is contained in the x86 to microcode compiler, found in org.jpc.emulator.memory.codeblock.optimised.ProtectedModeUDecoder.decodeOpcode(). This method takes a binary x86 instruction and decodes its prefices, opcode, modrm, sib, displacement and immediate parameters. Then it delegates the translation of the microcode to this sequence of methods:


writeInputOperands(prefices, opcode, modrm, sib, displacement, immediate);
writeOperation(prefices, opcode, modrm);
writeOutputOperands(prefices, opcode, modrm, sib, displacement);
writeFlags(prefices, opcode, modrm);
<span style="color: #000000;">

For instance, if we take the binary instruction 04 42 (add al, 0x42), it is decoded with opcode = 0x04 and immediate = 0x42. Then based on these values, the instruction is translated to the following microcode sequence:


// writeInputOperands:
LOAD0_AL
LOAD1_IB 0x42
// writeOperation:
ADD
// writeOutputOperands:
STORE0_AL
// writeFlags:
ADD_O8_FLAGS<span style="color: #ffffff;">
</span>

Now, understanding the semantics of an x86 instruction reduces to understanding the semantics of the microcode language. For this, we need the microcode interpreter, which is org.jpc.emulator.memory.codeblock.optimised.ProtectedModeUBlock.execute(). It is a relatively simple execution language (execution-wise), with 5 general-purpose registers but with roughly 750 opcodes. The execution of the above microcodes translates to this Java sequence:


reg0 = cpu.eax & 0xff;
reg1 = 0x42 & 0xff;
reg2 = reg0; reg0 = reg2 + reg1;
cpu.eax = (cpu.eax & ~0xff) | (reg0 & 0xff);
cpu.setZeroFlag((byte)reg0);
cpu.setParityFlag(reg0);
cpu.setSignFlag((byte)reg0);
cpu.setCarryFlag(reg0, Processor.CY_TWIDDLE_FF);
cpu.setAuxiliaryCarryFlag(reg2, reg1, result, Processor.AC_XOR);
cpu.setOverflowFlag(reg0, reg2, reg1, Processor.OF_ADD_BYTE);

Python Idioms: + versus join

I was told to use ”.join([]) instead of the ‘+’ operator in Python. However a (bad) benchmark showed ‘+’ to be a lot faster. I think it is reasonable to say that in some cases ‘+’ is faster, here is my test:

def test0(b, c, d, e, f):
    for i in xrange(10**7):
        a = b + c + d + e + f
    print(a)

def test1():
    l = ['hello ', 'world ', 'with ', '+ ', 'operator']
    for i in xrange(10**7):
        a = ''
	for j in l:
            a += j
    print(a)

def test2():
    l = ['hello', 'world', 'with', 'join', 'function']
    for i in xrange(10**7):
	a = ' '.join(l)
    print(a)

test0('hello ', 'world ', 'with ', '+ ', 'operator')
test1()
test2()

And the result of the test:

$ python -m cProfile -s cumulative test.py
hello world with + operator
hello world with + operator
hello world with join function

   10000007 function calls in 14.968 CPU seconds
   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        ...
        1    6.838    6.838    6.838    6.838 test.py:7(test1)
        1    2.683    2.683    5.113    5.113 test.py:15(test2)
        1    3.016    3.016    3.016    3.016 test.py:2(test0)

So clearly the worst way of using ‘+’ is when iterating over a list of strings and accumulating the concatenations in a variable (function test1). But there is nothing wrong with performing multiple ‘+’ operations in a single line and then storing the result in a variable (function test0).

A quick look at the bytecode of the function confirms this intuition, we can see a bunch of LOADs and ADDs and only one STORE:

>>> import dis
>>> dis.dis(test0)
...
             19 LOAD_FAST                0 (b)
             22 LOAD_FAST                1 (c)
             25 BINARY_ADD          
             26 LOAD_FAST                2 (d)
             29 BINARY_ADD          
             30 LOAD_FAST                3 (e)
             33 BINARY_ADD          
             34 LOAD_FAST                4 (f)
             37 BINARY_ADD          
             38 STORE_FAST               6 (a)

The test was performed with Python 2.5.4 on a Debian sid. Would be nice to see if the results hold for new versions of the Python interpreter.

A Quick Survey on Intermediate Representations for Program Analysis

This is mostly a note to myself, but I guess people interested in automating reverse engineering will be interested at some point in IR suitable for low-level abstractions. I consider top-down IR used by optimizing compilers and bottom-up IR used by decompilers and other reversing tools.

Intermediate Representations for Reverse Engineering

REIL. Used in BinNavi, the Reverse Engineering Intermediate Language defines a very simple RISC architecture (17 instructions), with the nice property that each instruction has at most one side-effect. Thomas Dullien and Sebastian Porst recently presented at CanSecWest an abstract interpretation framework for REIL (paper, slides). It is clearly possible to easily write analyses and transformation passes for REIL without getting into the complexity of the whole x86 architecture, given x86 -> REIL and REIL -> x86 translators.

Here are some sample REIL instructions :

1006E4B00: str edi, , edi
1006E4D00: sub esp, 4, esp
1006E4D01: and esp, 4294967295, esp
1006E4D02: stm ebp, , esp

Language Reference

Hex Rays Microcode. Presented at Black Hat USA 2008 by Ilfak Guilfanov (paper, slides), it is an IR used during decompilation. From the paper: “The microcode language is very detailed and precisely represents how each instruction modifies the memory, registers, and processor condition codes. Typically one CPU instruction is converted into 5-15 microinstructions”. According to the REIL paper, REIL and the microcode language are significantly different, for instance the microinstructions can have a variable number of operands and perform multiple side effects.

Sample microcode:

mov esi.4, eoff.4
mov ds.2, seg.2
add eoff.4, #4.4, eoff.4
ldx seg.2, eoff.4, et1.4
mov et1.4, eax.4

I couldn’t find the language reference.

ELIR. Part of the ERESI project, the goal of ELIR is to simplify static analysis by providing a platform independent abstraction. An overview was presented at Ekoparty08 (slides) and some ideas appeared in Phrack 64, but 30s of Googling didn’t get me to the language reference or a code sample, so that’s all I will say about ELIR for the moment.

Pin Inspection API. PIN, Intel’s Dynamic Binary Instrumentation framework provides a very handy instruction inspection API. This is not an IR but provides the same type of information about complex instructions without having to make giant switch statements. For instance, this is the way to log memory writes with PIN given an instruction:

VOID RecordMemWrite(VOID * addr, UINT32 size) {
    fprintf(trace,",%dW%p", size, addr);
}

// this function is called each time an instruction is encountered
VOID Instruction(INS ins, VOID *v) {
    // isn't that a nice API ?
    if (WRITES && INS_IsMemoryWrite(ins)) {
        INS_InsertPredicatedCall(
            ins, IPOINT_BEFORE, (AFUNPTR)RecordMemWrite,
            IARG_MEMORYWRITE_EA,
            IARG_MEMORYWRITE_SIZE,
            IARG_END);
    }
}

API Documentation

Valgrind IR. On my todo list.

FermaT Transformation System. I’ll have to write something about it someday. Oh lucky you, a wikipedia entry and a bunch of papers!

Optimizing Compilers Intermediate Representations

LLVM Bitcode. This language uses low-level RISC-like instructions in SSA form with type information. It is clean and well defined, and is a very suitable target for platform-independent analysis and optimization. It is designed to convey high-level information in lower level operations, so converting machine code to LLVM bitcode probably requires some intensive work.

Here is the hello world example :

; Declare the string constant as a global constant...
@.LC0 = internal constant [13 x i8] c"hello worldA0"          

; External declaration of the puts function
declare i32 @puts(i8 *)                                           

; Definition of main function
define i32 @main() {
        ; Convert [13 x i8]* to i8  *...
        %cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 *

        ; Call puts function to write out the string to stdout...
        call i32 @puts(i8 * %cast210)
        ret i32 0
}

Language reference

Register Transfer Language. One of the IR used in GCC, it is an architecture-neutral assembly language that represents instructions in a LISP-like form (d’oh), like this:

(insn 2 49 3 test.c:3 (set (mem/c/i:SI (plus:DI (reg/f:DI 6 bp)
                (const_int -20 [0xffffffffffffffec])) [0 argc+0 S4 A32])
        (reg:SI 5 di [ argc ])) 47 {*movsi_1} (nil))

It feels a bit old-fashioned and less clean than LLVM bitcode, but this is just a gut feeling. Use gcc -fdump-rtl-all to see what it looks like.

Side note: the idea of dumping RTL to a file, performing transformations on it and giving this back to GCC is quite common, but RMS qualifies it as “not feasible”, even though the creator of RTL says it is not only feasible but quite useful actually.

Python XML for Real Men Cheat Sheet

Howdy again,

Now let’s do some serious XML output in Python using cElementTree.

# this is included in Python 2.5+
from xml.etree.cElementTree import ElementTree, Element, dump

# let's create the root element
root = Element("teabag")

# give it a child with an attribute
child1 = Element("spam")
child1.attrib["name"] = "value"
root.append(child1)

# and a child with text content
child2 = Element("eggs")
child2.text = "spam and eggs"
root.append(child2)

# print the whole thing to stdout
dump(root)

# or to a file
ElementTree(root).write("teabag.xml")

See the author’s website for downloads and usage information of cElementTree.

Using this API, I was able to create a 47 Mb XML file in a few minutes, burning roughly 300 Mb of heap space. This XML file represents a graph of graphs, namely the control flow graph of each function of IDA Pro. Here are some screenshots, using yEd for the visualization part:

Python XML Cheat Sheet

[UPDATED] Finally xml.dom.minidom sucks balls, it can burn up to hundreds/gigs of megabytes of sweet memory when working with “large” xml files (10 Mb or more). See this post for a really lightweight implementation.

Howdy,

Here is a quick reference of how to create an XML document and output it in Python.

import xml.dom.minidom

# create the document
doc = xml.dom.minidom.Document()

# populate it with an element
root = doc.createElement("teabag")
doc.appendChild(root)

# time to give some children to the root element, one with an attribute for instance
child1 = doc.createElement("spam")
child1.setAttribute("name", "value")
root.appendChild(child1)

# and another one with some text
child2 = doc.createElement("eggs")
text = doc.createTextNode("spam and eggs!")
child2.appendChild(text)
root.appendChild(child2)

# let's get the output, as a string
print doc.toprettyxml()

# you're supposed to get the following output:
#<?xml version="1.0" ?>
#<teabag>
#    <spam name="value"/>
#    <eggs>
#        spam and eggs!
#    </eggs>
#</teabag>

How nice is that ? Yep, a lot.

A Quick Survey on Automatic Unpacking Techniques

This is a non-comprehensive list of papers and tools dealing with automated unpacking. Please let me know if I’ve missed another technique or if I misunderstood any of the techniques below.

Ring0/Ring3 components, using manual unpacking and heuristics

OllyBonE:

OllyBonE (Break on Execution) uses a Windows driver to prevent memory pages from being executed, and an OllyDbg plugin communicating with the driver. As such it is not an automatic unpacker and requires manual tagging of the pages in which the unpacked code is expected to be found.

Technology used: Windows driver to prevent memory page execution, debugger plugin

Handles unknown packers: no.

Drawbacks: requires a priori knowledge of the memory location of the unpacked code, vulnerable to anti-debugging techniques, modification of the integrity of the host operating system due to the driver.

Code Available: yes, http://www.joestewart.org/ollybone/.

Original Site

(Updated) Dream of Every Reverser / Generic Unpacker:

It is a Windows driver used to hook ring 3 memory accesses. It is used in a project called Generic Unpacker by the same author to find the original entrypoint. The tool then tries to find all import references, dumps the file and fixes the imports. It is reported to work against UPX, FSG and AsPack, but not against more complex packers.

Technology used: Windows driver to hook userland memory access

Handles unknown packers: no.

Drawbacks: requires a priori knowledge of the memory location of the unpacked code, modification of the integrity of the host operating system due to the driver.

Code Available: yes, http://deroko.phearless.org/GenericUnpacker.rar.

Original Site

(updated) RL!Depacker

No description for this one, however it looks similar to Dream of Every Reverser / Generic Unpacker.

Code Available: yes,  http://ap0x.jezgra.net/RL!dePacker.rar.

Original Site

(updated) QuickUnpack

Again, no real description, but it looks similar to RL!Depacker and DOER / Generic Unpacker. It is a scriptable engine using a debugging API. It is reported to work against 60+ simple packers.

Code Available: yes, http://www.team-x.ru/guru-exe/?path=Tools/Unpackers/QuickUnpack/

Original Site (in Russian)

Universal PE Unpacker:

This is an IDA Pro plugin, using the IDA Pro Debugger interface. It waits for the packer to call GetProcAddress and then activates single-stepping mode until EIP is in a predefined range (an estimate for the OEP). It only works well against UPX, Morphine, Aspack, FSG and MEW (according to the authors of Renovo).

Technology used: Debugging and heuristics.

Handles unknown packers: no, needs an approximation of the OEP and assumes that the unpacker will call GetProcAddress before calling the original code.

Drawbacks: not fully automatic, very vulnerable to debugger detection, does not necessarily work against all packers or self-modifying code.

Code Available: yes, since IDA Pro 4.9

Original Site

Instruction-level analysis, comparison between written addresses and executed addresses

Renovo:

Built on TEMU (BitBlaze), it uses full system emulation to record memory writes (and mark those memory locations as dirty). Each time a new basic block is executed, if it contains a dirty memory location a hidden layer has been found. Cost: 8 times slower than normal execution. It seems to unpack everything correctly except Armadillon and Obsidium (due to incorrect system emulation ?). It seems to only obtain partial results against Themida with the VM option on.

Technology used: Full system emulation.

Handles unknown packers: yes.

Drawbacks: order of magnitude slowdown, detection of the emulation stage

Code Available: I couldn’t find it.

Original Site, Local Copy

Azure:

Paul Royal’s solution, named after BluePill because it is based on KVM, a Linux-based hypervisor. It uses Intel’s VT extension to trace the target process (at the instruction-level), by setting the trap flag and intercepting the resulting exception. The memory writes are then recorded and compared to the address of the current instruction. According to the paper, it handles every packer correctly (including Armadillo, Obsidium and Themida VM).

Technology used: Hardware assisted virtualization and virtual machine introspection.

Handles unknown packers: yes.

Drawbacks: detection of the hypervisor. Slowdown ?

Code Available: yes, http://blackhat.com/presentations/bh-usa-08/Royal/Royal_Extras.zip.

Original Site, Local Copy

Saffron:

Developed by Danny Quist and Valsmith, a first version uses Intel PIN to dynamically instrument the analyzed code. It actually inserts instructions in the code flow, allowing lightweight fine-grained control (no need for emulation or virtualization), but it modifies the integrity of the packer. A second version modifies the page fault handler of Windows and traps when a written memory page is executed. It has mixed results with Molebox, Themida, Obsidium, and doesn’t handle Armadillo correctly (according to Paul Royal).

Technology used: Dynamic instrumentation, Pagefault handling (with a kernel component in the host operating system).

Handles unknown packers: yes.

Drawbacks: modifies the integrity of the code (with DI) and of the host operating system. It must not work in a virtual machine. The dynamic instrumentation is very slow. The memory monitoring of the pagefault handler is coarse-grained (pages are aligned on a 4k boundary), and therefore some memory access can go unnoticed.

Code Available: dynamic instrumentation available, what about the driver ?

Original Site, Local Copy

(updated) OmniUnpack:

Uses a technique similar to the second version of Saffron: a Windows driver to enforce a W^X policy on memory pages.

Technology used: Pagefault handling  and system call tracing (with a kernel component in the host operating system)

Handles unknown packers: yes.

Drawbacks: modifies the integrity of the host operating system. It must not work in a virtual machine. The memory monitoring of the pagefault handler is coarse-grained, leading to spurious unpacking stages.

Code Available: ?

Original SiteLocal Copy

Pandora’s Bochs:

Developed by Lutz Böhne, it is based on Bochs which is used to monitor memory writes and compare them with branch targets. Interestingly, the assumptions about the program are stated explicitly (which is a GOOD thing) : the unpacking does not involve multiple processes, it does not happen in kernel mode, the unpacked code is reached through a branch instruction (not a fall-through edge), etc… Another interesting point in this approach is that it uses no component in the guest OS (as opposed to Renovo for example), all the information is retrieved from outside the matrix (as with Azure).

Technology used: Full system emulation based on Bochs.

Handles unknown packers: yes.

Drawbacks: As stated in the paper the limitations are speed, compatibility (not all packed samples seemed to run under Bochs), detection of OEP and reconstruction of imports sometimes failed.

Code Available: http://damogran.de/blog/archives/21-To-release,-or-not-to-release-….html

Original Site, Local Copy

Other techniques (comparison with static disassembly or disk image)

Secure and Avanced Unpacking by Sebastien Josse:

The idea developed by Sebastien Josse is to use full system emulation (based on QEMU ?) and to compare the basic blocks that are going to be executed by the virtual CPU with the equivalent address in the file image of the executable. If the memory and the disk version differ, it means that the code has been generated on the fly and therefore a hidden layer has been found. Josse then proposes techniques to rebuild a fully functional executable based on the memory dump. This technique seems to work well (but sometimes requires human intervention) against several packers, including Armadillo, ASProtect, PEtite, UPX, yC…

Technology used:Full system emulation, comparison between memory images and disk images.

Handles unknown packers: yes, manual intervention might be required in some cases.

Drawbacks: slowdown due to the full system emulation, full reconstruction of the unpacked program is not always possible.

Code Available: ?

Original Site

PolyUnpack:

The idea behind PolyUnpack is to address the fundamental nature of unpacking, which is runtime code generation. To identifiy code that has been generated at runtime, PolyUnpack uses a conceptually elegant technique: it first statically analyses the program to build a map of statically accessible code, and then traces the execution of the program. The dynamically intercepted instructions are compared with the static disassembly, if they do not appear in the static disassembly then they have been generated at runtime.

Technology used: comparison between static disassembly and dynamic tracing. The dynamic trace is extracted with single-step debugging APIs.

Handles unknown packers: yes.

Drawbacks: vulnerable to debugger detection. Note that this is a limitation of the implementation, not of the concept.

Code Available: http://polyunpack.cc.gt.atl.ga.us/polyunpack.zip (updated 26/06/2009)

Original Site, Local Copy

Trusted Computing, change we’re supposed to believe in

I just come back from ETISS08, the 3rd European Trusted Infrastructure Summer School at Oxford. The event was really well organised with a number of high-level speakers, including David Grawrock, Graeme Proudler (HP Labs), Paul Congdon (HP ProCurve CTO), Robert Thibadeau (Seagate Chief Technologist), Paul England (Microsoft) and many others.

For those who don’t know what Trusted Computing is, have a look at the wikipedia page, insight from Bruce Schneier or pure hatred from RMS. The idea is to use a hardware component called the Trusted Platform Module as a secure cryptographic device to secure the boot process of your computer and ensure the integrity of “important components”.

Here are some key concepts and concerns about Trusted Computing and my totally biased opinion about them:

Tamper Resistance: It is quite common to hear that the TPM is a tamper resistant hardware module (that’s what security hardware is for, right ?). Finally the story is quite different, the Trusted Computing Group doesn’t care much about hardware attacks. Mainly because tamper resistance = $$$.

Secure boot: this is one of the core features of TC, it allows you to boot securely by measuring (the rest of the world calls that hashing) each component before executing it and storing the hash securely in the TPM. BitLocker already uses that feature, and if your boot sequence has changed, BitLocker won’t be able to automatically extract the encryption key of your hard disk. This is a nice feature, however notice that we can only “lock” the boot process by checking if “something” has changed.

Attestation: the other Big Thing about TC. Remember that hashes about your boot process, operating system and other are stored in the TPM ? Now the idea of attestation is that some remote party would like to ensure that you are using a trusted OS or piece of software (notice the use of “trusted” and not “secure”). Well, TC can do that for you ! Your TPM contains a secret key called the Endorsement Key which can be used to sign the value of your Platform Configuration Register (i.e. the hash of your system configuration) and send that over the network. The remote party can check that the hash has been signed by a valid EK and is a hash of a valid boot with a trusted OS and software. Now, if you don’t see the obvious implementation problems, think again. Basically, there are so many valid systems, configurations and software that the number of hashes is unmanageable (plus they change each time you apply patches and they depend on the order in which you hash things). And you can just attest remotely that the software was OK when it was launched, not that it hasn’t been exploited in the mean time (and as David Grawrock pointed, hashing a memory image of a running program is HARD).

Privacy: there have been concerns about having a secret key (the EK) in a hardware module, somewhat out of reach of the user, which might be used to aggregate personal information on remote services. Therefore the system has been carefully designed to use the EK in very limited circumstances and to use indirect signing for attestation using Attestation Identity Keys or zero-knowledge proofs such as Direct Anonymous Attestation. There was a consensus that your privacy is more at risk when you purchase something on the internet or have a Facebook account than when you use a TPM, and I think this is true.

Security: you might have noted that the keyword here is trusted, not secure. There have been a lot of philosophical debate at the summer school about what is to be trusted, what is trustworthy, and how all that relates to correctness and security. Overall TC is not a revolution for computer security, it’s more a way to bypass the question.

Open Source Software: a lot of concern about TC comes from the FOSS community, probably because the Trusted Computing Group is manned by large companies and you can somehow feel a big red hovering DRM behind it. However, most practical sessions were based on Linux and Xen and there is an opentc.net research initiative. Plus there was a really hot debate when Graeme Proudler asked “can you trust open source software ?”

Digital Rights Management: this is probably the most common concern about DRM. Why should I trust Trusted Computing if it’s just a tool to implement stronger DRM techniques ? It’s true that DRM is one of the possibilities of TC, but honestly I think than most actual DRM implementations will rely on attestation (to ensure you use a “trusted” mp3 player for example), so as long as attestation doesn’t work in practice, there can be no DRM based on it.

Loss of Control: the other big concern about TC is the idea to have a hardware chip containing some secret code and cryptographic keys that you can’t control (to some extent). Graeme and David insisted that the Platform Owner (TC terminology) remains in control of the TPM, by design. But I think there is still some loss of control due to remote attestation, and this is also by design. Because the whole point of remote attestation and the technologies based on it (such as Trusted Network Connect) is not to protect your computer from malware, but to protect your corporate network from you. Therefore if the remote party doesn’t want to attest your OS/software configuration, you’re out. In Ian Levy’s terminology, it is a way to mitigate the “wetware” risk. This is probably a good way to control your network infrastructure and corporate network, but doesn’t solve the porn surfing workstation problem. Remember, the monkey behind the keyboard really wants to see the dancing bunnies !

Oh and by the way, I also met the cool guy behind Joebox there ! If you have a suspicious executable file to analyse, Joebox is the right tool for you.

(updated: Sven Türpe has posted some nice photos of Oxford and his presentation about BitLocker on his blog)

etiss-group

Ant Skeleton Build.xml

Here is a quick Ant skeleton build file, based on the Ant manual.

  • Install and configure Ant (you must set a few environment variables)
  • Save the following file as ‘build.xml’ in your base directory (noted as ‘.’).
  • Replace PROJECTNAME with what you want and MAINCLASSNAME with the class that must be launched in your resulting jar (the class with a static void main(String[]) method).
  • Put your source files in ./src, and any external libraries (jar files) in ./lib. Ant will then compile the classes in ./build/classes and store them as a jar file in ./build/jar.
  • Run Ant in the base directory with the command ‘ant’ or ‘ant main’, it will execute the directives in build.xml (it will clean, compile, package and run your program).
<project name="PROJECTNAME" basedir="." default="main">

  <!-- set global properties for this build -->
  <property name="main-class"  value="MAINCLASSNAME"/>
  <property name="src.dir"     value="src"/>
  <property name="build.dir"   value="build"/>
  <property name="classes.dir" value="${build.dir}/classes"/>
  <property name="jar.dir"     value="${build.dir}/jar"/>
  <property name="lib.dir"     value="lib"/>

  <!-- adds every jar in the lib directory to the classpath-->
  <path id="classpath">
  <fileset dir="${lib.dir}" includes="**/*.jar"/>
  </path>

  <target name="clean">
    <delete dir="${build.dir}"/>
  </target>

  <target name="compile">
    <mkdir dir="${classes.dir}"/>
    <javac srcdir="${src.dir}" destdir="${classes.dir}" classpathref="classpath"/>
  </target>

  <target name="jar" depends="compile">
    <mkdir dir="${jar.dir}"/>
    <jar destfile="${jar.dir}/${ant.project.name}.jar" basedir="${classes.dir}">
      <manifest>
        <attribute name="Main-Class" value="${main-class}"/>
      </manifest>
    </jar>
  </target>

  <target name="run" depends="jar">
    <java fork="true" classname="${main-class}">
      <classpath>
        <path refid="classpath"/>
        <path location="${jar.dir}/${ant.project.name}.jar"/>
      </classpath>
    </java>
  </target>

  <target name="clean-build" depends="clean,jar"/>

  <target name="main" depends="clean,run"/>
</project>