If you don’t feel like reading, here is the short version: **at most 41.25% going first, and 50.98% going second**.

Let’s start simple, and assume that we are going first (i.e., we draw 3 cards) and have only 1 copy of Undertaker. Of course, no one plays only one copy of Undertaker but just bear with me. Using binomial coefficients, we know that there are C(30, 3) ways to draw 3 cards out of 30, which is 4060. Out of these possible 4060 starting hands, how many have the only Undertaker? The answer is C(29, 2) = 406 (i.e., Undertaker plus all the 2 card combinations of the remaining 29 cards), so** the probability with one Undertaker is 406 / 4060 = 10%**.

Now, our beloved Hunters more likely have 2 copies of Undertaker. This complicates the analysis a bit, and it’s actually simpler to look at it this way:

P(having at least one Undertaker in the initial draw) = 1 – P(not having any Undertaker in the initial draw)

There are C(28, 3) ways to not draw Undertaker at all (all the 3 card combinations in the remaining 28 cards of the deck), so **the probability with 2 undertakers is 1 – 3276 / 4060 = 19.31%**.

If we assume that we toss everything that is not an Undertaker in the opening hand, we get to draw 3 cards again. So the probability to have at least one Undertaker after mulligan is 1 – P(no Undertaker before mulligan) * P(no Undertaker after mulligan). We already know the probability to get no Undertaker in 3 cards from above (that is: 1 – 19.31% = 80.69%). For the probability after mulligan, we apply the same logic but in the remaining 27 cards, so **the probability after mulligan is 1 – 0.8069 * C(25, 3) / C(27, 3) = 36.55%**.

Now, what matters is to play an Undertaker by turn 1, which means there is an additional card draw to take into account. The probability to have at least one is P(not having any after mulligan) + P(not having any after mulligan) * P(drawing one on turn 1) = 36.55% + (1 – 36.55%) * 2 / 27 = **41.25%**

When going second, the initial card draw is 4, so we just need to apply the same logic as in the steps above. In the end, we find that the **probability Undertaker will show up on turn 1 for the second player is 50.98%**.

Of course, there is nothing special about the Undertaker, this gives you an estimate of how likely you are to mulligan what you want. I have double-checked my numbers by randomly sampling 1 million openings, the code is available in this gist.

**EDIT: I updated the numbers to reflect that after the mulligan phase, you can’t draw the cards that have been tossed (as pointed out by masharpe)**

A gentle introduction to abstract interpretation

Get Big Fast vs Organic Growth (Joel Spolsky’s talk at Startup School 2012, @12:50)

]]>]]>Life… is like a grapefruit. It’s orange and squishy, and has a few pips in it, and some folks have half a one for breakfast.

In object-oriented languages, constructing a call graph is complicated by the presence of virtual method calls. This is a problem known as the *dynamic dispatch problem*, and is common in many programming languages and paradigms. The problem can be formulated as follows for Java: if m is a virtual method (i.e. not private or static), then the method that o.m() resolves to depends on the run-time type of o (as opposed to its declaration type). Determining the run-time type of o requires points-to information (what object may o point to?), which in turn requires a call graph to be computed (because what values o may point to depend on what methods get called). This mutual recursion is a fundamental problem, that is not quite trivial to tackle.

A number of static analyses have been proposed to do the job, but since the problem is undecidable no algorithm can return both a precise and correct answer. As a result, we have to decide if we want fast or precise analyses, since they can not be both. Some of the characteristics that affect the precision and cost of the algorithms are:

- flow sensitivity: does the order of statements matter?
- context sensitivity: how to distinguish between two calls of the same procedure in different contexts?
- aggregate structure modeling: how are arrays, lists, and hash tables abstracted?
- heap modeling: how to differentiate the potentially unbounded number of objects that may be created at a given allocation site?

Below are some notes about flow sensitivity and context sensitivity, followed by a discussion of some of the classic algorithms for call graph construction.

- Flow-insensitive analyses are only concerned with what statements are present in the program, not with the order or the reachability of statements. They return properties of variables valid in the whole program.
- Flow-sensitive analyses compute what values may flow to variables by considering valid paths in the program. They return properties of variables valid at certain program points. Therefore, they are more precise but more expensive to compute.

Example (courtesy of smcc): a parity analysis for the following code snippet yields different results depending on the flow sensitivity of the analysis. A flow-insensitive analysis would return that x could be anything, while a flow-sensitive analysis would return that x is even at line 3 and odd at line 4.

void someMethod(int y) { int x; x = 2 * y; x = x + 1; }

A common way to obtain a form of flow sensitivity is to transform the program in SSA form (thereby encoding control dependencies in variable names) and then to apply flow-insensitive analyses. Algorithms for efficient translation to SSA form are given in Appel, *Modern Compiler Implementation in Java*. For an implementation that transforms Java bytecode to SSA form, see Soot. In SSA form, the above example would become:

void someMethod(int y) { int x_1 = 2 * y; int x_2 = x_1 + 1; }

A flow-insensitive analysis would then be sufficient to return that x_1 is even and x_2 is odd.

Although instances of the standard dataflow frameworks (Kildall, *A Unified Approach to Global Program Optimization*, POPL 1973) lend themselves particularly well to flow-sensitive analyses, Nielson et al. in *Principles of Program Analysis* give an example of a flow-insensitive instance, by simply considering empty kill sets in the dataflow equations (i.e., it loses information about which variables are overwritten).

Context sensitivity matters for inter-procedural analyses, where a single method may be invoked in different contexts. So, “what is a context?”, you may ask. Well, it is a parameter of the analysis, so the answer is “it depends”. The most common abstractions seem to be:

- call sites, so that the same method invoked from different line numbers would be distinct. However, if a method is invoked multiple times from the same program point (in a loop for example), then all these invocations would look exactly the same to the analysis, because they are all abstracted to come from the same place.
- receiver objects, so that the same method applied to two different (abstract) objects would be distinct. Objects are commonly abstracted by their allocation site (i.e., the place where they are new’ed). The same restriction as above applies, multiple allocations in a loop will be merged. The technique is described in
*Parameterized Object Sensitivity for Points-to Analysis for Java*(ISSTA 2002, 129 citations) by Milanova, Rounev and Ryder.

**Call site sensitivity example** (from Lhoták’s PhD thesis): A context-insensitive points-to analysis would analyse id() once, and return that both c and d may point to b and a. A context-sensitive analysis using call sites as the context abstraction, on the other hand, would analyze id() in 2 different contexts and would properly return that c points to a and d points to b.

Object id(Object o) { return o; } void f() { Object a = new Object(); Object b = new Object(); Object c = id(a); Object d = id(b); }

**Object sensitivity example** (from Chord’s PLDI tutorial): in the following example, we want to examine whether floors.elems and events.elems may point to the same elements. An object-insensitive points-to analysis would return that they could point to the same objects, because it does not distinguish between different instances of List. An object-sensitive analysis would abstract events and floors using two different allocation sites, and would maintain separate points-to information for evens.elems and floors.elems. Therefore, it would be able to determine that these two arrays have disjoint points-to sets.

class List { Object[] elems; List() { elems = new Object[42]; } } class Bldg { List events, floors; Bldg() { this.events = new List(); this.floors = new List(); for (int i=0; i < events.elems.length; i++) events.elems[i] = new Object(); for (int i=0; i < floors.elems.length; i++) floors.elems[i] = new Object(); } }

In each case (object sensitivity and call site sensitivity), we could get further precision by examining not only the last call site/receiver object, but the list of call sites/receiver objects that lead to the current context (i.e., the call stack is the context). In general, these lists may be infinite, so a common abstraction is to bound them by a parameter k. Hence, we refer to k-object-sensitivity or k-call-site-sensitivity. However, if m in the number of abstract context abstractions (allocation sites or call sites), there are m^k potential abstract contexts to consider. It quickly becomes impractical for values of k greater than 2 or 3.

The added precision of context-sensitivity depends on the client analysis. For instance, context-sensitivity does not add much to the precision of call graph construction and virtual method resolution. In Java, 1-object-sensitivity seems to provide the best tradeoff between performance and precision for most client analyses.

Example:

abstract class Animal { public abstract void saySomething(); } class Cat extends Animal { public void saySomething() { System.out.println("purr"); } } class Dog extends Animal { public void saySomething() { System.out.println("woof"); } } class Fish extends Animal { public void saySomething() { System.out.println("..."); } } class Car { // not an Animal public void saySomething() { System.out.println("honk!"); } } public class Main { static Animal neverCalled() { return new Fish(); } static Animal selectAnimal() { return new Cat(); } public static void main(String[] args) { Animal animal = selectAnimal(); animal.saySomething(); } }

There are 4 methods called saySomething in this application, and the problem is to find which one(s) may be called at run-time (the correct answer is Cat.saySomething()). A dumb and incorrect analysis would just report that any method called saySomething could be invoked, including the one in the Car class. Many algorithms have been proposed to solve this problem, here are a few examples (they are all flow-insensitive, and n stands for the program size):

**CHA**(Class Hierarchy Analysis)- by Dean, Grove and Chambers in
*Optimization of object-oriented programs using static class hierarchy analysis*(ECOOP 1995, 490 citations) - complexity: O(n)?
- it does the only trivial thing to do: it first builds the class hierarchy (i.e., it figures that Cat, Fish and Dog are subclasses of Animal) and determines that saySomething() could be applied to any subclass of Animal, thereby eliminating Car.
**RTA**(Rapid Type Analysis)- by Bacon and Sweeney in
*Fast Static Analysis of C++ Virtual Function Calls*(OOPSLA 1996, 387 citations) - complexity: O(n)?
- simply put, it prunes from the CHA call graph methods that can never be reached because their enclosing class is never instantiated. In our example, it would determine that saySomething() could be applied to a Fish or a Cat but not a Dog, since no instance of Dog is ever created. It is strictly more powerful than CHA and still very fast and simple.
**VTA**(Variable Type Analysis)- by Sundaresan et al. in
*Practical Virtual Method Call Resolution for Java*(OOPSLA 2000, 206 citations) - complexity: O(n)?
- it can be thought of as a refined version of RTA. When RTA collects all objects that can be created in the whole program and uses that information to prune the call graph, VTA does this for each variable instead, providing more precise information. In our example, VTA would correctly determine that saySomething() can not be applied to a Fish, because there is no corresponding edge in the type propagation graph.
**k-CFA**(Control Flow Analysis of order k):- by Shivers in
*Control-flow analysis of higher-order languages*(1991 PhD thesis, 476 citations) - complexity: O(n
^{3}/log(n)) for 0CFA - initially formulated for the functional language Scheme, it has apparently evolved to mean different things in the functional and OO communities (I am still looking for a simple formulation for OO languages though). In
*Resolving and Exploiting the k-CFA Paradox*(PLDI 2010), it is described as a “k-call-site-sensitive, ﬁeld-sensitive points-to analysis algorithm with a context-sensitive heap and with on-the-ﬂy call-graph construction”. It seems that any analysis that uses k-call-strings as contexts could be called a k-CFA analysis. - in the PLDI paper, Might et al. also note that “[in the OO setting] 1- and 2-CFA analysis is considered heavy but certainly possible”.
- in our example, 0CFA would return that saySomething() at line 40 has to be applied to a Cat, because only a Cat can flow to the animal variable.

**ZCWL**- by Zhu and Calman in
*Symbolic pointer analysis revisited*(PLDI 2004, 69 citations) and Whaley and Lam in*Cloning-based context-sensitive pointer alias analysis using binary decision diagrams*(PLDI 2004, 327 citations) - complexity: massive
- essentially a k-CFA analysis where k is the depth of the call graph (plus some fancy-schmancy operations on strongly connected components of the call graph). Requires a context-insensitive call-graph to boot, which it then makes context-sensitive.

Based on this simple example, we see that CHA and RTA return an imprecise call graph, while VTA, 0CFA and ZCWL return a precise call graph. VTA is by far the simplest and most efficient in our case, because we only want a rather coarse-grained call graph, not variable-level points-to information that k-CFA and ZCWL provide.

- for a detailed discussion of the different sensitivities/abstractions: Lhoták,
*Program analysis using binary decision diagrams*(PhD thesis, 2006) - for a theoretical comparison of the relative precision of different algorithms: Grove et al.,
*Call Graph Construction in Object-Oriented Languages*(ACM SIGPLAN 1997, 209 citations) - for an empirical comparison of the relative precision and performance of different algorithms: Lhoták and Hendren,
*Context-sensitive points-to analysis: is it worth it?*(CC 2006, 75 citations)

You don’t expect Antarctica to be a particularly comfortable place, but the book does a very good job at recounting the life (and death) of the expedition in excruciating details. Their boat almost sank on their way to Antarctica as they passed the Roaring Forties. They hoped that ponies could be used to lay out dépôts, but the poor beasts had the greatest pains to adapt to the cold and had to be finished with pickaxes and fed to the dogs. Their two pre-WWI motor sledges just did not work. They were to travel more than 3000 km by foot, dragging sledges with their food and fuel supplies behind them, over a period of 5 months. They had to thaw their way into their sleeping bags at night, and to cut their boots open to give them extra room for the frostbite-induced blisters. There was the constant threat of crevasses, week-long blizzards, scurvy and snow-blindness.

Yet, they were experienced, organized, fit and cheerful. Reminiscent of a huge but flat Himalayan ascent (although the Pole actually lies 2800m above sea level), they built a shelter for the winter and laid out dépôts with supplies for the return of the polar party. After a year of preparation, the polar journey was finally ready. They crossed the immensity of the Ross Ice Shelf (roughly the size of France). They ascended the 160 km long Beardmore glacier, and reached the Polar Plateau. They were ready for the final push to the Pole into uncharted territory. All was looking good.

January 4. A last note from a hopeful position. I think it’s going to be all right. We have a fine party going forward and arrangements are all going well.— entry in the journal of Captain Scott before the final “ascent”

When they reached the Pole after months of such hardships, it was only to find a tent sporting a Norwegian flag and a note by Amundsen, whose expedition was the first to reach the Pole. Just a month earlier.

On the return journey, problems started to arise. The cold was more intense than expected at this time of year, making the sledges harder to pull on ice crystals. Hunger was persistent, as the rations proved to be insufficient (the summit rations were roughly 4900 calories a day, but with no vitamins intake). Frostbite was creeping its way on hands and feet. One man collapsed, and never woke up. Another could not go any further and suggested that they should leave him behind in his sleeping bag. Ultimately, he said “I am just going outside and may be some time”. He walked out into the blizzard never to be seen again. The three remaining explorers fought until the end, but ran out of fuel and food and died 11 miles from a dépôt with plenty. What follows are the last entries in the Captain’s journal.

]]>Monday, March 19. Lunch. We camped with difficulties last night and were dreadfully cold till after our supper of cold pemmican and biscuit and a half pannikin of cocoa cooked over the spirit. Then, contrary to expectations, we got warm and slept well. To-day we started in the usual dragging manner. Sledge dreadfully heavy. We are 15.5 miles from the dépôt and ought to get there in three days. What progress ! We have two days’ food but barely a day’s fuel. All our feet are getting bad — Wilson’s best, my right foot worse, the left all right. There is no chance to nurse one’s feet till we can get hot food into us. Amputation is the least I can hope for now, but will the trouble spread ? That is the serious question. The weather doesn’t give us a chance — the wind from N. to N.W. and -40 degrees. temp. to-day.Wednesday, March 21. Got within 11 miles of dépôt Monday night ; had to lay up all yesterday in severe blizzard. To-day forlorn hope, Wilson and Bowers going to dépôt for fuel.22 and 23. Blizzard bad as ever — Wilson and Bowers unable to start — to-morrow last chance — no fuel and only one or two of food left — must be near the end. Have decided it shall be natural — we shall march for the dépôt with or without our effects and die in our tracks.Thursday, March 29. Since the 21st we have had a continuous gale from W.S.W. and S.W. We had fuel to make two cups of tea apiece and bare food for two days on the 20th. Every day we have been ready to start for our dépôt 11 miles away, but outside the door of the tent it remains a scene of whirling drift. I do not think we can hope for any better things now. We shall stick it out to the end, but we are getting weaker, of course, and the end cannot be far.It seems a pity, but I do not think I can write more.

He is at the origin of k-CFA, a class of static analyses that tackle the *dynamic dispatch problem* with context-sensitive data-flow analysis. Even though the analysis is targeted at functional languages, it also applies to object-oriented languages with virtual function calls (see Matt Might for a gentle introduction). From Shivers’ thesis:

So, if we wish to have a control-flow graph for a piece of Scheme code, we need to answer the following question: for every procedure call in the program, what are the possible lambda expressions that call could be a jump to? But this is a flow analysis question! So with regard to flow analysis in a [higher-order language], we are faced with the following unfortunate situation:

- In order to do flow analysis, we need a control-flow graph.
- In order to determine control-flow graphs, we need to do flow analysis.

The problem could be rephrased for Java as: “if we wish to have an [inter-procedural] control flow graph for a Java application, we need to answer the following question: for every instance method call, what classes could the object be an instance of?”

]]>
]]>

The algorithm, adapted from the maximal fixedpoint algorithm in the dragon book, takes a control flow graph as input and outputs IN and OUT (maps from basic block to abstract state at their entry and exit (an abstract state maps variables to abstract values)). It is parametric, you must supply it with a few functions that will determine the output of your analysis:

- analysis.meet, takes two abstract states and returns an abstract state (see lattices). To guarantee that the algorithm terminates, this function should be monotone (and your lattice of abstract values of finite height).
- analysis.step_forward (resp. analysis.step_backward), a function that takes an instruction and an abstract state at its entry (resp. exit) and “executes” it, transforming the abstract state. They are used to automatically compute the transfer function for each basic block in the cfg.

It looks like this:

def forward_transfer_function(analysis, bb, IN_bb): OUT_bb = IN_bb.copy() for insn in bb: analysis.step_forward(insn, OUT_bb) return OUT_bb def backward_transfer_function(analysis, bb, OUT_bb): IN_bb = OUT_bb.copy() for insn in reversed(bb): analysis.step_backward(insn, IN_bb) return IN_bb def update(env, bb, newval, todo_set, todo_candidates): if newval != env[bb]: print '{0} has changed, adding {1}'.format(bb, todo_candidates) env[bb] = newval todo_set |= todo_candidates def maximal_fixed_point(analysis, cfg, init={}): # state at the entry and exit of each basic block IN, OUT = {}, {} for bb in cfg.nodes: IN[bb] = {} OUT[bb] = {} IN[cfg.entry_point] = init # first make a pass over each basic block todo_forward = cfg.nodes todo_backward = cfg.nodes while todo_backward or todo_forward: while todo_forward: bb = todo_forward.pop() #### # compute the environment at the entry of this BB new_IN = reduce(analysis.meet, map(OUT.get, cfg.pred[bb]), IN[bb]) update(IN, bb, new_IN, todo_backward, cfg.pred[bb]) #### # propagate information for this basic block new_OUT = forward_transfer_function(analysis, bb, IN[bb]) update(OUT, bb, new_OUT, todo_forward, cfg.succ[bb]) while todo_backward: bb = todo_backward.pop() #### # compute the environment at the exit of this BB new_OUT = reduce(analysis.meet, map(IN.get, succ[bb]), OUT[bb]) update(OUT, bb, new_OUT, todo_forward, cfg.succ[bb]) #### # propagate information for this basic block (backwards) new_IN = backward_transfer_function(analysis, bb, OUT[bb]) update(IN, bb, new_IN, todo_backward, cfg.pred[bb]) #### # IN and OUT have converged return IN, OUT

Ideally, to propagate dataflow information in one pass, you would like to have visited every predecessors of a basic block B for a forward pass before analyzing B. Unfortunately, due to irreducible flow graphs you are not guaranteed to be able to do this. Instead, this algorithm

- starts with an empty state at some arbitrary basic block
- makes a forward pass and a backward pass over each basic block, adding the successors/predecessors to a worklist when changes are detected
- continues until the worklist is empty.

The meet function is here to “combine” information from multiple paths, for instance if B2 is reachable from B0 and B1, then IN(B2) = meet(OUT(B1), OUT(B2)). If you wanted to collect value set information and you had:

- OUT(B0) = [a->{1}]
- OUT(B1) = [a-> {-1}]
- then meet could output IN(B2) = [a -> {1, -1}]

Depending on how meet is defined, it can look for information true for all paths coming to a basic block, or for information from at least one path.

Now, some sample code to implement a simple constant propagation analysis. It is forward only for simplicity, but the algorithm works for bidirectional analyses such as type inference.

def meet_val(lhs, rhs): result = None if lhs == 'NAC' or rhs == 'NAC': result = 'NAC' elif lhs == 'UNDEF' or rhs == 'UNDEF': result = 'UNDEF' else: result = 'CONST' return result def meet_env(lhs, rhs): lhs_keys = set(lhs.keys()) rhs_keys = set(rhs.keys()) result = {} for var in lhs_keys - rhs_keys: result[var] = lhs[var] for var in rhs_keys - lhs_keys: result[var] = rhs[var] for var in lhs_keys & rhs_keys: result[var] = meet_val(lhs[var], rhs[var]) return result def abstract_value(env, expr): if expr.isdigit(): return 'CONST' try: return env[expr] except KeyError: return 'UNDEF' def step_forward(insn, env_in): if type(insn) == str: return var, op, expr = insn # insn is var = c if len(expr) == 1: env_in[var] = abstract_value(env_in, expr) else: e1, op, e2 = expr val1 = abstract_value(env_in, e1) val2 = abstract_value(env_in, e2) env_in[var] = meet_val(val1, val2) def step_backward(insn, env_in): pass

The function step_forward defines the abstract semantics for the statements or instructions of the language you want to analyze and for the analysis you want to implement. For instance here we only collect if a variable at some program point is constant, undefined, or not a constant (NAC). To do the actual propagation, we could also collect the allocation site of the constant.

Let’s consider a super simple language, where variables are numbers that can only be affected to or added together. The function meet_val computes the meet for two abstract values, according to this table:

UNDEF CONST NAC ----------------- UNDEF | UNDEF UNDEF NAC CONST | UNDEF CONST NAC NAC | NAC NAC NAC

Let’s consider a simple program in this “language” where we don’t specify the constructs for the control flow. The algorithm just assumes that every edge in the CFG is reachable. This is obviously not the case in practice, but that only means that we are going to miss some patterns (the analysis is sound but imprecise in order to terminate).

Now, we want to find if a and ret are constants. Here is the code necessary to setup and run the example (you need networkx to run it):

import networkx as nx class SomeObject: pass def instructionify(somestr): toks = somestr.split() if '+' in somestr: return (toks[0], toks[1], (toks[2], toks[3], toks[4])) return tuple(somestr.split()) # setup the program's cfg prog = nx.DiGraph() s0 = ('entry'), s1 = instructionify('b = x'), s2 = instructionify('c = 2'), s3 = instructionify('a = 40 + c'), s4 = instructionify('ret = a + x'), prog.add_edge(s0, s1) prog.add_edge(s1, s2) prog.add_edge(s2, s1) prog.add_edge(s1, s3) prog.add_edge(s3, s3) prog.add_edge(s3, s4) # initialize pred and succ pred, succ = {}, {} for bb in prog: pred[bb] = set(prog.predecessors(bb)) succ[bb] = set(prog.successors(bb)) cfg = SomeObject() cfg.nodes = set(prog.nodes()) cfg.pred = pred cfg.succ = succ cfg.entry_point = s0 analysis = SomeObject() analysis.meet = meet_env analysis.step_forward = step_forward analysis.step_backward = step_backward # run the whole thing IN, OUT = maximal_fixed_point(analysis, cfg) print 'a at program point s3 is', OUT[s3]['a'] print 'ret at program point s4 is', OUT[s4]['ret']

And the output is:

a at program point s3 is CONST ret at program point s4 is UNDEF

As a final note: it is possible to speed things up a bit by choosing a better ordering for basic blocks than just going randomly at first (because we initially fail to propagate lots of information). This might end up in another blog post. Cheers!