A little about how CrossHair works: Part 1

CrossHair and Such

04 Feb 2020

CrossHair is an ambitous project, and a lot of it seems pretty magical. It’s less magical than you might think, and Python does a lot of the heavy lifting for us.

This is part one in hopefully a series of posts about how CrossHair works.

I recommend the following reading as well, depending on your familiarity with other work in this space.

The Basic Idea

CrossHair just repeatedly calls the function you want to analyze and passes in special objects that behave like things your function expects.

CrossHair doesn’t do any kind of AST or bytecode analysis. It just calls your function.

These “special objects” behave differently on each execution. That’s how CrossHair explores different paths through your function.

These special CrossHair objects hold one or more Z3 expressions which are used in the Z3 solver. Here are some examples:

When your function takes a parameter with this python type, we supply an object of this CrossHair type, which holds an expression with this Z3 sort:
int SmtInt IntSort()
bool SmtBool BoolSort()
str SmtStr StringSort()
dict SmtDict ArraySort(K, V) and IntSort() for the length

Let’s Explore

We can initialize a CrossHair object by giving it a name:

>>> crosshair_x = SmtInt('x')

We can access the .var attribute of any CrossHair object to get the Z3 variable(s) that it holds:

>>> crosshair_x.var
>>> type(crosshair_x.var)
<class 'Z3.Z3.ArithRef'>

This takes the Z3 variable we just defined and adds one to it:

>>> expr = crosshair_x.var + Z3.IntVal(1)
>>> expr
x + 1
>>> type(expr)
<class 'Z3.Z3.ExprRef'>

We can create CrossHair objects not only for fresh variables, but also for Z3 expressions. So, if we wanted to wrap x + 1 back into a CrossHair object, we’d write:

>>> SmtInt(crosshair_x.var + Z3.IntVal(1))

The SmtInt class defines the __add__ method so that you don’t have to spell that out, though. You can just say crosshair_x + 1, and SmtInt does the necessary unwrapping and re-wrapping:

>>> type(crosshair_x + 1)
<class 'crosshair.libimpl.builtinslib.SmtInt'>

SmtInt also defines the comparison methods so that they return symbolic booleans:

>>> type(crosshair_x >= 0)
<class 'crosshair.libimpl.builtinslib.SmtBool'>

The symbolic boolean holds an equivalent Z3 expression:

>>> (crosshair_x >= 0).var
0 <= x

So far, everything is symbolic. But eventually, the Python interpreter needs a real value; consider:

>>> if crosshair_x > 0:
>>>   print('bigger than zero')

Should this execute the print or not? When python executes the if statement, it calls __bool__ on the SmtBool object. This method does something very special. It consults Z3:

CrossHair will remember what decisions it has made so that it can make different decisions on future executions. Ultimately, we’re looking for some target thing to happen: an exception to be raised, or a postcondition to return False. When that happens, we ask Z3 for a model and report it as a counterexample.

That’s the core of how CrossHair works.

Simple, right?

Well, if there is an accomplishment about CrossHair, it’s that it tries hard to get the details right. And there are a lot of details.

Here are some of the topics I’m considering talking about next. Let me know which ones interest you the most!