Boolean Splitter Obfuscation Algorithm

Author

Kamlesh Kantilal (kamlesh@cs.arizona.edu)

Description

This algorithm detects boolean variables and arrays and modifies all uses and definitions of these variables. Every boolean variable or array element is split into 2 variables or array elements, and the state of the original variable is reflected in the combined state of these 2 variables or array elements. This algorithm consists of two components. One is boolean variable identification. The other is splitting.

Examples

Application of the algorithm

An assignment to a boolean variable in java source code generates the following java bytecode sequence:

ifeq l1
iconst 0
goto l2
l1:
iconst 1
l2:
istore boolean var
    

An boolean operation is implemented similarly. For example, b=a&&c generates the following code sequence:

iload a
ifeq l1
iload c
ifeq l1
iconst 1
goto l2
l1:
iconst 0
l2:
istore 1
    

Boolean variable detection

We do data flow analysis to identify the possbile value range of any variable, if it is [0,1], we consider it as boolean variable. This information is partly obtained using the stack simulator. But it does not work for the following cases:
b=a, whose code sequence is:

iload a
istore b
    

Hence, we have to do something else. An iterative algorithm is used wherein on every iteration we identify non boolean variables and eliminate them. This iteration stops when no elimination were made in the last iteration as we have identified all the non boolean variables. The remaining variables are considered as boolean.

XOR Splitting

The basic idea of the XOR splitting technique is as follows:

1 boolean a = true;
2 boolean b = false;
    

is converted to:

1 boolean a1 = true,a2 = false;
2 boolean b1 = false,b2 = false;
    

or

1 boolean a1 = false,a2 = true;
2 boolean b1 = true,b2 = true;
    

In general, a single boolean variable will be replaced with a pair of boolean variables such that the state of the original variable at any program point is equal to the XOR of the 2 new variables. Likewise, a boolean array 'a' will be replaced by an array 'b' with twice as many elements, in which the state of array element a[i] in the original array at any program point is equal to the XOR of b[2*i] and b[2*i + 1].

Code Transformation Techniques

Our implementation only obfuscates boolean variables where all assignments to the variables are in one of the following forms:

1 iconst_0
2 istore_1
3
4 
5
6 iconst_1
7 istore_1
    

or

iload_1

istore_2
    

or

1 
2 ifeq L1
3 iconst_1
4 goto L2
5 L1:
6 iconst_0
7 L2:
8 istore_1
    

These patterns are easily modi ed to work as split variables:

1 ifeq L1
2 iconst_1
3 iconst_0
4 goto L2
5 L1:
6 iconst_0
7 iconst_0
8 L2:
9 istore_1
10 istore_2
    

Boolean Array Splitting Techniques

Our implementation only obfuscates arrays that are allocated in a function and used in speci c ways exclusively in that function. Allocation must look like this (all examples use the XOR obfuscation):

1 newarray boolean
2 astore_1

This pattern is converted to this pattern in the obfuscated code, which produces an array that is twice as long:

1 iconst_1
2 ishl
3 newarray boolean
4 astore_1

All stores must look approximately like this:

1 aload_1
2 iload_2
3 iload_2
4 ifeq L1
5 iconst_1
6 goto L2
7 L1:
8 iconst_0
9 L2:
10 bastore
    

This pattern is obfuscated as follows:

ALOAD_1
ILOAD_2
DUP2_X1
POP
DUP_X1
POP
DUP
DUP2_X1;
POP
POP
ICONST_1
ISHL
DUP
DUP_X2
POP
ICONST_1
IADD
ICONST_0
BASTORE
DUP_X2
POP
DUP_X2
POP
BASTORE
POP
    

All uses must look approximately like this:

1 aload_1
2 iload_2
3 baload
    

This pattern is obfuscated as follows:

1 aload_1
2 iload_2
3 iconst_1
4 ishl
5 dup2
6 iconst_1
7 iadd
8 baload
9 dup_x2
10 pop
11 baload
12 ixor 
    

Configuration

There are no extra configuration parameters necessary to run this obfuscator.

References