Sunday, August 6, 2017

DEFCON Finals 2017 - Introduction & Rubix

Over the weekend, July 28-31 2017 RPISEC competed with Lab RATs and Techsec in DEFCON Finals, one of the most important CTFs of every year. Only 15 teams in the world get to qualify for the event each year, and our team under Lab RATs was able to earn the right to compete among 14 other globally professional teams.

This writeup covers the first challenge presented at DEFCON Finals, which kept us on our toes throughout the competition. It was deceptively simple, and also served as an introduction to the cLEMENCy architecture as well to the toolchain that was used to create all future binaries.

The majority of this writeup will involve getting our feet wet with the cLEMENCy architecture, a terrifying 9-bit middle endian architecture, as well as setting up a comfortable environment for reverse engineering the later challenges released at DEFCON.

DEFCON Challenge Format

For the CTF we are given only an emulator and a debugger to solve each of the challenges; these challenges are compiled for a 9-bit middle endian architecture, cLEMENCy. When we first run the rubix challenge, we note that only garbage gets printed to the terminal. This output doesn't change each run, and it's puzzling what exactly this is. What if it's the emulator printing out 9-bit ascii instead of 8? Indeed, when we transform from 9-bit to 8-bit between the debugger and our terminal, we end up with the following:

This service implements a rubix cube.  Solve the cube and win.
Give me 54 Bytes(ie: 53,12,5,etc): <input 54 comma-delimited chars>

             R6 R7 B0
             L1 T4 R5
             A0 B7 R2
  Left         Front       Right       Back
L8 L7 L0     T0 A7 F0    R8 F3 L6    B6 T3 T6
T5 L4 B1     F7 F4 A1    T1 R4 T7    F1 A4 R1
B8 B3 F8     A6 R3 L2    T2 L3 B2    R0 L5 A8
             A2 F5 F6
             B5 B4 A3
             T8 A5 F2

Note that for the input too, we must transform the 54 comma-delimited 9-bit integers from 8-bit ascii to 9-bit ascii. Because interacting with the emulator was so painful yet so crucial to all challenges hereafter, we wrote our own internal 9-bit to 8-bit translator stand-in for netcat, dubbed cLEMnc.

Symbolizing LibC

Before we go any further, we attempt to dig into the code. First, we note the following strings in the binary (using a tool which specifically searches for ascii strings in 9-bit format):

memtst %d %d  %ld
memtst: increase MEMTSTSZ
memtstleak %8p %8ld %s
memtstfail %8ld %s
memtstfree %8p %s

... among many other strings. However, these strings are present in almost every binary--it's natural to assume that there exists a libc somewhere which is shared between all challenges. Searching for these strings gives us a link to neatlibc; it turns out that starting at address 0x60 (in this binary), we can work our way up from atoi, as each function in the binary is in exactly the same order as that specified in the enclosing file, with each file being compiled in lexigographic order. For example, atoi is immediately followed by atol (see atoi.c), followed by isascii, isblank, etc (from ctypes.c). With a few exceptions, we are able to symbolize all of libc for all future challenges, as functions are defined immediately one after the other, with no gaps. The full script can be found here.

Thanks to Kareem (irc: krx) for developing the majority of the binja script!

Alternatively, we could consider a more generic approach which takes advantage of flare or flirt signatures, but our quick and dirty script worked on all the challenges anyway.

Analyzing Rubix

We now start disassembling from address 0, which is similar to _start on more familiar platforms; this start template is closely shared among all challenges, with slight variation.

This second call to 0x5fd0 turns out to be the call to main, whereas 0x8b4 is simply setting up some state. In binaries for which the stack grows in the opposite direction (Weird, right? One other quirk of this architecture) _start looks slightly larger, but still has the same structure.

Manually combing through main, we see the following:

The binary maintains 3 cubes throughout the program: cube1 and cube2 are initially set directly to our input. cube3 is only used for printing out the cube and can be ignored, and is also not seen in the main function.

After our call to srand(), we perform a random shuffle on cube1, shown below:

This randomly performs actions on cube1 and cube3 in such a way that the resulting cube is always solvable. Then, the program enters a read loop to parse an action and perform it on cube1 (PerformAction()); once cube1 matches cube2 (which was initialized to our custom input and never touched again), that means we've solved the puzzle and are now able to execute it as shellcode.


We would like to execute arbitrary shellcode that fits into 54 bytes. It turns out that because RandomShuffle() performs only valid actions on cube1, we can always reverse this by simply knowing the seed, which is in fact the first three numbers we give it, converted to a trie. A valid solution script, which recieves shellcode and returns the actions which need to be performed to solve the cube is shown below (minus the IO portion):

The given bytes should be interpreted as valid shellcode which prints out the flag (e.g. printf(0x401000)). The example shellcode used in the example above disassembles to the following:

Recommended Patch

Unfortunately, DEFCON Finals requires much more past the initial exploit; we need to patch our own services in a way that does not violate the service check, and in a way that disrupts all future exploits from other teams. This is challenging, because we are unable to simply kill the call to mprotect or the call to the user's shellcode; the service check likely depends on being able to run shellcode, even if it does not necessarily have to print out the flag.

Fortunately, LegitBS dropped us a nice hint in the binary in the form of FilterFunction(), shown below:

Clearly, we're being asked to "filter" the globally stored shellcode, and insert it into this huge code cave. In other words, to completely patch the program, we are asked to write a static or dynamic shellcode analyzer to filter out shellcode which could potentially leak the flag. Hideous!

Unfortunately, neither our team nor any others wrote a full static analyzer within the time constraints, and every patch was circumventable. By the third day, Lab RATs had accumulated a working shellcode for every individual patch, so we were consistently popping flags on all other teams. As a result, this challenge kept everyone on their toes the entire competition.

Thursday, July 6, 2017

Google CTF Quals 2017 - Web Assembly

This is a writeup on Web Assembly challenge from Google CTF Quals 2017, which took place on 6/16/17-6/18/17, and was categorized as a web challenge. This writeup was also submitted to the Google CTF Writeup Contest, and earned us $500. Thanks to Nick Burnett (irc: itszn) for solving the challenge and for authoring the writeup!

Challenge Description

This challenge takes place on a very retro looking page the lets you drag assembly instructions from the sidebar into the main page. One button lets us compile the code, and another lets us run it. There are also a list of test cases. A quick glance at the source shows that they have implemented a simple assembly architecture and vm on the client side.

We are given several unminimized javascript files. I will quickly list what each is responsible for:
  1. asm.js - This file parses the input, and converts it into a "bytecode", which encodes the instructions as raw bytes.
  2. vm.js - This file contains the VM implementation that decodes the bytecode and runs the instructions.
  3. test.js - This file contains code to run a webworker with the VM. It also gives it the testcase input, and compares the worker's output to the expected output.
  4. worker.js - This is run by the webworker. It takes the input, runs the VM, and then responds with the output.
One of the testcases checks our code's output against the flag. If it matches, it will print the flag out to us. Since we cannot know the flag to output it, we can assume that we need to find a bug in the VM and gain javscript execution.

Bytecode Compilation

The first step is in asm.js, where our data is parsed and compiled to a byte code. This process is fairly straight forward. There are three datatypes:
  • int is simply a 32 bit integer value
  • float is simply a 64 bit float value
  • string is a set of bytes, which is prefixed by the length as an integer. Strings are decoded back to normal javascript strings later one.
If a label is used, it looks up the location as an integer. However, it also sets the high bit of the byte that represents what data type it is. This is important for later, so I modified the code to let me set the bit by prepending the type with a *.

The actual instructions are also encoded as a byte. Finally, the 'data section' is encoded similarly to the data types, and stored at the beginning of the output byte array.

VM Implementation

The virtual machine first decodes all of the instructions. Each one is replaced by a function calls the action with the given arguments. Each argument is decoded, and passed to this function. If the high bit is set (marking it as a pointer), the argument is replaced by a function that offset of memory: 

Here is the code for two important opcodes:
  • mov - Moves the second value into some offset of memory

  • get - Call the function associated with a file descriptor, and store the value in memory

getValue() will recursively call its input until an error is thrown. This is used to either return the non-function for the normal values, or call the function for the pointer values: 

The other opcodes are fairly straight forward, but we will only need these two for the final exploit anyway. All these functions are put into an array called memory, with the data section starting at index 1.

Breaking out of the VM

The bug in this implementation is pretty simple. When we access some value in memory, we are not limited to numbers, since we can use force a string to be a pointer by setting the high bit. When we do this we gain access to all the attributes of the javascript array such as __proto__.

Doing something like mov int 0 *string __proto__ ultimately performs the operation memory[0] = memory['__proto__'].

Background: __proto__ 

In Javascript pretty much every type is an Object. Objects have attributes that define what they do, many of which are backed by the native interpreter, depending on the object's type. These attributes can be accessed with either the . operator, or ['key'] notation.

Objects also have __proto__ attribute (which is also an object), that defines all attributes for the class of the object. When you access an attribute that is not a direct property of the instance of the object, Javscript will try to access it on the object's __proto__. Of course if it isn't a direct attribute of the
__proto__, it will check the __proto__'s __proto__ (remember, __proto__ is just an object too!). This is how Javascript does inheritance. If the attribute is not found anywhere, and a null __proto__ is reached, then it returns it as undefined.

Note that the same
__proto__ is shared for a given class, so if you modify it, objects of the same type will also be affected by the changes.

Accessing a Function Constructor

In javascript there are many ways to try and escape sandboxes. Our eventual goal will be to call eval('our data') or Function('our data')().

If our goal is to run Function('our data')(), we need to be able to arbitrarily call Function, however we don't actually have a reference to it anywhere. Luckily, you can also use constructor, as long as you have the constructor reference from a function.

Unfortunately for our current situation, memory is an array, not a function, so memory['constructor'] will only ever create an array. To bypass this, we can change the __proto__ of memory. As I said above, javascript will recursively search __proto__ until it finds the attribute you are looking for. If we are asking for constructor, it will search memory.__proto__ for constructor, and if not found look for it in memory.__proto__.__proto__.

So what if we replace memory.__proto__ with some function? Well constructor will be found in memory.__proto__.__proto__ which will happen to be the function's original __proto__!

If so many __proto__s confuse you, the TL;DR is that we can turn memory into a function object temporally, allowing us to access a function constructor.

All we need to do is mov string __proto__ *string someArrayFunction which hopefully become memory['__proto__'] = memory['someArrayFunction'].

The only problem now, is getValue(). As we recall, getValue()
will continue to call what ever we try to access. If we want to store a function, we need getValue() to return a function. The only way to do that is to cause an exception. Looking at the array's functions, I found __defineGetter__, which takes two arguments. If only one is given, such as in getValue(), it throws an exception. Perfect! So far our exploit is:

Calling the Function's Constructor

First we want to grab the constructor from the now-function memory object with mov int 0 *string constructor, which will do memory[0] = memory['constructor'].

The next challenge is to actually call it with our payload. It is easy to call, all we need to do is mov int 0 *int 0 but this will end up doing memory['constructor'](memory) thanks to getValue(). Unfortunately this throws an exception as it tries to call memory.toString(), but toString() is function's toString(), which does not expect an array.

We can fix this by restoring memory's __proto__ with an array, much like how we made it a function before.

However, where do we get an array? We can't even call memory's constructor to make one, since it is a function now... Luckily we recall the get opcode, mentioned earlier.  fds is an array with a normal __proto__, so we can do get string __proto__ string constructor which will run memory['__proto__'] = fds['__constructor'](). This makes memory an array again.

memory.toString() works again, but what does it actually produce? For an array, it functions like memory.join(','). This will give us our data separated by commas.

For this to be valid javascript, we can stick our payload at the start, and comment out the rest:

To do this, we can simply stick our payload in the data section, and move it to index 0, while moving a */ to a very far off index. Here is our payload now:

All this to was done in order to call Function('PAYLOAD/*,,,,,,*/')()!

Passing the Flag Test

Now that we have arbitrary javascript running, we need to figure out how to get the flag. The code is running in a webworker, which is somewhat sandboxed. It cannot access the dom, nor the location of the old page, which is where the flag is located.

So now we can look at how the parent is reading the response from the worker. We can send any responses we want now, so there may be a bug there too.

test.js sets the worker, with a callback for onmessage

We can see that if the answer attribute of the returned data object is not equal to the expected output, it will reject, and terminate our worker. However, if we are able to cause an exception before worker.terminate();, we will be able to continue sending guesses.

Looking at TestCaseError we can see that data.test is appended to a string, meaning toString() will be called:

It is easy to cause an exception here, by making test.toString not a function: 

Now we can guess as many times as we want. As long as we get it right once this code will be called:

We can do this easily:

At this point we would have to make this code both small enough to make <400 bytes encoded, and also make it play nicly with the parser. I decided not to do this, and instead use a nice feature of the webworker, importScripts. importScripts will synchronously load and run a javascript file, which is nice, because we can make our payload as long as we want now. Here is the final payload (remember the `*` syntax is something I modified myself to set the pointer bit): 

Running this with the 'Guess The Flag' test causes all the test cases to pass, and have it print the flag.

Now we just need to submit it so it will run on the remote server. It took a few tries, because I kept getting 500 errors (although I knew it was working because I was getting requests for the payload file). Finally it went though:

The final flag is CTF{_r3m0v3_th3_c0mm4s_plz_kthxbye_}

Google CTF Quals 2017 - Moon

This writeup is for the reversing challenge "Moon" we solved during 2017 Google CTF Quals. This writeup and 3 others were also submitted to the Google CTF Writeup Competition.

Dealing with GLEW

A big problem we noticed early on was the use of GL3W, which generates code to lazily load all OpenGL functions at runtime--at all places where an OpenGL function was used we would simply see a call to some offset in the data section. Unfortunately we couldn't find a script to re-symbolize function calls, but we plan to make one soon-ish :)

We can see the huge routine which calls LoadLibrary on every OpenGL function, at sub_4032c0.

Instead of symbolizing (by hand) all of the symbols, we only bothered to load ones which had valid XREFs to them, saving a bit of time.

Running the Program

Unfortunately all the RPI-sold computers from our year are not yet reported to support OpenGL 4.3, so first and foremost we had to patch the OpenGL verification check from 4.3 to 4.2. Surprisingly, this "just worked" despite the code making use of Compute Shaders which I had thought to be introduced in OpenGL 4.3.

The program simply opens up a window, and asks for a password. After we've entered 32 characters, the program either responds "good" (presumably), or "Nope".

When we XREF the string Nope, we see that it is used when constructing the texture to be printed for this SDL event loop iteration. Not too far from "Nope" do we find "Good", and we notice that "Good" is only selected if a particular global variable is set. We trace this back to the following code in main:

We want should_compute here to be 2, meaning the memcmp succeeded. Buf2 is the following string:


So, once our input is hashed, it must match the above value. The hashing of our input seems to occur at sub_401BF0, and through windbg we can confirm that our input is the first argument, and the hash is written out to the second argument. Nothing much happens here, however, though we do see references to glUseProgram and glDispatchCompute, as seen below:

Although we can see the fragment and vertex shaders in clear view in the strings, we can't see any reference to the compute shader glsl source. We assume it's encrypted somehow, so we break on glCompileShader and wait until the glsl compute shader is decrypted. This is a wild guess, as the shader could have been precompiled somehow but we punt on this.

The dumped source is as follows, after adding the proper formatting:

Reversing the Compute Shader

We note a few properties of the hashing function below:

  1. The h variable is constant for all iterations of this compute shader. It is dependent only on the password.
  2.  The only place in which the index, idx affects final is where we compute final ^= idx << i in a loop. This is completely reversible.
  3. Other than these 2 conditions, final is completely dependent on the current character.

Our goal here is to exploit these characteristics to find an idx-independent and password-independent hash for each character. This would allow us to brute force the password character by character. We can find the h of the real password like so:

hash_C  = reverse_idx(final_C) ⊕ hash_h
hash_C' = reverse_idx(final_C) ⊕ hash_h'

Here, hash_C is the value of the first index corresponding to the 'C' character in 'CTF' (we're assuming the password starts thus because of the flag format), for the real password, or the password we'd like to find. hash_C' is the value of the same index in our dummy string, say CTF{AAAAAAAAAAAAAAAAAAAAAAAAAAAA which also has a 'C' at the same index. hash_h represents the h value used to xor with the output of the hash function for the real password, and likewise hash_h' is this value of h for our dummy password.

Note that final_C is the same for both the real and the dummy password; it's passed out of the hash function. The reverse_idx function removes final_C's dependence on idx, reproduced below:

reverse_idx(final,idx) = final ⊕ (idx<<0) 
                               ⊕ (idx<<6) 
                               ⊕ ... 
                               ⊕ (idx<<26)

Finally, to compute hash_h, we simply need to perform the following:
hash_h = hash_C' ⊕ hash_C ⊕ hash_h'

Although it took a lot of work, we now have h of the target password, since we know hash_C', hash_C, and hash_h'. With this, we can figure out the hash value of every character in the password, independent of the character's index, and brute force them character by character.

Brute Forcing the Password

Unfortunately we were unable to replicate this algorithm forwards in C++, for some unknown reasons (probably something to do with the calc function). Since we were running out of time, we opted instead to compute a lexicon of characters up front with the debugger. We then took the hash of each character and made them idx-independent, as well as un-xor'd their h terms, like so:

Now, all we need to do is take Buf2's characters and render them also idx-independent, un-xor them with the known h for this password (which turns out to be 0x6f6f6f6f, computed in the previous section) and then match each hash with the known values in our lexicon, as below:

The resulting flag is: CTF{OpenGLMoonMoonG0esT0TheMoon}

Google CTF Quals 2017 - The X Sanitizer

This is a writeup on The X Sanitizer challenge from Google CTF Quals 2017, which took place on 6/16/17-6/18/17, and was categorized as a web challenge. This writeup was also submitted to the Google CTF Writeup Contest. Thanks to Nick Burnett (itszn on IRC) for solving the challenge and for authoring the writeup!

Challenge Description

Investigation: Index page

The site contains a text box, which we can enter html into. When the button is clicked, it runs some kind of sanitization program, and finally renders the output back to the screen. The page claims that the entire process is client side, and that there is no hidden server logic. From this and description, I would guess that the goal is to preform a Cross Site Scripting (XSS) attack on the page.

Background: Cross Site Scripting

Browsers try to protect users from malicious websites by using something called the Same Origin Policy (SOP). This policy controls what a website can and cannot do. For example a website can access its own cookies and read its own web pages, but it cannot read the cookies or data of another webpage. To define what a webpage is, we use the term origin. A page's origin in most cases is based on the domain name. So is one origin, while is another.

The fact that SOP blocks cookies is a good thing for the user, because most websites use cookies to tell if you are logged in. Reading another site's cookie would allow an attacker to log in as you.

However, I mentioned that websites can access their own cookies. Here is where XSS comes into play. If an attacker can run javascript on a website, they will have all the same permissions as the website, even if the script was not originally from the website (hence the name cross site scripting). Executing javascript on this origin will be our goal for this challenge.

Investigation: Santization system

Included from the index page was two javscript script file sanitize.js. We can see that it first takes our input in the Sanitize function. The code then spawns a service worker. Service workers are a feature in chrome which allow the client to server response to requests for a script. Below we can see the responses it sends as part of the fetch function:

  • /sandbox will append the contents of the url parameter html to this html which loads the sanitize script:

  • /sanitize will respond with a script that sets up a 1 second timer to respond to the parent, as well as a Content Security Policy (more on that in a second). It also creates a remove function which will either delete a given html node, or remove the documents contents:

  • Any other request will respond with a page that is designed to either be html or javascript, either way it will run the javascript (since it will request x which just returns this page again.) The purpose of this code is to delete whatever requested it, either the script tag, or the HTML import (we'll look at this more soon too):

The sanitize function first tries to remove a few black listed words from our input:

Since they run it with a loop, we cannot bypass it by simply doing something like <me<meta>ta>. However, it is good to keep in mind what they are trying to block.

Finally the function creates an iframe pointed at /sandbox?html=<OURINPUT> and lets it run. As we saw above, after 1 second the page will send its contents to us. Once we get that back, the script writes the contents to the page without any further sanitization. If we can get any javascript into here, we should be able to steal the cookies.

Investigation: Sandbox page

As I said, the script run in the sandbox page sets up a Content Security Policy (CSP) using the meta html tag. This policy consits of default-src 'none'; script-src *. This means that by default all requests and inline content will be blocked, but all script requests will be allowed (but not inline content). Seeing this we can also check the CSP of the main page to find it is script-src 'self' which will block all script requests not going to the same origin.

Background: Content Security Policy

A CSP is another tool the browser uses to protect sites. Like SOP it dictates what a site is allowed to do. However, these restrictions are actually enabled by the site itself, to protect it from things it might not normally do. For example, if a site never expects to run unsigned script tags, then if one appears, it is probably an attacker trying to preform an XSS attack. By setting script-src in the CSP, the site knows to block that tag. Good CSPs are very effective and can be very difficult to bypass.

To check for a CSP, first check the response headers of the site. If there is not one there, it can still be enabled with a <meta> HTML tag in the page header.


The CSP on the sandbox page also has a special feature. The sanitize script sets up a callback which will be called on securitypolicyviolation which will happen any time a request is blocked by the CSP. It calls the remove function, which will delete the element that caused the CSP to trigger, removing them from the final output of the sandbox!

The second feature is that any scripts we run will respond with the javascript that removes the script tag. This also tries to stop HTML imports. HTML imports are a way of loading another HTML page into the current page, and is useful for XSS since the browser will run anything we put on the other page (assuming the CSP doesn't stop it.) It is done like this: <link rel="import" href="page to load">

Here, importing any page will also respond with this response. The javascript will be ignored, but <script src=x></script> will run, and the same script will be loaded. querySelector('link[rel="import"]') looks for the link tag doing the import.

At first glance it seems that every way for us to run javascript is either blocked, or will cause our tags to be removed from the final output!

Sandbox Bypass

I found two ways to bypass the sandbox, and inject script tags into the main page. Both of them use the HTML import feature.

Method 1:

To respond to the parent, we saw that the sandbox uses a one second time:

When this timer triggers, anything still on the page will be send back to the script.

I found that by using the async feature of HTML imports, I could cause some to remain when time was up. Adding async to the import tag, causes the import to actually be loaded after the page has finished loading. This means that onload would have been triggered, and the timer would have started counting down. By adding a large number of these tags (around 500), some will remain by the time 1 second is up.

Method 2:

A simpler method, (probably the intended solution) is due to a flaw in their code. When the import removing code is run, it uses querySelector('link[rel="import"]') to find the link tag. However this will only locate the first link tag.

If we put <link rel="import"> and also <link rel="import" href="page to load>, then only the first will be deleted when the second is loaded!

Using either method, we can now do a HTML import on the main page. However there is a new problem! As I mentioned above, the main page has a CSP with script-src 'self'. This means that we can only run scripts and import pages from the domain.

Bypassing script-src 'self'

Our goal is still to run javscript, but now we must find a way to load it from the somewhere on the challenge.

Injecting a Script Tag

Lets start by injecting a script tag using the HTML import we smuggled out of the sandbox. This is relatively easy, thanks to the sandbox page. We can url encode the script tag with javascript and put it as the html url parameter.

Requesting /sandbox?html=%3Cscript%20src%3D%22target%22%3E%3C%2Fscript%3E gives us

You may be worried about the sanitize script being run again, but luckily since our code doesn't actually 'activate' it, there is no client yet, so the logic causes it to 404:

Our payload so far:

Putting Javascript on /sandbox

Now we can load a script, but we can still only load from the domain. We can try to put our script on /sandbox like we did the script tag, put that gives us problems, since the tags in the first part of the page is not valid javascript.

To bypass this we can use an encoding attack. An encoding attack is where we specifiy a multibyte encoding for the script. If we are lucky, all of the html junk will turn into one large valid identifier, thanks to javascript's unicode support.

If we were to load the page as utf16 big endian (specified as utf-16be), beginning turns into


To prevent this from causing an error, we can append =0\n. Now we can also append our own cookie stealing payload and encode it as utf-16be and urlencode (for normal characters in utf-16be, the character is prepended by a null byte):

We can load it like this:

The script that is run is this:

Putting it all together

Now we can put that script into the import like we did before, and we should be good to go:

This gives us the final long payload

However, if we try this, we find there is still one problem! The original sandbox removes 'utf-16be' from our input:

This is easy to bypass, as we can just url encode utf-16be to utf-16b%65 with this:

The final corrected payload is

Waiting for the request back we see

And we have captured the flag! CTF{no-problem-this-can-be-fixed-by-adding-a-single-if}

Monday, June 26, 2017

Google CTF Quals 2017 - Food

This writeup is for the "Food" challenge found in Google CTF Quals 2017, from the reversing category. This writeup and others were also submitted to the Google CTF Writeup Competition.

JNI-Native Reversing

We are only given a food.apk, and from there we immediately unzip it and run dex2jar, followed by jd-gui to view the source code. Fortunately, the source is very succinct:

Although I'm not well versed in Android development, it looks like it's loading a library from either one of lib/{armeabi,x86}. We turn our attention to the arm library, and symbolize it using a script we found here. This resolves certain awkward indirect function calls that otherwise would appear as unnamed constant offsets into the JNI struct (e.g. JNI_FindClass).

The bizarre disassembly in JNI_Onload seems to indicate that all useful strings have been encrypted, and are only decrypted at runtime. One such example of this is shown below:

Note that decrypt_string here, as we learn later is a variadic function with the following signature: char *decrypt_string(int a1, ...). We implement the decrypt_string function in python so that we can determine all the strings used by the application:

We can now resolve many important strings which are passed as arguments to fopen and fwrite, shown below, as well as to later JNI runtime functions.

Unfortunately, binwalk doesn't recognize dex files, but clearly we're writing an embedded dex file to the location /deta/data/, which we can dump from the binary with IDA.

Before we analyze this dex file, we continue reversing the JNI_Onload routine, to discover a particularly scary routine which parses /proc/self/maps and patches itself at runtime, a little later on.

At a high level, it reads /proc/self/maps until it finds the base address of /d.dex, and looks for the magic header, dex\n0. It then proceeds to patch the JVM bytecode at runtime with a sinister binary string of length 0x90 that's xor'd with 0x5a. This d.dex, we presume is the same d.dex file we dumped from earlier.

We go to offset 0x720 of the d.dex file, to discover what appears to be 0x90 bytes of jvm-bytecode nops. We patch in the unxor'd content into the dex file using IDA r2, and decompile it in much the same way we did with classes.dex.

Reversing d.dex

Unfortunately the files were a bit large, but we note the following routines in

In the unpatched d.dex, cc() was originally nopped to thwart static analysis. sets up a view in which one can request foods in a particular order, by pressing the corresponding buttons. onRecieve will take the indices of each order and save it in this.k, and once all 8 have been entered cc() xors k with arrayOfBytes, and compares it to a new string.

Once we've figured out the right k, we will be given the flag. We can figure out this k with the following python code:

To save some time, we can construct a simple class, and simply importℝ (since we couldn't get any of this to run in an android emulator).

This spits out the flag, CTF{bacon_lettuce_tomato_lobster_soul}

Tuesday, May 23, 2017

NorthSec 2017 rao_bash

This is a posthumous writeup of the rao_bash challenge hosted by NorthSec 2017. It features a recompiled and backdoored bash with a broken ELF header. This was solved after the CTF, because a small hint was dropped to us afterwards.

Fixing the ELF Header

First and foremost, we’d like to fix the ELF header of the executable. This ELF header interferes with all of our tools, and even the venerable file is having trouble with it:

$ file ./rao_bash_orig
./rao_bash_orig: ELF 64-bit MSB *unknown arch 0x3e00* (SYSV)

If that’s not enough of a hint, readelf seems to think that the entrypoint is some very large number,

$ readelf -a ./rao_bash_orig
ELF Header:
 Magic:   7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00
 Class:                             ELF64
 Data:                              2's complement, big endian
 Machine:                           <unknown>: 0x3e00
 Version:                           0x1000000
 Entry point address:               0x300c420000000000

At this point, it should be obvious that the entry point is correct, if interpreted in big endian, and readelf is so nice to show as much that the data itself is all big endian. We simply revert this by flipping the “\x02” to a “\x01” in the 5th byte of the ELF header (see the magic line). Now, all of our tools work as appropriate.


Here is where the hint comes into place: bash is a huge executable, and unfortunately we could not find anything largely different between rao_bash and a bash we compiled ourselves with a similar compiler version, i.e. by using diaphora. However, it seems we missed a very quick-and-dirty xor, as seen below in the main function.


It looks like argv is checksummed, and if the checksum is valid it moves to sh_login_init(). Obviously, we don’t quite care about the checksum, as it’s very lossy and the interesting stuff is really happening in sh_login_init().

Unfortunately, everything after this is a little hairy; the disassembly involves self-modifying code, slightly obscured control flow and finally a reversing function that doesn’t decompile well. For testing purposes, we jump right into this function by marking it as the entrypoint, i.e. with rabin2 -O e/0x46390d ./rao_bash_entry. This jumps immediately into the code (which importantly is completely independent), and reveals some more to us about the program.

$ ./rao_bash_entry
Welcome home RAO
Would you like to destroy the planet today?
Enter the supreme Password:<whatever>
zsh: segmentation fault (core dumped)  ./rao_bash

For starters, sh_login_init() immediately copies 90 bytes of the data section onto the stack, jumping to it. It then proceeds to xor the next 700 bytes of the data section with 0xcf, and of course jumps into that. This section is quite interesting; the disassembly is slightly obfuscated, in that it flattens control flow by introducing a dispatch basic block, as reproduced below, symbolized to obviate the meaning of offsets, etc.

 lea r15, dispatch
 xor r14, r14
 mov r14w, write_prep1 - dispatch
 add r14, r15
 push r14

Here, the dispatch basic block will jump to any offset from itself, where the offset is stored in r14 and the address of dispatch is stored in r15. This is used to implement the reading and writing, as below.

 xor rdx, rdx
 call write_prep2
welcome_msg: db "Welcome home RAO", 0xa, ...
 pop rsi
 mov dl,  0x5a ; sizeof welcome_msg
 mov r14, write - dispatch
 mov r13, read  - dispatch
 push r15      ; jump back to dispatch
 mov r14, r13  ; prepare to jump to read
 push r15      ; jump back to dispatch
 jnz verify_input

Here, read pulls 32 bytes from the user and places them onto the stack, crashing if read failed (by jumping into another instruction). Then, we immediately begin verifying the user input, as below:


Here, we see some obfuscation techniques at hand as well, notably the use of the rdrand instruction to generate some random values which are xor’d with the hashed user input. However, this is only there to thwart static analysis tools like binary ninja (which does not currently parse the instruction as of this writing), as well as some other nifty tools like angr. Furthermore, the effects of this instruction are reverted because we redo the xor on the hashed user input with these random values before checking against the global checksum data. Thus, it’s safe to ignore this in our analysis.

As for the rest of the verification function, it can be easily mimicked with the following python code:

def enc_byte(byte):
    lzcnt = float(64 - len(bin(byte)[2:]))
    byte  = float(byte)
    xmm0  = float(lzcnt) / float(byte)
    xmm0  =  float_to_int(xmm0)
    xmm0 ^=  float_to_int(lzcnt)
    return c_uint32(xmm0).value
enc = [ 0x44444444, 0xD60864B9, 0xD83BA686, 0x89D89D8A,
        0x38E38E39, 0xEA0EA0EA, 0x33333333, 0x32323232,
        0xE1E1E1E2, 0x97829CBC, 0xffffffffffffffff,
        0xE54975FC, 0x97829CBC, 0x31DEC0D5, 0xE54975FC,
        0x89D89D8A ]
for j,i in zip(enc, user_input)[::-1]:
    if j == 0xffffffffffffffff:
        j = -j - 1
        j = c_ulonglong(j).value
    if enc_byte(ord(i)) != j:

Now, we simply have to brute the password, character by character until we get the full flag, 4sM1sr1f3_Fl4gzZ.