I was to reverse a file for a challenge, MD5 hash 85c9feed0cb0f240a62b1e50d1ab0419.
The challenge was called mio cuggino, purposefully misspelled with two g letters. It asks for three numbers. The challenge led me to a brute force of the password with a Python script, learning how to interact with a subprocess stdin and stdout (SKIP to next section if you don't care about context but only want the code).
Looking at the assembly with Radare, the first thing it does is to check that the numbers are non-negative and in increasing order. In details, it checks that:
- exactly three inputs have been provided;
- the first two are non-negative;
- the third is bigger than the second;
- the second is bigger than the first;
- the third is non-negative.
Very good, so the input pattern is three non-negative integers in increasing order. Fine. No clue about what those numbers should be though, yet.
Scroll the assembly just enough to unravel the magic.
A (pointer to) string is loaded into ebx, which contains the following Italian sentence:
Mi ha detto mio cuggino che una volta e' stato co' una che poi gli ha scritto sullo specchio benvenuto nell'AIDS, mio cuggino mio cuggino
The assembly basically takes the characters in the string that correspond to the first and second input (for ex, 0 as first input would map to the first char, M) and checks whether they are equal. If this is not satisfied, a Nope message is shown and the binary returns.
If this is satisfied, the same check is repeated with the third input (with the first one, although this doesn't matter). If this is satisfied as well, a tricky sub.puts_640 function is called (with 5 inputs), and a Uhm message is shown.
Going to looking into that routine is absolutely useless as it's completely unreadable, and even makes a bunch of additional calls that are further jumbled.
So let us summarize what we know up to here: three non-negative integers, in increasing order, with the corresponding chars in the italian string being equal. Seems easy: just try with the space character! Pick integers whose position in the string are spaces, such as 2 5 11. No luck, we just get a Uhm.
At this point I must say I was stupid enough not to catch the hint, because I really was close to the solution! Cuggino is purposefully misspelled in the italian sentence: it has two g s! So the g was probably the letter to go... But I had been banging my head on this binary for a couple of hours after dealing with other challenges and the competition was going to end soon, so I really was tired.
Brute forcing a binary file input with Python
Later on, I was told that a brute-force approach could have worked as well, since the string length was not prohibitive. Indeed, in the worst scenario and with the most naive technique, I would have had to try out \(136^3 = 2515456\) inputs (with 136 being the string length). Not small, but not prohibitive.
So back home, I tried this approach. I wrote a small Python script that would generate all possible triplets of numbers (picking them at random) and feeding them as inputs one by one, looking at the output of the file to see when the flag was output. Sure enough, in roughly 20 minutes the flag was found! The correct input was indeed 18 19 63, corresponding to three g letters in the string.
In 1247 seconds, (more than) 1316248 attempts were made, scoring a rate of 1055 attempts/second. I was running this on an Intel i3 so this was not particularly powerful.
There was not a ready made script online to do this, so I though I would share it. In particular, running a file, feeding it an input NOT through arguments (i.e. writing to its stdin) and reading its output did not seem easy.
The core of the script is the following three lines
test = str( randint( 0, 136 ) ) + ' ' + str( randint( 0, 136 ) ) + ' ' + str( randint( 0, 136 ) ) p = subprocess.Popen(["/home/stefano/Binary Reverse/mio_cuggino_85c9feed0cb0f240a62b1e50d1ab0419"], stdin=subprocess.PIPE, stdout=subprocess.PIPE) stdout = p.communicate(input = test.encode()).decode()
First, input to be tested is built, picking three integers at random. Second, a process is spawned calling the file to crack. The two subsequent arguments to that call are needed to be able to interact with the process input and output spaces. The object of this call is stored into the p var. Finally, we communicate with that project passing as input the bytearray version of the test string (test.encode()), reading the output as the first entry of the result.
So stdout contains the output string of the program, and we can look at what it contains to see whether the attempt was successful or not.
More sophisticated approaches could be tried, such as keeping a log of the already tried inputs and not replaying them. This, however, was 10 times slower than the naive approach (took 1+ hour). Because of the random nature of the script, launching several instances of it could yield to success quicker than with just one.
One successful attempt was to restrict the random numbers range in \([0, 90]\), hoping that a valid triplet was found in that space - and it worked! Just keep in mind that we are definitely over-optimizing something that should be "just good enough" to work, and we already have a decently working script.
Here is the full Python script code for brute forcing a password:
from random import randint import subprocess import time if __name__ == "__main__": found = 0 n = 0 start_time = time.time() while found == 0: n += 1 test = str( randint( 0, 136 ) ) + ' ' + str( randint( 0, 136 ) ) + ' ' + str( randint( 0, 136 ) ) p = subprocess.Popen(["/home/stefano/Binary Reverse/mio_cuggino_85c9feed0cb0f240a62b1e50d1ab0419"], stdin=subprocess.PIPE, stdout=subprocess.PIPE) stdout = p.communicate(input = test.encode()).decode() if 'Uhm' in stdout: print('ATTEMPT : ' + str( n )) print('TESTING : ' + test) print('OUTPUT : ' + stdout) if 'flag' in stdout: print( ' ~~~~~~ FOUND!!!! ~~~~~' ) print(test) print(stdout) found = 1 break print("--- %s seconds ---" % (time.time() - start_time))