Getting Started With Reverse Engineering
Solving MalwareTech’s reversing challenge
There are many verticals of cyber security such as Mobile Security, Web/API Security, Forensics, Reverse Engineering, etc. I have to admit, reverse engineering is very challenging and I am not very proficient at it, let alone malware analysis, which would be one of my short term goals.
So I decided to read online articles about the topic of interest and ran across MalwareTech’s blogs. Luckily, he has some challenges to help people like me get started. This has helped me get the idea of what reverse engineering is all about and how to go on with it. I hope this can be a jumpstart for others who are interested in learning. Let’s jump right into it.
I am using Windows 10 and IDA for these exercises. I’ll be covering the first 3 Hide and Seek exercises in which you can find here. To be able to copy the executable to my windows machine, I had to make sure that real-time protection is turned off because windows marked these executables as malware and will not let me save it in my desktop unless I disable the protection.
[Hide and Seek] Strings1
The first step I would do when doing any challenge/exam is to read the instructions carefully. Because if I don’t there is a chance that I might miss a hint given there or I may “accidentally” cheat my way through. Of course you can cheat on this one, no one will punish you for it, but what you will not get the intended knowledge by cheating.
MalwareTech already gave us a format of what the flag is, FLAG{EXAMPLE-FLAG}.
I’m going to start loading this executable in IDA now. It gives me a nice set of instructions of the program.
Analyzing through the assembly instructions, there is a call of md5_hash at 004022BA and we know that the program will output an md5_hash of the flag when run, but we are not going to run the program. Even though this is just a simple challenge, we are going to have to treat this as if we were analyzing a real malware (in addition that windows 10 already flagged this executable as malicious), we are going to make sure that we are not going to run the executable.
Here is the analysis of the assembly code
004022BA — the program is taking an md5_hash of (char *)
004022B9 — (char *) is pushed to EAX (now the content of EAX is what we are interested in)
004022B4 — a string at offset 432294 is being moved to EAX
Ok I think this is enough for our starting analysis. Now I will go to what offset 432294 (off_432294) contains. In IDA is very easy to do this. By double clicking on that offset, it will take me to the contents, or I can just highlight it and it will give me the value.
Awesome, looks like we got our first flag!
[Hide and Seek] Strings2
This challenge has the same rules & information as the previous one. So let’s just jump into it by loading it up in IDA. As usual, I will start with the beginning of the program.
There is a bunch of those BYTE PTR -xxh so I’ll just take a short screenshot of it. I’m going to traceback from when the program actually calls the md5_hash function.
It looks very similar to the previous challenge. (char *) is being passed to the md5_hash function. (char *) is also being pushed to EAX).
Here, at 00402342, where the instruction is LEA EAX, [EBP+var_28], it will load the address pointer of [EBP+var_28] into EAX, which will then be passed to the md5_hash function.
Let’s walkthrough the mov instructions
It looks like it is pushing hexadecimal values to EBP at certain offsets of the register.
Here is a little walkthrough of the code:
It is pushing 46h(which is ‘{‘ in text) to EBP+var_28(-28h) in hex, and so on.
If we decode the whole hexadecimals being pushed to certain offsets of EBP, we will get our flag! Moving on to the last one.
[Hide and Seek] Strings3
Initial look.. ok this looks different.
But again, whats being passed to the md5_hash function is (char *) which is then again the contents of EAX register. But things go haywire from here. let’s trace it one by one.
There seems to be 2 new functions to us, which are colored pink — LoadStringA and FindResourceA. Why are they colored pink? I don’t know.
After searching through google, I immediately got the answer.
LoadStringA
Loads a string resource from the executable file associated with a specified module and either copies the string into a buffer with a terminating null character or returns a read-only pointer to the string resource itself.
Here is the syntax
After reading through the documentation, the interesting parameter is UINT. Which is the identifier of the string to be loaded. But what exactly is the value of the uID?
To answer this question, we have to understand how function parameter’s are pushed into the stack. The parameters have to be pushed to the stack before calling the function with the parameters, otherwise, it will be a parameterless function.
For example, if there is a function
function(param1, param2)
{
//dosomethinghere
}
The assembly code would result in something like this
push param2
push param1
call function
So now it is getting more clear, we are looking for the 2nd variable (uID) passed into the LoadStringA function. This means that it will be the 2nd push instruction from the LoadStringA function (counting to above).
That is what we are looking for. Now we just have to trace what value EDX holds. From the instruction above it, it holds the value of the pointer [EBP + var_4]. Now we have to trace EBP. This doesn’t seem to be as simple as the previous two exercises.
Looking closer, [EBP+var_4] is being pushed by the value of EAX. So let’s just trace EAX instead. I will start from the beginning.
1st instruction is moving the value 1 to EAX. Then shifting EAX to the left by 8. Then EDX gets XORed by itself, causing it to be all zeroes. Then increment EDX, followed by shifting it to the left by 4. Then taking the OR of EAX vs EDX. This value then gets stored to [EBP+var_4] which then will be put to EDX which will then be the uID value to the LoadStringA function.
Here is the representation on the registers on each of the instructions.
The final value of EAX or EDX is 110h which is 272 in decimal.
Now it’s time to analyze the next function.
FindResourceA
Determines the location of a resource with the specified type and name in the specified module
Here is the syntax for it
After reading through the parameter descriptions, there is 1 particular parameter that I am interested in, LPCTSTR. Looking at the assembly code, it is fairly easy to determine that this executable is fetching resources from rc.rc.
So now it becomes clearer that the executable loads a resource from rc.rc, and will look for the string identified by the uID of 272.
After searching through google, I found a tool called Resource Hacker, that will help me see the content of the resource file.
Flag found!
Game Over
Well, that concludes the first 3 challenges from MalwareTech. I hope you had a good read and a good starting point to start reversing more applications. I’ll try to post more write-ups in the future to share more interesting stuff :)