Static malware analysis: LoroBot

I’ve just watched the video of Paul Rascagneres from Malware.lu (https://youtu.be/Xop__W_h0tc [FR])  and I wanted to share the analysis he has done with you but in much more details.

Before starting the analysis, I wanted to introduce Malware.lu to you. Funny enough, it is actually quite complicated to find malware in order to analyse them. Usually, client sends you the malware to analyse or you collect them through honey pot, but whenever you want to analyse a specific one, AV companies are not quite keen on sharing them. Therefore, Malware.lu decided to create a malware repository where you can search for specific malware (usually based on the MD5 or SHA1). They have currently 4,950,424 samples.

In order to download them, you will need to register. They will ask you for the reasons why you need to access that repository, but it is more an excuse for them to tell “we do not distribute malware to everyone on the internet” if authorities complain. Therefore, if you want to play with malware, I would recommend you to register on that website: https://avcaesar.malware.lu

Now, let’s get started!

 

Introduction

The malware analysed in this short report is the sweet LoroBot and has the following MD5: fec60c1e5fbff580d5391bba5dfb161a
It is available here: https://www.dropbox.com/s/efecd2wawswk0r9/d1a8c50ee845f68d1f95655dfe6e2ed99177676579f03bac39128352a5fba01a.zip (with the password “infected”).
Also available on malware.lu: https://avcaesar.malware.lu/sample/d1a8c50ee845f68d1f95655dfe6e2ed99177676579f03bac39128352a5fba01a
It is a ransomware that will “encrypt” all your documents on your computer.

Doge about encryption

To be honest, that malware is really nice and easy to understand for a malware RE introduction. Fortunately, the malware is not packed so we can skip that unpacking session and move straight to the static analysis. The following presentation is quite “dry”, so I would recommend you to follow the analysis with the malware loaded in IDA.

Initialisation

1 first block

The malware first start by getting the path of the windows directory with GetWindowsDirectoryA [@0x4012AB]. Typically, it will return “C:\Windows”. The path (a string) will then be copied in the static array Filename.
The variable Filename is then concatenated with String2, which is actually the string “\CryptLogFile.txt”.
We can see that both the variable String2 and Filename are pushed just before the call to lstrCatA [@0x4012BA].
As seen in the MSDN description of the function, the result (concatenation of both strings) are then put in the first variable pushed, i.e. Filename.
The first letter of Filename (most likely “C”) is then copied to AL then AL is copied to File.
The function getCommandLineA is then called [@0x4012C9]. This function doesn’t take any argument but returns the command-line string for the current process (well, actually, the pointer to that value).
As usual, the return can be found in EAX, which is then moved to ESI.
An address in memory is then loaded in EDI.
We will see later that this address will be used to store the command-line string.

Copying the command-line string

1 block 2

If you switch your view to the graphical workflow (with the space bar) you will clearly identify a loop on the right hand side after the “Initialisation”. Let’s focus on that one first. This loop is using two instructions not explicitly covered during the SANS FOR610, lodsb and stosb.

lodsb loads (E)SI in AL. Load is not equal to move. In this case it doesn’t copy the content of (E)SI in AL, but instead, copy the value pointed by (E)SI to AL. So in this case, it will move the first character of the command-line string to the register AL. Once copied, (E)SI is automatically incremented (or decremented) without additional instruction needed. For your information, the flag DF is used to decide whether (E)SI should be incremented or decremented.

DF=0 > incremented | DF=1 > decremented

So this means after that instruction, (E)SI is now pointing to the second character of the command-line. AL is then compared with the value 0x22. When using IDA to convert that HEX value in its ASCII representation, we get a double quote (i.e. “). If the comparison fails – i.e. if the first character of the command-line is not a double quote – it will move to the next block. The next block will then compare AL (i.e. the first character), to 0 (i.e. NULL). Once again, if it fails it will move to the next block with stosb.

stosb stores AL to (E)DI. Store is not equal to a simple move. In this case, stosb will move the content of AL to the address pointed by (E)DI. In this case, the first character of the command-line will be copied to byte_403F28. Once copied, (E)DI is automatically incremented (or decremented) without additional instruction needed. For your information, the flag DF is used to decide whether (E)DI should be incremented or decremented.

DF=0 > incremented | DF=1 > decremented.

So this means after that instruction, (E)DI is now pointing to byte_403F28 + 1. The next instruction is a simple jump back at the beginning of the loop. Now that we completely understand the loop, we can change with IDA the name of byte_403F28 to “aCommandLine”.

Here is the loop equivalent:

Set a new wallpaper

1 block 3

Once the command-line string is copied in aCommandLine, the execution exit the loop, and move to the next block [@4012E2]. The first instruction of this block is to call loc_4015B5.

1 diagram 1

When looking at that function in the graphical diagram mode of IDA, the flow looks like a “Do something, check, do something, check, etc”.

1 block 4

This function will first call CreateFileA [@4015CF], which create… not a file but a handler. In this case, a handler to the file aCommandLine. Basically, the executable will open itself.

Since the flag swShareMode is set as 1, we know that the handler will open the file in read only mode. CreateFileA return (in EAX) the open handle if successful or a INVALID_HANDLE_VALUE, which is actually -1, i.e. 0xFFFFFFFF. The first check after the opening the handler is to make sure the function didn’t return -1.

1 block 5

If the handle has been successfully created, it will then save the open handler in hObject (we can change the name to “hExeHandler”).

GetFileSize is then called with the hExeHandler as argument. The function return (in EAX) the file size if successful or NULL if it fails. 4 is then subtracted from EAX, i.e. the file size – 4. Then, this new size is then copied in lDistanceToMove.

SetFilePointer is then call with this new size and the hExeHandler as argument. Basically, now we have the hExeHandler that is pointing to the last 4 bytes of the file.

And lastly, the function ReadFile is called with the number of byte to read (4) and the hExeHandler send as argument.
The 4 bytes read are stored at the address NumberOfBytesToRead. We will see it later, but basically NumberOfBytesToRead is actually the size of a data block that will be extracted from itself.

It then verifies that more then 0 bytes has been read and move to the next block.

1 block 6

In the next block, it first save NumberOfBytesToRead (actually the pointer to that variable) in EAX. It then call GlobalAlloc in order to create a buffer in the heap. The argument 0x40 is used to initialise the memory with 0’s. GlobalAlloc return (in EAX) the pointer to this new buffer. The pointer is saved in lpBuffer. It then move the lDistanceToMove in EAX. lDistanceToMove was set to “size of the exe – 4” and was used to extract the last 4 bytes from the executable.

Now it subtract to EAX the value read in the last 4 bytes. SetFilePointer is then called to point hExeHandler to that new distance located in EAX

Lastly the ReadFile is called to extract from the binary lDistanceToMove bytes starting from lDistanceToMove – 4.
The content is saved in the buffer pointed by lpBuffer. We can change the value of lBuffer to “lExtractedFile”.

Here is a quick summary, basically, the malware author appended to the malware a file together with the size of this file. The last for bytes contains the size used to point at the exactly at the beginning of the appended file.

1 block 7

In the next block, hExeHandler is closed. It then call getEnvironmentVariable to get the path of the TEMP folder. The function return the string in byte_404B41 previously pushed before the call. We can change the name of byte_404B41 to “tmpPath”. Here I didn’t find any information that explains why 0x40300C means the TEMP folder, I had to use OllyDbg. If anyone has an explanation, please leave comment.

It then appends the string “\wallpaper.bmp” to tmpPath with lstrcatA. The concatenation is saved in tmpPath. We can actually already change the name to “wallpaperPath”.

DeleteFileA is then used to delete the wallpaper file in case it was already present. And lastly, it call CreateFileA to open a new handler for the file wallpaperPath. The function return the handler in EAX and once again make sure the creation of the handler was successful.

1 block 8

Last block (finally!) of the function. It first writes the content lExtractedFile in the wallpaperPath file thanks to WriteFile. It then close the file handler.

Next it will call an interesting function: SystemParametersInfo. SystemParametersInfo can do many things on the system. In this case the action used is 0x14. As explained in the MSDN, 0x14 is used to set the desktop wallpaper, and the pvParam used in that call is wallpaperPath. So this call change the desktop wallpaper with the file extracted from the executable.

We’re now done with that local function. We can actually now rename sub_4015B5 to setNewWallpaper.

I decided to write a little python script to extract that wallpaper from the binary.
The python script is available here: http://pastebin.com/8wV066rd (please don’t mind my clumsy way of converting the size in integer). It works! We get a bmp file that contains the scam. Apparently, AES encryption is ahead, scary 🙂

 

Browse the drives

1 block 3

Let’s get back where setNewWallpaper was called [@4012E2]. Once the new wallpaper set, it calls CreateFile to create a new handler for the file FileName. Remember during the “Initialisation”, FileName was set to the CryptLogFile.txt path.

The open handle is once again saved in EAX. Since the flag swShareMode is set as 2, we know that the handler will open the file in write mode. Note the dwCreationDisposiation set to 1. As mentioned in the MSDN, this means that If the specified file exists, the function fails.

The next instructions verifies if the handle has been properly created (like for instance if the file already exist), it moves to the next block.

1 block 10

The next block start with the call SetErrorMode with the argument 0x1 [@401344]. According to the MSDN, this is to prevent error mode dialogs from hanging the application.

Then it call GetLogicalDrives in order to get a bitmask representing of the currently available disk drives. Bit position 0 (the least-significant bit) is drive A, bit position 1 is drive B, bit position 2 is drive C, and so on. So 00000001 means the drive A is available. 00000101 means the drive A and C are available. Etc.

It then move 0x19 in ECX.

1 block 11

The next blocks looks like a loop (if you look at the graphical diagram view in IDA) [@401353]. The loop starts by moving 0x1 in EBX then it executes the instruction  shl. shl shifts the bits in the first operand to the left or right by the number of bits specified in the second operand.

Here in the case, the first operant is EBX, and EBX contains 0x00000001. The second operand is CL. CL is the low byte of CX. And ECX contains 0x00000019. So this means 0x00000001 will be shifted to the left by 0x19 = 25 bits.

This shifted value is then AND’ed with EAX (which contains the drives bitmap). In my case, the bitmap = 0xC.

0000 0000 0000 0000 0000 0000 0000 1100 > Drive C: and D:

Then a AND between the AEX and EXC result in 0x0 therefore, the jump will be taken.

The JZ instruction jump to the end of the loop where ECX is decremented. Short note here: whenever DEC (decrement) is executed, it will set the zero flag to 1 if the decremented value reached 0. Therefore, the JGE (jump if greater or equal) will jump until ECX is decremented to 0.

Basically the loop start at 25 (the end of the alphabet), and will check each drive from Z to A. Whenever the drive is present, it will step into another block.

In this other block [@40135E], CL is added to 0x41. For those who already did some fuzzing/buffer overflow hunting, you should know that 0x41 is the equivalent of the ASCII representation of “A”. Basically at that stage in the executable, CL is the position in the alphabet of the letter used for the drive. Therefore, it converts that position in its ASCII representation. The letter is then saved at dword_403327+1.

The CL is set back to its initial value by subtracting 0x41. Then a specific value is copied at the address just after the drive letter. This specific once converted in ASCII with IDA give: “.*\:”.

Then another one is moved just after. Then finally the null character is copied at the end of it.

So if we have reached the C: drive, we now have in memory the string “*.*\:C”, which is “C:\*.*”.

EAX and ECX are then saved in memory for later.

The next instruction then call a local function sub_401000, the function used for encryption.

 

Encrypt all the files

The encrypt function is quite long and I think I lost most if not all the reader so far. Therefore, I will go a little bit faster for the analysis of this local function. But if you a reading this line, well, I’m surprise and also proud if you 🙂

Let’s analyse that “AES” encryption function!

1 block 12

So it first starts with the normal prolog [@401000], i.e. push EBP, move ESP in EBP and setting ESP for the local variables.
It then call FindFirstFile, which search for a file or subdirectory with a name that matches a specific name.
The argument sent to the function is the “C:\*.*” built previously.

This function return a search handle that will be use by FindNextFile in order to browse all the files in that directory.

A pointer to the fist file found with FindFirstFile is copied in the first parameter pushed , i.e. hFindFile. If the function fails because no more matching files can be found, it returns 0xFFFFFFFF, i.e. -1. Therefore, next instruction increment EAX and then verify is it is equal to 0. If it is not equal to zero, the instruction jump to the next block otherwise, it jumps at the end if the function.

EAX is then set back to its original value (i.e. DEC) then saved in a local variable (EBP-6).

1 block 12
The next block checks if the file found is a directory. For this, it will move the hFileFind structure in EAX. hFileFind [EBP-144] is a WIN32_FIND_DATA structure and start with a dwFileAttributes. File attributes are the metadata of the file.

For instance, if the file is a directory, the file attribute will be set as 0x10. Therefore, the block will XOR the file attribute with 0x10 and check if the value is 0 (which means it is not a directory) or not.

Directory

1 block 13

If it is a directory [@401034], it will make sure this is not the the current directory (single dot) or a parent directory (double dots)
If you wonder what [EBP-118h] is, look at the WIN32_FIND_DATA:

[EBP-118] is indeed cFileName.

If it is the current or the parent directory, it just moves to the next file. However, if it is another directory, it takes the current path (i.e. C:\*.*) and remove the last 3 characters. Then it concatenates the current path with the name of the directory and add another “\*.*” at the end of that new string.

In the next instruction, it calls itself (the function).

After that call, it reset the path string has it used to be (i.e. C:\*.*).

In the next block, it simply moves to the next file

Here, we have a recursive function (i.e. a function that call itself). This is commonly used for browse a data tree, in this case the file system.

File

1 block 14

Now let’s see what happen if the selected file is not a directory, i.e. if dwFileAttributes is not equal to 0x10 [@4010C3].

It first extracts the last 4 characters of the filename selected.

[eax+ebp-11Ch] > EAX is the size.
EBP + 0x118 is the location where the filename is.
So 0x118 + 0x4 = 0x11C.

The last 4 characters (most likely the extension) are then saved.

1 block 15

It then browses all the extensions located at 0x403095 starting by “.zip” (the last one).

There is a little trick tho in case the extension verified is “.db” (database). Indeed, since .db is 3 characters long (and not 4), is will copy the last character before the extension in the filename into the extension currently tested.

1 block 16

The two extensions are then simply compared. If it doesn’t match, it then jump back to the beginning of the loop where the next extension is then compared.

1 block 17

The filename is then appended after the drive path with the *.* being removed in order to create the full path of the current file. The full path is then used to create a handler. It then make sure that the file actually exist then move to the next block.

In this block [@401188], it first calculate the size of the file and store it. Then it read the file and copy the entire content a buffer.
If the file is not empty, it moves to the next block.

1 block 18

NumberOfBytesRead contains a pointer to the variable that receives the number of bytes read.
The value is saved in EAX, then a internal function is called.

1 block 19

The function start by moving EAX (bytes read) in ECX. Then ECX is shifted by two bits to the right, which is the equivalent of dividing by 4. EAX is then compared to 0x200000, which is the maximum expected size whenever the file was read. If the size was bigger than the expected size, ECX is incremented.

The pointer to the content of the file is then moved in ESI and EDI.
EDX (counter) is then initialised to 0 and it then jumps in a loop.
EDX is set to 0 if EDX = 0x10. This is the equivalent of EDX = EDX % 16.
We then have the content pointed by EDI that is moved to AEX thanks to lodsd (remember we covered lodsb, which deal with bytes. lodsd deals with double word).
AEX is then XORed with the first 4 bytes of the string located at 0x403060.
EAX is then saved in memory pointed by ESI thanks to stosd (remember we covered stosb, which deal with bytes. stosd deals with double word).
Then 4 is added to EDX and ECX is decremented.
If ECX is not equal to zero, it jumps back to the beginning of the loop.

Here is the equivalent of this function:

So we found the encryption function 🙂

My encryption knowledge is a bit rusty, but I don’t recall the AES algorithm to be a simple XOR with a static secret… Anyway, let’s step out of the function and look at what happen once after the call [@4011CD].

The pointer is set back at the beginning of the file. The buffer that contains the encrypted file is the used to override the initial file. So basically, the original file is replace by its encrypted form.

1 block 20

Then the full path of the encrypted file is then written in the CrpytLogFile.txt [@401224]. Then a new line is created with “TMP”.

1 block 21

Finally, it will call FindNextFile to move to the next file in the folder @40124C.

 

Create the warning

Once all the encryption blocks executed, it then moved in the next drive [@40138C].
Once all drives browsed, is finally move to the last block @40138E.

1 block 22

This final block will create a new file with name that contains weird character (see 0x403031).
Then it will write the text “Very bad news…” in it.
And finally, it will call ShellExecuteA [@4013EB] to open the file with the bad news notice.

 

The author’s backup solution

1 block 23

Let’s get back to call when the handler for CryptLogFile.txt was created [@4012FB]. We noticed that if the files already exist, it will go to another block @40130A.

This block will push the static variable aRafarpnkcucmghg, which is actually a string. The function lstrlenA is then called. lstrlenA takes a single argument (string) and return the length of that string in EAX (classic).

EAX is then pushed to the stack just for saving that value.  aCommandLine is then pushed to the stack where its length is also calculated with lstrlenA. The length of aCommandLine is then saved in EBX and the length of aRafarpnkcucmghg is then moved back in EAX. EAX is then subtracted to EBX. Basically, it is strlen(command_line) – strlen(aRafarpnkcucmghg).

EBX is then added to the pointer of aCommandLine. This means that now EBX is pointing in the middle of the string that contains the command-line. To be accurate, it is pointing at the (strlen(command_line)-strlen(aRafarpnkcucmghg)) character.

aRafarpnkcucmghg is then compared to the remaining string where EBX is pointing with lstrcmpiA. lstrcmpiA return 0 if the strings compared are equal. EAX is then OR to itself. If EAX OR EAX = 0, it jumps to a call instruction for the encryption function.

In other words, the function look if the executable has the following extension: “.rafarpnkcucmghgklmgtiftqgtswqtrim”. If yes, it runs the encryption function.

This might sounds a bit silly but actually, this is the solution the author used to decrypt all his files in case shit happened by unintentionally encrypting his own files.

Another solution would simply requires the user to delete the file CryptLogFile.txt… Too complex for the author I guess…

 

Wrap up

After that analysis, we identified exactly what this malware is doing. Although this is usually not possible (or needed), every single instruction should be understood now. We know that the malware is:

  • Extracting a BMP out of itself and set it as a new wallpaper
  • It encrypt all the doc, xls, docs, etc (see list @403095) by XORing the file with the string located @403060.
  • Each files encrypted are then listed in a file called CrypLogFile.txt
  • Another text file is created with a weird filname (@403032) with the text: “Very bad news…”

This could be used as IOC.

In order to clean the malware, we just need add the following extension “.rafarpnkcucmghgklmgtiftqgtswqtrim” and run it again in order to decrypt all files. Then remove the malware, the wallpaper, the text file with “Very bad news….” and the CrypLogFile.txt.

Leave a Reply

Your email address will not be published. Required fields are marked *