Dumping MPress packed binaries ------------------------------ (c) 2009 Fractal Guru (reverser AT put.as , http://reverse.put.as) Target: MPress v2.12 http://www.matcode.com/mpress.htm Tools used: GDB, IDA, 0xED, otool, gdbinit Platform: Mac OS X Leopard 10.5.7 @ x86 Document version: 1.0 (22/07/2009) Index: 0 - Introduction 1 - Disassembling 2 - Breakpointing 3 - Decrypting, unpacking and dumping! 4 - Conclusion 0 - Introduction ---------------- "MPRESS is a free, high-performance executable packer for PE32/PE32+/.NET/MAC-DARWIN executable formats! MPRESS makes programs and libraries smaller, and decrease start time when the application loaded from a slow removable media or from the network. It uses in-place decompression technique, which allows to decompress the executable without memory overhead or other drawbacks; it also protects programs against reverse engineering by non-professional hackers. Programs compressed with MPRESS run exactly as before, with no runtime performance penalties. MPRESS is absolutely free of charge software." Well I'm a very curious non-professional hacker so I decided to give it a go... The program itself is packed with it's technology so it's a good target before using it in something else. The best way to start is by looking at the headers. Otool is our friend! Here it's the result: $ otool -l mmpress mmpress: Load command 0 cmd LC_SEGMENT cmdsize 56 segname __MPRESS__v.1.21 vmaddr 0x0000a000 vmsize 0x0000394a fileoff 0 filesize 14666 maxprot 0x00000007 initprot 0x00000007 nsects 0 flags 0x0 Load command 1 cmd LC_UNIXTHREAD cmdsize 80 flavor i386_THREAD_STATE count i386_THREAD_STATE_COUNT eax 0x00000000 ebx 0x00000000 ecx 0x00000000 edx 0x00000000 edi 0x00000000 esi 0x00000000 ebp 0x00000000 esp 0x00000000 ss 0x00000000 eflags 0x00000000 eip 0x0000d6b0 cs 0x00000000 ds 0x00000000 es 0x00000000 fs 0x00000000 gs 0x00000000 There is a single LC_SEGMENT with an unusual segname. Not something we are used to ! 1 - Disassembling ----------------- The next logical step is to try to disassemble and see what happens... First attempt: $ otool -t -V mmpress mmpress: (__TEXT,__text) section Hum... nothing... I was kinda expecting it since that segname is not normal and otool doesn't recognize it. Otx gives the same result since it depends on otool for the disassembly. IDA should be able to do the job since we can load this binary as an unknown type and point IDA to the correct place where to start disassembling. The EIP is at 0xd6b0 so that's our starting point. I had to lipo the binary because IDA complained about it having a 64 bits version. It's easier to work with non-fat binaries so I extracted the i386 only version (lipo -thin i386 -output mmpress.i386 mmpress). You can load the binary as Mach-o. Nothing will be disassembled and you will get a collapsed segment header. Expand it...Now you get the whole program bytes... Press 'c' and directly convert everything to code. Now press 'g' (or via menu Jump -> Jump to address) and go to the entrypoint address, 0xd6b0. Now press 'c' again... Voila ! Finally we have some disassembly. It's a small block of code and if you give a quick look, there an interesting sysenter call ! HEADER:0000D708 sub_D708 proc near ; CODE XREF: sub_D70E+4p HEADER:0000D708 ; sub_D70E+13p HEADER:0000D708 0F B6 C0 movzx eax, al ; <- eax = SYS_mmap (0xc5 / 197) HEADER:0000D708 ; mmap -- map files or devices into memory HEADER:0000D708 ; mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t offset); HEADER:0000D70B 5A pop edx HEADER:0000D70C 0F 34 sysenter The code reference comes from: HEADER:0000D70E sub_D70E proc near ; CODE XREF: HEADER:0000D6EBp HEADER:0000D70E HEADER:0000D70E arg_0 = dword ptr 4 HEADER:0000D70E HEADER:0000D70E B0 C5 mov al, 0C5h ; SYS_mmap (0xc5) HEADER:0000D710 8B CC mov ecx, esp HEADER:0000D712 E8 F1 FF FF FF call sub_D708 HEADER:0000D717 73 0D jnb short locret_D726 ; mmap returned ok ? jump if ok, else don't jump and exit HEADER:0000D719 89 44 24 04 mov [esp+arg_0], eax HEADER:0000D71D B0 01 mov al, 1 ; SYS_exit (0x1) HEADER:0000D71F 8B CC mov ecx, esp HEADER:0000D721 E8 E2 FF FF FF call sub_D708 HEADER:0000D726 HEADER:0000D726 locret_D726: ; CODE XREF: sub_D70E+9j HEADER:0000D726 C2 1C 00 retn 1Ch Since I love the debugger I went straight thru gdb to start understanding the magic of this packer. This takes us to chapter 2 ! 2 - Breakpointing ----------------- Best breakpoint is where everything starts so EIP at 0xd6b0 is where I want to set it. Here it is what happened... $ gdb GNU gdb 6.3.50-20050815 (Apple version gdb-768) (Sun Mar 22 01:47:54 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-apple-darwin". gdb$ exec-file mmpress.i386 gdb$ b *0xd6b0 Breakpoint 1 at 0xd6b0 gdb$ r MATCODE comPRESSor for mac os x executables Copyright (C) 2007,2008, MATCODE Software, MACHO-MPRESS v1.21 Usage: mmpress [options] options: -b - create backup file -u - do not remove unsupported architectures (mac os x - ub) -i - ignore compression result -L - display short license Program exited with code 01. --------------------------------------------------------------------------[regs] EAX:Error while running hook_stop: No registers. gdb$ hbreak *0xd6b0 Hardware assisted breakpoint 2 at 0xd6b0 gdb$ r MATCODE comPRESSor for mac os x executables Copyright (C) 2007,2008, MATCODE Software, MACHO-MPRESS v1.21 Usage: mmpress [options] options: -b - create backup file -u - do not remove unsupported architectures (mac os x - ub) -i - ignore compression result -L - display short license Program exited with code 01. --------------------------------------------------------------------------[regs] EAX:Error while running hook_stop: No registers. gdb$ Software and hardware breakpoints aren't working... I bet it's because the weird segname. We could try to change some opcode to 0xCC (INT 3) and see if gdb respect it and breaks (I bet it does!). A good candidate is the instruction at 0xd6b2 (mov edi, ebx). We can replace the bytes 8B FB by CC 90 (int3 plus a nop to fill the extra byte). Hexedit the binary, save and launch gdb again. $ gdb GNU gdb 6.3.50-20050815 (Apple version gdb-768) (Sun Mar 22 01:47:54 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-apple-darwin". gdb$ exec-file mmpress.i386 gdb$ r Program received signal SIGTRAP, Trace/breakpoint trap. 0x0000d6b3 in ?? () --------------------------------------------------------------------------[regs] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 9618A452 o d I t s Z a P c ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: BFFFF864 EIP: 0000D6B3 CS: 0017 DS: 001F ES: 001F FS: 0000 GS: 0000 SS: 001F [0017:0000D6B3]-----------------------------------------------------------[code] 0xd6b3: nop 0xd6b4: call 0xd6b9 0xd6b9: pop eax 0xd6ba: add eax,0x27c 0xd6bf: push DWORD PTR [eax] 0xd6c1: pusha 0xd6c2: mov ecx,DWORD PTR [eax] 0xd6c4: mov edx,DWORD PTR [eax+0x4] -------------------------------------------------------------------------------- gdb$ Voila... :) Another error happens... If we try to step over, gdb gives an error and the gdbinit pretty display doesn't come. I don't know why this happens, but the single step command is executed because if we issue 'context' command after the error, we can see that EIP advanced. Probably it's another bug not fixed by Apple. Now we can set any other breakpoint. It will be respected if we restart the program ! Let's proceed to the third chapter... 3 - Decrypting, unpacking and dumping! -------------------------------------- I always recommend to give a brief and summary look at the disassembly listing to have a macro perspective of what is happening. Let me paste the listing of that small block of code: HEADER:0000D6B0 2B DB sub ebx, ebx HEADER:0000D6B2 8B FB mov edi, ebx HEADER:0000D6B4 E8 00 00 00 00 call $+5 HEADER:0000D6B9 58 pop eax HEADER:0000D6BA 05 7C 02 00 00 add eax, 27Ch HEADER:0000D6BF FF 30 push dword ptr [eax] HEADER:0000D6C1 60 pusha HEADER:0000D6C2 8B 08 mov ecx, [eax] HEADER:0000D6C4 8B 50 04 mov edx, [eax+4] HEADER:0000D6C7 B7 04 mov bh, 4 HEADER:0000D6C9 2B E3 sub esp, ebx HEADER:0000D6CB 8B 68 08 mov ebp, [eax+8] HEADER:0000D6CE 55 push ebp HEADER:0000D6CF 8B DA mov ebx, edx HEADER:0000D6D1 32 DB xor bl, bl HEADER:0000D6D3 2B D9 sub ebx, ecx HEADER:0000D6D5 8B 40 0C mov eax, [eax+0Ch] HEADER:0000D6D8 50 push eax HEADER:0000D6D9 52 push edx HEADER:0000D6DA 51 push ecx HEADER:0000D6DB 57 push edi HEADER:0000D6DC 57 push edi HEADER:0000D6DD 6A FF push 0FFFFFFFFh HEADER:0000D6DF 68 12 10 00 00 push 1012h HEADER:0000D6E4 6A 07 push 7 HEADER:0000D6E6 53 push ebx HEADER:0000D6E7 51 push ecx HEADER:0000D6E8 8D 71 1C lea esi, [ecx+1Ch] HEADER:0000D6EB E8 1E 00 00 00 call sub_D70E ; map file into memory HEADER:0000D6F0 59 pop ecx HEADER:0000D6F1 5A pop edx HEADER:0000D6F2 E8 32 00 00 00 call sub_D729 ; unpack HEADER:0000D6F7 0B ED or ebp, ebp HEADER:0000D6F9 75 08 jnz short loc_D703 HEADER:0000D6FB 81 C4 04 04 00 00 add esp, 404h HEADER:0000D701 61 popa HEADER:0000D702 58 pop eax HEADER:0000D703 HEADER:0000D703 loc_D703: ; CODE XREF: HEADER:0000D6F9j HEADER:0000D703 E9 3D 02 00 00 jmp loc_D945 (...) HEADER:0000D945 loc_D945: ; CODE XREF: HEADER:loc_D703j HEADER:0000D945 E9 B6 C6 FF FF jmp loc_A000 ; return to unpakced/decrypted entrypoint ? HEADER:0000D945 HEADER ends HEADER:0000D945 HEADER:0000D945 HEADER:0000D945 end If we ignore those calls, we can clearly see that the jump at 0xd703 will always be executed ! The JNZ at 0xd6f9 will not avoid it. The loc_D945 has another jump and then there's no more code available. If you look at location A000, this is what you get: HEADER:0000A000 CE into HEADER:0000A001 FA cli HEADER:0000A002 ED in eax, dx HEADER:0000A003 FE 07 inc byte ptr [edi] What is this telling us ? That code doesn't make any sense and it's either crypted or packed :) There are two calls at 0xd6eb and 0xd6f2. If you follow the first one, you will see that is calling mmap function. So it must the be that second call the one doing the decryption/unpacking. It's very easy to test. Set a breakpoint before the call and after the call and then look at the instructions on memory location 0xa000. Let me show you: gdb$ b *0x0000D6F2 Breakpoint 1 at 0xd6f2 gdb$ b *0x0000D6F7 Breakpoint 2 at 0xd6f7 gdb$ c Breakpoint 1, 0x0000d6f2 in ?? () --------------------------------------------------------------------------[regs] EAX: 00001000 EBX: 00009000 ECX: 00001000 EDX: 0000A0A4 o d I t s z a P c ESI: 0000101C EDI: 00000000 EBP: 00001384 ESP: BFFFF438 EIP: 0000D6F2 CS: 0017 DS: 001F ES: 001F FS: 0000 GS: 0000 SS: 001F [0017:0000D6F2]-----------------------------------------------------------[code] 0xd6f2: call 0xd729 0xd6f7: or ebp,ebp 0xd6f9: jne 0xd703 0xd6fb: add esp,0x404 0xd701: popa 0xd702: pop eax 0xd703: jmp 0xd945 0xd708: movzx eax,al -------------------------------------------------------------------------------- gdb$ x/5i 0xa000 0xa000: into 0xa001: cli 0xa002: in eax,dx 0xa003: inc BYTE PTR [edi] 0xa005: add BYTE PTR [eax],al gdb$ c Breakpoint 2, 0x0000d6f7 in ?? () --------------------------------------------------------------------------[regs] EAX: 000091C3 EBX: 00009000 ECX: 0000360C EDX: 0000A0A4 o d I t s Z a P c ESI: 0000101C EDI: 00000000 EBP: 00001384 ESP: BFFFF43C EIP: 0000D6F7 CS: 0017 DS: 001F ES: 001F FS: 0000 GS: 0000 SS: 001F [0017:0000D6F7]-----------------------------------------------------------[code] 0xd6f7: or ebp,ebp 0xd6f9: jne 0xd703 0xd6fb: add esp,0x404 0xd701: popa 0xd702: pop eax 0xd703: jmp 0xd945 0xd708: movzx eax,al 0xd70b: pop edx -------------------------------------------------------------------------------- gdb$ x/5i 0xa000 0xa000: lods eax,DWORD PTR ds:[esi] 0xa001: dec eax 0xa002: jne 0xa056 0xa004: mov eax,DWORD PTR [esi+0x28] 0xa007: test al,0x4 gdb$ The code is now decrypted. Since I didn't want to lose too much time understanding the decryption, I just decided to dump the memory contents. Using our beloved vmmap command, we have the following output: $ vmmap -w -resident -interleaved -submap -allSplitLibs 826 (don't forget 826 is my PID, find what's yours!) (...) mapped file 0000a000-0000d000 [ 12K/ 12K] rwx/rwx SM=COW (...) So while still at 0xd6f7 breakpoint, we can dump that memory location: gdb$ dump memory dump1 0xa000 0xd000 Now we have to load dump1 into IDA. IDA can't recognize it's format, so load it as binary and disassemble as 32 bit code. Once again, go to the first byte and press 'c'. You get another small block of code (blue colour in IDA). It's useful to rebase the addresses so we can follow into gdb (Edit -> Segments -> Rebase program). Use the address 0xa000. This time the code is a bit more complicated but one thing is again used, the sysenter at 0xa16b. I breakpointed there and started monitoring what calls were being used. One important detail, you must use hardware breakpoints because software breakpoints will make the program crash (checksum or something to detect the software breakpoints). seg000:0000A16B sub_A16B proc near ; CODE XREF: sub_A171+4p seg000:0000A16B ; sub_A17D+4p ... seg000:0000A16B movzx eax, al ; SYS_mprotect (0x4a / 74) seg000:0000A16B ; mprotect -- control the protection of pages seg000:0000A16B ; mprotect(void *addr, size_t len, int prot); seg000:0000A16B ; seg000:0000A16B ; SYS_open (0x5) seg000:0000A16B ; seg000:0000A16B ; SYS_pread (0x99) seg000:0000A16B ; seg000:0000A16B ; SYS_mmap (0xc5) seg000:0000A16B ; seg000:0000A16B ; SYS_close (0x6) seg000:0000A16E pop edx seg000:0000A16F sysenter Those were the calls used. The close is the last one and after that one the program gets executed (because I only had a breakpoint at that address and after seing the close call the program executed normally). So I traced after the close call and it came back to the address 0xa1b2. This is the code: seg000:0000A1A7 sub_A1A7 proc near ; CODE XREF: seg000:0000A077p seg000:0000A1A7 mov al, 6 ; SYS_close -> put the syscall number into EAX seg000:0000A1A9 mov ecx, esp seg000:0000A1AB call sub_A16B seg000:0000A1B0 jb short loc_A1B5 seg000:0000A1B2 retn 4 To use sysenter you push the arguments into the stack and the syscall number into EAX. The syscall numbers are available at syscall.h. IDA could easily identify the code reference for this code: seg000:0000A056 loc_A056: ; CODE XREF: seg000:0000A002j seg000:0000A056 pop eax seg000:0000A057 push edi seg000:0000A058 push edi seg000:0000A059 push eax seg000:0000A05A call sub_A199 ; call SYS_open seg000:0000A05F mov ebx, eax seg000:0000A061 mov esi, esp seg000:0000A063 push edi seg000:0000A064 push edi seg000:0000A065 push 400h seg000:0000A06A push esi seg000:0000A06B push eax seg000:0000A06C call sub_A18B ; call SYS_pread seg000:0000A071 call sub_A09C seg000:0000A076 push ebx seg000:0000A077 call sub_A1A7 ; call SYS_close seg000:0000A07C add esp, 400h seg000:0000A082 call $+5 seg000:0000A087 pop eax seg000:0000A088 add eax, 0Eh seg000:0000A08D mov [eax], edi seg000:0000A08F popa seg000:0000A090 call sub_A099 You can simply trace the other previous calls and see what are they using. After the close, 0xA090 is what we need to analyse... It's simply this: seg000:0000A099 sub_A099 proc near ; CODE XREF: seg000:0000A090p seg000:0000A099 pop eax seg000:0000A09A jmp dword ptr [eax] ; CRACKME! seg000:0000A09A sub_A099 endp ; sp-analysis failed That jump at 0xa09a is very suspicious...If you follow it you will land at this address 0x8fe01010 (might be different for you). Using again vmmap to understand to what corresponds this address we get: (...) __TEXT 8fe00000-8fe2e000 [ 184K/ 184K] r-x/rwx SM=COW /usr/lib/dyld (...) It's for the dynamic linker editor...Continue stepping, ignoring the first and second calls that appears there. The last instruction will be a jmp eax. Eax is pointing to 0x2414. Disassembling that address: gdb$ x/10i 0x2414 0x2414: push 0x0 0x2416: mov ebp,esp 0x2418: and esp,0xfffffff0 0x241b: sub esp,0x10 0x241e: mov ebx,DWORD PTR [ebp+0x4] 0x2421: mov DWORD PTR [esp+0x0],ebx 0x2425: lea ecx,[ebp+0x8] 0x2428: mov DWORD PTR [esp+0x4],ecx 0x242c: add ebx,0x1 0x242f: shl ebx,0x2 Now that looks like real code... Using vmmap again: $ vmmap -w -resident -interleaved -submap -allSplitLibs 826 Virtual Memory Map of process 826 (mmpress.i386) Output report format: 2.2 -- 32-bit process ==== regions for process 826 (non-writable and writable regions are interleaved) system 00000000-00001000 [ 4K/ 0K] ---/--- SM=NUL /Users/reverser/Projects/mpress/mac-mpress/mmpress.i386 __TEXT 00001000-00003000 [ 8K/ 8K] r-x/rwx SM=COW /Users/reverser/Projects/mpress/mac-mpress/mmpress.i386 __TEXT 00003000-00006000 [ 12K/ 12K] r-x/rwx SM=ZER /Users/reverser/Projects/mpress/mac-mpress/mmpress.i386 __DATA 00006000-00007000 [ 4K/ 4K] rw-/rwx SM=ZER /Users/reverser/Projects/mpress/mac-mpress/mmpress.i386 __IMPORT 00007000-00008000 [ 4K/ 4K] rwx/rwx SM=ZER /Users/reverser/Projects/mpress/mac-mpress/mmpress.i386 __LINKEDIT 00008000-00009000 [ 4K/ 4K] r--/rwx SM=ZER /Users/reverser/Projects/mpress/mac-mpress/mmpress.i386 If you have already some experience with vmmap you will quickly recognize that as a fully mapped binary :) I bet it's the unpacked binary waiting for us to dump it !!! Let's give it a try: gdb$ dump memory mmpress.unpacked 0x1000 0x9000 In another terminal windows: $ chmod +x mmpress.unpacked $ ./mmpress.unpacked MATCODE comPRESSor for mac os x executables Copyright (C) 2007,2008, MATCODE Software, MACHO-MPRESS v1.21 Usage: mmpress [options] options: -b - create backup file -u - do not remove unsupported architectures (mac os x - ub) -i - ignore compression result -L - display short license Voila ! It's unpacked in full glory. Otool is able to read it's header and disassemble it without a problem. This method will work for command line binaries not linked to Cocoa. If you dump Objective-C binaries, you will get errors about the NIBs. It's not packer fault but a problem with dumping Objective-C binaries. I was working on this to understand what needs to be fixed when these binaries are dumped. You can still disassemble the unpacked binary and give a look at what's happening. Maybe that's enough for your objectives. 4 - Conclusion -------------- And that's it ! There's not much science in manually dumping mpress packed binaries and a non professional hacker can do it ;). It's very easy to do it even with the Objective-C problem. I intend to investigate it and release that info when I find it. If you have any hints feel free to leave a comment or mail me ! I appreciate it :) Sorry if there are any mistakes or something not well explained ! It's not easy to write tutorials ! Have fun! fG!