My firm despatched me these days this fabulous t-shirt:
The text printed on it seems very very like the output of objdump -d shellcode.o
allege.
Let’s figure out if the shellcode essentially works.
If you had of venture to jot down your procure shellcode, you can without extend acknowledge this jmp
, call
, pop
pattern.
In pure invent it seems as if this (NASM syntax):
jmp short _end _run: pop esi ; now esi register contains address of data heed ; accurate shellcode instructions disappear here _end: call _run data: db 'some data'
After we exhaust shellcode on a staunch intention, now we don’t know at which memory address our shellcode shall be loaded. Most steadily this data could perchance additionally additionally be very functional e.g. when our shellcode contains no longer handiest code but also data.
While jmp
s and call
s can characteristic on relative addresses, thus permitting us to jot down field impartial code (PIC), the facts get entry to instructions (mov
s) want absolute addresses.
NOTE: The leisure sentence is never any longer correct on x86_64 structure, because it introduced a brand new addressing mode known as “RIP relative addressing”.
After we exhaust jmp
and call
instructions in relative address mode, we are essentially utilizing offsets relative to the subsequent instruction following jmp
or call
opcode.
So a relative soar jmp short 0
(all once more NASM syntax), will honest soar to the next instruction and jmp short -2
will make an diverse loop (assuming that your total jmp
instruction takes two bytes).
call offset
instruction is more attention-grabbing, as this could perchance additionally merely no longer handiest soar to the offset, but also will push the address of the next instruction on the stack (the so known as return address).
Now we shall be aware how jmp
, call
, pop
pattern works.
First we favor to field call
instruction honest old to the facts, of which we favor to get address. Then we produce a relative soar to the call
. The call
will establish the address of the next instruction (in this case our data) on the stack and could perchance merely aloof all once more produce a relative soar to the desired offset. Now now we procure the address of our data on the stack, so we could perchance additionally merely honest pop
it genuine into a register of our resolution
After we prefer a see on the t-shirt all once more, we could perchance additionally merely spy that the accurate offsets printed there are sinful. jmp 0x2b
has to be in actuality jmp 0x2a
for the reason that address of call
instruction is 0x2f = 0x05 + 0x2a
. The call
instruction on the a bunch of hand could perchance additionally merely aloof soar to the pop esi
instruction, so the offset has to be 0x2f (call addr) + 0x05 (dimension of call instruction) + offset = 0x05
, or -0x2f
(utilizing 2’s complement this price could perchance additionally additionally be represented as 0xffffffd1
).
Factual after pop esi
now we procure sequence of three circulate instructions:
mov dword ptr [esi+0x8], esi mov byte ptr [esi+0x7], 0x0 mov dword ptr [esi+0xc], 0x0
We know now that esi
points to the gap after our remaining shellcode instruction.
We could perchance additionally merely illustrate this memory space as:
ESI+0: ??|??|??|??
ESI+4: ??|??|??|??
ESI+8: ??|??|??|??
ESI+c: ??|??|??|??
After executing all these circulate instructions (in intel syntax that now we procure here, it is some distance continuously mov dest, src
) our memory space will stare like this:
ESI+0: ??|??|??|??
ESI+4: ??|??|??|00
ESI+8: [value of esi register]
ESI+c: 00|00|00|00
Now here’s attention-grabbing. Looks like now we procure a seven character string terminated by zero, then a pointer to that string and a NULL
price.
/bin/sh
😀 So it seems as if the shellcode on the t-shirt is truncated, the remaining two instruction could perchance additionally merely aloof stare like this:
call 0xffffffd1 ; favor to be a relative call db '/bin/sh'
And our mysterious memory space has to be:
ESI+0: /|b|i|n
ESI+4: /|s|h|00
ESI+8: [value of esi register]
ESI+c: 00|00|00|00
Now that we know what the missing bytes are, we could perchance additionally merely demand that our shellcode is calling one among the execve
functions.
In C execve
is declared in unistd.h
as:
int execve(const char *direction, char *const argv[], char *const envp[]);
It takes three arguments that has to be know to every C programmer available in the market.
Each argv
and envp
arrays have pointers to strings and favor to be terminated
by an entry containing NULL
. Right here is how we could perchance additionally merely exhaust execve
in C:
int most necessary(int argc, charargv) { charargs[] = { "/bin/sh", NULL }; charenv[] = { NULL }; execve(args[0], args, env); }
In truth when env
is empty we could perchance additionally merely compress this code a small (by reusing NULL
already recent in args
array):
int most necessary(int argc, charargv) { charargs[] = { "/bin/sh", NULL }; execve(args[0], &args[0], &args[1]); }
Leer that args
array seems corresponding to our memory space starting at ESI+8
.
After we return to the t-shirt code and take a look at the next instructions we glance:
mov eax, 0xb ; execve(filename, argv, envp) mov ebx, esi lea ecx, [esi+0x8] lea edx, [esi+0xc] int 0x80
The int 0x80
instruction is the fashioned system to call the Linux kernel from 32-bit code (64-bit code on the veil time assuredly makes exhaust of syscall
instruction).
After we call a tool characteristic, we pass the characteristic arguments in
ebx
, ecx
, edx
, esi
, edi
and ebp
registers in precisely that allege.
eax
register is extinct to exhaust the characteristic itself. We could perchance additionally merely look the checklist of all available in the market functions here.
To illustrate to call exit(0)
, first we favor to envision the price that intention assigned to exit
characteristic (0x01
) and establish it in eax
register.
exit(0)
takes one argument. We must establish that argument price in ebx
register
(subsequent arguments would disappear in ecx
, then edx
and so on).
Within the rupture we could perchance additionally merely call the kernel utilizing int 0x80
instrument interrupt:
; C equivalent: ; exit(0); mov eax, 0x1 mov ebx, 0x0 int 0x80
execve
characteristic is assigned to number 0x0b
. And after we stare on the t-shirt all once more, there, after a block of mov
s we can look that precisely this characteristic is
known as:
mov eax, 0xb ; execve(filename, argv, envp) mov ebx, esi lea ecx, [esi+0x8] lea edx, [esi+0xc] int 0x80
lea
instruction is extinct to load the address of the operand to the desired register.
However here since we exhaust oblique memory addressing, lea ecx, [esi+0x8]
is corresponding to ecx = esi + 0x08
in C.
In spite of every thing these mov
s and lea
s now we procure the address of /bin/sh
string in ebx
, the address of args
array (pointer to /bin/sh
followed by NULL
)
in ecx
and at remaining address of NULL
in edx
.
In a bunch of words our code is corresponding to the C code that we saw earlier:
int most necessary(int argc, charargv) { charargs[] = { "/bin/sh", NULL }; execve(args[0], &args[0], &args[1]); }
What follows call to execve
, is a call to exit(0)
. That is a mature methodology extinct in shellcode to honest exit the program with out crashing it. This form we can depart no traces of our code (judge no coredumps).
All in the total code on my t-shirt could perchance additionally merely aloof stare like this:
jmp short _sh_last _sh_start: pop esi mov dword [esi+0x8], esi mov byte [esi+0x7], 0x0 mov dword [esi+0xc], 0x0 mov eax, 0xb ; execve(filename, argv, envp) mov ebx, esi lea ecx, [esi+0x8] lea edx, [esi+0xc] int 0x80 mov eax, 0x1 ; exit(0) mov ebx, 0x0 int 0x80 _sh_last: call _sh_start db '/bin/sh'
Now the ethical of this memoir: continuously establish a working shellcode on t-shirts to protect away from further embarrassment by posts like this one 😉
Bonus: This repo incorporates a Makefile
that will invent the shellcode and likewise prepare a C header file containing the shellcode bytes. There is also a wrapper program that will point to that the shellcode indeed works. The handiest thing that you just need to is a 32-bit Linux.
You’re going to be in a field to take a look at if a Linux intention is 32-bit utilizing uname -a
allege:
uname -a
Linux 4.15.0-133-generic #137~16.04.1-Ubuntu SMP Fri Jan 15 02: 55: 05 UTC 2021 i686 i686 i686 GNU/Linux
If you look i386
or i686
then your intention is 32-bit.
To bring together the assembly code you can want nasm
. You’re going to be in a field to set up it utilizing apt-get
.
develop neat
develop all
./shellcode
./wrapper