How can I reprogram my shellcode snippet to avoid null bytes?

Question

I have programmed a piece of x64 linux assembly. All it does is just prints a line "Hello world", that's all. However what I want to do is copy the bytes from it's object file by objdump, so that I can make my own shellcode for my buffer overflow attacks.

The problem I a facing is that the shellcode contains lots of null bytes and that will terminate the execution of my shellcode.

root@kali:~/C scripts/shellcode/Assembly Based Shellcode# cat print.asm
section .text
 
global _start
 
_start:
 
    mov rax, 1
    mov rdi, 1
    mov rsi, message
    mov rdx, 12
    syscall
 
    mov rax, 60
    xor rdi, rdi
    syscall
 
message:
    db "Hello world", 10
root@kali:~/C scripts/shellcode/Assembly Based Shellcode# nasm -f elf64 print.asm && ld print.o -o print && ./print
Hello world
root@kali:~/C scripts/shellcode/Assembly Based Shellcode# objdump -D print.o
 
print.o:     file format elf64-x86-64
 
 
Disassembly of section .text:
 
0000000000000000 <_start>:
   0:   b8 01 00 00 00          mov    $0x1,%eax
   5:   bf 01 00 00 00          mov    $0x1,%edi
   a:   48 be 00 00 00 00 00    movabs $0x0,%rsi
  11:   00 00 00
  14:   ba 0c 00 00 00          mov    $0xc,%edx
  19:   0f 05                   syscall
  1b:   b8 3c 00 00 00          mov    $0x3c,%eax
  20:   48 31 ff                xor    %rdi,%rdi
  23:   0f 05                   syscall
 
0000000000000025 <message>:
  25:   48                      rex.W
  26:   65 6c                   gs insb (%dx),%es:(%rdi)
  28:   6c                      insb   (%dx),%es:(%rdi)
  29:   6f                      outsl  %ds:(%rsi),(%dx)
  2a:   20 77 6f                and    %dh,0x6f(%rdi)
  2d:   72 6c                   jb     9b <message+0x76>
  2f:   64                      fs
  30:   0a                      .byte 0xa
root@kali:~/C scripts/shellcode/Assembly Based Shellcode#

I hoped the shellcode would be free from null bytes. However it is not. Can someone help me and correct my code?

Use byte loads and find other alternatives to avoid zero bytes. Consult an instruction set reference. E.g. `mov rax, 1` can be written as `xor eax, eax; mov al, 1`. — Jester, Sep 06 '19 at 12:44
Unfortunately, I cannot find that set reference. If you do the please give it to me as well. Even this helped me thank you! — nltc, Sep 06 '19 at 12:53
@nltc See [Intel® 64 and IA-32 Architectures Software Developer Manuals](https://software.intel.com/en-us/articles/intel-sdm). Another useful reference is http://ref.x86asm.net/ and https://c9x.me/x86/. — fuz, Sep 06 '19 at 12:58
Note that in addition to fixing the NUL bytes, you also need to make your code position independent. Let me see if I can write an appropriate answer. — fuz, Sep 06 '19 at 13:00
@nltc Position independent means that the address your code is loaded to needs not be known at assembly time. For example, your `mov rsi, message` is not position independent because it needs to know what absolute address `message` is located at. To fix this, you need to use something like `lea rsi, [rel message]`. — fuz, Sep 06 '19 at 17:35
Oh thank you! Sorry I am a very beginner to assembly programming and don't know everything about registers, and function like **lea**. Anyways I have tried to remove all most all null bytes, just one is left. I will post the edited code along with the output on pastebin. Here is the link to it: https://pastebin.com/CjUpRzcd. What can I do to remove that remaining null byte — nltc, Sep 06 '19 at 18:20
You can for example add a big offset that you then subtract, e.g. `lea rsi, [rel message+0x11111111]; sub rsi, 0x11111111` — Jester, Sep 06 '19 at 18:41
Thank you so much that helped me! I successfully removed all the null bytes from my shellcode. However, the shellcode is useless I think because when I test it through a C script it doesn't work. Again the everything is posted here: https://pastebin.com/haBhcB4s — nltc, Sep 06 '19 at 19:13
You should use `void*` not `int`. Even the warning says `int` is of the wrong size. Also the `+2` offset may not be correct, you'd have to look at the compiler generated assembly code to check that. To invoke your shellcode the simplest way is to use a function pointer. This happens to work for me: `void **ret = (void**)&ret + 2; *ret = shellcode;` — Jester, Sep 06 '19 at 19:34
@Jester sorry for my late reply. I have been trying out many things, and thank you that code helped me a lot. Now I programmed another assembly code. I tried to remove null bytes from it as well but some are not getting solved. How can I do that? Code and output here: https://pastebin.com/85zDFeQm — nltc, Sep 07 '19 at 13:00
You put `name` into a separate section. Don't do that, shellcode does not have sections, and especially not `.bss`. Put it at the end and fill it with something other than zeroes, e.g. put 64 `A` letters or something. — Jester, Sep 07 '19 at 13:48
Okay I have done that, but I am encountering another error: https://pastebin.com/2q4S972Y. Since the `**name**` is already filled with 64 **'A'** so when I take input and store it in the `**name**` it gets executed in bash. How can I overcome this now? — nltc, Sep 07 '19 at 15:15
For the standalone version `.text` is not writable (should work in shellcode though). Use `ld -N` for linking. Also, you always print the full 64 bytes even if you did not read that much. — Jester, Sep 07 '19 at 17:36

Vuln X · Accepted Answer · 2020-09-30T12:26:26.343

You seem to be confused in assembly and buffer overflows.

I reprogrammed the assembly file like this :

section .text

GLOBAL _start

_start:

    xor rax, rax                  ; Clear the RAX register
    push rax                      ; Push the NULL byte [ string terminator ]
    add al, 0x1                   ; RAX = 1, to put the system in sys_write mode
    mov rdi, rax                  ; RDI = 1, to setup the fist parameter for write ( file descriptor to write to ). The integral value for 'stdout' is 1.
    lea rsi, [rel msg+0x41414141] ; Move the relative RIP address of msg to RSI to prepare the string buffer for writing to the stdout. Also add a large 4-byte offset to evade NULL bytes.
    sub rsi, 0x41414141           ; Subtract that large offset to make the RSI point correctly.
    xor rdx, rdx                  ; Empty the 3rd argument for write
    mov dl, 0xc                   ; RDX = 12, 12 ==> string length of msg
    syscall                       ; system call

msg db "Hello world", 0xa

EDIT : As per the discussion in the comment section I have removed the terminating NULL byte

section .text

GLOBAL _start

_start:

    push 0x1
    pop rax
    mov rdi, rax
    mov rbx, 'AAAAArld'
    shr rbx, 0x28
    push rbx
    mov rbx, 'Hello wo'
    push rbx
    mov rsi, rsp
    push 0xc
    pop rdx
    syscall

and then I compiled the program like this :

root@kali:~/Desktop/assembly# nasm -f elf64 main.asm; ld main.o -o main.elf; ./main.elf 
Hello world
Segmentation fault

The seg fault really doesn't matter because you want to use this as shellcode for a buffer overflow attack so there is anyway a seg fault there.

Now, extract the bytes from the object code :

root@kali:~/Desktop/assembly# for i in $(objdump -D main.o | grep "^ " | cut -f2); do echo -n "\x$i"; done; echo
\x48\x31\xc0\x50\x04\x01\x48\x89\xc7\x48\x8d\x35\x4f\x41\x41\x41\x48\x81\xee\x41\x41\x41\x41\x48\x31\xd2\xb2\x0c\x0f\x05\x48\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x0a

[ OPTION ] : You can test your shellcode before executing it in a memory corruption exploit

For that just copy the bytes from the above command and make a new C file [ it seems you are confused in the C script as well ]

#include <stdio.h>

int main(void) {

    char shellcode[] = "\x48\x31\xc0\x50\x04\x01\x48\x89\xc7\x48\x8d\x35\x4f\x41\x41\x41\x48\x81\xee\x41\x41\x41\x41\x48\x31\xd2\xb2\x0c\x0f\x05\x48\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x0a";
    int (*ret)() = (int (*)())shellcode;
    // The above line will create an integer pointer ret make it point to a function which doesn't require parameter [ indicated by the () ]. Then it will type casting to cast the shellcode to a function pointer of the same type.
    // So this will essentially cast your shellcode array address to a function pointer which you can later use to call it as a function and execute the code.
    ret(); // Execute the shellcode

}

Then compile the program and make sure to make the stack executable otherwise you will end up getting a seg fault here itself and the shellcode will not execute.

root@kali:~/Desktop/assembly# gcc -z execstack test.c; ./a.out
Hello world
Segmentation fault

From the above code it seems like the shellcode seems to work just fine!

I have tried this on a basic application and it works so your problem should be solved.

Another point to mention is that, only if your executable application [ the one you are going to be exploiting ] uses input methods which stop on a NULL bytes like strcpy(), only then you have to remove NULL bytes.

If your executable uses input functions like gets() and fgets() then you don't need to worry about NULL bytes [ unless you are looking forward for format string vulnerability as well ] This came from the man page of fgets:

fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte ('\0') is stored after the last character in the buffer.

this clearly means that the NULL bytes shouldn't bother your exploit.

I hope your doubts get cleared!

Thank you! Your answer seems to clarify my doubts and yes the new assembly file works perfectly! Marking this as the correct answer. — nltc, Sep 30 '20 at 05:09
`push rax` seems pointless. The `write` system call takes an explicit length, not a 0-terminated string. So it's unnecessary even if this did happen to get executed with RSP pointing 8 bytes after the end of `msg: db ...`. There's also several wasted REX prefixes. Also, you can avoid the `sub rsi, 0x41...` by jumping forward over the string so the `rel32` is negative, with high `0xff` bytes instead of zeros. (If you want to avoid `0xff` bytes as well, that's a good trick. Or I guess to make sure the newline is at the end, for fgets with wrong buffer sizes.) — Peter Cordes, Sep 30 '20 at 05:19
Also, `push 1` / `pop rax` is a more compact way to set a register to a small immediate constant. [Tips for golfing in x86/x64 machine code](https://codegolf.stackexchange.com/q/132981) — Peter Cordes, Sep 30 '20 at 05:19
@PeterCordes you are correct the code is a little bit long and somewhat unnecessary as well. But after all, from the question it seems like the user is not very experienced with assembly so I have picked up some points from the other comments to make things easier for the user to understand. — Vuln X, Sep 30 '20 at 05:30
Yes, I upvoted for the useful comments / text in the rest of the answer. But especially for an inexperienced user, introducing a `push rax` that wasn't there in the question is just misleading and confusing. It only 0-terminates `msg` under very specific circumstances which you don't mention, and that isn't needed. — Peter Cordes, Sep 30 '20 at 05:43
I agree with you! The NULL terminating string was not required. I was probably thinking about posting a /bin/sh shellcode so I had pushed the NULL byte on the stack. Moreover, this is the first time I have seen a shellcode which prints "Hello World" rather than something which establishes a reverse connection, pops a shell or something like this. The purpose of asking for the shellcode might be to understand how to evade NULL bytes, keeping that in mind I had post the answer! Anyway I have edited the code in the answer and it decreased the shellcode length too, so it should be better! — Vuln X, Sep 30 '20 at 12:31

How can I reprogram my shellcode snippet to avoid null bytes?

1 Answers1

EDIT : As per the discussion in the comment section I have removed the terminating NULL byte