syscalls? huh? what is that?

Well the man page of syscall says that “The system call is the fundamental interface between an application and the Linux kernel.” Also “System calls are generally not invoked directly, but rather via wrapper functions in glibc (or perhaps some other library)”. For an user level application to interact with the kernel, syscalls are required. But what is that wrapper function in glibc? Lets understand this using an example.

Here is a small program written in C.

key_check.c

#include <stdio.h>

int main(int argc, char *argv) {
    char secret[20];
    printf("Enter a secret key:\n");
    scanf("%s", secret);

    if(strcmp(secret, "youhaveme") == 0) {
        printf("Welcome!\n");
    } else {
        printf("Try again!\n");
    }

    return 0;
}

This is just a simple C program which compares the user inputted string with “youhaveme”, and if it is equal then it prints “Welcome!” to the console or else prints “Try again!”

Now compile this using gcc

gcc key_check.c -o key_check

Now trace the system calls made by the binary using strace

strace ./key_check

alt

what the heck is all that!!!!???? x_x

Well these are the list of syscalls made by our key_check binary. We can ignore most of that for now and focus on the last few lines.

alt

It contains the string we passed in the printf function, but instead of printf, write is used. 🤔

Basically the libc function printf is just a wrapper around the write syscall. We can directly use the write syscall in our program. According to the man page of write, the write syscall takes three parameters :- a file descriptor, an address to a buffer and the length of the buffer.

One important point : Running the strace command on any binary will execute the binary once.

We can use 1 for file descriptor which denotes standerd output, for the next argument we can put a buffer directly into it, the compiler will automatically find a memory for the buffer and put the address as the second argument in the write syscall, and the third argument is the length of the buffer.

write(1, "YOUHAVEME\n", 10);

You can scroll down the man page of strace and have a look at the different syscalls available. Now let’s come back to our key_check binary.

We can save the output of strace in a text file so that we can filter it or analyse it better. To do so, we have to use -o flag.

strace -o key_check_syscalls ./key_check

All the other lines before write syscall in the strace output are the syscalls which the system is doing to load and map the binary into the main memory so that it can execute the binary.

The values after equal are the return values of each syscalls. For example the return value of the write syscall is the length of the buffer give to it. In our example it’s 20.

alt

So many useless syscalls for our analysis…. what do I do?

Well we can filter out only required syscalls while tracing. Let’s trace all the read syscall from ourt binary. For this we have to ise “-e” flag.

strace -e trace='read' ./key_check

Output: alt

Here, we can see only read syscalls filtered out which makes a specific section of the binary convenient to analyse.

Now lets try something different 🥶

Lets create a child process and see if strace can trace the syscalls made by the child process.

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
void create_process()
{
    
    if (fork() == 0) {
        printf("Hello from Child!\n");
    } else {
        printf("Hello from Parent!\n");
    }
}

int main()
{
    create_process();
    return 0;
}

Now compile it and analyse the output

alt

Let’s filter out all the write syscall

strace -e trace='write' ./child_proc

Output:-

alt

We can see the write syscall made by the parent process, but where is the syscall made by child??? 🤔🤔

Well by default the strace do not trace the syscalls made by the child process, but to do that we can use ‘-f’ or ‘–follow-forks’ to trace it.

strace -f ./child_proc

Output alt

We can see that the output is somewhat bigger than the previous output. It include all the syscalls made by the parent and the child process.

Let’s filter out all the write syscall again

strace -e trace='write' ./child_proc

Output:-

alt

Now both parent’s and child’s syscalls are visible.

Thanks for reading if you made up this far… <3