71

I want to write a signal handler to catch SIGSEGV. I protect a block of memory for read or write using

char *buffer;
char *p;
char a;
int pagesize = 4096;

mprotect(buffer,pagesize,PROT_NONE)

This protects pagesize bytes of memory starting at buffer against any reads or writes.

Second, I try to read the memory:

p = buffer;
a = *p 

This will generate a SIGSEGV, and my handler will be called. So far so good. My problem is that, once the handler is called, I want to change the access write of the memory by doing

mprotect(buffer,pagesize,PROT_READ);

and continue normal functioning of my code. I do not want to exit the function. On future writes to the same memory, I want to catch the signal again and modify the write rights and then record that event.

Here is the code:

#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>

#define handle_error(msg) \
    do { perror(msg); exit(EXIT_FAILURE); } while (0)

char *buffer;
int flag=0;

static void handler(int sig, siginfo_t *si, void *unused)
{
    printf("Got SIGSEGV at address: 0x%lx\n",(long) si->si_addr);
    printf("Implements the handler only\n");
    flag=1;
    //exit(EXIT_FAILURE);
}

int main(int argc, char *argv[])
{
    char *p; char a;
    int pagesize;
    struct sigaction sa;

    sa.sa_flags = SA_SIGINFO;
    sigemptyset(&sa.sa_mask);
    sa.sa_sigaction = handler;
    if (sigaction(SIGSEGV, &sa, NULL) == -1)
        handle_error("sigaction");

    pagesize=4096;

    /* Allocate a buffer aligned on a page boundary;
       initial protection is PROT_READ | PROT_WRITE */

    buffer = memalign(pagesize, 4 * pagesize);
    if (buffer == NULL)
        handle_error("memalign");

    printf("Start of region:        0x%lx\n", (long) buffer);
    printf("Start of region:        0x%lx\n", (long) buffer+pagesize);
    printf("Start of region:        0x%lx\n", (long) buffer+2*pagesize);
    printf("Start of region:        0x%lx\n", (long) buffer+3*pagesize);
    //if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
    if (mprotect(buffer + pagesize * 0, pagesize,PROT_NONE) == -1)
        handle_error("mprotect");

    //for (p = buffer ; ; )
    if(flag==0)
    {
        p = buffer+pagesize/2;
        printf("It comes here before reading memory\n");
        a = *p; //trying to read the memory
        printf("It comes here after reading memory\n");
    }
    else
    {
        if (mprotect(buffer + pagesize * 0, pagesize,PROT_READ) == -1)
        handle_error("mprotect");
        a = *p;
        printf("Now i can read the memory\n");

    }
/*  for (p = buffer;p<=buffer+4*pagesize ;p++ ) 
    {
        //a = *(p);
        *(p) = 'a';
        printf("Writing at address %p\n",p);

    }*/

    printf("Loop completed\n");     /* Should never happen */
    exit(EXIT_SUCCESS);
}

The problem is that only the signal handler runs and I can't return to the main function after catching the signal.

Jeff Hammond
  • 4,682
  • 3
  • 22
  • 42
Adi
  • 1,439
  • 3
  • 18
  • 27
  • 2
    thanks nos for editing..I appreciate that. I need to spend some time to learn editing my questions.. – Adi Apr 18 '10 at 18:59
  • when compiling, always enable all the warnings, then fix those warnings. (for `gcc`, at a minimum use: `-Wall -Wextra -pedantic` I also use: `-Wconversion -std=gnu99` ) The compiler will tell you: 1) parameter `argc` unused 2) parameter `argv` unused (suggest using main() signature of: `int main( void )` 3) local variable `p` used in the `else` code block without being initialized. 4) parameter `unused` unused, suggest: add statement: `(void)unused;` as first line in that function. 5) local variable `a` set but not used. – user3629249 Sep 03 '16 at 20:51
  • NEVER use `printf()` in a signal handler! The function `write()` would be ok to use, but best to not do any I/O in a signal handler, just set a flag and let the main line of code be checking that flag – user3629249 Sep 03 '16 at 20:52
  • the variable `pagesize` is declared as an `int`, but it should be declared as a `size_t` – user3629249 Sep 03 '16 at 20:54
  • the `sig` parameter should be compared to SIGSEGV, as there are other signals, and such a comparison would removed the compiler message about a unused `sig` parameter – user3629249 Sep 03 '16 at 20:59
  • the function: `memalign()` is obsolete. the code should be using: `posix_memalign()` which also requires the definition of `_POSIX_C_SOURCE` with a value greater/equal to 200112L – user3629249 Sep 03 '16 at 21:08
  • the code (probably) should use: `getpagesize()` rather than hardcoding the value `4096`. – user3629249 Sep 03 '16 at 21:19

5 Answers5

74

When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, reexecuting the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.

So to make it work, you have to call mprotect in the signal handler. Unfortunately, as Steven Schansker notes, mprotect is not async-safe, so you can't safely call it from the signal handler. So, as far as POSIX is concerned, you're screwed.

Fortunately on most implementations (all modern UNIX and Linux variants as far as I know), mprotect is a system call, so is safe to call from within a signal handler, so you can do most of what you want. The problem is that if you want to change the protections back after the read, you'll have to do that in the main program after the read.

Another possibility is to do something with the third argument to the signal handler, which points at an OS and arch specific structure that contains info about where the signal occurred. On Linux, this is a ucontext structure, which contains machine-specific info about the $PC address and other register contents where the signal occurred. If you modify this, you change where the signal handler will return to, so you can change the $PC to be just after the faulting instruction so it won't re-execute after the handler returns. This is very tricky to get right (and non-portable too).

edit

The ucontext structure is defined in <ucontext.h>. Within the ucontext the field uc_mcontext contains the machine context, and within that, the array gregs contains the general register context. So in your signal handler:

ucontext *u = (ucontext *)unused;
unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];

will give you the pc where the exception occurred. You can read it to figure out what instruction it was that faulted, and do something different.

As far as the portability of calling mprotect in the signal handler is concerned, any system that follows either the SVID spec or the BSD4 spec should be safe -- they allow calling any system call (anything in section 2 of the manual) in a signal handler.

Chris Dodd
  • 101,438
  • 11
  • 111
  • 197
  • Right, you can perform the memory access on behalf of the program (like a VM) and then update the instruction pointer. Calling `mprotect` is definitely easier. – Ben Voigt Apr 18 '10 at 19:22
  • hi chris, You have given me some useful information. Thanks for that.. Can you tell me how can i read the info in the ucontext structure (3rd argument and change the $PC) . I am curious to know about it. – Adi Apr 18 '10 at 19:25
  • @ Ben Voigt, I did not understand clearly what are u saying, request you to be slightly more elaborate. – Adi Apr 18 '10 at 19:26
  • @chris, looks like i can do mprotect inside the signal handler and then return back safely to do my normal execution. I am not sure about portability as you guys mentioned , but I hope it is fine in my case. Thanks all for the help.. – Adi Apr 18 '10 at 19:43
  • @chris thanks for the explanation, i will see the PC using ur technique . – Adi Apr 19 '10 at 06:34
  • `mprotect` is not formally async-signal-safe, but in reality, there is no good reason an implementation of `mprotect` would fail to be AS-safe, since it has to be implemented as a syscall. – R.. GitHub STOP HELPING ICE Jun 22 '13 at 23:41
  • One can also use SIGTRAP to detect the start of the next instruction. – Cel Skeggs Jun 14 '15 at 04:44
  • @col6y: that requires replacing the next instruction with a breakpoint instruction, which is possible, though non-trivial (finding the next instruction on x86 is tricky). I'm not aware of any UNIX variant that will allow single-stepping via a signal handler, though most allow it via another process and ptrace(2) – Chris Dodd Jun 14 '15 at 21:12
  • @ChrisDodd If I'm not mistaken, you can also set the trap flag, which will cause SIGTRAP on every instruction. – Cel Skeggs Jun 15 '15 at 00:26
  • @col6y: that sounds plausible, though I've never seen anything like that done. The OS might interfere with setting the flag. Even if it didn't, you might trap the same instruction again -- need to check the CPU manual to understand precisely how the trap flag works on your processor. – Chris Dodd Jun 15 '15 at 02:09
  • @ChrisDodd take a look [at this question](http://stackoverflow.com/q/27080741/3369324) - that's where I found out about it. – Cel Skeggs Jun 15 '15 at 05:27
  • "mprotect is a system call, so is safe to call from within a signal handler": if you have any docs to support that, please add the quote at: https://stackoverflow.com/questions/11675040/does-linux-allow-any-system-call-to-be-made-from-signal-handlers – Ciro Santilli新疆棉花TRUMP BAN BAD Sep 20 '17 at 07:34
  • This is a great answer! – GL2014 Jan 04 '21 at 21:22
25

You've fallen into the trap that all people do when they first try to handle signals. The trap? Thinking that you can actually do anything useful with signal handlers. From a signal handler, you are only allowed to call asynchronous and reentrant-safe library calls.

See this CERT advisory as to why and a list of the POSIX functions that are safe.

Note that printf(), which you are already calling, is not on that list.

Nor is mprotect. You're not allowed to call it from a signal handler. It might work, but I can promise you'll run into problems down the road. Be really careful with signal handlers, they're tricky to get right!

EDIT

Since I'm being a portability douchebag at the moment already, I'll point out that you also shouldn't write to shared (i.e. global) variables without taking the proper precautions.

igk
  • 31
  • 10
Steven Schlansker
  • 34,307
  • 13
  • 76
  • 99
  • 1
    Hi steven , If I can't do anything useful inside the signal handler, I will be OK if I can update some counters inside it and return back to main and normally run my code, is it possible ? – Adi Apr 18 '10 at 19:03
  • quoting from the CERT advisory, "they may call other functions provided that all implementations to which the code is ported guarantee that these functions are asynchronous—safe". On linux that includes a lot more functions. – Ben Voigt Apr 18 '10 at 19:29
  • Sure, but you have to just be aware of the problem! I can't name off the top of my head which functions are and aren't signal safe, and I doubt many could! – Steven Schlansker Apr 19 '10 at 01:59
  • 2
    The CERT Secure Coding is a great site, I didn't know about it. It seems I got some new reading for a while :) – alecov Aug 12 '10 at 05:47
  • If you can't do anything useful in signal handlers, why do they exist? – Mawg says reinstate Monica Jun 29 '16 at 15:36
  • Best is to put the sighandler data into a reentrant queue and then process it from the main process loop. – peterh Feb 10 '17 at 12:45
  • 1
    "You're not allowed to call [mprotect] from a signal handler." Only if you need to strictly conform to POSIX. Glibc's mprotect today is async-signal-safe: https://www.gnu.org/software/libc/manual/html_node/Memory-Protection.html#index-mprotect – Joseph Sible-Reinstate Monica Sep 26 '18 at 19:59
13

You can recover from SIGSEGV on linux. Also you can recover from segmentation faults on Windows (you'll see a structured exception instead of a signal). But the POSIX standard doesn't guarantee recovery, so your code will be very non-portable.

Take a look at libsigsegv.

Jeff Hammond
  • 4,682
  • 3
  • 22
  • 42
Ben Voigt
  • 260,885
  • 36
  • 380
  • 671
5

You should not return from the signal handler, as then behavior is undefined. Rather, jump out of it with longjmp.

This is only okay if the signal is generated in an async-signal-safe function. Otherwise, behavior is undefined if the program ever calls another async-signal-unsafe function. Hence, the signal handler should only be established immediately before it is necessary, and disestablished as soon as possible.

In fact, I know of very few uses of a SIGSEGV handler:

  • use an async-signal-safe backtrace library to log a backtrace, then die.
  • in a VM such as the JVM or CLR: check if the SIGSEGV occurred in JIT-compiled code. If not, die; if so, then throw a language-specific exception (not a C++ exception), which works because the JIT compiler knew that the trap could happen and generated appropriate frame unwind data.
  • clone() and exec() a debugger (do not use fork() – that calls callbacks registered by pthread_atfork()).

Finally, note that any action that triggers SIGSEGV is probably UB, as this is accessing invalid memory. However, this would not be the case if the signal was, say, SIGFPE.

Demi
  • 3,152
  • 2
  • 26
  • 35
  • mmap() & mprotect() are often used in combination with a SIGSEGV handler to trap memory accesses to certain regions, and the behavior is defined in this case as the memory access is not invalid, but protected. – Bogatyr Jan 22 '21 at 13:26
0

There is a compilation problem using ucontext_t or struct ucontext (present in /usr/include/sys/ucontext.h)

http://www.mail-archive.com/arch-general@archlinux.org/msg13853.html

j0k
  • 21,914
  • 28
  • 75
  • 84