How to check for a valid pointer?
Recently, for one of the projets I’m working on, I’ve been asked a really naive question: “How can I be sure a function pointer is valid?”. If the question is naive, the answer isn’t that easy to find. In most programs, when pointers are checked, we just check whether the pointer is null. If not, we assume the pointer is valid. If that test is sufficient for most cases, it happens sometimes it’s not enough.
So my goal here was to find a way to catch wrong non null pointers. I immediately thought about huge and crappy solutions where pointer checks would have been slower than crashing and restarting.
Then, close to the end, you do stupid stuff, such as typing: man end. Here was the solution. 3 external variables are provided by ld when linking a program, and defined by loader when a program is started: etext, edata, end.
Let’s switch back to the structure of a binary. When you build a program, using GCC and without playing with sections, your program is cut into 3 parts: .text, .data, .bss. Interesting part for us is .text. This is were program code is stored. So, when you’re using a function pointer, its address will point into .text section, or it’s not valid. etext meaning end of text (section) is the address of the first instruction after the text section, so every pointer address has to be lower than etext. This gives the first way to check a function pointer. Then, I wondered: pointer has to be higher than something. What? And how to know. What was easy to find: base address of the binary in memory. Indeed, when you start a program, its code is stored in memory. So, I had to find the address of the first instruction. How? While browsing the web, I found that another symbol was also provided by ld: start. This is the address of the entrypoint of the program in memory (most of the time, the main() function of a program is the entrypoint). And most of the time, OS when loading a program puts entrypoint at the begin of the text section in memory.
So, I wrote the following function:
int is_fct_ptr_valid(void *p)
{
extern char _etext, _start;
return (((char*) p < &_etext) && ((char*) p > &_start));
}
This way, you can check in a more accurate way if a pointer function is valid.
Then, another question raised in my brain: “OK, you can check functions pointers in a nice way, what about memory now?”. Memory is something harder to check, and I was again a bit lost about how to proceed. I found a first way: most of the time, memory of a program is stored at the end of the program representation in memory. So, pointer address has be higher than end (cf: previous paragraph). And, in fact, having higher address is easy. You just have to use sbrk(0). sbrk is the function you can use to increment heap size of your program, it takes the size of the increment in parameter. 0 means no increment, so it just returns the current higher address.
So, I implemented that and tested. But it failed. I was really stupid thinking that would always work. In fact, in a program, you’ve got two kinds of memory: heap and stack. The method described above is only good at checking heap. Now, the question was: how to check stack then? There was no direct method way to check it, as it was possible for all the rest. Then, I thought about something a bit tricking. When you call a function, memory for local variables is allocated from stack. In fact, you just substract size you need from SP (Stack Pointer) and then, use stack past that given SP. And you do it each time you call a function, even inside another function. And main() is the main function. Then, if you know the address of the stack in main, you know that every other pointer has to be lower than it. And when you call the function to check memory, you know that’s the last called function, so that the pointer to check has to be higher than a pointer you would have in the function. The tricky method was born.
char * sstack;
int is_mem_ptr_valid(void *p)
{
char estack = 0;
extern char _end;
return (((((char*) p > &_end) && (p < sbrk(0))) || (((char*) p < sstack) && ((char*) p > &estack))));
}
int main()
{
char start_stack;
sstack = &start_stack;
/* … */
}
Here, you have the whole process to check memory.
Now, the test program, to show you the whole process:
/* Pointers checkings example */
/* Author: Pierre Schweitzer */
#define _BSD_SOURCE 1
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
typedef struct _call_t
{
void (*function)(void *);
void * data;
} call_t;
char * sstack;
int is_fct_ptr_valid(void *p)
{
extern char _etext, _start;
return (((char*) p < &_etext) && ((char*) p > &_start));
}
int is_mem_ptr_valid(void *p)
{
char estack = 0;
extern char _end;
return (((((char*) p > &_end) && (p < sbrk(0))) || (((char*) p < sstack) && ((char*) p > &estack))));
}
void stupid(void * data)
{
printf(”function’s been called :)\n”);
}
void callfunction(call_t * ct)
{
if (is_mem_ptr_valid(ct))
{
if (is_fct_ptr_valid(ct->function))
{
(ct->function)(ct->data);
}
else
{
printf(”Incorrect function pointer: %p\n”, ct->function);
}
}
else
{
printf(”Incorrect memory pointer: %p\n”, (void *)ct);
}
}
int main()
{
char start_stack;
call_t call1, call2, * call3;
sstack = &start_stack;
call1.function = stupid;
call1.data = NULL;
call2.function = (void(*)(void *))time(NULL);
call2.data = NULL;
call3 = malloc(sizeof(call_t));
call3->function = stupid;
call3->data = NULL;
printf(”Test #1: function will be called\n”);
callfunction(&call1);
printf(”Test #2: function error will be raised\n”);
callfunction(&call2);
printf(”Test #3: memory error will be raised\n”);
callfunction((call_t *)time(NULL));
printf(”Test #4: function will be called\n”);
callfunction(call3);
return 0;
}
The functions given above, even if they are more accurate, are not fail-proof. Several assertions have been made because they are true in most cases. But, in case you don’t have contiguous memory, in case entrypoint isn’t at the begin of memory, those functions would be senseless. Furthermore, those functions only check if pointer points to a valid memory zone, not if the content are valid. Your program can still have issue due to of-by-one mistakes or such.
But, that’s a nice way to begin! ![]()
June 6th, 2010 at 10:24 pm
Your findings are fun of course, however all of this looks like one big, heavy and fat hack. Same way we can make up quite a few hack-checks which would ensure validity of some random pointer, by e.g. checking it to be inside heap virtual addresses range, being inside usermode address range, etc.
But all of this is just a hack. The only possible correct way is to probe this pointer inside a SEH, and if you are getting this function pointer as a callback, always wrap a call to it in a SEH.
If your concern is that GCC doesn’t provide SEH, then I’m afraid GCC is not the right tool. You can kick in the nail to the wood using microscope, but it’s better to use hammer
June 7th, 2010 at 7:03 pm
First, thanks for your comment :).
Then, main purpose of those findings is not to replace SEH. That aims to be a “soft” substitute to SEH. Indeed, on Linux system you’ve got no exception handling in C as you can find on Microsoft platforms. You can play with signals, jumps, but that’s not really easy to handle. Furthermore having a “nice” exception handling system starts being really hard and heavy.
The solution I purpose here is a way to ensure some basic stuff for a lower cost. When writing that, I am thinking about trycatch library, that you can find here: http://llg.cubic.org/trycatch/.
Indeed, that is bit hackish, not complete, but that is, in my opinion, a good begin to get rid of some errors in a program.