2
__inline int my_sprintf (char *dest,char *format,...)
{
    va_list va;
    va_start(va,format);
    return vsprintf(dest,format,va);
}

My issue is that I can't add the buffer size parameter to my_sprintf because it's used in more than 50k places, this I can't replace vsprintf with vsprintf_s or vsnprintf.

any alternative to make the above function safer ?

user327843
  • 442
  • 6
  • 17

5 Answers5

5

What you're asking here is a specialization of the (in)famous question: "how to get the size of an array if all I have is a pointer?"

There's no way to figure out what's the size of the object pointed to by dest. Your best option is likely to bite the bullet and change those 50k places to pass the size.


There might be more to your code that you're not telling us. For example, in those "50k" places that you mention, is the size known? If so you could get away with a dirty variadic macro that uses sizeof behind the scenes and then calls a function that takes a length parameter.

cnicutar
  • 164,886
  • 23
  • 329
  • 361
3

Sorry, there's no silver bullet for you here, as @cnicutar already mentioned.

You can start by restricting the buffer size and asserting on overflows. Something like:

#define SPRINTF_TRACE_BUFSIZE 4096

int my_sprintf( char* dest, const char* fmt, ... )
{
    /* in threaded code use malloc(3) instead */
    static char trace_buf[SPRINTF_TRACE_BUFSIZE];

    va_list va;
    va_start( va, fmt );
    int rc = vsnprintf( trace_buf, SPRINTF_TRACE_BUFSIZE, fmt, va );

    assert( rc != -1 && rc < SPRINTF_TRACE_BUFSIZE );

    memcpy( dest, trace_buf, rc + 1 ); /* +1 for \0 terminator */
    return rc;
}

And then start lowering the tracing buffer size until asserts start firing. At that point you can find and fix the offending calls.

This will of course slow down the overall system, but we are not talking about performance here.

Just to underscore it - this is a quick and dirty hack to fight with large old existing code-base, do not use it for new development.

Nikolai Fetissov
  • 77,392
  • 11
  • 105
  • 164
  • wouldn't memcpy cause of buffer overflow if the dest is too small ? – user327843 May 28 '15 at 15:13
  • 1
    Yes it will., but this is not about securing bad upstream code from overflowing their buffers, this is about finding the offenders and cleaning them up. – Nikolai Fetissov May 28 '15 at 15:16
  • I have inherited an extermly huge legacy code (millions of LOC) with no automated unit-tests. I will be dead before I can complete that. – user327843 May 28 '15 at 15:21
  • I like this as s simple way to get handle on the size of the issue. Alternatively, code code use a BA buffer and simple record the max length used in a global variable for later debug examination. – chux - Reinstate Monica May 28 '15 at 16:51
2

OP commented "a large number of buffers are dynamically allocated, ... ". The malloc(), realloc(), calloc(), free(), etc could be re-written with a wrapper function that stores the size.

typedef union {
  max_align_t align;
  size_t sz;
} my_header;

void* my_malloc(size_t size) {
  my_header *p = malloc(sizeof *p + size);
  if (p) {
    p->sz = size;
    p++;
  }
  return p;
}

size_t my_size(const void *p) {
  if (p) {
    const my_header *head = p;
    return head[-1].sz;
  }
  return 0;
}

void my_free(void *p) {
  if (p) {
    my_header *head = p;
    free(--head);
  }
}

All other *.c files call some *.h file with

#define malloc my_malloc
#define free my_free
void *my_malloc(size_t size);
void my_free(void *p);
size_t my_size(const void *p);

Now when my_sprintf() is called with an allocated pointer ...

int my_sprintf (char *dest,char *format,...) {
  va_list va;
  va_start(va,format);
  size_t n = my_size(dest);
  return vsnprintf(dest,n,format,va);
}

Further, a magic number could also be pre-pended to help identify if the pointer passed is truly a my_allcoated() one.

Wrapping allocation functions is also a way to determine various allocation concerns: double free, max usage, all pointers free'd, ...


[Edit] after 5 years.

Code needs to ensure alignment - code re-worked.


With pre-C11 use a union of wide types in lieu of max_align_t.

typedef union {
  double d;
  long l;
  void *p;
  void (*fp)();
  // With C99
  complex long double cld;
  long long ll;

  size_t sz;
} my_header; 
chux - Reinstate Monica
  • 113,725
  • 11
  • 107
  • 213
  • nice, but a bit unreliable... I think it could be improved if you add some magic, say 0xC0DE, in the first place to my_header, thus attempts to unrestricted printing to static buffers can be filtered out/detected – pmod Apr 16 '20 at 20:19
  • @pmod Yes good idea. It is mentioned in the the answer with "Further, a magic number could also be pre-pended to help identify ...". I have even coded further with a sentinel at the end too. Useful to detect overruns. – chux - Reinstate Monica Apr 16 '20 at 20:22
  • Oh, missed that piece :) – pmod Apr 16 '20 at 20:25
0

This large-scale exercise refactoring effort begs for automation. Changing the behavior of the my_sprintf function itself is trivial, so I'll leave that exercise to the reader. As you noted, the calls are the hard part. I'll assume those calls structured something like:

ret = my_sprintf(dest, "%d:%s", arg1, arg2);

Many IDE/text editors support regular expressions in their search/replace options. One of mine, SlickEdit, is particularly nice in that it allows Perl regular expressions. Here is an example that will that transform the above call into the following:

ret = my_sprintf(dest, MAX_SIZE, "%d:%s", arg1, arg2);

Note that SlickEdit has been instructed to recursively apply this pattern to all the C source files in my project tree.

enter image description here

Note that the dialog box entries should be equivalent to a regex of s/(\".+\")/MAX_SIZE, $1/. Does your dev environment have similar capabilities? Many do, so take a look.

If not, there are other options, like independent scripts. As an example, assume you have some *nix-like shell and Perl available. Google "perl one liner replace" to find some nice examples, like this and this. Couple these with judicious use of the find command (examples), and your problem is solved.

*Disclaimer: I cannot vouch for the elegance of my example regex. It worked in my cases, but regex gurus can undoubtedly improve it.

Throwback1986
  • 5,597
  • 1
  • 26
  • 21
0

You can use a macro to help in those cases where your my_sprintf() is passed something other than a true char *.

#define my_sprintf(dest,format,...) \
    my_sprintf_func( (dest),sizeof(dest), ( format ), __VA_ARGS__ )

__inline int my_sprintf_func(char *dest,size_t size,char *format,...)
{
    va_list va;
    va_start(va,format);
    if ( size == sizeof(dest) )
        return vsprintf(dest,format,va);
    return vsnprintf(dest,size,format,va);

}

Yes it does evaluate dest twice, but without knowing your code base I can't say if that's a problem. But it would at least help in those cases where the call is made using an array.

I'd also seriously recommend seeing if something like Purify is worth buying.

Andrew Henle
  • 27,654
  • 3
  • 23
  • 49