free() inconsistencies

Share on Facebook
Share on LinkedIn


Hi, folks!


Last week I was porting a program from uClibc to glibc. Everything went fine, until I found a crash to always happen in a certain part of it. The failure was in a call to the free() function. First thing that came in my head: Why the hell is it crashing now, while it used to run fine on uClibc? I made a simple program that simulates the problem:


Oi gente!


Semana passada tava portando um código compilado em cima da uClibc para glibc. Tudo tranquilo, até que certa parte do programa dava crash. Investigando, vi que a falha acontecia em uma chamada à função free(). Primeira coisa que veio à minha cabeça: porque raios tá dando crash aqui se o mesmo programa, compilado na uClibc roda perfeito? Fiz um programa simples que simula a situação:

#include <stdio.h>
#include <stdlib.h>

typedef struct {
  char *field1;
} s_test;

s_test test = {
  .field1 = NULL
};

int main (int argc, char **argv) {
  s_test *t;

  t = &test;
  free (t);
  t = &test;
  t->field1 = "bug";
  printf ("%s\n", t->field1);

  return 0;
}


Look at line 16. I’m executing a free() in a pointer to a static variable, instead of a pointer in the heap (previously allocated with malloc() or similar). It’s expected a crash here, right? Maybe! Yes, if you’re using glibc. No if you’re using uClibc. The above code works like [not] expected. Weird! Everything we learned at the programming school is ruined now :D !


So, we have a similar code here that have been worked for a long time, exactly because it was compiled and run on top of uClibc. I’ve seen this and other behavior differences between uClibc and glibc. The solution? Change the code to make it portable, not only to make it compile, but also so that it have the same behavior on every platform.


I thought it was a bug in uClibc, but I was told it doesn’t break the standards. In fact, standards say, in that case, the behavior is “undefined”. Ah, standards :) … So, in order to avoid surprises like that, here is what I learned: Always code in the right way, even if it comes with a harder job. Don’t say: “hey, it’s working, let’s deploy it!”.


See you!


Preste atenção na linha 16. Estou executando um free() num ponteiro que aponta para uma variável estática, ao invés de uma variável que foi alocada com malloc() ou similar. Um crash é esperado aqui, certo? Em partes! Usando a glibc, sim. Já com a uclibc, não! O código acima funciona como [não] esperado! Estranho, não? Tudo o que aprendemos na escola de programação vai por água abaixo aqui hehehe.


Então, o que acontece é que temos um código similar aqui e que sempre funcionou, justamente por ser compilado na uclibc. Já vi essa e outras diferenças de comportamento entre a uclibc e a glibc. A solução? Mudar o código para torná-lo portável, não só para que compile corretamente, mas para que tenha os mesmos resultados, independente da plataforma.


A princípio, achei que isso era um bug na uclibc, mas fui apontado que isso não fere “os padrões”. De fato, os padrões dizem que nesse caso, o comportamento é “indefinido”. Ah, padrões :) … Para evitar surpresas do tipo, fica aqui a lição aprendida: programar da forma certa, mesmo que dê um pouco mais de trabalho. Não se acomodar dizendo: “ah, testei aqui e funciona, deixa assim mesmo!” ;)


Bons códigos!

4 comments ↓

#1 Simon on 01.19.12 at 18:42

In fact, standards say, in that case, the behavior is “undefined”.

Microsoft’s Raymond Chen has had much to say on that subject over the years – most of which amounts to that if the API says behavior is officially undefined under a particular situation, don’t rely on the observed behavior being consistent. It may change with different implementations of the API, it may change when bug fixes go in, it may even change based on the phase of the moon.

#2 Tiago on 01.19.12 at 18:55

humm, vc conferiu a C99 pra ter certeza? Imagino que não vai tá escrito nada mesmo, pq na parte da documentação de alocação dinâmica não deve ter nada sobre variável automática.

#3 napsy on 01.20.12 at 5:37

The man pages say:
… The free() function frees the memory space pointed to by ptr, which must have been ***returned by a previous call to malloc(), calloc() or realloc() *** …

#4 R on 01.21.12 at 9:58

The C standard has three notions of underspecification; unspecified behavior, implementation defined behavior and undefined behavior. The interesting thing about undefined behavior (which indeed occurs in your example) is that an implementation is basically allowed to do whatever it likes for the remaining of the execution of your program. So it is even allowed to format your hard disk or to let your computer explode.

Your example is not very interesting though, since the C99 clearly states that

Otherwise, if the argument does not match a
pointer earlier returned by the calloc, malloc, or
realloc function, or if the space has been
deallocated by a call to free or realloc, the
behavior is undefined.

Hence it is not an inconsistency. Now let us look at real inconsistencies with respect to free in the C99 standard. Consider

int main() {
int *p = malloc(sizeof(int));
free(p);
int *q = malloc(sizeof(int));
*q = 10;
if(!memcmp(&p, &q, sizeof(p)))
printf(“%d\n”, *p);
}

Here it is very likely that the second malloc call yields a pointer to the same piece of memory as the first. After that we check whether the bit representations of the pointers p (that has been freed, and has an indeterminate value according to the standard) and q (that is not freed) are equal. In this case one might wonder whether we could use p and q interchangeably, after all, their bit representations are equal (and bits is all there is). But on the other hand, p has an indeterminate value, so dereferencing would be undefined behavior.

To make it even more fun, clang -O3 (version 2.7-3) prints 10 twice in the following example!

int main() {
int *p = malloc(sizeof(int));
free(p);
int *q = malloc(sizeof(int));
*q = 10;
if(!memcmp(&p, &q, sizeof(p))) {
printf(“%d\n”, *p);
*q = 20;
printf(“%d\n”, *p);
}
}

This is due to the fact that it assumes that p and q are not aliased.

Now the infamous Defect Report 260 [1] comes to the rescue. But AFAIK these issues have still not been clearly incorporated in recent versions of the standard.

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm