40

The documentation for the Valgrind thread error detection tool Helgrind, found here

warns that, if you use GCC to compile your OpenMP code, GCC's OpenMP runtime library (libgomp.so) will cause a chaos of false positive reports of data races, because of its use of atomic machine instructions and Linux futex system calls instead of POSIX pthreads primitives. It tells you that you can solve this problem, however, by recompiling GCC with the --disable-linux-futex configuration option.

So I tried this. I compiled and installed to a local directory (~/GCC_Valgrind/gcc_install) a new GCC version 4.7.0 (the latest release as of this writing) with the --disable-linux-futex configuration option. I then created a small OpenMP test program (test1.c) that has no visible data races:

/* test1.c */

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

#define NUM_THREADS 2

int a[NUM_THREADS];

int main(void) {
        int i;
#pragma omp parallel num_threads(NUM_THREADS)
        {
                int tid = omp_get_thread_num();
                a[tid] = tid + 1;
        }
        for (i = 0; i < NUM_THREADS; i++)
                printf("%d ", a[i]);
        printf("\n");
        return EXIT_SUCCESS;
}

I compiled this program as follows

~/GCC_Valgrind/gcc_install/bin/gcc -Wall -fopenmp  -static -L~/GCC_Valgrind/gcc_install/lib64 -L~/GCC_Valgrind/gcc_install/lib -o test1 test1.c

However, I got 30 false positive data race reports!--all occurring in libgomp code. I then compiled test1.c without the -static flag, and ran Helgrind on it again. This time, I got only 9 false positive data race reports, but that is still too many--and, without the -static flag, I cannot trace the supposed race in the libgomp code.

Has anybody found a way to reduce, if not eliminate, the number of false positive data race reports from Helgrind applied to an OpenMP program compiled with GCC? Thanks!

Chris
  • 5,346
  • 7
  • 35
  • 56
Amittai Aviram
  • 2,152
  • 3
  • 24
  • 31
  • 1
    Just a wild guess - could it be that your recompiled **gcc** links against the recompiled version of **libgomp** but the dynamic linker still loads the system supplied **libgomp** at runtime? Try to recompile with `-Wl,-rpath,/path/to/recompiled/lib`. – Hristo 'away' Iliev May 17 '12 at 20:33
  • 1
    Just a side comment - give a try to the Thread Analyzer tool from Oracle Solaris Studio for Linux while the toolset is still free :) – Hristo 'away' Iliev May 17 '12 at 20:39
  • 4
    Have you looked at adding error suppressions? http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress – johlo May 18 '12 at 21:07
  • Just to make sure, could you mark `tid` as private? – user1202136 Jun 05 '12 at 14:05
  • @user1202136 How do you do that? – autistic Apr 28 '13 at 17:12
  • @undefinedbehaviour I now realize that `tid` is supposed to be private, since it is declared *inside* the parallel section. Anyway, the syntax would have been `#pragma omp parallel private(tid)`. – user1202136 Apr 29 '13 at 12:43
  • From the website you mentioned: " Fortunately, this can be solved using a configuration-time option (for GCC). Rebuild GCC from source, and configure using --disable-linux-futex. This makes libgomp.so use the standard POSIX threading primitives instead. Note that this was tested using GCC 4.2.3 and has not been re-tested using more recent GCC versions. We would appreciate hearing about any successes or failures with more recent versions." Did you report your problem there? – Andrew W May 29 '13 at 15:26

3 Answers3

2

Sorry to put this in as an answer since it's more of a comment, but it's too long to fit in as a comment, so here goes:

From the site you referenced.

Runtime support library for GNU OpenMP (part of GCC), at least for GCC versions 4.2 and 4.3. The GNU OpenMP runtime library (libgomp.so) constructs its own synchronisation primitives using combinations of atomic memory instructions and the futex syscall, which causes total chaos since in Helgrind since it cannot "see" those.

Fortunately, this can be solved using a configuration-time option (for GCC). Rebuild GCC from source, and configure using --disable-linux-futex. This makes libgomp.so use the standard POSIX threading primitives instead. Note that this was tested using GCC 4.2.3 and has not been re-tested using more recent GCC versions. We would appreciate hearing about any successes or failures with more recent versions.

as you mentioned in your post, this has to do with libgomp.so, but that's a shared object, so I don't see how you can pass the -static flag and still use that library. Am I just misinformed?

Andrew W
  • 4,220
  • 1
  • 14
  • 16
0

Steps which will make it work:

  1. Recompile gcc (including libgomp) using --disable-linux-futex
  2. Make sure you use the futex free gcc when compiling your program.
  3. Make sure the system will load the futex free libgomp when executing your program (the library is usually in GCC-OBJ-DIR/PLATFORM/libgomp/.libs). For example by setting the LD_LIBRARY_PATH environment variable:

export LD_LIBRARY_PATH=~/gcc-4.8.1-nofutex/x86_64-unknown-linux-gnu/libgomp/.libs:

Eduard Wirch
  • 9,361
  • 9
  • 58
  • 70
0

Please also note, that if omp_set_lock is used in the code the omp.h path must be substituted because of different lock struct size. See https://xrunhprof.wordpress.com/2018/08/27/tsan-with-openmp/

Andrew
  • 21
  • 4