Pthread Mutex: pthread_mutex_unlock() consumes lots of time

Question

I wrote a multi-thread program with pthread, using the producer-consumer model.

When I use Intel VTune profiler to profile my program, I found the producer and consumer spend lots of time on pthread_mutex_unlock. I don't understand why this happened. I think threads may wait a long time before they can acquire a mutex, but releasing a mutex should be fast, right?

The snapshot below is from Intel VTune. It shows the codes where consumer tries to fetch an item from the buffer, and time consumed by each code line.

My question is that why pthread_mutex_unlock has such overhead? Is the problem with pthread mutex itself or with the way I use it? enter image description here

Unlocking a mutex can be slow if there's a lot of contention on that mutex, because part of the work of unlocking is waking up any threads waiting on the mutex. — caf, Jun 04 '13 at 03:21
I think it would be interesting to see the results if you move the `pthread_mutex_unlock()` call above the call to `pthread_cond_signal()`. There's no requirement to hold the mutex while signaling the condition variable (only when waiting on it), and I suspect that what happens is that the signal causes contention on the mutex because the thread that gets released immediately attempts to acquire the mutex, which the signaling thread still holds. — Michael Burr, Jun 05 '13 at 06:35
@MichaelBurr Good point! I test with your suggestion and the the program is about 40% faster now. — lei_z, Jun 05 '13 at 14:55
@stone199141: I appreciate you letting us know the results. If it's not too much trouble, I'd be interested in having the equivalent screenshot from vtune from after the change added to the question (or I guess it should be in an answer). — Michael Burr, Jun 05 '13 at 15:48
@MichaelBurr: That is not a safe change to make in his code! If he signals while holding the mutex, the signal is guaranteed to wake a thread that chose to block in the full state. If he signals after releasing the mutex, another thread may block on the condition variable in the empty state and he may wake that thread up when he signals. That thread will just go back to sleep and the signal will be lost. If you use the same condition variable to signal more than one state, you cannot safely unlock the mutex before signalling. (You can before broadcasting.) — David Schwartz, Jun 15 '13 at 12:32
@DavidSchwartz: the code uses two separate condition variables, `non_empty` and `non_full`. Though one can't be 100% sure without seeing all of the code manipulating the shared resource, assuming that this is a standard use of the condition variables the change should be safe. Of course, assumptions can be dangerous (particularly with threading); I suppose that could have been mentioned, but honestly I didn't think of it. — Michael Burr, Jun 18 '13 at 00:32
@DavidSchwartz: also, you might be interested in the following glibc bug: http://sourceware.org/bugzilla/show_bug.cgi?id=13165 If that behavior is still in glibc (the bug is not marked as fixed) then even if you're holding the mutex when signalling a condvar, the thread that is 'released' might be one that wasn't waiting when the signal was issued. At least if you're using glibc. As far as I can determine from the bug's comments, the glibc maintainer for this part of the library interpreted POSIX as permitting that behavior until a recent clarification/change to the standard. — Michael Burr, Jun 18 '13 at 00:39

score 3 · Accepted Answer · edited Dec 02 '19 at 13:23

The pthread_mutex_unlock() function shall release the mutex object referenced by mutex. But, the manner in which a mutex is released is dependent upon the mutex's type attribute. If there are threads blocked on the mutex object referenced by mutex when pthread_mutex_unlock() is called, resulting in the mutex becoming available, the scheduling policy shall determine which thread shall acquire the mutex.

If the mutex type is PTHREAD_MUTEX_NORMAL, deadlock detection shall not be provided. Attempting to relock the mutex causes deadlock. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results.

If the mutex type is PTHREAD_MUTEX_ERRORCHECK, then error checking shall be provided. If a thread attempts to relock a mutex that it has already locked, an error shall be returned. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be returned.

If the mutex type is PTHREAD_MUTEX_RECURSIVE, then the mutex shall maintain the concept of a lock count. When a thread successfully acquires a mutex for the first time, the lock count shall be set to one. Every time a thread relocks this mutex, the lock count shall be incremented by one. Each time the thread unlocks the mutex, the lock count shall be decremented by one. When the lock count reaches zero, the mutex shall become available for other threads to acquire. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error shall be returned.

If the mutex type is PTHREAD_MUTEX_DEFAULT, attempting to recursively lock the mutex results in undefined behavior. Attempting to unlock the mutex if it was not locked by the calling thread results in undefined behavior. Attempting to unlock the mutex if it is not locked results in undefined behavior.

I usually prefer to use PTHREAD_MUTEX_RECURSIVE mutexes, because in this case the mutex shall become available when the count reaches zero and the calling thread no longer has any locks on this mutex.

Pthread Mutex: pthread_mutex_unlock() consumes lots of time

1 Answers1

Linked