1118

When writing multithreaded applications, one of the most common problems experienced is race conditions.

My questions to the community are:

What is the race condition?
How do you detect them?
How do you handle them?
Finally, how do you prevent them from occurring?

Arsen Khachaturyan
  • 6,472
  • 4
  • 32
  • 36
bmurphy1976
  • 25,066
  • 11
  • 29
  • 23
  • 5
    I'd like to mention that - without specifying the language - most parts of this question cannot be answered properly, because in different languages, the definition, the consequences and the tools to prevent them might differ. – MikeMB Apr 21 '15 at 17:18
  • @MikeMB. Agreed, except when analyzing byte code execution, like it is done by Race Catcher (see this thread http://stackoverflow.com/a/29361427/1363844) we can address all those approximately 62 languages that compile to byte code (see https://en.wikipedia.org/wiki/List_of_JVM_languages) – Ben Aug 12 '16 at 05:49

18 Answers18

1408

A race condition occurs when two or more threads can access shared data and they try to change it at the same time. Because the thread scheduling algorithm can swap between threads at any time, you don't know the order in which the threads will attempt to access the shared data. Therefore, the result of the change in data is dependent on the thread scheduling algorithm, i.e. both threads are "racing" to access/change the data.

Problems often occur when one thread does a "check-then-act" (e.g. "check" if the value is X, then "act" to do something that depends on the value being X) and another thread does something to the value in between the "check" and the "act". E.g:

if (x == 5) // The "Check"
{
   y = x * 2; // The "Act"

   // If another thread changed x in between "if (x == 5)" and "y = x * 2" above,
   // y will not be equal to 10.
}

The point being, y could be 10, or it could be anything, depending on whether another thread changed x in between the check and act. You have no real way of knowing.

In order to prevent race conditions from occurring, you would typically put a lock around the shared data to ensure only one thread can access the data at a time. This would mean something like this:

// Obtain lock for x
if (x == 5)
{
   y = x * 2; // Now, nothing can change x until the lock is released. 
              // Therefore y = 10
}
// release lock for x
Amit Joki
  • 53,955
  • 7
  • 67
  • 89
Lehane
  • 42,680
  • 14
  • 46
  • 53
  • 141
    What does the other thread do when it encounters the lock? Does it wait? Error? – Brian Ortiz Oct 19 '09 at 01:58
  • 195
    Yes, the other thread will have to wait until the lock is released before it can proceed. This makes it very important that the lock is released by the holding thread when it is finished with it. If it never releases it, then the other thread will wait indefinitely. – Lehane Oct 22 '09 at 09:01
  • 2
    @Ian In a multithreaded system there will always be times when resources need to be shared. To say that one approach is bad without giving an alternative just isn't productive. I'm always looking for ways to improve and if there is an alternative I will gladly research it and weigh the pro's and cons. – Despertar May 03 '12 at 05:16
  • 1
    @Despertar Yes but you can design your multi-threaded system such that data contention is minimized - that's an alternative. Anyhow, I don't see a problem with someone only mentioning a problem with an approach if they explain what the problem is. – Ian Warburton May 03 '12 at 15:07
  • 2
    @Despertar ...also, its not necessarily the case that resources will always need to be shared in a milti-threaded system. For example you might have an array where each element needs processing. You could possibly partition the array and have a thread for each partition and the threads can do their work completely independently of one another. – Ian Warburton May 03 '12 at 17:29
  • 1
    +1 for reminding us to lock the variables used in the if condition. At first glance I would only put a lock around y since it's the data being changed, but that wouldn't do any good in your example. – goku_da_master Oct 18 '12 at 15:02
  • 15
    For a race to occur it's enough that a single thread attempts to change the shared data while rest of the threads can either read or change it. – SomeWittyUsername Nov 09 '12 at 16:13
  • How would the other thread gain access to that variable though? Isn't it restricted in scope(local variable)? – committedandroider Mar 07 '15 at 07:16
  • By this definition, there can be no race condition in programs on a single core, because threads are time-multiplexed and *never* run at the same time. Race condition should not be defined by runtime behavior; it should be a property of some code. – qznc Mar 15 '16 at 13:06
  • Narrowing down the concept of "race condition" to an unsynced access to a variable is a very misleading way to answer the original question. Race conditions are not in any way inherently tied to data access. @Baris Kasikci provided very good answers to what race condition really is. – AnT Apr 20 '16 at 16:28
  • @tsellon also provided a very good explanation to what race condition is. This answer on the other hand... Well, yes, unsynched access to data is an example of race condition, but it is such a specific and narrow example that using it as an answer to a general question about race condition is misleading at least. – AnT Apr 20 '16 at 16:30
  • I think it's worth noting that race condtion will only occur if at lease one competing thread will try to write/alter the resource. – FaceBro Jul 09 '16 at 06:08
  • I wrote a blog post about a race condition in Golang: https://joonas.fi/2017/02/20/concurrency-in-golang-and-the-importance-of-using-locks/ – joonas.fi Feb 21 '17 at 08:02
  • 2
    so let me guess? to prevent a race condition, you introduce a potential deadlock...C.S irony at its best – PirateApp Sep 03 '18 at 13:52
  • It's worth noting here that if x and y were local variables(i.e. if they were declared inside a function or passed as parameters), this race condition would NOT occur because each thread would have its own copy of x and y. – Gaurav B Dec 12 '18 at 06:38
  • 1
    Couldn't one just set the value of x to another variable as a placeholder? Such as the following `z = x; if(z == 2){ y = z * 2; }` – Isaac Apr 24 '19 at 21:18
  • you mentioned "Because the thread scheduling algorithm can swap between threads at any time, " what if scheduling algorithm swap thread while thread still inside the critical section and does not unlock mutex . – simplePerson43 Jun 17 '19 at 14:51
  • Yes and there is a problem of deadlock which occurs due to acquisition of lock from different threads. – Mike Aug 09 '19 at 19:54
  • what is the actual syntax to implement a lock in C? – mLstudent33 Dec 30 '19 at 22:17
  • not sure if I am just misreading this "A race condition occurs when two or more threads can access shared data and they try to change it at the same time." I understand this as "race condition = both try to change shared data at the same time", but in fact it is sufficient if one of them writes. Also your example is only reading from `x` (while a different thread writes to it) – 463035818_is_not_a_number Dec 03 '20 at 20:28
  • @BrianOrtiz Just wait – linjiejun Feb 02 '21 at 08:07
229

A "race condition" exists when multithreaded (or otherwise parallel) code that would access a shared resource could do so in such a way as to cause unexpected results.

Take this example:

for ( int i = 0; i < 10000000; i++ )
{
   x = x + 1; 
}

If you had 5 threads executing this code at once, the value of x WOULD NOT end up being 50,000,000. It would in fact vary with each run.

This is because, in order for each thread to increment the value of x, they have to do the following: (simplified, obviously)

Retrieve the value of x
Add 1 to this value
Store this value to x

Any thread can be at any step in this process at any time, and they can step on each other when a shared resource is involved. The state of x can be changed by another thread during the time between x is being read and when it is written back.

Let's say a thread retrieves the value of x, but hasn't stored it yet. Another thread can also retrieve the same value of x (because no thread has changed it yet) and then they would both be storing the same value (x+1) back in x!

Example:

Thread 1: reads x, value is 7
Thread 1: add 1 to x, value is now 8
Thread 2: reads x, value is 7
Thread 1: stores 8 in x
Thread 2: adds 1 to x, value is now 8
Thread 2: stores 8 in x

Race conditions can be avoided by employing some sort of locking mechanism before the code that accesses the shared resource:

for ( int i = 0; i < 10000000; i++ )
{
   //lock x
   x = x + 1; 
   //unlock x
}

Here, the answer comes out as 50,000,000 every time.

For more on locking, search for: mutex, semaphore, critical section, shared resource.

IKavanagh
  • 5,704
  • 11
  • 38
  • 44
privatehuff
  • 2,549
  • 1
  • 13
  • 7
  • See http://jakob.engbloms.se/archives/65 for an example of a program to test how oiften such things go bad... it really depends on the memory model of the machine you are running on. – jakobengblom2 Oct 12 '08 at 19:54
  • 1
    How can it get to 50 million if it has to stop at 10 million? –  Oct 26 '15 at 14:24
  • 10
    @nocomprende: By 5 threads executing the same code at a time, as described directly below the snippet... – Jon Skeet Nov 12 '15 at 18:47
  • 4
    @JonSkeet You are right, I confused the i and the x. Thank you. –  Nov 12 '15 at 18:57
  • 1
    Double check locking in implementing Singleton pattern is such an example of preventing race condition. – Bharat Dodeja Aug 31 '16 at 08:02
161

What is a Race Condition?

You are planning to go to a movie at 5 pm. You inquire about the availability of the tickets at 4 pm. The representative says that they are available. You relax and reach the ticket window 5 minutes before the show. I'm sure you can guess what happens: it's a full house. The problem here was in the duration between the check and the action. You inquired at 4 and acted at 5. In the meantime, someone else grabbed the tickets. That's a race condition - specifically a "check-then-act" scenario of race conditions.

How do you detect them?

Religious code review, multi-threaded unit tests. There is no shortcut. There are few Eclipse plugin emerging on this, but nothing stable yet.

How do you handle and prevent them?

The best thing would be to create side-effect free and stateless functions, use immutables as much as possible. But that is not always possible. So using java.util.concurrent.atomic, concurrent data structures, proper synchronization, and actor based concurrency will help.

The best resource for concurrency is JCIP. You can also get some more details on above explanation here.

ekostadinov
  • 6,642
  • 3
  • 24
  • 46
Vishal Shukla
  • 2,370
  • 1
  • 12
  • 18
  • Code reviews and unit tests are secondary to modeling the flow between your ears, and making less use of shared memory. – Acumenus Nov 25 '13 at 22:49
  • 2
    I appreciated the real world example of a race condition – Tom O. Jun 06 '17 at 17:45
  • 14
    Like the answer *thumbs up*. Solution is: you lock the tickets between 4-5 with mutex (mutual exception, c++). In real world it is called ticket reservation :) – Volt Aug 25 '17 at 12:11
  • 2
    would be a decent answer if you dropped the java-only bits (the question is not about Java, but rather race conditions in general) – Corey Goldberg Aug 26 '18 at 07:12
  • No. This isn't a race condition. From a "business" perspective you just waited too long. Obviously backorder is not a solution. Try a scalper otherwise just buy the ticket as insurance – csherriff Dec 03 '18 at 11:59
  • Great answer indeed, Race conditions will occur if at at all we come across "CHECK AND ACT" and "READ_MODIFY_UPDATE" situations – Sumanth Varada Jan 10 '19 at 09:50
71

There is an important technical difference between race conditions and data races. Most answers seem to make the assumption that these terms are equivalent, but they are not.

A data race occurs when 2 instructions access the same memory location, at least one of these accesses is a write and there is no happens before ordering among these accesses. Now what constitutes a happens before ordering is subject to a lot of debate, but in general ulock-lock pairs on the same lock variable and wait-signal pairs on the same condition variable induce a happens-before order.

A race condition is a semantic error. It is a flaw that occurs in the timing or the ordering of events that leads to erroneous program behavior.

Many race conditions can be (and in fact are) caused by data races, but this is not necessary. As a matter of fact, data races and race conditions are neither the necessary, nor the sufficient condition for one another. This blog post also explains the difference very well, with a simple bank transaction example. Here is another simple example that explains the difference.

Now that we nailed down the terminology, let us try to answer the original question.

Given that race conditions are semantic bugs, there is no general way of detecting them. This is because there is no way of having an automated oracle that can distinguish correct vs. incorrect program behavior in the general case. Race detection is an undecidable problem.

On the other hand, data races have a precise definition that does not necessarily relate to correctness, and therefore one can detect them. There are many flavors of data race detectors (static/dynamic data race detection, lockset-based data race detection, happens-before based data race detection, hybrid data race detection). A state of the art dynamic data race detector is ThreadSanitizer which works very well in practice.

Handling data races in general requires some programming discipline to induce happens-before edges between accesses to shared data (either during development, or once they are detected using the above mentioned tools). this can be done through locks, condition variables, semaphores, etc. However, one can also employ different programming paradigms like message passing (instead of shared memory) that avoid data races by construction.

Community
  • 1
  • 1
Baris Kasikci
  • 2,329
  • 1
  • 13
  • 8
  • The difference is critical to understand race condition. Thanks! – ProgramCpp Apr 03 '18 at 08:33
  • "It is a flaw that occurs in the timing or the ordering of events that leads to erroneous program behavior." Perfect definition! Indeed, there is no reason to assume that the events must occur within one instance of an application. Multiple instances are just as applicable. – truefusion Jul 08 '20 at 04:11
43

A sort-of-canonical definition is "when two threads access the same location in memory at the same time, and at least one of the accesses is a write." In the situation the "reader" thread may get the old value or the new value, depending on which thread "wins the race." This is not always a bug—in fact, some really hairy low-level algorithms do this on purpose—but it should generally be avoided. @Steve Gury give's a good example of when it might be a problem.

Chris Conway
  • 52,725
  • 40
  • 121
  • 150
  • 3
    Could you please give an example of how race conditions can be useful? Googling didn't help. – Alex V. Dec 11 '13 at 14:47
  • 3
    @Alex V. At this point, I have no idea what I was talking about. I think this may have been a reference to lock-free programming, but it's not really accurate to say that depends on race conditions, per se. – Chris Conway Dec 12 '13 at 15:31
34

A race condition is a kind of bug, that happens only with certain temporal conditions.

Example: Imagine you have two threads, A and B.

In Thread A:

if( object.a != 0 )
    object.avg = total / object.a

In Thread B:

object.a = 0

If thread A is preempted just after having check that object.a is not null, B will do a = 0, and when thread A will gain the processor, it will do a "divide by zero".

This bug only happen when thread A is preempted just after the if statement, it's very rare, but it can happen.

Blorgbeard
  • 93,378
  • 43
  • 217
  • 263
Steve Gury
  • 13,568
  • 6
  • 34
  • 41
26

A race condition is a situation on concurrent programming where two concurrent threads or processes compete for a resource and the resulting final state depends on who gets the resource first.

Jorge Córdoba
  • 47,433
  • 11
  • 77
  • 126
24

Race condition is not only related with software but also related with hardware too. Actually the term was initially coined by the hardware industry.

According to wikipedia:

The term originates with the idea of two signals racing each other to influence the output first.

Race condition in a logic circuit:

enter image description here

Software industry took this term without modification, which makes it a little bit difficult to understand.

You need to do some replacement to map it to the software world:

  • "two signals" => "two threads"/"two processes"
  • "influence the output" => "influence some shared state"

So race condition in software industry means "two threads"/"two processes" racing each other to "influence some shared state", and the final result of the shared state will depend on some subtle timing difference, which could be caused by some specific thread/process launching order, thread/process scheduling, etc.

Community
  • 1
  • 1
nybon
  • 7,315
  • 7
  • 48
  • 59
19

Race conditions occur in multi-threaded applications or multi-process systems. A race condition, at its most basic, is anything that makes the assumption that two things not in the same thread or process will happen in a particular order, without taking steps to ensure that they do. This happens commonly when two threads are passing messages by setting and checking member variables of a class both can access. There's almost always a race condition when one thread calls sleep to give another thread time to finish a task (unless that sleep is in a loop, with some checking mechanism).

Tools for preventing race conditions are dependent on the language and OS, but some comon ones are mutexes, critical sections, and signals. Mutexes are good when you want to make sure you're the only one doing something. Signals are good when you want to make sure someone else has finished doing something. Minimizing shared resources can also help prevent unexpected behaviors

Detecting race conditions can be difficult, but there are a couple signs. Code which relies heavily on sleeps is prone to race conditions, so first check for calls to sleep in the affected code. Adding particularly long sleeps can also be used for debugging to try and force a particular order of events. This can be useful for reproducing the behavior, seeing if you can make it disappear by changing the timing of things, and for testing solutions put in place. The sleeps should be removed after debugging.

The signature sign that one has a race condition though, is if there's an issue that only occurs intermittently on some machines. Common bugs would be crashes and deadlocks. With logging, you should be able to find the affected area and work back from there.

tsellon
  • 2,286
  • 5
  • 24
  • 30
12

Microsoft actually have published a really detailed article on this matter of race conditions and deadlocks. The most summarized abstract from it would be the title paragraph:

A race condition occurs when two threads access a shared variable at the same time. The first thread reads the variable, and the second thread reads the same value from the variable. Then the first thread and second thread perform their operations on the value, and they race to see which thread can write the value last to the shared variable. The value of the thread that writes its value last is preserved, because the thread is writing over the value that the previous thread wrote.

Konstantin Dinev
  • 30,746
  • 13
  • 64
  • 91
8

What is a race condition?

The situation when the process is critically dependent on the sequence or timing of other events.

For example, Processor A and processor B both needs identical resource for their execution.

How do you detect them?

There are tools to detect race condition automatically:

How do you handle them?

Race condition can be handled by Mutex or Semaphores. They act as a lock allows a process to acquire a resource based on certain requirements to prevent race condition.

How do you prevent them from occurring?

There are various ways to prevent race condition, such as Critical Section Avoidance.

  1. No two processes simultaneously inside their critical regions. (Mutual Exclusion)
  2. No assumptions are made about speeds or the number of CPUs.
  3. No process running outside its critical region which blocks other processes.
  4. No process has to wait forever to enter its critical region. (A waits for B resources, B waits for C resources, C waits for A resources)
Adnan Qureshi
  • 477
  • 4
  • 12
2

A race condition is an undesirable situation that occurs when a device or system attempts to perform two or more operations at the same time, but because of the nature of the device or system, the operations must be done in the proper sequence in order to be done correctly.

In computer memory or storage, a race condition may occur if commands to read and write a large amount of data are received at almost the same instant, and the machine attempts to overwrite some or all of the old data while that old data is still being read. The result may be one or more of the following: a computer crash, an "illegal operation," notification and shutdown of the program, errors reading the old data, or errors writing the new data.

1

You can prevent race condition, if you use "Atomic" classes. The reason is just the thread don't separate operation get and set, example is below:

AtomicInteger ai = new AtomicInteger(2);
ai.getAndAdd(5);

As a result, you will have 7 in link "ai". Although you did two actions, but the both operation confirm the same thread and no one other thread will interfere to this, that means no race conditions!

Vishal Shukla
  • 2,370
  • 1
  • 12
  • 18
1

Here is the classical Bank Account Balance example which will help newbies to understand Threads in Java easily w.r.t. race conditions:

public class BankAccount {

/**
 * @param args
 */
int accountNumber;
double accountBalance;

public synchronized boolean Deposit(double amount){
    double newAccountBalance=0;
    if(amount<=0){
        return false;
    }
    else {
        newAccountBalance = accountBalance+amount;
        accountBalance=newAccountBalance;
        return true;
    }

}
public synchronized boolean Withdraw(double amount){
    double newAccountBalance=0;
    if(amount>accountBalance){
        return false;
    }
    else{
        newAccountBalance = accountBalance-amount;
        accountBalance=newAccountBalance;
        return true;
    }
}

public static void main(String[] args) {
    // TODO Auto-generated method stub
    BankAccount b = new BankAccount();
    b.accountBalance=2000;
    System.out.println(b.Withdraw(3000));

}
Blorgbeard
  • 93,378
  • 43
  • 217
  • 263
realPK
  • 1,850
  • 24
  • 20
0

Try this basic example for better understanding of race condition:

    public class ThreadRaceCondition {

    /**
     * @param args
     * @throws InterruptedException
     */
    public static void main(String[] args) throws InterruptedException {
        Account myAccount = new Account(22222222);

        // Expected deposit: 250
        for (int i = 0; i < 50; i++) {
            Transaction t = new Transaction(myAccount,
                    Transaction.TransactionType.DEPOSIT, 5.00);
            t.start();
        }

        // Expected withdrawal: 50
        for (int i = 0; i < 50; i++) {
            Transaction t = new Transaction(myAccount,
                    Transaction.TransactionType.WITHDRAW, 1.00);
            t.start();

        }

        // Temporary sleep to ensure all threads are completed. Don't use in
        // realworld :-)
        Thread.sleep(1000);
        // Expected account balance is 200
        System.out.println("Final Account Balance: "
                + myAccount.getAccountBalance());

    }

}

class Transaction extends Thread {

    public static enum TransactionType {
        DEPOSIT(1), WITHDRAW(2);

        private int value;

        private TransactionType(int value) {
            this.value = value;
        }

        public int getValue() {
            return value;
        }
    };

    private TransactionType transactionType;
    private Account account;
    private double amount;

    /*
     * If transactionType == 1, deposit else if transactionType == 2 withdraw
     */
    public Transaction(Account account, TransactionType transactionType,
            double amount) {
        this.transactionType = transactionType;
        this.account = account;
        this.amount = amount;
    }

    public void run() {
        switch (this.transactionType) {
        case DEPOSIT:
            deposit();
            printBalance();
            break;
        case WITHDRAW:
            withdraw();
            printBalance();
            break;
        default:
            System.out.println("NOT A VALID TRANSACTION");
        }
        ;
    }

    public void deposit() {
        this.account.deposit(this.amount);
    }

    public void withdraw() {
        this.account.withdraw(amount);
    }

    public void printBalance() {
        System.out.println(Thread.currentThread().getName()
                + " : TransactionType: " + this.transactionType + ", Amount: "
                + this.amount);
        System.out.println("Account Balance: "
                + this.account.getAccountBalance());
    }
}

class Account {
    private int accountNumber;
    private double accountBalance;

    public int getAccountNumber() {
        return accountNumber;
    }

    public double getAccountBalance() {
        return accountBalance;
    }

    public Account(int accountNumber) {
        this.accountNumber = accountNumber;
    }

    // If this method is not synchronized, you will see race condition on
    // Remove syncronized keyword to see race condition
    public synchronized boolean deposit(double amount) {
        if (amount < 0) {
            return false;
        } else {
            accountBalance = accountBalance + amount;
            return true;
        }
    }

    // If this method is not synchronized, you will see race condition on
    // Remove syncronized keyword to see race condition
    public synchronized boolean withdraw(double amount) {
        if (amount > accountBalance) {
            return false;
        } else {
            accountBalance = accountBalance - amount;
            return true;
        }
    }
}
Morsu
  • 1
  • 1
0

You don't always want to discard a race condition. If you have a flag which can be read and written by multiple threads, and this flag is set to 'done' by one thread so that other thread stop processing when flag is set to 'done', you don't want that "race condition" to be eliminated. In fact, this one can be referred to as a benign race condition.

However, using a tool for detection of race condition, it will be spotted as a harmful race condition.

More details on race condition here, http://msdn.microsoft.com/en-us/magazine/cc546569.aspx.

kiriloff
  • 22,522
  • 32
  • 127
  • 207
  • What language is your answer based on? – MikeMB Apr 21 '15 at 17:14
  • Frankly it seems to me that if you have race conditions _per se_, you are not architecting your code in a tightly-controlled manner. Which, while it may not be an issue in your theoretical case, is evidence of larger issues with the way you design & develop software. Expect to face painful race condition bugs sooner or later. – Engineer Jul 13 '16 at 12:57
0

Consider an operation which has to display the count as soon as the count gets incremented. ie., as soon as CounterThread increments the value DisplayThread needs to display the recently updated value.

int i = 0;

Output

CounterThread -> i = 1  
DisplayThread -> i = 1  
CounterThread -> i = 2  
CounterThread -> i = 3  
CounterThread -> i = 4  
DisplayThread -> i = 4

Here CounterThread gets the lock frequently and updates the value before DisplayThread displays it. Here exists a Race condition. Race Condition can be solved by using Synchronzation

bharanitharan
  • 2,278
  • 5
  • 28
  • 30
0

A race condition is an undesirable situation that occurs when two or more process can access and change the shared data at the same time.It occurred because there were conflicting accesses to a resource . Critical section problem may cause race condition. To solve critical condition among the process we have take out only one process at a time which execute the critical section.

rashedcs
  • 2,709
  • 2
  • 29
  • 32