Java Multithreading Value Corruption

Question

Asking a question from https://www.baeldung.com/java-thread-safety.

Code given is

public class MathUtils {

    public static BigInteger factorial(int number) {
        BigInteger f = new BigInteger("1");
        for (int i = 2; i <= number; i++) {
            f = f.multiply(BigInteger.valueOf(i));
        }
        return f;
    }
}

The above linked website says that it is stateless and multiple threads can run this method at same time and get proper results. My question is: doesn't the value of number variable get corrupted, when many threads call this?

`number` is a parameter. Each method call ends up with an entirely separate variable. — Jon Skeet, Apr 28 '21 at 12:01
Not just every thread - but every method call gets its own copy of all local variables, including parameters. (So if you call a method recursively, each of those calls will have its own set of local variables.) — Jon Skeet, Apr 28 '21 at 12:20

score 2 · Answer 1 · answered Apr 28 '21 at 12:24

The specific mechanism you need to be aware of is the stack.

Each thread gets its own stack. The stack is a traditional last-in-first-out setup: You can 'push' things on it, and you can 'pop' things off it, which will retrieve and remove the most recently pushed thing.

Each thread has its own unique stack. The stack is used for both local variables and execution pointers. Imagine this code:

public static void main(String[] args) {
    a(5);
    System.out.println("Done");
}

public static void a(int x) {
    b();
    System.out.println("In a: " + a);
}

public static void b() {}

Now imagine you're a CPU. You are just pointed at an instruction and are supposed to run it. You don't know java and you don't know what a loop is. You just know about basic instructions, including 'GO TO'. But that's all you have.

How would you know where to go back to once b() is done running? How would you know that you have to jump back to the midpoint of the a method and continue at System.out.println("In a")?

The stack is the answer. When executing b(), what happens under the hood is:

PUSH [position in this method we're at right now]
GOTO [position of the start of the b method]

And the b() method ends in an instruction that means: POP a number off of the stack and then GOTO that number.

Local variable are ALSO stored on the stack. So, the instruction set of all this is essentially:

1 PUSH 4 // position to return back to once a is done
2 PUSH 5 // from a(5)
3 GOTO 8 // a method
4 PUSH 7
5 CREATE_OBJECT "Done" // pushes pos of new object on stack
6 GOTO [position of System.out.println]
7 EXIT_APPLICATION

8 PUSH 10 // position to return back to
9 GOTO 16 // b() method
10 CREATE_OBJECT "In a: " // pushes pos of new object on stack
11 FLIP // flip the top two stack entries
12 CONCAT_STRINGS
13 PUSH 15 // position to return back to
14 GOTO [position of System.out.println]
15 RET // pop number and go to it

16 RET

(This is HIGHLY oversimplified; CPUs and bytecode are way more complicated than this, but it gets the point across, hopefully!)

The stack as a concept explains how java works:

Methods are 're-entrant', and each local variable and parameter is a unique copy every time the method runs. That's because these are represented by things on the stack, and you can of course just keep adding things to it. In that sense the stack is a relative thing: "System.out.println(b)", if b is a local var or parameter, is like an instruction to read a line in a book 'two lines above where you are currently reading' (that'll be a new line as long as you keep reading), vs an instruction to read 'line 8 in this book', which is the same line every time.
Java is pass-by-value, which means everything you get is a copy:

int x = 5;
add1(x);
System.out.println(x);

public void add1(int x) {
   x = x + 1;
}

the above prints 5 and not 6, because add1(x) is shorthand for:

PUSH current_value_of_whatever_the_x_variable_holds
CALL add1

and add1 is going to operate on that pushed value, and not on your x variable. It gets a little convoluted when we involve objects (because objects in java are represented by their reference: A pointer. Imagine an object is a house, and a reference is more like an address. I can have an address to my house, and then hand you a copy of that address on a piece of paper. You can take a pen and change that paper all you like, it does not affect either my address book or my house. But if you drive over to the house and toss a brick through the window, even though I handed you a copy of a page of my address book, that's still my window. So:

List<String> list = new ArrayList<String>();
list.add("Hello");
add1(list);
System.out.println(list);

public void add1(List<String> list) {
  list.add("World!");
}

This would print Hello, World!. Because . is the java equivalent of 'drive over to the house that address list is pointing to. Had I written list = List.of("Hello", "World!"), nothing would appear to happen, as = is java equivalent of: Wipe out the address card and write a new address on it. Which doesn't affect my house nor my address book.

score 1 · Accepted Answer · answered Apr 28 '21 at 12:02

1

Multiple threads can indeed call MathUtils.factorial() safely.

Each activation of factorial will have its own copy of f, accessible to it alone. That is the meaning of a "local variable".

The argument number is not modified, and acts like a local variable of `factorial' in any case.

As to your question in the comment. No, there's one copy of the code - no need to have more that one. But each thread has its own execution of the code, so if it helps to think of that as a 'separate copy', not much conceptual harm is done.

answered Apr 28 '21 at 12:02

something

164
4

So even if a static method is run by different threads, threads will have their own copy of the mehtod? – Abhijeet Apr 28 '21 at 12:06
1

@Abhijeet No, each method invocation will have its own set of local variables, but overall there is only one copy of the method itself. – Mark Rotteveel Apr 28 '21 at 12:43
Its a wonder...something more has to be explained here i think. But thanks ofcourse – Abhijeet Apr 28 '21 at 13:06
I think I understood this... rereading helps... – Abhijeet Apr 28 '21 at 13:28

score 0 · Answer 3 · answered Apr 28 '21 at 12:26

Whenever any thread (apart from main thread) starts execution of "factorial(int number)" method, that thread saves a copy of "number" into it's own stack as local variable and hence there is no chance to change the value of "number" by any other thread.

However, if "number" value is coming from any shared object(shared by multiple threads), and if it is copied by multiple threads into stack & after that value got changed by some thread(s), then in this case there could be a chance of data inconsistency (check 'volatile').

Java Multithreading Value Corruption

3 Answers3