BufferedReader in a multi-core environment

Question

I have 8 files. Each one of them is about 1.7 GB. I'm reading those files into a byte array and that operation is fast enough.

Each file is then read as follow:

BufferedReader br=new BufferedReader(new InputStreamReader(new ByteArrayInputStream(data)));

When processed using a single core in a sequential sense it takes abour 60 seconds to complete. However, when distributing the computation over 8 separate cores it takes far longer than 60 seconds per file.

Since the data are all in memory and no IO operations is performed, I would have presumed that it should take no longer than 60 seconds to process a single file per core. So, the total 8 files should complete in just over 60 seconds but this is not the case.

Am I missing something about BufferedReader behaviour? or any of the readers used in the above code.

It might worth mentioning that I'm using this code to upload files first:

byte[] content=org.apache.commons.io.FileUtils.readFileToByteArray(new File(filePath));

The code over all looks like this:

For each file
 read the file into a byte[]
 add the byte[] to a list
end For
For each item in the list
 create a thread and pass a byte[] to it
end For

How many disk drives are the files distributed over? Or are they all stored on the same drive? — Ramón J Romero y Vigil, Feb 27 '13 at 13:42
For such big files I would strongly recommend using NIO. Please check this article: http://www.javalobby.org/java/forums/t17036.html, it might be helpful — n1ckolas, Feb 27 '13 at 13:44
Files are in-memory stored in a byte[]. disk drives are not relevant here. @RJRyV — DotNet, Feb 27 '13 at 13:44
@n1ckolas Thanks. As in my previous comment and in the question, all files are in memory and no IO is involved. — DotNet, Feb 27 '13 at 13:45
I disagree that the number of drives is not relevant. If you have one disk drive (i.e. one head) then multi-coring the problem would only serve to cause thread contention for a single resource. This would make the multi-threaded approach slower than the serial method... — Ramón J Romero y Vigil, Feb 27 '13 at 13:47
Could be the cost of swapping from main memory to each of the core caches, is slowing down your processes in a non-linear fashion when multiple cores are used. — Perception, Feb 27 '13 at 13:48
@RJRyV - disk drives are *irrelevant* to the question. Just pretend like the OP has 8 large byte arrays he is trying to process concurrently in multiple cores. — Perception, Feb 27 '13 at 13:49
It might have little to do with the readers themselves, and more to do with the overhead of creating and reading from them concurrently. There could also be a bottleneck in your multi-thread code which keeps the different read operations from actually proceeding in the different threads. It would help if we could see the code that reads from the BufferedReaders as well as the threading code... — RudolphEst, Feb 27 '13 at 13:50
@RJRyV Yes I agree with you, but files are in-memory already! — DotNet, Feb 27 '13 at 13:50
Can you show the code you are using to distribute the task? It could be that the way you are doing it is causing contention — John Kane, Feb 27 '13 at 13:51
The questions of Brett are actually relevant: What do you mean by "distributing the computation"? Has every thread its own copy of data? Is there synchronization on a shared Reader? — Stephan, Feb 27 '13 at 13:54
@AdamDyga that is exactly what I said in my answer below. I agree this is a likely cause for the problem. — KingCronus, Feb 27 '13 at 13:56
Do I understand your question correctly, that when sequentially processing the files, it takes about 60s to process each file, e.g. ~8 Minutes in total? And that if you process them in separate threads, it takes "far longer than 60s" in total? How much is "far longer? — jarnbjo, Feb 27 '13 at 14:24

score 3 · Accepted Answer · answered Feb 27 '13 at 13:49

How are you actually "distributing the computation"? Is there synchronization involved? Are you simply creating 8 threads to read the 8 files?

What platform are you running on (linux, windows, etc.)? I have seen seemingly strange behavior from the windows scheduler before where it moves a single process from core to core to try and balance the load among the cores. This ended up causing slower performance than just allowing a single core to be utilized more than the rest.

Synchronization between objects was the issue. Thanks Brett. — DotNet, Feb 27 '13 at 14:54

score 2 · Answer 2 · answered Feb 27 '13 at 13:52

2

How much memory is your system rocking?

8 x 1.7GB, + operating system overhead, might mean that virtual memory / paging is having to come into play. Which is obviously much slower than RAM.

I appreciate you say each file is in memory, but do you actually have 16GB of free RAM or is there more going on at an abstracted level?

If the context switch is also having to constantly switch pages too, that would explain an increased time.

answered Feb 27 '13 at 13:52

KingCronus

4,440
1
21
46

Thanks for your answer, I have memory large enough to accommodate the data. No paging or use of virtual memory is involved. – DotNet Feb 27 '13 at 14:00

BufferedReader in a multi-core environment

2 Answers2