Questions tagged [combiners]
104 questions
17
votes
2 answers
“Combiner" Class in a mapreduce job
A Combiner runs after the Mapper and before the Reducer,it will receive as input all data emitted by the Mapper instances on a given node. then emits output to the Reducers.
And also,If a reduce function is both commutative and associative, then it…
![](../../users/profiles/2046149.webp)
wayen wan
- 207
- 1
- 2
- 7
14
votes
3 answers
On what basis mapreduce framework decides whether to launch a combiner or not
As per definition "The Combiner may be called 0, 1, or many times on each key between the mapper and reducer."
I want to know that on what basis mapreduce framework decides how many times cobiner will be launched.
![](../../users/profiles/417516.webp)
banjara
- 3,561
- 2
- 33
- 53
13
votes
3 answers
Multiple git commands in single command executed in order they are encountered by compiler
I have following list of commands that I run in respective order so that a source project can be committed and pushed to the repository on Bitbucket:
git init
git remote add origin https://[BitBucket Username]@bitbucket.org/[BitBucket…
![](../../users/profiles/365107.webp)
Vicky Dev
- 1,219
- 1
- 18
- 40
11
votes
4 answers
combiner and reducer can be different?
In many MapReduce programs, I see a reducer being used as a combiner as well. I know this is because of the specific nature of those programs. But I am wondering if they can be different.
![](../../users/profiles/1549741.webp)
kee
- 8,915
- 18
- 85
- 149
10
votes
4 answers
Hadoop combiner sort phase
When running a MapReduce job with a specified combiner, is the combiner run during the sort phase? I understand that the combiner is run on mapper output for each spill, but it seems like it would also be beneficial to run during intermediate steps…
![](../../users/profiles/123695.webp)
Michael Mior
- 26,133
- 8
- 80
- 110
5
votes
1 answer
Turn list of key/value pairs into list of values per key in spark
We need to efficiently convert large lists of key/value pairs, like this:
val providedData = List(
(new Key("1"), new Val("one")),
(new Key("1"), new Val("un")),
(new Key("1"), new Val("ein")),
(new Key("2"), new…
![](../../users/profiles/563868.webp)
Bradjcox
- 1,468
- 1
- 17
- 30
5
votes
1 answer
Difference between combiner and in-mapper combiner in mapreduce?
I'm new to hadoop and mapreduce. Could someone clarify the difference between a combiner and an in-mapper combiner or are they the same thing?
![](../../users/profiles/4504186.webp)
Billy02
- 51
- 1
- 2
4
votes
1 answer
Have 3 matrices of same dimensions - I want to get the highest value of each cell of the three different matrices
Basically I have 3 matrices of the same dimensions. They only consist of values 0 , 1, 2 ,3. I would like to create a new matrix that takes the highest value from each of the corresponding matrices.
For example, if the first row of the matrices are…
![](../../users/profiles/9948666.webp)
Ethan Hunt
- 85
- 4
4
votes
3 answers
combine array entries with every other entry
Sorry for the title as it looks like most of the other questions about combining arrays, but I don't know how to write it more specific.
I need a PHP function, which combines the entries of one array (dynamic size from 1 to any) to strings in every…
![](../../users/profiles/6907846.webp)
Lucker
- 43
- 4
4
votes
4 answers
PHP, make with all other array values
I have to make a complex array with other array values.
The original array is:
Array (
[0] => Array (
[0] => A
[1] => B
[2] => C
)
[1] => Array (
[0] => D
[1] => E
[2] => F
)
)
I'm…
![](../../users/profiles/4374217.webp)
Neteor Informatique
- 51
- 3
4
votes
1 answer
Why is the number of combiner input records more than the number of outputs of maps?
A Combiner runs after the Mapper and before the Reducer, it will receive as input all data emitted by the Mapper instances on a given node. It then emits output to the Reducers. So the records of the combiner input should less than the maps…
![](../../users/profiles/1064828.webp)
alex
- 41
- 3
4
votes
3 answers
How can I find out if a task is a reducer or a combiner during run time in Hadoop?
If the operation performed with MapReduce is not commutative and associative, then the combiner cannot be the same as the reducer.
For example when calculating an average value the combiners sums the values for a key and the reducer sums then and…
![](../../users/profiles/803061.webp)
Calin-Andrei Burloiu
- 1,411
- 2
- 13
- 25
3
votes
0 answers
Why combiner input records are more than mapper output records?
Combiner works on output records of mapper. If the mapper output records are fed to the combiner then why are my combiner input records are more than mapper output records?
I got these 80 records extra.I have no idea from where they came & what…
![](../../users/profiles/4737559.webp)
shriyog
- 838
- 13
- 23
3
votes
5 answers
Who will get a chance to execute first , Combiner or Partitioner?
I'm getting confused after reading below article on Hadoop- Definitive guide 4th edition(page-204)
Before it writes to disk, the thread first divides the data into
partitions corresponding to the reducers that they will ultimately be
sent to.…
![](../../users/profiles/3492538.webp)
Prashant
- 133
- 1
- 11
3
votes
2 answers
Partial aggregation vs Combiners which one faster?
There are notice about what how cascading/scalding optimized map-side evaluation
They use so called Partial Aggregation.
Is it actually better approach then Combiners? Are there any performance comparison on some common hadoop tasks(word count for…
![](../../users/profiles/426377.webp)
yura
- 14,149
- 19
- 71
- 123