Questions tagged [combiners]

104 questions
17
votes
2 answers

“Combiner" Class in a mapreduce job

A Combiner runs after the Mapper and before the Reducer,it will receive as input all data emitted by the Mapper instances on a given node. then emits output to the Reducers. And also,If a reduce function is both commutative and associative, then it…
wayen wan
  • 207
  • 1
  • 2
  • 7
14
votes
3 answers

On what basis mapreduce framework decides whether to launch a combiner or not

As per definition "The Combiner may be called 0, 1, or many times on each key between the mapper and reducer." I want to know that on what basis mapreduce framework decides how many times cobiner will be launched.
banjara
  • 3,561
  • 2
  • 33
  • 53
13
votes
3 answers

Multiple git commands in single command executed in order they are encountered by compiler

I have following list of commands that I run in respective order so that a source project can be committed and pushed to the repository on Bitbucket: git init git remote add origin https://[BitBucket Username]@bitbucket.org/[BitBucket…
Vicky Dev
  • 1,219
  • 1
  • 18
  • 40
11
votes
4 answers

combiner and reducer can be different?

In many MapReduce programs, I see a reducer being used as a combiner as well. I know this is because of the specific nature of those programs. But I am wondering if they can be different.
kee
  • 8,915
  • 18
  • 85
  • 149
10
votes
4 answers

Hadoop combiner sort phase

When running a MapReduce job with a specified combiner, is the combiner run during the sort phase? I understand that the combiner is run on mapper output for each spill, but it seems like it would also be beneficial to run during intermediate steps…
Michael Mior
  • 26,133
  • 8
  • 80
  • 110
5
votes
1 answer

Turn list of key/value pairs into list of values per key in spark

We need to efficiently convert large lists of key/value pairs, like this: val providedData = List( (new Key("1"), new Val("one")), (new Key("1"), new Val("un")), (new Key("1"), new Val("ein")), (new Key("2"), new…
Bradjcox
  • 1,468
  • 1
  • 17
  • 30
5
votes
1 answer

Difference between combiner and in-mapper combiner in mapreduce?

I'm new to hadoop and mapreduce. Could someone clarify the difference between a combiner and an in-mapper combiner or are they the same thing?
Billy02
  • 51
  • 1
  • 2
4
votes
1 answer

Have 3 matrices of same dimensions - I want to get the highest value of each cell of the three different matrices

Basically I have 3 matrices of the same dimensions. They only consist of values 0 , 1, 2 ,3. I would like to create a new matrix that takes the highest value from each of the corresponding matrices. For example, if the first row of the matrices are…
Ethan Hunt
  • 85
  • 4
4
votes
3 answers

combine array entries with every other entry

Sorry for the title as it looks like most of the other questions about combining arrays, but I don't know how to write it more specific. I need a PHP function, which combines the entries of one array (dynamic size from 1 to any) to strings in every…
Lucker
  • 43
  • 4
4
votes
4 answers

PHP, make with all other array values

I have to make a complex array with other array values. The original array is: Array ( [0] => Array ( [0] => A [1] => B [2] => C ) [1] => Array ( [0] => D [1] => E [2] => F ) ) I'm…
4
votes
1 answer

Why is the number of combiner input records more than the number of outputs of maps?

A Combiner runs after the Mapper and before the Reducer, it will receive as input all data emitted by the Mapper instances on a given node. It then emits output to the Reducers. So the records of the combiner input should less than the maps…
alex
  • 41
  • 3
4
votes
3 answers

How can I find out if a task is a reducer or a combiner during run time in Hadoop?

If the operation performed with MapReduce is not commutative and associative, then the combiner cannot be the same as the reducer. For example when calculating an average value the combiners sums the values for a key and the reducer sums then and…
Calin-Andrei Burloiu
  • 1,411
  • 2
  • 13
  • 25
3
votes
0 answers

Why combiner input records are more than mapper output records?

Combiner works on output records of mapper. If the mapper output records are fed to the combiner then why are my combiner input records are more than mapper output records? I got these 80 records extra.I have no idea from where they came & what…
shriyog
  • 838
  • 13
  • 23
3
votes
5 answers

Who will get a chance to execute first , Combiner or Partitioner?

I'm getting confused after reading below article on Hadoop- Definitive guide 4th edition(page-204) Before it writes to disk, the thread first divides the data into partitions corresponding to the reducers that they will ultimately be sent to.…
3
votes
2 answers

Partial aggregation vs Combiners which one faster?

There are notice about what how cascading/scalding optimized map-side evaluation They use so called Partial Aggregation. Is it actually better approach then Combiners? Are there any performance comparison on some common hadoop tasks(word count for…
yura
  • 14,149
  • 19
  • 71
  • 123
1
2 3 4 5 6 7