Questions tagged [emr]

Questions relating to Amazon's Elastic MapReduce (EMR) product.

Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).

http://aws.amazon.com/elasticmapreduce/

See also

Synonymous tag :

1202 questions
-3
votes
1 answer

How to run parallel clustering using Amazon EMR / Spark from files in a S3

I have 200,000 points in an 1000-dimensional space. If I load all these points using sc.textFile and exhaustively calculated the distance between each point, how can I do it in a parallel manner? Will Spark automatically parallelize the work for me?
Rodrigo Stv
  • 279
  • 2
  • 8
-4
votes
1 answer

How to install mosh on an EMR master

I've been having issues with ssh connections terminating, and thought it might be better to install mosh on the emr master - this way there would be some protection from loss of connectivity. I've created the following script to act as part of my…
theheadofabroom
  • 15,961
  • 5
  • 29
  • 63
1 2 3
80
81