Questions tagged [giraph]

Apache Giraph is an iterative graph processing system built for high scalability.

Apache Giraph is an iterative graph processing system built for high scalability.

For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.

Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google and described in this paper.

Both systems are inspired by the Bulk Synchronous Parallel model of distributed computation introduced by Leslie Valiant.

Bulk Synchronous Parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It differs from Parallel random access machine (PRAM) by not talking communication and synchronization for granted. An important part of analyzing a BSP algorithm rests in qualifying the synchronization and the communication needed.

Giraph adds several features beyond the basic Pregel model, including master computation, sharded aggregators, edge-oriented input, out-of-core computation, and more.

With a steady development cycle and a growing community of users worldwide, Giraph is a natural choice for unleashing the potential of structured datasets at a massive scale.

References

159 questions
19
votes
3 answers

Neo4j or GraphX / Giraph what to choose?

Just started my excursion to graph processing methods and tools. What we basically do - count some standard metrics like pagerank, clustering coefficient, triangle count, diameter, connectivity etc. In the past was happy with Octave, but when we…
Roman
  • 227
  • 1
  • 2
  • 4
14
votes
2 answers

Whats the difference between .ppk and .pem . Where .pem is stored in amazons ec2 cluster?

I am using Amazon's EC2 cluster for running GraphLab. They want the location of my .pem file, which is my private key. After searching I still I could not find the file in ubuntu. I am using PuTTY.
Divine Retribution
  • 155
  • 1
  • 1
  • 3
7
votes
2 answers

Gremlin - Giraph - GraphX ? On TitanDb

I need some help to be confirm my choice... and to learn if you can give me some information. My storage database is TitanDb with Cassandra. I have a very large graph. My goal is to use Mllib on the graph latter. My first idea : use Titan with…
dede
  • 91
  • 1
  • 7
5
votes
0 answers

Run giraph on Hadoop yarn 2.6.0

I'm trying to use Giraph on hadoop 2.6.0 with yarn. I've managed to build it by removing STATIC_SASL_SYMBOL in in the yarn profile. with the command : sudo mvn -Phadoop_yarn -Dhadoop.version=2.6.0 -DskipTests package Then i've setup…
5
votes
1 answer

Apache Giraph - Cannot run in split master / worker mode since there is only 1 task at a time

I ran Giraph 1.0.0 with hadoop 2.2.0 using the PageRank Benchmark example here. Suddenly I got this error result: Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, must…
Algorithman
  • 1,101
  • 13
  • 32
5
votes
1 answer

Giraph Shortest Paths Example ClassNotFoundException

I am trying to run the shortest paths example from the giraph incubator (https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example). However instead of executing the example from the giraph-*-dependencies.jar, I have created my own…
alien01
  • 1,330
  • 2
  • 14
  • 31
4
votes
2 answers

Building Giraph with Hadoop

I am trying to set up Giraph with Hadoop 2.7.1 Try as I might, it doesn't seem to work. I have tried following the below…
Ghazanfar
  • 1,353
  • 12
  • 21
4
votes
3 answers

java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1

i'm having some problems with custom classes in giraph. I made a VertexInput and Output format, but i always getting the following error: java.io.IOException: ensureRemaining: Only * bytes remaining, trying to read * with different values where the…
chomp
  • 1,242
  • 12
  • 28
4
votes
1 answer

Maven project generation failure

I used the following command to generate the project: $ mvn archetype:generate The error presented during build failure is: INFO] ---------------------------------------------------------------------------- [INFO] Using following parameters for…
Anish Gupta
  • 263
  • 1
  • 5
  • 18
3
votes
1 answer

Graph OLAP processing - Giraph vs. Tinkerpop3 GraphComputer

My use case is a graph of several hundreds of millions of vertices (say 100M to 1B). Each vertex has a set of 10 properties which are basically scores that are computed based on the weights of the vertex's edges and the scores of the adjacent…
Fabien Coppens
  • 243
  • 3
  • 12
3
votes
0 answers

Custom Graph Partitioning algorithms in Giraph

There have been mentions of using Custom Partitioning algorithms for Giraph applications. However it is not clearly given at any place. As Castagna pointed out here in how to partition graph for pregel to maximize processing speed?, there may not be…
Sharukh Mohamed
  • 65
  • 1
  • 10
3
votes
1 answer

Which Giraph I/O format can be used for property graph?

There are several built-in input output format in Giraph, but all those formats support only numerical ids & value. So is there a way to process property graph such that both vertices & edges can have multiple key & values or anything close? I'm…
Parth
  • 639
  • 7
  • 22
3
votes
1 answer

Could not find or load main class org.apache.giraph.yarn.GiraphApplicationMaster

I am attempting ito get Giraph running on a YARN cluster, (Hadoop 2.5.2) but am I'm stuck at this error: Could not find or load main class org.apache.giraph.yarn.GiraphApplicationMaster I've tried everything I can find in previous messages on this…
mindcrime
  • 615
  • 8
  • 22
3
votes
1 answer

Giraph job never ends

I'm trying to run the SimpleShortestPathsComputation example using the latest Giraph code and Hadoop 2.5.2. My command line looks like this: hadoop jar…
mindcrime
  • 615
  • 8
  • 22
3
votes
1 answer

Giraph: Class not Found Exception on custom Job

I am developing an algorithm using Giraph. I am working with version 1.0.0 on Hadoop 1.2.1. I am pretty new to developing Giraph, so please be gentle ;) My custom job is split into three packages: io: contains the input and output format…
Alessio Arleo
  • 71
  • 1
  • 1
  • 6
1
2 3
10 11