Current Version: 1.0.2-BETA
Arabesque is a distributed graph mining system that enables quick and easy development of graph mining algorithms, while providing a scalable and efficient execution engine running on top of Hadoop.
Benefits of Arabesque:
Arabesque is open-source with the Apache 2.0 license.
In [3]:
import io.arabesque.ArabesqueContext
println (s"spark application ID: ${sc.applicationId}")
// arabesque context is built on top of SparkContext
val arab = new ArabesqueContext (sc)
println (s"arabesque context = ${arab}")
// get local path for the sample graph
val localPath = s"${System.getenv ("ARABESQUE_HOME")}/data/citeseer-single-label.graph"
println (s"localPath = ${localPath}")
// several arabesque graphs are built on top of ArabesqueContext
val arabGraph = arab.textFile (localPath)
println (s"arabesque graph = ${arabGraph}")
// generating motifs of size 3
val motifs = arabGraph.motifs (3).set ("agg_ic", true).set ("comm_ss", "embedding")
println (s"arabesque result = ${motifs}")
println (motifs.config.getOutputPath)
// embeddings RDD
val embeddings = motifs.embeddings
println (motifs.config.getOutputPath)
println (s"two sample embeddings:\n${embeddings.take(2).mkString("\n")}")
// getting aggregations, one by one ()
val aggKeys = motifs.registeredAggregations
println (s"aggKeys = ${aggKeys.mkString(" ")}")
val motifsAgg = motifs.aggregation (aggKeys(0))
println (motifsAgg)
// getting all aggregations
val allAggs = motifs.aggregations
println (allAggs)
arab.stop