Some boilerplate

Importing library. Setting up spark. Apache spark is not required for spylon, but is used here because it makes getting the correct classpath for that scala parts simpler.


In [1]:
import spylon
import spylon.spark as sp
c = sp.SparkConfiguration()
c._spark_home = "/path/to/spark-1.6.2-bin-hadoop2.6"
c.master = ["local[4]"]

In [2]:
(sc, sqlContext) = c.sql_context("MyApplicationName")

In [4]:
from spylon.spark.spark import SparkJVMHelpers
helpers = SparkJVMHelpers(sc)

Simple calls

Create a new java object


In [6]:
rand = helpers.jvm.java.util.Random()

In [8]:
rand


Out[8]:
JavaObject id=o19

You can also get documentation for a class instance


In [11]:
print rand.__doc__


Help on class Random in package java.util:

Random implements java.io.Serializable {
|  
|  Methods defined here:
|  
|  nextBoolean() : boolean
|  
|  nextBytes(byte[]) : void
|  
|  nextDouble() : double
|  
|  nextFloat() : float
|  
|  nextGaussian() : double
|  
|  nextInt(int) : int
|  
|  nextInt() : int
|  
|  nextLong() : long
|  
|  setSeed(long) : void
|  
|  ------------------------------------------------------------
|  Fields defined here:
|  
|  ------------------------------------------------------------
|  Internal classes defined here:
|  
}

Call a method on that java object.


In [10]:
rand.nextInt(10000)


Out[10]:
4047

Conversions from python to scala types

Conversion from python types to Scala Seq


In [13]:
o = helpers.to_scala_seq([1, 2, 3, 4])

In [14]:
o


Out[14]:
JavaObject id=o29

In [7]:
o.getClass().toString()


Out[7]:
u'class scala.collection.convert.Wrappers$JListWrapper'

In [8]:
o.toString()


Out[8]:
u'Buffer(1, 2, 3, 4)'

In [9]:
o.toList().toString()


Out[9]:
u'List(1, 2, 3, 4)'

Dictionaries to Maps


In [10]:
m = helpers.to_scala_map({'a': 1, 'b': 2})

In [11]:
m.toString()


Out[11]:
u'Map(b -> 2, a -> 1)'

In [12]:
c = m.getClass()

In [13]:
c.getCanonicalName()


Out[13]:
u'scala.collection.convert.Wrappers.JMapWrapper'