Scala Example for SFTP server

This is an example of how to use Generic Connector with Apache Spark in order to process files stored in a SFTP server directory.

Load dependencies for Apache Spark 2.x


In [ ]:
%dep

z.load("alvsanand:spark-generic-connector:0.2.0-spark_2x-s_2.11")

Import dependencies


In [ ]:
import es.alvsanand.sgc.ftp.{FTPCredentials, FTPSlot}
import es.alvsanand.sgc.ftp.secure.{SFTPSgcConnectorFactory, SFTPParameters, KeyConfig}
import org.apache.spark.streaming.sgc._

Create the SgcConnectorParameters with the desired parameters

  • Using user and password authentication:

In [ ]:
val parameters = SFTPParameters("HOST", PORT, "DIRECTORY", FTPCredentials("USER", Option("PASSWORD")))
  • Using private key authentication:

In [ ]:
val parameters = SFTPParameters("HOST", PORT, "DIRECTORY", FTPCredentials("USER"),
                                   pconfig = Option(KeyConfig("PRIVATE_KEY_URL", "PUBLIC_KEY_URL")))
  • Using encrypted private key authentication:

In [ ]:
val parameters = SFTPParameters("HOST", PORT, "DIRECTORY", FTPCredentials("USER"),
                                   pconfig = Option(KeyConfig("PRIVATE_KEY_URL", "PUBLIC_KEY_URL",
                                                       Option("PRIVATE_KEY_PASSWORD")))

Create the RDD passing the SgcConnectorFactory and the parameters


In [ ]:
val rdd = sc.createSgcRDD(SFTPSgcConnectorFactory, parameters)

Use the RDD as desired


In [ ]:
rdd.partitions.map(_.asInstanceOf[SgcRDDPartition[FTPSlot]].slot)
rdd.take(10).foreach(println)