Saturday, January 14, 2017

Creating a SparkContext

In this post I am going to explain about SparkContext and creating SparkContext in Scala, Python and Java.

SparkContext is the entry point to Spark for a Spark application and establishes a connection to a cluster. SparkContext allows many functions like get and set configurations of the cluster for running or deploying the application, creating objects, scheduling jobs, canceling jobs and many more.

Now we will see how to create a new SparkContext in Scala, Python and Java

Scala
import org.apache.spark.{SparkConf, SparkContext}

// 1. Create Spark configuration
val conf = new SparkConf()
  .setAppName("Your Spark Application Name")
  .setMaster("local[*]")  // local mode

// 2. Create Spark context
val sc = new SparkContext(conf)

Python
from pyspark import SparkContext
from pyspark import SparkConf

''' 1. Create Spark configuration '''
conf = SparkConf()
.setAppName("Your Spark Application Name")
.setMaster("local[*]")

''' 2. Create Spark context '''
sc = SparkContext(conf=conf)

Java
import org.apache.spark.SparkConf
import org.apache.spark.api.java.JavaSparkContext

// 1. Create Spark configuration
SparkConf conf = new SparkConf()
.setAppName("Your Spark Application Name")
.setMaster("local[*]");

// 2. Create Spark context
JavaSparkContext sc = new JavaSparkContext(conf);

Once a SparkContext instance is created you can use it to create RDDs, Accumulators and Broadcast variables, access Spark services and run jobs (until SparkContext is stopped).

Enjoy Spark!

No comments:

Post a Comment