kafkaSendDataPy

This notebook sends data to Kafka on the topic 'test'. A message that gives the current time is sent every second

Add dependencies


In [ ]:
import os
os.environ['PYSPARK_SUBMIT_ARGS'] = '--conf spark.ui.port=4041 --packages org.apache.kafka:kafka_2.11:0.10.0.0,org.apache.kafka:kafka-clients:0.10.0.0  pyspark-shell'

Load modules and start SparkContext

Note that SparkContext must be started to effectively load the package dependencies. One core is used.


In [ ]:
from pyspark import SparkContext
sc = SparkContext("local[1]", "KafkaSendStream") 
from kafka import KafkaProducer
import time

Start Kafka producer

One message giving current time is sent every second to the topic test


In [ ]:
producer = KafkaProducer(bootstrap_servers='localhost:9092')
while True:
    message=time.strftime("%Y-%m-%d %H:%M:%S")
    producer.send('test', message)
    time.sleep(1)

In [ ]: