Apache Spark Training Course

Primary tabs

Course Language

This course is delivered in English.

Course Code

Duration Duration

14 hours (usually 2 days including breaks)

Course Outline Course Outline

Why Spark?

  • Problems with Traditional Large-Scale Systems
  • Introducing Spark

Spark Basics

  • What is Apache Spark?
  • Using the Spark Shell
  • Resilient Distributed Datasets (RDDs)
  • Functional Programming with Spark

Working with RDDs

  • RDD Operations
  • Key-Value Pair RDDs
  • MapReduce and Pair RDD Operations

The Hadoop Distributed File System

  • Why HDFS?
  • HDFS Architecture
  • Using HDFS

Running Spark on a Cluster

  • Overview
  • A Spark Standalone Cluster
  • The Spark Standalone Web UI

Parallel Programming with Spark

  • RDD Partitions and HDFS Data Locality
  • Working With Partitions
  • Executing Parallel Operations

Caching and Persistence

  • RDD Lineage
  • Caching Overview
  • Distributed Persistence

Writing Spark Applications

  • Spark Applications vs. Spark Shell
  • Creating the SparkContext
  • Configuring Spark Properties
  • Building and Running a Spark Application
  • Logging

Spark, Hadoop, and the Enterprise Data Center

  • Overview
  • Spark and the Hadoop Ecosystem
  • Spark and MapReduce

Spark Streaming

  • Spark Streaming Overview
  • Example: Streaming Word Count
  • Other Streaming Operations
  • Sliding Window Operations
  • Developing Spark Streaming Applications

Common Spark Algorithms

  • Iterative Algorithms
  • Graph Analysis
  • Machine Learning

Improving Spark Performance

  • Shared Variables: Broadcast Variables
  • Shared Variables: Accumulators
  • Common Performance Issues

Guaranteed to run even with a single delegate!
Public Classroom Public Classroom
Participants from multiple organisations. Topics usually cannot be customised
From $6150
(119)
Private Classroom Private Classroom
Participants are from one organisation only. No external participants are allowed. Usually customised to a specific group, course topics are agreed between the client and the trainer.
From $6150
Request quote
Private Remote Private Remote
The instructor and the participants are in two different physical locations and communicate via the Internet
From $3850
Request quote

The more delegates, the greater the savings per delegate. Table reflects price per delegate and is used for illustration purposes only, actual prices may differ.

Number of Delegates Public Classroom Private Classroom Private Remote
1 $6150 $6150 $3850
2 $3495 $3420 $2270
3 $2610 $2510 $1743
4 $2168 $2055 $1480
Cannot find a suitable date? Choose Your Course Date >>
Too expensive? Suggest your price

Related Categories


Course Discounts

Course Venue Course Date Course Price [Remote/Classroom]
CP306A: Google Container Engine and Kubernetes Remote Course Fri, Aug 26 2016, 9:30 am $1220 / N/A
Forecasting with R Remote Course Tue, Aug 30 2016, 9:30 am $2450 / N/A
SQL Fundamentals Remote Course Fri, Sep 16 2016, 9:30 am $750 / N/A

Upcoming Courses

VenueCourse DateCourse Price [Remote/Classroom]
SK, SaskatoonMon, Aug 15 2016, 9:30 am$3850 / $6850
BC, Vancouver Park PlaceTue, Aug 16 2016, 9:30 am$3850 / $6250
ON, Ottawa - Fairmont Chateau LaurierTue, Aug 16 2016, 9:30 am$3850 / $6550
MB, Winnipeg - 201 Portage AvenueTue, Aug 16 2016, 9:30 am$3850 / $6350
NS, Halifax - Purdy's WharfThu, Aug 18 2016, 9:30 am$3850 / $6350

Some of our clients