USA: +1 734 418 2465 | India: +91 40 4018 1306 | info@learntek.org

Setup Menus in Admin Panel

LEARNTEK
rsz_scala-spark

Scala and Spark Training

Inquiry Now

Product Description

What is Scala?

Scala is a modern multi-paradigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way. Scala, the word came from “Scalable Language”, is a hybrid functional programming language which smoothly integrates the features of objected oriented and functional programming languages and it is compiled to run on the Java Virtual Machine. Scala has been created by Martin Odersky and released in 2003.

Why Scala?

There are the following reasons that encourages Scala learning.

Many existing companies, who depend on Java for business critical applications, are turning to Scala to boost their development productivity, applications scalability and overall reliability.

Scala  is a type-safe JVM language that incorporates both object oriented and functional programming features into an extremely concise, logical, simple and extremely powerful language.

Scala creates a “better Java” alternative by remaining its syntax very close to the Java language syntax, so that to minimize the learning difficulty.

Scala was created specifically with the goal of creating a better language, in contrast with those restrictive, overly tedious, or frustrating features of Java.

Scala is a much cleaner and well organized language that is ultimately easier to use and increases productivity.

What is Spark?

Spark is a fast cluster computing technology, designed for fast computation in Hadoop clusters. It is based on Hadoop MapReduce programming and it extends the MapReduce model to efficiently use it for more types of computations, like interactive queries and stream processing. Spark uses Hadoop in two different ways – one is storage and another one is processing. As Spark is having its own cluster management computation, it uses Hadoop for storage purpose only.

Spark is one of Hadoop’s sub project developed in 2009 in UC Berkeley’s AMPLab by Matei Zaharia. It was Open Sourced in 2010 under a BSD license. It was donated to Apache software foundation in 2013, and now Apache Spark has become a top level Apache project from Feb-2014.

Why Spark?

Spark was introduced by Apache Software Foundation for speeding up the Hadoop software computing process.

The main feature of Spark is its in-memory cluster computing that highly increases the speed of an application processing.

Spark is designed to cover a wide range of workloads such as batch applications, iterative algorithms, interactive queries and streaming applications by reducing the management burden of maintaining separate tools.

Apache Spark also have the following features.

  • Speed− Spark helps to run an application in Hadoop cluster, up to 100 times faster in memory and 10 times faster when running on disk by reducing number of read/write operations to disk and by storing the intermediate processing data in memory.
  • Supports multiple languages− Spark comes up with 80 high-level operators for interactive querying and provides application development with built-in APIs in different languages in Java, Scala, or Python.
  • Advanced Analytics− Spark not only supports ‘Map’ and ‘reduce’ programming but it also supports SQL queries, Streaming data, Machine learning (ML), and Graph algorithms.

 

The following topics will be covered in our Scala and Spark Online Training:

Introduction to Scala

Overview of Scala

Installing Scala

Scala Basics

IDE for Scala

Scala Programming

Variables & Methods

Literals

Reserved Words

Operators

Precedence Rules

If Expression

For Expression

Exception handling with Try Expression

Match Expression

While Loops

Do-While Loops

Implicit Conversion

Functions in Scala

Methods

First class Function

Higher Order Methods

Function Literal

Partially Applied Function

Tail Recursion

Closure

Currying

Control Abstraction

Traits & OOPs in Scala

Traits

Classes & Objects

Abstract Class

Access Modifiers

Functional Programming

Scala Class Hierarchy

Package and Imports

Case Class & Pattern Matching

Pattern type

Pattern Guard

Sealed Class

Option Type

Extractor

Scala Collection

Immutable And Mutable collection

Array

Sets

Lists

Tuples

Maps

Introduction to Spark

Problems with Traditional Large-Scale Systems

Introducing Spark

What is Spark?

Spark Basics

Spark Installation

Configure HDP 2.4 (or 2.5) on local machine

Spark Shell

Storage layers for Spark

Overview of Spark architecture

Initialize a Spark Context and building applications

IDEs for Spark Applications

SBT and its overview

Intellij

Eclipse

Resolving dependencies for Spark applications

RDDs

RDD Basics

RDD transformations and Actions

Lazy evaluation

Element wise transformations

Pair RDDs

Key-Value Pair RDD

Creating Pair RDDs

Transformations on Pair RDD

Grouping , Joining, Sorting on Pair RDD

Data Partitioning

Determining a partitioner of Pair RDD

Operations that Benefit from Partitioning

Operations those affect the partitioning

Page Rank Example

Advance concepts in Spark

Accumulator

Broadcast

Working on per-partition basis

Launching Spark on cluster

Configure and launch Spark Cluster on AWS

Configure and launch Spark Cluster on Microsoft Azure

Running Spark on Cluster

Spark Runtime Architecture

Driver

Executor

Cluster Manager

Components of Execution : Job, Stage and Task

Spark Web URL

Driver and Executor logs

Spark-submit command

Caching and Persistence

RDD Lineage

Caching Overview

Distributed Persistence

Spark Algorithms

Spark SQL

Spark Streaming

MLlib

GraphX

Duration & Timings :

Duration – 30 Hours.

Course Fee : $300    Discount Offer  

Training Type: Online Live Interactive Session.

Faculty: Experienced.

Weekend  Session –  Sat – Sun  9::30 AM – 12:30 PM EST– 5 Weeks. December 2 , 2017.

Weekend  Session –  Sat – Sun  9::30 AM – 12:30 PM EST– 5 Weeks. January 20 , 2018.

Any questions, please submit   Inquiry Now  

USA: +1 734 418 2465 | India: +91 40 4018 1306

About Learntek

Learntek is global online training provider on Big Data, Hadoop, Data Analytics and other IT and Management courses. We are dedicated to designing, developing and implementing training programs for students, corporate employees and business professional.

Our job is to make sure your training and learning experience is everything it should be – exciting, enjoyable, stimulating and successful.
top