Hadoop Development MODULES

PLANS AND PRICING

JAVA

Module - 1

Duration 53 mins

Java Essentials

  • What is JAVA ?
  • What is JRE and JDK ?
  • What is JVM?
  • How Java works?
  • Installation of JAVA and Eclipse IDE
  • Data types in Java

Module - 2

Duration 2 hrs 40 mins

Oops Concepts

  • Classes and Objects
  • Methods
  • Constructors
  • Arrays
  • This keyword
  • Super and final keyword

Module - 3

Duration 1 hr 30 mins

Oops Concept Part 2

  • Inheritance
  • Concept of Polymorphism
  • Abstract class
  • Interface
  • StringTokenizer
  • BufferReader

Module - 4

Duration 55 mins

Collections

  • List
  • Set
  • Map
  • Arraylist
  • HashMap

MEET HADOOP

Module - 5

Duration 1 hr 20 min

Meet Hadoop

  • What is Data?25 mins
  • Classification of Data27 mins
  • Detailed description of Structured Data20 mins
  • Sources of Structured Data8 mins
  • Detailed description of Semi-Structured Data
  • Sources of Semi-Structured Data

Module - 6

Duration 47 min

Hadoop Architecture Part 1

  • Architecture of Hadoop20 mins
  • Detailed description of Hadoop ecosystem27 mins
  • Classification of Hadoop Ecosystem
  • Introduction of different components of Hadoop: Hive,Pig,Sqoop,Hbase,Flume etc
  • Hadoop Core Components: HDFS and MapReduce
  • Detailed description of HDFS

Module - 7

Duration 37 min

Hadoop Architecture Part 2

  • Processing Data with Hadoop21 mins
  • Introduction to MapReduce16 mins
  • Languages used in MapReduce
  • MapReduce Daemons
  • Introduction to Job Tracker?
  • Introduction to Task Tracker?

Module - 8

Duration 23 min

Environement Setup

  • Operational modes of Hadoop23 mins
  • Description of Standalone mode
  • Description of Pseudo Distributed mode
  • Desciption of Fully distributed mode
  • Environment Setup of Hadoop
  • Installation of Cloudera VM

Module - 9

Duration 1 hr

Hdfs

  • Browsing the HDFS20 mins
  • HDFS Commands and Operations22 mins
  • Listing all the Hadoop file system Commands18 mins
  • Checking version of Hadoop
  • How to run a Jar file in Hadoop
  • Making a directory in HDFS

MAPREDUCE PROGRAMMING

Module - 10

Duration 1 hr 10 mins

Mapreduce Programming Part 1

  • Introduction to MapReduce27 mins
  • High level view of MapReduce Processing30 mins
  • Descriptive Details of various steps involved in MapReduce22 mins
  • MapReduce version 1
  • Issues in MRv1: Need of MRv2
  • MRv2 : YARN

Module - 11

Duration 1 hr 35 mins

Mapreduce Programming Part 2

  • Browsing the HDFS20 mins
  • HDFS Commands and Operations22 mins
  • Listing all the Hadoop file system Commands18 mins
  • Checking version of Hadoop
  • How to run a Jar file in Hadoop
  • Making a directory in HDFS

Module - 12

Duration 1 hr 40 mins

Mapreduce Programming Part 3

  • Introduction to MapReduce27 mins
  • High level view of MapReduce Processing30 mins
  • Descriptive Details of various steps involved in MapReduce22 mins
  • MapReduce version 1
  • Issues in MRv1: Need of MRv2
  • MRv2 : YARN

Module - 13

Duration 1 hr

Mapreduce Programming Part 4

  • Use case of Partitioner: Segregating patient data based on some condition using partitioners29 mins

Module - 14

Duration 1 hr 15 mins

Advanced Mapreduce Part 1

  • What are Custom data types?29 mins
  • Common rules for creating Custom data types42 mins
  • Use case : Processing Online music data using custom data types 5 mins

Module - 15

Duration 1 hr 30 mins

Advanced Mapreduce Part 2

  • What are Counters?25 mins
  • Built in counters31 mins
  • Types of Counters and their description11 mins
  • User defined counters
  • Use case1 : Processing weblog entries using counters
  • Use case 2 : Processig Customer complaint data using counters

Module - 16

Duration 50 mins

Advanced Mapreduce Part 3

  • What is Distributed Cache?
  • Setting up the cache for a job
  • Usecase: Processing news data using distributed cache

Module - 17

Duration 1 hr

Advanced Mapreduce Part 4

  • Joining data in Mapreduce28 mins
  • What are MapJoins34 mins
  • How the data is joined at MapSide
  • Use Case : Joining Enterprise datasets

Module - 18

Duration 43 mins

Advanced Mapreduce Part 5

  • Operational modes of Hadoop23 mins
  • Description of Standalone mode
  • Description of Pseudo Distributed mode
  • Desciption of Fully distributed mode
  • Environment Setup of Hadoop
  • Installation of Cloudera VM

Module - 19

Duration 2 hrs 10 mins

Advanced Mapreduce Part 6

  • What are Sequence files?
  • Formats of Sequence files
  • Structure of Seq files with and without record compression
  • Structure of Seq files with and without block compression
  • Sequence file header
  • Writing a sequence file

PIG

Module - 20

Duration 2 hrs

Pig Part

  • Why do we need PIG ?37 mins
  • Why should we go for PIG when we have MapReduce?19 mins
  • What is Pig ?13 mins
  • Where to use Pig?13 mins
  • Anatomy of Pig15 mins
  • Pig on Hadoop

Module - 21

Duration 41 mins

Pig Part 2

  • UseCase: Processing the weblogs using Pig

Module - 22

Duration 1 hr 20 mins

Pig Part 3

  • Functions in Pig31 mins
  • AVG19 mins
  • CONCAT30 mins
  • COUNT
  • MAX
  • MIN

Module - 23

Duration 45 mins

Hive Part 1

  • What is Hive?45 mins
  • Hive query language
  • Why Hive when Pig is there ?
  • Hive v/s Pig
  • Brief History of Hive
  • Features of Hive

HIVE

Module - 24

Duration 2 hrs 30 mins

Hive Part 2

  • Hive Query language29 mins
  • Database commands27 mins
  • Creating a database30 mins
  • Listing all databases30 mins
  • Using a specific database26 mins
  • Tables : Managed and External Tables

Module - 25

Duration 40 mins

Hive Part 3

  • Use Case : Processing Stocks Data to calculate covariance

Module - 26

Duration 32 mins

Hive Part 4

  • Hive UDFs
  • Usecase : Analyzing news data using user defined functions32 mins

HBASE

Module - 27

Duration 1 hr 17 mins

Hbase Part 1

  • What are NoSQL databases?29 mins
  • Types of NoSQL databases15 mins
  • NoSQL technology landscape33 mins
  • Limitations of Hadoop
  • What is Hbase?
  • Brief History of Hbase

Module - 28

Duration 3 hrs

Hbase Part 2

  • Hbase Data Coordinates21 mins
  • Multi Map Structure29 mins
  • Java Client Apis27 mins
  • Creating a table using java api27 mins
  • Listing tables using java api35 mins
  • Disabling a table using java api27 mins

SQOOP

Module - 29

Duration 1 hr 30 min

Sqoop Part 1

  • Why Sqoop?29 mins
  • What is Sqoop ?22 mins
  • How Sqoop works ?26 mins
  • Sqoop import and sqoop export16 mins
  • Controlling Parallelism
  • Direct mode of importing data

Module - 30

Duration 2 hrs

Sqoop Part 2

  • Importing data into Hbase22 mins
  • Sqoop-import-all-tables36 mins
  • Sqoop-export36 mins
  • Sqoop-job
  • Savedjobs and incremental imports
  • Sqoop-eval

FLUME

Module - 31

Duration 2 hr 50 mins

Flume

  • What is Flume?33 mins
  • Why Flume?32 mins
  • Advantages of Flume13 mins
  • Architecture of Flume25 mins
  • Flume event23 mins
  • Flume agents23 mins

PROJECT

Module - 32

Duration 1 hrs 50 mins

Project 1: Processing Web Log Data

  • Generating web log dummy data using a script27 mins
  • Loading the data into hive28 mins
  • Analysing the DDOS attacks
  • Plotting the refined data in Power View30 mins

Module - 33

Duration 1 hrs 50 mins

Project 2: Sentiment Analysis

  • Popup Handling
  • Managing different Windows
  • Close and Quit -Difference
  • Concept of WebTables
  • Dynamic WebTable Handling
  • Extracting Data From WebTable