Get early bird 15% discount - batch starting on 29th Aug(IST)
Logo
  • phone_icon
  • US Contact Number: +1(917)-745-8787

  •  India Contact Number: (+91) 8968585110
  • Get Social with us!
hadoop

Hadoop Development Course Details

Total Duration: 46+ hrs

Total Modules: 32

Java

Duration 53 mins

Module - 1

right_icon2Java Essentials

  • Topics
  • What is JAVA ?
  • What is JRE and JDK ?
  • What is JVM?
  • How Java works?
  • Installation of JAVA and Eclipse IDE
  • Data types in Java
  • Variables
  • Operators and its types

Duration 2 hrs 40 mins

Module - 2

right_icon2OOPs concepts

  • Topics
  • Classes and Objects
  • Methods
  • Constructors
  • Arrays
  • This keyword
  • Super and final keyword
  • Control flow statements
  • if else
  • Switch
  • For loop
  • While loop
  • Do… while loop
  • Modifiers
  • Concept of Static and non static
  • Packages

Duration 1 hr 30 mins

Module - 3

right_icon2OOPS Concept Part 2

  • Topics
  • Inheritance
  • Concept of Polymorphism
  • Abstract class
  • Interface
  • StringTokenizer
  • BufferReader

Duration 55 mins

Module - 4

right_icon2Collections

  • Topics
  • List
  • Set
  • Map
  • Arraylist
  • HashMap

Meet Hadoop

Duration 1 hr 20 min

Module - 5

right_icon2Meet Hadoop

  • Topics
  • What is Data?25
    mins
  • Classification of Data27
    mins
  • Detailed description of Structured Data20
    mins
  • Sources of Structured Data8
    mins
  • Detailed description of Semi-Structured Data
  • Sources of Semi-Structured Data
  • Detailed description of Unstructured data
  • Sources of Unstructured data
  • How do we deal with Unstructured data
  • Definition of Big Data
  • Concept of 3V's in Big Data : Variety, Volume and Velocity
  • Extracting Businesss Value from Data
  • Characteristics of Big Data
  • Real world example of Big Data Analytics
  • Importance of Big Data Analytics
  • Distributed File System and Why do we need it ?
  • What is Hadoop?
  • Brief History of Hadoop
  • Characteristics of Hadoop
  • Diffference bet RDBMS and Hadoop
  • What is Grid Computing ?
  • Grid Computing v/s Hadoop
  • Why do we need Hadoop ?

Duration 47 min

Module - 6

right_icon2Hadoop Architecture Part 1

  • Topics
  • Architecture of Hadoop20
    mins
  • Detailed description of Hadoop ecosystem27
    mins
  • Classification of Hadoop Ecosystem
  • Introduction of different components of Hadoop: Hive,Pig,Sqoop,Hbase,Flume etc
  • Hadoop Core Components: HDFS and MapReduce
  • Detailed description of HDFS
  • Hadoop Nomenclature
  • Daemons in HDFS :NameNode,Datanode,Secondary Namenode
  • HDFS architecture
  • Detailed description of various nodes
  • Communication between nodes
  • Anatomy of File Read
  • Anatomy of File Write
  • Concept of Rack Awareness n Hadoop

Duration 37 min

Module - 7

right_icon2Hadoop Architecture Part 2

  • Topics
  • Processing Data with Hadoop21
    mins
  • Introduction to MapReduce16
    mins
  • Languages used in MapReduce
  • MapReduce Daemons
  • Introduction to Job Tracker?
  • Introduction to Task Tracker?
  • Interaction between Job Tracker and Task Tracker
  • How MapReduce works?
  • Hadoop versions
  • Limitations of Hadoop 1.0 Architecture
  • HDFS federation in Hadoop
  • Feature of HDFS 2
  • What are Active and Passive namdenode and their interaction
  • What is YARN and its functionality?
  • YARN Daemons : Resource Manager, NodeManager, Application Master

Duration 23 min

Module - 8

right_icon2Environement Setup

  • Topics
  • Operational modes of Hadoop23
    mins
  • Description of Standalone mode
  • Description of Pseudo Distributed mode
  • Desciption of Fully distributed mode
  • Environment Setup of Hadoop
  • Installation of Cloudera VM
  • Description of different configuration files

Duration 1 hr

Module - 9

right_icon2HDFS

  • Topics
  • Browsing the HDFS20
    mins
  • HDFS Commands and Operations22
    mins
  • Listing all the Hadoop file system Commands18
    mins
  • Checking version of Hadoop
  • How to run a Jar file in Hadoop
  • Making a directory in HDFS
  • Creating a file in HDF
  • Listing files and directories in HDFS
  • Checking the file size
  • Reading the contents of the a file
  • Copying a file from local filesystem to HDFS
  • Copying a file from HDFS to local filesystem
  • Put and Get commands
  • Removing a file from HDFS
  • Removing a directory from HDFS

MapReduce Programming

Duration 1 hr 10 mins

Module - 10

right_icon2MapReduce Programming Part 1

  • Topics
  • Introduction to MapReduce27
    mins
  • High level view of MapReduce Processing30
    mins
  • Descriptive Details of various steps involved in MapReduce22
    mins
  • MapReduce version 1
  • Issues in MRv1: Need of MRv2
  • MRv2 : YARN
  • Detailed description of ResourceManager
  • Detailed description of NodeManager
  • Execution flow of a YARN application
  • LifeCycle of Job
  • MapReduce JobHistory Server
  • Input and Output of MapReduce Job
  • What are input splits?
  • Relationship between Input Split and HDFS blocks

Duration 1 hr 35 mins

Module - 11

right_icon2MapReduce Programming Part 2

  • Topics
  • What is Map Task?30
    mins
  • Detailed explanation of Mapper29
    mins
  • What is Reduce Task ?34
    mins
  • Detailed explanation of Reducer
  • What is Writable?
  • Providing data to the Mapper
  • How data is read by Mapper
  • Input Format Class
  • FileInputFormat and its types
  • Step by step understanding of MapReduce Program using Word Count Example
  • MapReducejob
  • Combiners in Mapreduce
  • Detailed working of Combiner
  • Partitioners in Mapreduce
  • Detailed working of Partitioner
  • Job Configurations

Duration 1 hr 40 mins

Module - 12

right_icon2MapReduce Programming Part 3

  • Topics
  • Hands On session32
    mins
  • Writing a Word Count Program21
    mins
  • Use Case : Processing customer complaints info using combiner32
    mins

Duration 1 hr

Module - 13

right_icon2MapReduce Programming Part 4

  • Topics
  • Use case of Partitioner: Segregating patient data based on some condition using partitioners29
    mins

Duration 1 hr 15 mins

Module - 14

right_icon2Advanced MapReduce Part 1

  • Topics
  • What are Custom data types?29
    mins
  • Common rules for creating Custom data types42
    mins
  • Use case : Processing Online music data using custom data types 5
    mins

Duration 1 hr 30 mins

Module - 15

right_icon2Advanced MapReduce Part 2

  • Topics
  • What are Counters?25
    mins
  • Built in counters31
    mins
  • Types of Counters and their description11
    mins
  • User defined counters
  • Use case1 : Processing weblog entries using counters
  • Use case 2 : Processig Customer complaint data using counters

Duration 50 mins

Module - 16

right_icon2Advanced MapReduce Part 3

  • Topics
  • What is Distributed Cache?
  • Setting up the cache for a job
  • Usecase: Processing news data using distributed cache

Duration 1 hr

Module - 17

right_icon2Advanced MapReduce Part 4

  • Topics
  • Joining data in Mapreduce28
    mins
  • What are MapJoins34
    mins
  • How the data is joined at MapSide
  • Use Case : Joining Enterprise datasets

Duration 43 mins

Module - 18

right_icon2Advanced MapReduce Part 5

  • Topics
  • What are Reduce Joins
  • How the data is joined at ReduceSide
  • Use Case : Joining customer transaction datasets
  • Reduce Joins

Duration 2 hrs 10 mins

Module - 19

right_icon2Advanced MapReduce Part 6

  • Topics
  • What are Sequence files?
  • Formats of Sequence files
  • Structure of Seq files with and without record compression
  • Structure of Seq files with and without block compression
  • Sequence file header
  • Writing a sequence file
  • Reading a sequence file
  • Usecase: Reading and writing a seq file
  • Usecase : Image processing using sequence file

PIG

Duration 2 hrs

Module - 20

right_icon2Pig Part

  • Topics
  • Why do we need PIG ?37
    mins
  • Why should we go for PIG when we have MapReduce?19
    mins
  • What is Pig ?13
    mins
  • Where to use Pig?13
    mins
  • Anatomy of Pig15
    mins
  • Pig on Hadoop
  • Basic Program Structure in Pig
  • Data Models
  • Execution modes of Pig
  • Relational operators in Pig latin
  • Loading and storing in Pig
  • Filtering in Pig
  • Joins in Pig
  • Sorting of data using Pig
  • Combining and Spliting data
  • File loaders in Pig Latin
  • Filter Operator in Pig latin
  • Foreach operator
  • Group Operator
  • Order By Operator
  • Diagnostic Operators
  • Piggy Bank
  • Use case : Word count program using Pig

Duration 41 mins

Module - 21

right_icon2Pig Part 2

  • Topics
  • UseCase: Processing the weblogs using Pig

Duration 1 hr 20 mins

Module - 22

right_icon2Pig Part 3

  • Topics
  • Functions in Pig31
    mins
  • AVG19
    mins
  • CONCAT30
    mins
  • COUNT
  • MAX
  • MIN
  • SIZE
  • TOKENIZE
  • SUM
  • User defined functions in Pig
  • Use case: Data correction using UDFs

HIVE

Duration 45 mins

Module - 23

right_icon2Hive Part 1

  • Topics
  • What is Hive?45
    mins
  • Hive query language
  • Why Hive when Pig is there ?
  • Hive v/s Pig
  • Brief History of Hive
  • Features of Hive
  • Data units in Hive
  • Hive Architecture
  • Metastores
  • Embedded MetaStore
  • Local Metastore
  • Remote Metastore
  • Hive Data Types
  • Collection Data Types
  • File formats in Hive
  • Starting a Hive Shell

Duration 2 hrs 30 mins

Module - 24

right_icon2Hive Part 2

  • Topics
  • Hive Query language29
    mins
  • Database commands27
    mins
  • Creating a database30
    mins
  • Listing all databases30
    mins
  • Using a specific database26
    mins
  • Tables : Managed and External Tables
  • Views
  • Inserting output into another Table
  • Inserting output into HDFS
  • Inserting output in local filesystem
  • Hive Script
  • Differences with Trad RDBMS
  • Partitions in Hive
  • Bucketing in Hive
  • Joins
  • Serialization Formats
  • GUI - Hue
  • Beeswaax

Duration 40 mins

Module - 25

right_icon2Hive Part 3

  • Topics
  • Use Case : Processing Stocks Data to calculate covariance

Duration 32 mins

Module - 26

right_icon2Hive Part 4

  • Topics
  • Hive UDFs
  • Usecase : Analyzing news data using user defined functions32
    mins

HBASE

Duration 1 hr 17 mins

Module - 27

right_icon2Hbase Part 1

  • Topics
  • What are NoSQL databases?29
    mins
  • Types of NoSQL databases15
    mins
  • NoSQL technology landscape33
    mins
  • Limitations of Hadoop
  • What is Hbase?
  • Brief History of Hbase
  • Companies using Hbase
  • HDFS v/s Hbase
  • Hbase Architecture
  • Master, Region Servers and Regions
  • Region Server Components
  • Logical Architecture
  • Storage in Hbase
  • Hbase Shell
  • Hbase Shell Commands
  • Table Manangement commands
  • Data Manipulation Commands

Duration 3 hrs

Module - 28

right_icon2Hbase Part 2

  • Topics
  • Hbase Data Coordinates21
    mins
  • Multi Map Structure29
    mins
  • Java Client Apis27
    mins
  • Creating a table using java api27
    mins
  • Listing tables using java api35
    mins
  • Disabling a table using java api27
    mins
  • Alter using java api
  • Deleting a column family
  • Create/Save data to Hbase
  • Filters
  • Bulk Loading into Hbase
  • Built-in TSV bulk loader

SQOOP

Duration 1 hr 30 min

Module - 29

right_icon2Sqoop Part 1

  • Topics
  • Why Sqoop?29
    mins
  • What is Sqoop ?22
    mins
  • How Sqoop works ?26
    mins
  • Sqoop import and sqoop export16
    mins
  • Controlling Parallelism
  • Direct mode of importing data
  • Staging table - Auxiliary Table
  • Incremental Imports
  • Insert and Updates in Sqoop
  • Importing data into Hive

Duration 2 hrs

Module - 30

right_icon2Sqoop Part 2

  • Topics
  • Importing data into Hbase22
    mins
  • Sqoop-import-all-tables36
    mins
  • Sqoop-export36
    mins
  • Sqoop-job
  • Savedjobs and incremental imports
  • Sqoop-eval

FLUME

Duration 2 hr 50 mins

Module - 31

right_icon2Flume

  • Topics
  • What is Flume?33
    mins
  • Why Flume?32
    mins
  • Advantages of Flume13
    mins
  • Architecture of Flume25
    mins
  • Flume event23
    mins
  • Flume agents23
    mins
  • Source, Channel and Sink
  • Additional components of Flume
  • Data flow in flume
  • Multiplexing
  • Working with Flume
  • Flume configuration file
  • Different sources, sinks and channels in Flume
  • Sequence generator source example
  • Spooling Directory source example

PROJECT

Duration 1 hrs 50 mins

Module - 32

right_icon2Project 1: Processing web log data

  • Topics
  • Generating web log dummy data using a script27
    mins
  • Loading the data into hive28
    mins
  • Analysing the DDOS attacks
  • Plotting the refined data in Power View30
    mins

Duration 1 hrs 50 mins

Module - 33

right_icon2Project 2: Sentiment Analysis

  • Topics
  • Getting live data from twitter about launch of a movie30
    mins
  • Loading the data into hive34
    mins
  • Assigning the polarity to each tweet
  • Plotting the polarity in Power View

Sorry, we aren't online at the moment. Leave a message and we'll get back to you.