You are on page 1of 2

CETPA INFOTECH PVT. LTD.

CURRICULUM OF BIG DATA HADOOP


Duration: 3 MONTHS

Duration: 4 Days
 Introduction to Big Data  Map Reduce Programming
 What is RDBMS?  Overview of the Map Reduce
 What is Big Data?  History of Map Reduce
 Problems with the RDBMS and other existing systems  Flow of Map Reduce
 Requirement for the new approach  Working of Map Reduce with simple example
 Solution to the problem with huge  Difference Between Map phase and Reduce phase
 Difference between relational databases and NoSQL  Concept of Partition and Combiner phase in Map Reduce
type databases  Submission of a Map Reduce job in Hadoop cluster and
 Need of NoSQL type databases it’s completion
 Problems in processing of Big Data with the traditional  File support in Hadoop
systems  Achieving different goals using Map Reduce programs
 How to process and store Big Data?
 Where to use Hadoop?  Sqoop
 What is Sqoop ?
 Hadoop Basic Concepts  Use Case for Sqoop?
 What is Hadoop?  Configuring Sqoop
 Why to use Hadoop?  Importing and Exporting Data using Sqoop
 Architecture of Hadoop  Importing data into Hive using Sqoop
 Difference between Hadoop 1.x and Hadoop 2.x  Code Generation using sqoop
 What is YARN?  Using Map Reduce with the Sqoop
 Advantage of Hadoop 2.x over Hadoop 1.x
 Use cases for using Hadoop  PIG
 Components of Hadoop  Introduction to Apache Pig
 Hadoop Distributed File System (HDFS)  Architecture of Apache Pig
 Map Reduce  Why Pig?
 RDBMS Vs Apache PIG
 Hadoop Distributed File System  Loading data using PIG
 Components of HDFS  Different Modes of execution of PIG Commands
 What was the need of HDFS?  PIG Vs Map Reduce coding
 Data Node, Name Node, Secondary name Node  Diagnostic operations in Pig
 High Availability and Fault Tolerance  Combining and Filtering Operations in Pig
 Command Line interface
 Data Ingestion  Flume
 Hadoop Commands  What is Flume?
 Architecture of Flume
 Hadoop Cluster  Why we need Flume?
 Installation of Hadoop  Problem with traditional export method
 Understanding the Configuration of Hadoop  Configuring Flume
 Starting the Hadoop related Processes  Different Channels in Flume
 Visualization of Hadoop in UI  Importing data using Flume
 Writing the files to the HDFS  Using Map Reduce with the Flume
 Reading the files from the Hadoop Cluster
 Work flow of the JoB
 HBASE
 What is HBASE?
 Why HBASE is needed?
 HBASE Architecture and Schema Design
 Column Oriented and Row Oriented Databases
 HBASE Vs RDBMS
 HIVE
 Introduction to HIVE  Analysis Using R Language
 Architecture of HIVE  Introduction to R Language
 Why HIVE?  Introduction to R Studio
 RDBMS Vs HIVE  Why to use R?
 Introduction to HiveQL  R Vs Other Languages
 Loading data using HIVE  Using R to analyze the data extracted using Map
 HIVE Vs Map Reduce Coding Reduce
 Different functions supported in HIVE  Introduction to ggplot package
 Partitioning, Bucketing in HIVE  Plotting the graphs of the extracted data from
 Hive Built-In Operators and Functions Map Reduce using R
 Why do we need Partitioning and Bucketing in
HIVE?  Mini Project to Use Hadoop and Related Technologies on
a Dataset
 MongoDB
 What is MongoDB?
 Difference between MongoDB and RDBMS
 Advantages of MongoDB over RDBMS
 Installing MongoDB
 What are Collections and Documents?
 Creating Databases and Collections.
 Working with Databases and Collections

HEAD OFFICE: 200 Purwavali , 2nd Floor, (Opp. Railway Ticket Agency), Railway Road ,
Ganeshpur,
Roorkee – 247667, Ph.No.: 09219602769, 01332-270218 Fax - 1332 – 274960
CORPORATE OFFICE: D-58, Sector-2, Near Red FM. Noida -201301, Uttar Pradesh
Contact Us: +91-9212172602 , 0120-4535353
BRANCH OFFICE: 401 A, 4th Floor, Lekhraj Khazana, Faizabad Road, Indira Nagar,
Lucknow-220616 (U.P.) Ph. No: +91-522-6590802, +91-9258017974
BRANCH OFFICE: 105, Mohit Vihar, Near Kamla Palace, GMS Road, Dehradun-248001, UK
Contact: +91-9219602771, 0135-6006070
Toll Free- 1800-8333-999 (from any network)

You might also like