Hadoop Architecture

1.
4 Hadoop Architecture - HDFS

Leons Petrazickis Bradley Steinfeld Marius Butuc
Disclaimer
Copyright IBM Corporation 2012. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBMS CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE. IBM, the IBM logo, ibm.com, and DB2 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at Copyright and trademark information at http://www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others.
Agenda
Terminology review HDFS MapReduce Type of nodes Topology awareness Configuring Hadoop
User Interface
InfoSphere BigInsights A Full Hadoop Stack

Management Console Development Tooling (ODS)
Analytics Visualization
Application
Zookeeper Avro Pig Hive MapReduce AdaptiveMR Oozie Jaql
Analytics
ML Analytics
Text Analytics Lucene
Storage
HDFS
HBase
GPFS-SNC
Data Sources/ Connectors
Streams Data Stage Flume
DB2 LUW DB2 z Informix
Netezza
Teradata
Oracle
User Interface
InfoSphere BigInsights A Full Hadoop Stack

Management Console Development Tooling (ODS)
Analytics Visualization
Application
Zookeeper Avro Pig Hive MapReduce AdaptiveMR Oozie Jaql
Analytics
ML Analytics
Text Analytics Lucene
Storage
HDFS
HBase
GPFS-SNC
Data Sources/ Connectors
Streams Data Stage Flume
DB2 LUW DB2 z Informix
Netezza
Teradata
Oracle
Agenda
Terminology review
Node 1
Terminology review
Node 1
Node 2
Terminology review
Node 1
Node 2
Node n
Terminology review
Rack 1
Node 1
Node 2
Node n
10
Terminology review
Rack 1 Rack 2
Node 1
Node 1
Node 2
Node 2
Node n
Node n
11
Terminology review
Rack 1 Rack 2 Rack n
Node 1
Node 1
Node 1
Node 2
Node 2
Node n
Node n
Node 2
Node n
12
Terminology review
Hadoop cluster Rack 1 Rack 2 Rack n
Node 1
Node 1
Node 1
Node 2
Node 2
Node n
Node n
Node 2
Node n
13
Hadoop architecture
Two main components:

Hadoop Distributed File System (HDFS) MapReduce Engine
14
Agenda
15
Hadoop distributed file system (HDFS) Hadoop file system that runs on top of existing file system Designed to handle very large files with streaming data access patterns Uses blocks to store a file or parts of a file Can create, delete, copy, but NOT update
16
HDFS - Blocks
File Blocks
64MB (default), 128MB (recommended) compare to 4KB in UNIX Behind the scenes, 1 HDFS block is supported by multiple operating system (OS) blocks
128 MB
HDFS Block OS Blocks
...
17
HDFS - Blocks
Fits well with replication to provide fault tolerance and availability
Advantages of blocks:
Fixed size easy to calculate how many fit on a disk A file can be larger than any single disk in the network If a file or a chunk of the file is smaller than the block size, only needed space is used. Eg: 420MB file is split as:
128 MB
128 MB
128 MB
36 MB
18
HDFS - Replication
Blocks with data are replicated to multiple nodes Allows for node failure without data loss
Node 1 Node 3
Node 2
19
Writing a file to HDFS
20
21
22
23
24
25
26
27
28
29
30
HDFS Command line interface

Type hadoop from the Linux shell to get different options
31
namenode -format
Before it can be used, a new HDFS installation needs to be formatted
hadoop namenode -format

May need to stop Hadoop first using stop.sh hadoop
32
namenode -format
33
fsck file system check

Eg: To delete corrupted files use:
hadoop fsck -delete
34
fs file system shell

File System Shell (fs)
Invoked as follows:
hadoop fs <args>
Example: Listing the current directory in hdfs
hadoop fs ls .
35

FS shell commands take URIs as argument URI format: Scheme: For the local filesystem, the scheme is file For HDFS, the scheme is hdfs Authority is the hostname and port of the NameNode
scheme://authority/path
hadoop fs copyFromLocal file://myfile.txt hdfs://localhost:9000/user/keith/myfile.txt

Scheme and authority are optional
Defaults are taken from configuration file core-site.xml
36

Many POSIX-like commands cat, chgrp, chmod, chown, cp, du, ls, mkdir, mv, rm, stat, tail
Some HDFS-specific commands

copyFromLocal, put, copyToLocal, get, getmerge, setrep
37
HDFS FS shell commands

copyFromLocal / put Copy files from the local file system into fs
hadoop fs -copyFromLocal <localsrc> .. <dst>

Or
hadoop fs -put <localsrc> .. <dst>
38

copyToLocal / get Copy files from fs into the local file system
hadoop fs -copyToLocal [-ignorecrc] [-crc] <src> <localdst>

Or
hadoop fs -get [-ignorecrc] [-crc] <src> <localdst>
39

getMerge Get all the files in the directories that match the source file pattern
Merge and sort them to only one file on local fs

<src> is kept
hadoop fs -getmerge <src> <localdst>
40

setRep Set the replication level of a file.
The -R flag requests a recursive change of replication level for an entire tree.
If -w is specified, waits until new replication level is achieved.
hadoop fs -setrep [-R] [-w] <rep> <path/file>
41
cat

Copies source paths to stdout. Example:
hadoop fs -cat hdfs:/mydir/test_file1 hdfs:/mydir/test_file2 hadoop fs -cat file:///file3 /user/hadoop/file4
Usage: hadoop fs -cat URI [URI ]
chgrp
Usage: hadoop fs -chgrp [-R] GROUP URI [URI ]
Change the permissions of files. With -R, make the change recursively through the directory structure.
chmod
Usage: hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ]
Change group association of files With -R, make the change recursively through the directory structure.
chown
Usage: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]
Change the owner of files. With -R, make the change recursively through the directory structure.
count
Usage: hadoop fs -count [-q] <paths>
Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns are: DIR_COUNT, FILE_COUNT, CONTENT_SIZE FILE_NAME. The output columns with -q are: QUOTA, REMAINING_QUATA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, DIR_COUNT, FILE_COUNT, CONTENT_SIZE, FILE_NAME.
Example:
hadoop fs -count hdfs:/mydir/test_file1 hdfs:/mydir/test_file2 hadoop fs -count -q hdfs:/mydir/test_file1
cp
Usage: hadoop fs -cp URI [URI ] <dest>
Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory. Example:
hadoop fs -cp hdfs:/mydir/test_file file:///home/hdpadmin/foo hadoop fs -cp file:///home/hdpadmin/foo file:///home/hdpadmin/boo hdfs:/mydir
du
Usage: hadoop fs -du URI [URI ]
Displays aggregate length of files contained in the
dus
Usage: hadoop fs -dus <args>
Displays a summary of file lengths.
expunge
Usage: hadoop fs -expunge
Empty the Trash
When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored quickly as long as it remains in /trash. A file remains in /trash for a configurable amount of time
ls
Usage: hadoop fs -ls <args>
For a file returns stat on the file with the following format:
permissions number_of_replicas userid groupid filesize modification_date modification_time filename For a directory it returns list of its direct children as in unix.A directory is listed as: permissions userid groupid modification_date modification_time dirname
Example:
hadoop fs -ls hdfs:/mydir/test_file
lsr
Usage: hadoop fs -lsr <args>
Recursive version of ls. Similar to Unix ls -R. Example:
hadoop fs -lsr hdfs:/mydir
mkdir
Usage: hadoop fs -mkdir <paths>
Takes path uri's as argument and creates directories. The behavior is much like unix mkdir -p creating parent directories along the path.
Example:
hadoop fs -mkdir hdfs:/mydir/foodir hdfs:/mydir/boodir
mv
Usage: hadoop fs -mv URI [URI ] <dest>
Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Moving files across filesystems is not permitted. Example:
hadoop fs -mv file:///home/hdpadmin/test_file file:///home/hdpadmin/test_file1 hadoop fs mv hdfs:/mydir/file1 hdfs:/mydir/file2 hdfs:/mydir2
rm
Usage: hadoop fs -rm [-skipTrash] URI [URI ]
Delete files specified as args. Only deletes non empty directory and files. If the -skipTrash option is specified, the trash, if enabled, will be bypassed and the specified file(s) deleted immediately. This can be useful when it is necessary to delete files from an over-quota directory. Refer to rmr for recursive deletes. Example:
hadoop fs -rm hdfs:/home/hdpadmin/test_file
rmr
Usage: hadoop fs -rmr [-skipTrash] URI [URI ]
Recursive version of delete. If the -skipTrash option is specified, the trash, if enabled, will be bypassed and the specified file(s) deleted immediately. Example:
hadoop fs -rmr file:///home/hdpadmin/mydir hadoop fs -rmr skipTrash hdfs:/mydir
stat
Usage: hadoop fs -stat URI [URI ]
Returns the stat information on the path. Example:
hadoop fs stat hdfs:/mydir/test_file
tail
Usage: hadoop fs -tail [-f] URI
Displays last kilobyte of the file to stdout. -f option can be used as in UNIX. Example:
hadoop fs -tail hdfs:/mydir/test_file

test
Usage: hadoop fs -test -[ezd] URI
Options: -e check to see if the file exists. Return 0 if true. -z check to see if the file is zero length. Return 0 if true. -d check to see if the path is directory. Return 0 if true.
Example:
hadoop fs -test e hdfs:/mydir/test_file
Agenda
53
MapReduce engine
Technology from Google A MapReduce program consists of map and reduce functions A MapReduce job is broken into tasks that run in parallel
54
Agenda
55
Types of nodes - Overview

HDFS nodes NameNode (Master) DataNode (Slaves) Checkpoint Node Secondary NameNode (deprecated) Backup Node MapReduce nodes JobTracker (Master) TaskTracker (Slaves)
56
57
58
59
60
Types of nodes - NameNode

Manages the filesystem namespace and metadata No data goes through the NameNode Only one per Hadoop cluster Single point of failure
Mitigated by writing state to multiple filesystems Dont use inexpensive commodity hardware for this node, large memory requirements
61
Types of nodes - NameNode

Entire metadata is kept in RAM
Ensure enough RAM in NameNode If run out of RAM, NameNode will crash
NameNode mainly consists of:

fsimage: Contains the metadata on disk (not exact copy of what is in RAM, but a checkpoint copy) edit logs: Records all write operations, synchronizes with metadata in RAM after each write
In case of power failure on NameNode

Can recover using fsimage + edit logs
Need to format NameNode to use it:

hadoop namenode -format
62
Types of nodes Checkpoint Node

Use to reduce the size of edit logs Periodically creates checkpoints of NameNode filesystem namespace
The Checkpoint node should run on a different machine than the NameNode
Should have same storage requirements as NameNode There can be many Checkpoint nodes per cluster
63
Types of nodes Checkpoint Node
64
Types of nodes Secondary NameNode

Like Checkpoint Node but it doesnt copy new fs image back to NameNode
Edit logs on Secondary NameNode under control, but not on NameNode
If theres a problem in NameNode, it can read from the Secondary NameNode. Should have same storage requirements as NameNode
65
Types of nodes Backup Node

Use to reduce the size of edit logs (like Checkpoint node) Difference with Checkpoint node is that it also keeps and up-to-date copy of metadata in RAM Same RAM requirements as NameNode Can only have one Backup node per cluster If a Backup node is used, there cannot be Checkpoint nodes running at the same time
66
Types of nodes - DataNode

Many per Hadoop cluster Manages blocks with data and serves them to clients Periodically reports to NameNode the list of blocks it stores Use inexpensive commodity hardware for this node
67
Types of nodes - JobTracker

One per Hadoop cluster Receives job requests submitted by client Schedules and monitors MapReduce jobs on task trackers
68
Types of nodes - TaskTracker

Many per Hadoop cluster Executes MapReduce operations Reads blocks from DataNodes
69
Agenda
70
Topology awareness (or Rack awareness)

Bandwidth becomes progressively smaller in the following scenarios:
71
Topology awareness
1.Process on the same node.
72
Topology awareness
1.Process on the same node 2.Different nodes on the same rack
73
Topology awareness
1.Process on the same node 2.Different nodes on the same rack 3.Nodes on different racks in the same data center
74
Topology awareness
1.Process on the same node 2.Different nodes on the same rack 3.Nodes on different racks in the same data center 4.Nodes in different data centers
75
Agenda
76
Configuration modes
Standalone (local) mode
Single machine No daemons are running Everything runs in single JVM Standard OS storage Good for development and test with small data, but will not catch all errors
Configuration modes
Pseudo-distributed mode
Single machine but cluster is simulated Daemons run Separate JVMs Good for development and debugging
Fully-distributed mode
Run Hadoop on cluster of machines Daemons run Production environment
Configuration files
hadoop-env.sh Environment variables that are used in the scripts to run Hadoop. core-site.xml Configuration settings for Hadoop Core, such as I/O settings that are common to HDFS and MapReduce hdfs-site.xml Configuration settings for HDFS daemons: the name node, secondary name node, and the data nodes. mapred-site.xml Configuration settings for MapReduce daemons and jobtracker, and tasktrackers. masters A list of machines (one per line) that each run secondary NameNode. slaves A list of machines (one per line) that each run data node and tasktracker. hadoop-metrix.properties Properties for controlling how metrics are published in Hadoop. log4j.properties Properties for system logfiles, the NameNode audit log, and the task log for the tasktracker child process
BigInsights Configuration Directory: /opt/ibm/biginsights/hadoop-conf
hadoop-env.sh settings
Most variables are default and not set Only export JAVA_HOME is required and should be set to java JDK HADOOP_HEAPSIZE heap size used by JVM of each daemon Can be overwritten for each daemon:
NameNode - HADOOP_NAMENODE_OPTS DataNode - HADOOP_DATANODE_OPTS Secondary NameNode - HADOOP_SECONDARYNAMENODE_OPTS JobTracker - HADOOP_JOBTRACKER_OPTS TaskTracker - HADOOP_TASKTRACKER_OPTS
Point to code & config /opt/ibm/biginsights BIGINSIGHTS_VAR Keeps logs /var/ibm/biginsights Other environment variables: HADOOP_CLASSPATH, HADOOP_PID_DIR, JAQL_HOME
BIGINSIGHTS_HOME
core-site.xml settings
fs.default.name The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem. Default: file:/// A base for other temporary directories. Default: /tmp/hadoop-${user.name} fs.trash.interval Number of minutes between trash checkpoints. If zero, the trash feature is disabled (default). When greater than zero erased files will be inserted in .trash in users home directory. The size of buffer for use in sequence files. The size of this buffer should be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations.
hadoop.tmp.dir
io.file.buffer.size
core-site.xml settings (continue)

hadoop.rpc.socket.factory.class.default Default SocketFactory to use. This parameter is expected to be formatted as package.FactoryClassName".
hadoop.rpc.socket.factory.class.ClientP rotocol
SocketFactory to use to connect to a DFS. If null or empty, use hadoop.rpc.socket.class.default. This socket factory is also used by DFSClient to create sockets to DataNodes.
hadoop.rpc.socket.factory.class.JobSub SocketFactory to use to connect to the missionProtocol JobTracker. If null or empty, uses hadoop.rpc.socket.class.default. Recommendation: Leave all three above parameters empty and mark them as FINAL
hdfs-site.xml settings
dfs.data.dir dfs.name.dir
List of directories where the DataNode stores its persistent metadata List of directories where the NameNode stores its persistent metadata. Recommendation: Remote mount NFS disk to backup metadata on NameNode (soft mount). HDFS block size. Default is 64MB. Recommendation: Set block size to 128MB or as appropriate for your data.
dfs.block.size
hdfs-site.xml settings (continue)

dfs.namenode.handler.count Number of threads the NameNode node will use to handle requests. Default: 10 Recommendation: Increase for larger cluster dfs.replication The number of time the file block should be replicated in HDFS. Default: 3 Recommendation: Set it to 1 when not on the cluster Name of a file containing an approved list of hostnames to access the NameNode. Name of a file containing a list of hostnames not allowed to access the NameNode Enables/Disables unix-like permissions on HDFS. Enabling the permissions does usually make things harder to work with while its bringing limited advantages (its not so much for securing things but for prohibiting users to mistakenly mess up others user's data )
dfs.hosts dfs.hosts.exclude dfs.permissions
mapred-site configuration
mapred.hosts
Names a file that contains the list of nodes that may connect to the jobtracker. If the value is empty, all hosts are permitted.
mapred.hosts.exclude
Names a file that contains the list of hosts that should be excluded by the jobtracker. If the value is empty, no hosts are excluded.
mapred.max.tracker.failur The number of task-failures on a tasktracker of a given job after which new tasks of that job aren't es
assigned to it. Default is 4
mapred.max.tracker.black The number of blacklists for a taskTracker by various jobs after which the task tracker could lists
be blacklisted across all jobs. The tracker will be given a tasks later (after a day). The tracker will become a healthy tracker after a restart. Default is 4.
mapred-site configuration (continue) mapred.reduce.tasks

mapred.map.tasks.spec ulative.execution
The default number of reduce tasks per job. Typically set to 99% of the cluster's reduce capacity, so that if a node fails the reduces can still be executed in a single wave. Ignored when mapred.job.tracker is "local". Default: 1. Recommendation: set it to 90% If true, then multiple instances of some map tasks may be executed in parallel. Default: true.
mapred.reduce.tasks.sp If true, then multiple instances of some reduce tasks may be executed in parallel. Default: true. Recommended: false. eculative.execution
mapred.tasktracker.map The maximum number of map tasks that will be run simultaneously by a task tracker. Default: 2. .tasks.maximum
Recommendations: set relevant to number of CPUs and amount of memory on each data node. The maximum number of reduce tasks that will be run simultaneously by a task tracker. Default: 2. Recommendations: set relevant to number of CPUs and amount of memory on each data node.
mapred.tasktracker.red uce.tasks.maximum
mapred-site configuration (continue) mapred.jobtracker.task

Scheduler mapred.jobtracker.resta rt.recover mapred.local.dir
The class responsible for scheduling the tasks. Default points to FIFO scheduler. Recommendation: Use Fair scheduler - org.apache.hadoop.mapred.FairScheduler Recover failed job when JobTracker restarts. For production clusters recommended to be set to TRUE
The local directory where MapReduce stores intermediate data files. May be a comma-separated list of directories on different devices in order to spread disk i/o. Directories that do not exist are ignored. Default: ${hadoop.tmp.dir}/mapred/local
Script is referenced in topology.script.property.file in core-site.xml. Example of property:
Setting Rack Topology (Rack Awareness) Can be defined by script which specifies which node is on which rack.
<property>
<name>topology.script.file.name</name> <value>/opt/ibm/biginsights/hadoop-conf/rack-aware.sh</value> </property> The network topology script (topology.script.file.name in the above example) receives as arguments one or more IP addresses of nodes in the cluster. It returns on stdout a list of rack names, one for each input. The input and output order must be consistent.
Hadoop core lab Part1
Thank you!
http://bit.ly/cascon2012 @leonsp, @bsteinfe, @mariusbutuc

Hadoop Architecture

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hadoop Architecture

Uploaded by

Copyright:

Available Formats

1.

4 Hadoop Architecture - HDFS

InfoSphere BigInsights A Full Hadoop Stack

Text Analytics Lucene

Data Sources/ Connectors

Streams Data Stage Flume

DB2 LUW DB2 z Informix

InfoSphere BigInsights A Full Hadoop Stack

Text Analytics Lucene

Data Sources/ Connectors

Streams Data Stage Flume

DB2 LUW DB2 z Informix

Two main components:

HDFS Block OS Blocks

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

Writing a file to HDFS

HDFS Command line interface

hadoop namenode -format

fsck file system check

hadoop fsck -delete

fs file system shell

fs file system shell

hadoop fs copyFromLocal file://myfile.txt hdfs://localhost:9000/user/keith/myfile.txt

fs file system shell

Some HDFS-specific commands

HDFS FS shell commands

hadoop fs -copyFromLocal <localsrc> .. <dst>

hadoop fs -put <localsrc> .. <dst>

HDFS FS shell commands

hadoop fs -copyToLocal [-ignorecrc] [-crc] <src> <localdst>

hadoop fs -get [-ignorecrc] [-crc] <src> <localdst>

HDFS FS shell commands

Merge and sort them to only one file on local fs

hadoop fs -getmerge <src> <localdst>

HDFS FS shell commands

hadoop fs -setrep [-R] [-w] <rep> <path/file>

HDFS FS shell commands

Usage: hadoop fs -cat URI [URI ]

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

HDFS FS shell commands

Types of nodes - Overview

Types of nodes - Overview

Types of nodes - Overview

Types of nodes - Overview

Types of nodes - Overview

Types of nodes - NameNode

Types of nodes - NameNode