Professional Documents
Culture Documents
Aim:
To write a word count program to demonstrate the use of map and Reduce tasks
Description:
MapReduce is a processing technique and a program model for distributed computing
based on java. The MapReduce algorithm contains two important tasks, namely Map and
Reduce.
Map stage : The map or mapper’s job is to process the input data. Generally the input
data is in the form of file or directory and is stored in the Hadoop file system
(HDFS). The input file is passed to the mapper function line by line. The mapper
processes the data and creates several small chunks of data.
4. Select all jars and click ok once again.Add external jars ,all libs in “client”
6. Add program.
Program:
org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import package wordcount;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
To list files:
hadoop fs –ls
To create directory:
To Run:
Result:
Thus the word count program to demonstrate the use of map and Reduce tasks was
written ,executed and output was verified.