Welcome to Scribd!

Skip carousel

Word Count Program To Demonstrate The Use of Map and Reduce Tasks

Uploaded by

riya k

0% found this document useful (0 votes)

44 views5 pages

reduce

Original Title

map

Copyright

Available Formats

DOC, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

reduce

Copyright:

Available Formats

Download as DOC, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

44 views5 pages

Word Count Program To Demonstrate The Use of Map and Reduce Tasks

Uploaded by

riya k

reduce

Copyright:

Available Formats

Download as DOC, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 5

Search inside document

Ex No:9

Word count program to demonstrate the use of Map

and Reduce tasks
Date:

Aim:
To write a word count program to demonstrate the use of map and Reduce tasks

Description:
MapReduce is a processing technique and a program model for distributed computing
based on java. The MapReduce algorithm contains two important tasks, namely Map and
Reduce.

 Map stage : The map or mapper’s job is to process the input data. Generally the input
data is in the form of file or directory and is stored in the Hadoop file system
(HDFS). The input file is passed to the mapper function line by line. The mapper
processes the data and creates several small chunks of data.

 Reduce stage : This stage is the combination of the Shufflestage and

the Reduce stage. The Reducer’s job is to process the data that comes from the
mapper. After processing, it produces a new set of output, which will be stored in the
HDFS.

Steps in Eclipse IDE:

1. File->new project->Java project->next.”wordcount”-project name click finish

2. Right click on word count project and select properties.

3. Click add external jars.File system->usr->lib->hadoop

4. Select all jars and click ok once again.Add external jars ,all libs in “client”

5. Right click on source,new->class->wordcount->finish.

6. Add program.

Exporting the jar:

1. Right click on wordcount project and select export.Java->jar file.

2. Select destination and input file

Cat/home/location of file/wordcount.txt

Program:

org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import package wordcount;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{

private final static IntWritable one = new IntWritable(1);

private Text word = new Text();

public void map(Object key, Text value, Context context

) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}

public static class IntSumReducer

extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> values,

Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}}

Create and Setting up path :

To list files:

hadoop fs –ls

To create directory:

hadoop fs -mkdir input

Execution:

hadoop –fs –put/home/loc/wordcount.txt/input/wordcount1.txt

hadoop fs –ls input

To Run:

hadoop jar /home/.../wordcount.jar wordcount.wordcount /input/wordcount.txt /output

Output Command:

hadoop fs –ls output

hadoop fs –cat output/part-00000

Result:
Thus the word count program to demonstrate the use of map and Reduce tasks was
written ,executed and output was verified.

Inverted Index
Document9 pages
Inverted Index
ReaderRat
No ratings yet
dbms2 1
Document5 pages
dbms2 1
thess22
No ratings yet
Department of Computer Science and Engineering: Delhi Technological University Big Data Analysis Lab-BDA E3-G3
Document4 pages
Department of Computer Science and Engineering: Delhi Technological University Big Data Analysis Lab-BDA E3-G3
Ishan
No ratings yet
Hadoop Configuration and Single Node Setup
Document61 pages
Hadoop Configuration and Single Node Setup
Parth
No ratings yet
Execute Java Map Reduce Sample Using Eclipse
Document9 pages
Execute Java Map Reduce Sample Using Eclipse
Arjun S
No ratings yet
BDALab Assn4
Document9 pages
BDALab Assn4
Deepti Agrawal
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
Document7 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
SARAVANAN
No ratings yet
BDALab Assn4
Document9 pages
BDALab Assn4
Deepti Agrawal
No ratings yet
Hadoop and Map Reduce
Document27 pages
Hadoop and Map Reduce
arshpreetmundra14
No ratings yet
Word Count Program With MapReduce and Java
Document7 pages
Word Count Program With MapReduce and Java
chetna
No ratings yet
BDT Lab Manual
Document48 pages
BDT Lab Manual
Vishnu Vardhan H
No ratings yet
Run Wordcount
Document3 pages
Run Wordcount
Khushi Patil
No ratings yet
Bda Unit-Iii
Document42 pages
Bda Unit-Iii
rohithatimsi
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
Document9 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
K V D Sagar
No ratings yet
Exp 4 Word Count
Document4 pages
Exp 4 Word Count
munish kumar agarwal
No ratings yet
Step 2 - First MapReduce Program
Document25 pages
Step 2 - First MapReduce Program
Santosh Kumar Desai
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
Document11 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
saiconze
No ratings yet
Bda Experiment 7: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
Document8 pages
Bda Experiment 7: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
Alka
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
Document5 pages
Prerequisites: Single Node Setup Cluster Setup
martha quinga
No ratings yet
Develop Simple MapReduce WordCount
Document22 pages
Develop Simple MapReduce WordCount
Kaushal Prajapati
No ratings yet
Unit-3 Introduction To MapReduce Programming
Document17 pages
Unit-3 Introduction To MapReduce Programming
Siva
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
Document13 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
smitanair143
No ratings yet
WordCount Program Hadoop Task 2
Document7 pages
WordCount Program Hadoop Task 2
20261A6757 VIJAYAGIRI ANIL KUMAR
No ratings yet
BDA List of Experiments For Practical Exam
Document21 pages
BDA List of Experiments For Practical Exam
Pharoah Gamerz
No ratings yet
12 CodigoNetbeans
Document5 pages
12 CodigoNetbeans
Miguel Angel
No ratings yet
BDA Lab 8 Manual
Document7 pages
BDA Lab 8 Manual
Mydah Nasir
No ratings yet
Hands-On Big Data MapReduce Exercises
Document14 pages
Hands-On Big Data MapReduce Exercises
moorthykem
No ratings yet
MapReduce Word Count Program
Document6 pages
MapReduce Word Count Program
shaliniiii
No ratings yet
Analyzing The Data With Hadoop
Document13 pages
Analyzing The Data With Hadoop
Vyshnavi Thottempudi
No ratings yet
Assignment 2-033
Document13 pages
Assignment 2-033
DHARSHANA C P
No ratings yet
DSBDSAssingment 11
Document20 pages
DSBDSAssingment 11
403 Chaudhari Sanika Sagar
No ratings yet
Palak
Document10 pages
Palak
Dolly Mehra
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements Public Void Throws
Document6 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements Public Void Throws
SARAVANAN
No ratings yet
Big Data - ASSIGNMENT 2
Document15 pages
Big Data - ASSIGNMENT 2
DHARSHANA C P
No ratings yet
BDA LabManual
Document20 pages
BDA LabManual
posprojectz
No ratings yet
Assignment 11 DSBDA
Document4 pages
Assignment 11 DSBDA
DARSHAN JADHAV
No ratings yet
Wordcount
Document3 pages
Wordcount
21020279 Trần Diệu Anh
No ratings yet
BDA3
Document7 pages
BDA3
nikithakatta0
No ratings yet
Run Python MapReduce On Local Docker Hadoop Cluster - DEV Community
Document5 pages
Run Python MapReduce On Local Docker Hadoop Cluster - DEV Community
Ahmed Mohamed
No ratings yet
CS-702 (D) BigData
Document61 pages
CS-702 (D) BigData
garima bh
No ratings yet
Unit 4 BDA
Document31 pages
Unit 4 BDA
Amritha
No ratings yet
Experiment: Word Count in Hadoop
Document12 pages
Experiment: Word Count in Hadoop
SAMINA ATTARI
No ratings yet
Hadoop Interview Questions Author: Pappupass Learning Resource
Document16 pages
Hadoop Interview Questions Author: Pappupass Learning Resource
Dheeraj Reddy
No ratings yet
Big Data Word Count MapReduce Example
Document6 pages
Big Data Word Count MapReduce Example
Jajang Nurjaman
No ratings yet
MapReduce Example
Document3 pages
MapReduce Example
Ravi Chander
No ratings yet
MapReduce and Yarn
Document39 pages
MapReduce and Yarn
Alekhya Abbaraju
No ratings yet
Lab Manual
Document86 pages
Lab Manual
pthuynh709
No ratings yet
Big Data
Document17 pages
Big Data
gtfhbmnvh
No ratings yet
20dce017 Bda Pracfil
Document41 pages
20dce017 Bda Pracfil
Raj Chauhan
No ratings yet
Bda Unit 1
Document13 pages
Bda Unit 1
CrazyYT Gaming channel
No ratings yet
How To Improve .NET Applications With AOP - CodeProject
Document7 pages
How To Improve .NET Applications With AOP - CodeProject
gfgomes
No ratings yet
201070046_BDA_03
Document10 pages
201070046_BDA_03
HARSH NAG
No ratings yet
BIGDATA LAB MANUAL
Document27 pages
BIGDATA LAB MANUAL
john wick
No ratings yet
Introduction To Hadoop - Part Two: 1 Hadoop and Comma Separated Values (CSV) Files 1
Document38 pages
Introduction To Hadoop - Part Two: 1 Hadoop and Comma Separated Values (CSV) Files 1
Sadikshya khanal
No ratings yet
Big Data Hadoop Framework
Document56 pages
Big Data Hadoop Framework
PRACHI ROSHAN
No ratings yet
Introduction To Map Reduce Programming: By: Syed Nawaz Pasha Course Name: Big Data Analytics
Document12 pages
Introduction To Map Reduce Programming: By: Syed Nawaz Pasha Course Name: Big Data Analytics
Shushanth munna
No ratings yet
To Count Using Map and Reduce Program: Wordcount - Java
Document2 pages
To Count Using Map and Reduce Program: Wordcount - Java
Ramya Devi
No ratings yet
6 - Simple Wordcount
Document2 pages
6 - Simple Wordcount
Xavier TxA
No ratings yet
HADOOP AND BIG DATA - Final
Document26 pages
HADOOP AND BIG DATA - Final
Deeq Huseen
No ratings yet
Big Data File
Document16 pages
Big Data File
Arnav Shrivastava
No ratings yet
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
Rating: 3 out of 5 stars
3/5 (4)
CN Has Major Role
Document1 page
CN Has Major Role
riya k
No ratings yet
Computernetworks Has Awide Variety
Document1 page
Computernetworks Has Awide Variety
riya k
No ratings yet
P 61 PPTX
Document3 pages
P 61 PPTX
riya k
No ratings yet
P 57 PPTX
Document3 pages
P 57 PPTX
riya k
No ratings yet
P 70 PPTX
Document3 pages
P 70 PPTX
riya k
No ratings yet
P 55 PPTX
Document3 pages
P 55 PPTX
riya k
No ratings yet
P 65 PPTX
Document3 pages
P 65 PPTX
riya k
No ratings yet
P 72 PPTX
Document3 pages
P 72 PPTX
riya k
No ratings yet
P 71 PPTX
Document3 pages
P 71 PPTX
riya k
No ratings yet
P 61 PPTX
Document3 pages
P 61 PPTX
riya k
No ratings yet
P 58 PPTX
Document3 pages
P 58 PPTX
riya k
No ratings yet
P 60 PPTX
Document3 pages
P 60 PPTX
riya k
No ratings yet
P 53 PPTX
Document3 pages
P 53 PPTX
riya k
No ratings yet
P 56 PPTX
Document3 pages
P 56 PPTX
riya k
No ratings yet
P 45
Document3 pages
P 45
riya k
No ratings yet
P 50
Document3 pages
P 50
riya k
No ratings yet
Peacock
Document3 pages
Peacock
riya k
No ratings yet
P 51 PPTX
Document3 pages
P 51 PPTX
riya k
No ratings yet
P 42
Document3 pages
P 42
riya k
No ratings yet
Orange
Document3 pages
Orange
riya k
No ratings yet
Comedy
Document3 pages
Comedy
riya k
No ratings yet
Powerpoint
Document3 pages
Powerpoint
riya k
No ratings yet
Documents
Document3 pages
Documents
riya k
No ratings yet
P 37
Document3 pages
P 37
riya k
No ratings yet
Music
Document3 pages
Music
riya k
No ratings yet
Timetable Hours Interval
Document3 pages
Timetable Hours Interval
riya k
No ratings yet
Drawing
Document3 pages
Drawing
riya k
No ratings yet
P 33
Document3 pages
P 33
riya k
No ratings yet
Books
Document3 pages
Books
riya k
No ratings yet
Shapes
Document3 pages
Shapes
riya k
No ratings yet
Voucher-Zahra Wifi-7 Jam-Up-787-07.15.22-Rawin
Document10 pages
Voucher-Zahra Wifi-7 Jam-Up-787-07.15.22-Rawin
Cipto
No ratings yet
2017 PC Pricer Download Instructions
Document8 pages
2017 PC Pricer Download Instructions
Clediliano André Miranda
No ratings yet
Install Uchk
Document3 pages
Install Uchk
sutedjo parto
No ratings yet
DAY Course Content Description
Document1 page
DAY Course Content Description
OmaR AL-SaffaR
No ratings yet
Simplify Your Streaming
Document27 pages
Simplify Your Streaming
Prasenjit Patnaik
No ratings yet
Cse5243 Intro. To Data Mining: Chapter 1. Introduction
Document56 pages
Cse5243 Intro. To Data Mining: Chapter 1. Introduction
GIOVANE GONÇALVES
No ratings yet
Server Side Trace Vs Client Side Trace
Document5 pages
Server Side Trace Vs Client Side Trace
Prasad Redd
No ratings yet
How To Add A Descriptive Flexfield (DFF) in A Custom Oracle Apps Form
Document17 pages
How To Add A Descriptive Flexfield (DFF) in A Custom Oracle Apps Form
Oracle developer
No ratings yet
Unit 3 SQL
Document24 pages
Unit 3 SQL
Bhanu Prakash Reddy
No ratings yet
CSCI835 Database Systems Assignment 0 (Zero) : Saturday 29 August, 2020, 7.00 PM (Sharp)
Document12 pages
CSCI835 Database Systems Assignment 0 (Zero) : Saturday 29 August, 2020, 7.00 PM (Sharp)
Masud Zaman
No ratings yet
Getintopc - Com SAS 9.4 M5 x64 SID 30 April 2020
Document3 pages
Getintopc - Com SAS 9.4 M5 x64 SID 30 April 2020
palanivel
No ratings yet
Apps DBA
Document24 pages
Apps DBA
anwarbhai
No ratings yet
Lecture 05
Document19 pages
Lecture 05
Hezekiah Ebere Enekwa
No ratings yet
Aditya College Database Normalization
Document29 pages
Aditya College Database Normalization
RONGALI CHANDINI
No ratings yet
Configure DNS on AIX
Document3 pages
Configure DNS on AIX
musabsyd
No ratings yet
Execution Plan Basics - Simple Talk
Document34 pages
Execution Plan Basics - Simple Talk
Kevin Anderson
No ratings yet
DBMS Suggestions
Document17 pages
DBMS Suggestions
sudarshan karki
No ratings yet
HFM Admin PDF
Document402 pages
HFM Admin PDF
nara4all
No ratings yet
Lecture 1: Intro: by Imran Mahmud
Document25 pages
Lecture 1: Intro: by Imran Mahmud
Ashik Ahmed Nahid
No ratings yet
Business Driven Technology Plug-In T6 - Basic Skills and Tools Using Access
Document5 pages
Business Driven Technology Plug-In T6 - Basic Skills and Tools Using Access
Thao Trung
No ratings yet
Aggregation & Indexing in MongoDB
Document4 pages
Aggregation & Indexing in MongoDB
Indra Kumar Singh
No ratings yet
Syllabus - SQL - IZ0 - 007
Document2 pages
Syllabus - SQL - IZ0 - 007
Nutan Khimasiya
No ratings yet
Oracle Certified Associate (OCA)
Document1 page
Oracle Certified Associate (OCA)
JoseUtia
No ratings yet
Oej PDF
Document40 pages
Oej PDF
dk25711
No ratings yet
Brspace Management
Document3 pages
Brspace Management
soma3nath
No ratings yet
File Allocation
Document18 pages
File Allocation
neha91
No ratings yet
Strategies for Testing Data Warehouse Apps
Document5 pages
Strategies for Testing Data Warehouse Apps
Ramesh Subramani
No ratings yet
BI Query Runtime Statistics
Document13 pages
BI Query Runtime Statistics
Sugguna Viswanadth
No ratings yet
Abhisek Parida
Document21 pages
Abhisek Parida
Bikash Kumar Behera
No ratings yet