You are on page 1of 2

Using counters in Hadoop MapReduce

Posted on July 11, 2012


Sometimes when running MapReduce jobs, you want to know whether or how often a c
ertain event has occured during execution. Imagine an iterative algorithm that s
hould run until no changes were made to the data during execution. Lets assume that
the change happens in the map function.
A common mistake would be to use the context object and set a value in the confi
guration object.
context.getConfiguration().set("event", "hasOccured");
This approach only works when executing the job on a single machine. When runnin
g on a cluster of computers each mapper or reducer will have its own configurati
on object. Therefore, global reading and writing is not possible. It can only be
used to store information before executing the job and passing this information
to the mappers and reducers, e.g. filenames of auxiliary files, etc.
In order to know how often a mapper has changed a data item on a global level, w
e will use a so-called counter. First you have to define an Enum which represent
s a group of counters. In this example we will only have one counter.
1 public enum MyCounters {
2 Counter
3 }
From within the map method of our mapper, we can access the counter and incremen
t it when we change a dataset.The counter is identified by the enum value.
context.getCounter(MyCounters.Counter).increment(1);
Finally, we can read the counter after job execution and see whether the data ha
s changed.
job.getCounters().findCounter(MyCounters.Counter).getValue();
All counters are displayed during job execution.
Summary: Counters are a useful feature provided by the hadoop framework to globa
lly certain values during job execution. They can also be analyzed to count how
many damaged or malformed datasets were in the input data.
http://www.antony-neu.com/2012/07/11/using-counters-in-hadoop/
MapReduce Counter
Hadoop MapReduce Counter provides a way to measure the progress or the number of
operations that occur within MapReduce programs. Basically, MapReduce framework
provides a number of built-in counters to measure basic I/O operations, such as
FILE_BYTES_READ/WRITTEN and Map/Combine/Reduce input/output records. These coun
ters are very useful especially when you evaluate some MapReduce programs. Besid
es, the MapReduce Counter allows users to employ your own counters. Since MapRed
uce Counters are automatically aggregated over Map and Reduce phases, it is one
of the easiest way to investigate internal behaviors of MapReduce programs. In t
his post, Im going to introduce how to use your own MapReduce Counter. The example
sources described in this post are based on Hadoop 0.21 API.
Incrementing your counter
For your own MapReduce counter, you first define a enum type as follow:
1 public static enum MATCH_COUNTER {
2 INCOMING_GRAPHS,
3 PRUNING_BY_NCV,
4 PRUNING_BY_COUNT,
5 PRUNING_BY_ISO,
6 ISOMORPHIC
7 };
And then, when you want to increment your own counter, you should call the incre
ment method as follows:
1
context.getCounter(MATCH_COUNTER.INCOMING_GRAPHS).increment(1);
You can access context instance within setup, cleanup, map, and reduce method in
Mapper or Reducer class. You can get a desired counter via calling context.getC
ounter method with some enum value.
Finding your counter
You can get some Counters from a finished job as follows:
1 Configuration conf = new Configuration();
2 Cluster cluster = new Cluster(conf);
3 Job job = Job.getInstance(cluster,conf);
4 result = job.waitForCompletion(true);
5 ...
6 Counters counters = job.getCounters();
The instance of Counters class contains all of the counters obtained from a job.
So, when you want to get your own counter, you should call findCounter method w
ith a enum type as follows:
1 Counter c1 = counters.findCounter(MATCH_COUNTER.INCOMING_GRAPHS);
2 System.out.println(c1.getDisplayName()+":"+c1.getValue());
The below example shows how to get built-in counter groups that Hadoop provides
basically.
1 for (CounterGroup group : counters) {
2 System.out.println("* Counter Group: " + group.getDisplayName() + " (" + grou
p.getName() + ")");
3 System.out.println(" number of counters in this group: " + group.size());
4 for (Counter counter : group) {
5 System.out.println(" - " + counter.getDisplayName() + ": " + counter.getNa
me() + ": "+counter.getValue());
6 }
7 }

You might also like