You are on page 1of 2

Hi,

Greetings from PVR America Inc.,

Following is the opening with one of our client. If you are available and
interested please revert back with your updated resume so that we can discuss
further.

Senior Hadoop Admin


Responsibilities
� Deploying a Hadoop cluster, maintaining a Hadoop cluster, adding and
removing nodes using cluster monitoring tools like Cloudera Manager, configuring
the NameNode high availability and keeping a track of all the running Hadoop jobs
� Implementing, managing and administering the overall Hadoop
infrastructure
� Take care of the day-to-day running of Hadoop clusters
� Work closely with the database team, network team, BI team and
application teams to make sure that all the big data applications are highly
available and performing as expected
� Understanding of the setup for all configurations such as Core-Site,
HDFS-Site, YARN-Site and Map Red-Site
� Responsible for capacity planning and estimating the requirements for
lowering or increasing the capacity of the Hadoop cluster
� Ensure the Hadoop cluster is up and running all the time
� Monitoring the cluster connectivity and performance
� Manage and review Hadoop log files
� Backup and recovery tasks
� Resource and security management
� Troubleshooting application errors and ensuring that they do not occur
again
� Strong understanding and experience in upgrading Hadoop cluster services
and distributions
� Hadoop development and implementation using tools such as Spark, Scala,
Python, Impala, etc.
� Loading from disparate data sets using tools such as Nifi, Sqoop, and
Spark
� Pre-processing using Hive
� Translate complex functional and technical requirements into detailed
design
� Perform analysis of vast data stores and uncover insights
� Create scalable and high-performance web services for data tracking
� High-speed querying
� Being a part of a POC effort to help build new Hadoop clusters
� Test prototypes and oversee handover to operational teams
� Propose best practices/standards
Skillsets
� Excellent knowledge of UNIX/LINUX OS because Hadoop runs on Linux
� Knowledge of high degree configuration management and automation tools
like Puppet or Chef for non-trivial installation
� Knowledge of cluster monitoring tools like Ambari and Cloudera Manager
� Knowing of core java is a plus, but not mandatory
� Good understanding of OS concepts, process management and resource
scheduling
� Basics of networking, CPU, memory and storage
� Good knowledge of shell scripting
� Excellent knowledge of all the components in the Hadoop ecosystem like
Hive, Impala, Spark, YARN, MR, HDFS, Sentry, Zookeeper, JournalNode, Oozie, etc.
Experience
� Minimum of 4+ years of experience working in Hadoop/ Big Data related
field
� Working experience on tools like Hive, Spark, Sqoop, Impala, Oozie,
MapReduce, etc.
� Minimum 4+ years of hands on programming experience in Java, Scala,
Python, Shell Scripting
� Experience in end to end design and build process of Near-Real time and
Batch Data Pipelines
� Strong experience with SQL and Data modelling
� Experience working in Agile development process and has good
understanding of various phases of Software Development Life Cycle
� Experience using Source Code and Version Control systems like Git, etc
� Deep understanding of the Hadoop ecosystem and strong conceptual
knowledge in Hadoop architecture components
� Self-starter who works with minimal supervision. Ability to work in a
team of diverse skill sets
� Ability to comprehend customer requests & provide the correct solution
� Strong analytical mind to help solve complicated problems
� Desire to resolve issues and dive into potential issues

Regards,

Chetana Shah
PVR America Inc.
chetana@pvramerica.com

You might also like