You are on page 1of 4

SUMMARY

 Having Total X years of experience in IT


 Working on Hadoop since 2.5 years
  Working on Python and Spark , Scala since 1 year
 Have strong knowledge of Data Warehousing
  Have expertise in ETL

TECHNICAL SKILLS
 Hadoop Ecosystem : HDFS, Pig, Hive, Spark, Scala, Spark SQL, YARN,
 SQOOP, Hue, Oozie, Shell Script
  Databases/Tools: Oracle, DB2,MySql, SQL Server 2008, AQT 
  Server/ Platforms: Unix, CentOS,Windows 

WORK EXPERIENCE
Project: 1
Company :
Client :
Team Size: 7
Profile : Hadoop and Spark Developer
Duration :
Tools : Oracle, MySQL, HDFS, Hive, Spark SQL, Shell Script

Description: We are working on 360 degree data view of the complete business of the
client across multiple repositories related to different products and purchases.
Responsibilities:
 Data extraction from different servers in data lake
 Data processing using Python and Spark
 Data Ingestion in Hadoop data lake
 Data validation using Quality check
 Jobs Execution monitoring
 Jobs automation using shell script
 Data pipeline optimization
 Working in Agile model

Project : 2

Company :
Client :
Team Size: 5
Profile : Hadoop and Spark Developer
Duration
Tools : Oracle, HDFS, Pig, Hive, Spark SQL
Database: Oracle
ETL Tools : Pig, Spark SQL
Data warehouse : Hive
Description: This Tool is implemented for Banking and financial services and provides
several services for Batch processing and Real time data streaming. We are working in
stock market project to handle real time and batch processing data sourced in Oracle,
GMI , Stream core and Hadoop Ecosystem. For Reporting we are using IBM Cognos
tool.

Responsibilities:
 Data mapping in Data Lake layers
 Worked on more than 70 scenarios for development and enhancement.
 Data Analysis and pattern comparison
 Transformation script writing in Pig
 Worked on several semi-structured formats to process datasets in Pig like JSON,
XML , CSV, Fixed length file
 Worked on bad files to reprocess in Pig using UDF
 Implemented custom UDF in Pig
 Worked on Hadoop and Spark integration from scratch
 Data streaming using Spark
 Spark SQL to connect with Hive warehouse
 Large datasets processing in Spark DataSet
 Worked on RDDs, Transformations, Actions
 Worked on several functions in Scala Library to build Spark Applications using
Spark SQL
 Linear Regression
 Worked in Development, QA, Preprod and Production Environments
 Migration from Oracle in Hive using SQOOP
 Daily batch run and bug fixing in real time
 Have worked on critical scenarios and given feasible and efficient solutions
 Product development and release management

Project : 3
Company :
Client
Team Size: 5
Profile: Hadoop Developer
Duration :
Tools: HDFS, MapReduce, Pig,Hive,Hue,SQOOP,Flume,Cloudera cluster
Databases: DB2, SQL Server 2008,Oracle
Description:
Daimler Trucks North America (DTNA) is a brand which has its sub-brands and provides
Trucks and Engines in the market. The main objective of the project is to outsource the
Strategy, Design, Development, Coding, Testing, Deployment and Maintenance of the
Enterprise Data Repository for reporting purpose to gain the required Information from
their huge amount of Data. Data comes from more than 40 source systems and Tools used
in this project are Hadoop ecosystem tools described in below list.

Responsibilities:

 Worked on more than 90 scenarios for development and enhancement purposes.


 Worked in Development and Enhancement Environment
 Migration from Oracle in Hive
 Transformation script writing in Pig
 Daily batch run and bug fixing in real time
 Have worked on critical scenarios and given feasible and efficient solutions
 Have stretched many times to deliver deliverables in time

Project : 4
Company :
Client: Team Size: 6
Profile: Working as ETL Developer on Informatica PowerCenter and Teradata
Duration :
Tools :
Databases: Oracle, DB2, SQL Server

Description: Gates Corporation is the world’s leading manufacturer of power


transmission belts and a premier global manufacturer of fluid power products. We worked
for sales, manufacturing, Billing and Shipping departments of the client to migrate data
from 18 ERP's which are based on Oracle, SQL Server and DB2 and loaded data in
Teradata Warehouse in Staging, Data Store and Data Warehouse layers.

Responsibilities:
 Worked as ETL Developer
 Have worked as ETL tester also in Dev and QA Environments
 Have worked on SCD-1, SCD-2
 Created Mapping Specification documents
 Have worked on Teradata and implemented Test Cases with mimic queries for
mappings
 Worked in Development and Enhancement Environment
 Have worked on critical scenarios and given feasible and efficient solutions

You might also like