You are on page 1of 3

ORACLE DATA SHEET

ORACLE BIG DATA CONNECTORS


BIG DATA FOR THE ENTERPRISE KEY FEATURES
Oracle Loader for Hadoop delivers high

performance loading of Hadoop data into Oracle Database


Oracle Direct Connector for HDFS

provides efficient, high-performance access to Hadoop data from SQL


Oracle Data Integrator natively interprets

Built from the ground up by Oracle, Oracle Big Data Connectors delivers a high-performance Hadoop to Oracle Database integration solution and enables optimized analysis using Oracles distribution of open source R analysis directly on Hadoop data. By providing efficient connectivity, Big Data Connectors enables analysis of all data in the enterprise both structured and unstructured.
Oracle Big Data Connectors
One of the key issues that faces organizations in the big data era is how to integrate data stored in current IT systems (such as structured data in relational data warehouses) with data collected in Hadoop, Oracle NoSQL Database and other NoSQL systems. Oracle Big Data Appliance makes it easy for these organizations to acquire and organize new types of data, while Oracle Big Data Connectors enables an integrated data set for analyzing by providing: Oracle Loader for Hadoop Oracle Direct Connector for Hadoop Distributed File System (HDFS) Oracle Data Integrator Application Adapter for Hadoop Oracle R Connector for Hadoop

Hive metadata and generates optimized HiveQL code


Oracle R Connector for Hadoop enables

interactive access to Hadoop data from R KEY BENEFITS


Optimized, high performance data

loading between Hadoop and Oracle Database


Lower CPU utilization on Oracle

RDBMS while improving load rates


Optimized connector for Oracle R to

analyze raw data on HDFS leveraging Hadoop


Integrated and tested on Big Data

Oracle Big Data Connectors bundles all of these capabilities in a single, optimized and supported product from Oracle.

Appliance
Easy to use through graphical user

Oracle Direct Connector for Hadoop Distributed File System


Oracle Direct Connector for Hadoop Distributed File System (HDFS) is a high speed connector for accessing data on HDFS directly from Oracle Database. Oracle Direct Connector for HDFS gives users the flexibility of accessing and importing data from HDFS at any time, as needed by their application. It allows the creation of an external table in Oracle Database, enabling direct SQL access on data stored in HDFS. The data stored in HDFS can then be queried via SQL, joined with data stored in Oracle Database, or loaded into the Oracle Database. Access to the data on HDFS is optimized for fast data movement and parallelized, with automatic load balancing. Data on HDFS can be in delimited files or in Oracle data pump files created by Oracle Loader for Hadoop.

interface

Oracle Loader for Hadoop


Oracle Loader for Hadoop is a MapReduce utility to optimize data loading from Hadoop into Oracle Database. Oracle Loader for Hadoop sorts, partitions, and converts data into Oracle Database formats in Hadoop, then loads the converted data into the database. By preprocessing the data to be loaded as a Hadoop job on a Hadoop cluster, Oracle Loader for Hadoop dramatically reduces the CPU and IO utilization on the database commonly seen when ingesting data from Hadoop. An added benefit of presorting data is faster index creation on the data once in the database.

ORACLE DATA SHEET

Oracle Loader for Hadoop


Features On-Line Load Option Reducer nodes connect to the database for load, using JDBC or direct path load options. Reducer nodes write Oracle Data Pump binary files or delimited text files for loading into the database. Distribute work evenly to all reducers. Supports multiple input formats: Delimited text files, Hive tables, build your own.

Off-Line Load Option Load Balancing Input Formats

Oracle Data Integrator Application Adapter for Hadoop


Oracle Data Integrator (ODI) Application Adapter for Hadoop provides native Hadoop integration within ODI. Specific ODI Knowledge Modules optimized for Hive and Oracle Loader for Hadoop are included within ODI Application Adapter for Hadoop. The knowledge modules can be used to build Hadoop metadata within ODI, load data into Hadoop, transform data within Hadoop, and load data easily and directly into Oracle Database utilizing Oracle Loader for Hadoop. Hadoop implementations require complex Java MapReduce code to be written and executed on the Hadoop cluster. Using ODI and the ODI Application Adapter for Hadoop developers use a graphical user interface to create these programs. Utilizing the ODI Application Adapter for Hadoop, ODI generates optimized HiveQL which in turn generates native MapReduce programs that are executed on the Hadoop cluster. Once the data is processed and organized on the Hadoop cluster, ODI loads the data directly into Oracle Database utilizing the Oracle Loader for Hadoop support within the Application Adapter for Hadoop.

Oracle Data Integrator Application Adapter for Hadoop


Features Optimized for Developer Productivity Native Integration with Hadoop Optimized for Performance Familiar ODI graphical user interface End-to-End coordination of Hadoop jobs Map-Reduce jobs created and orchestrated by ODI Native integration with Hadoop using Hive Ability to represent Hive metadata within ODI Transformations and filtering occur directly in Hadoop Transformations written in SQL-like HiveQL Optimized Hadoop ODI knowledge modules High Performance load to Oracle Database using ODI with Oracle Loader for Hadoop Ability to configure and execute Oracle Loader for Hadoop

Oracle R Connector for Hadoop


Oracle R Connector for Hadoop is an R package that provides transparent access to Hadoop and to data stored in HDFS. R Connector for Hadoop provides users of the open-source statistical environment R with the

ORACLE DATA SHEET

RELATED PRODUCTS AND SERVICES Oracle Big Data Connectors is a set of Oracle products that deliver highperformance integration and load capabilities between Hadoop, Oracle R and Oracle Database. RELATED PRODUCTS The following are related products available from Oracle:
Oracle Big Data Appliance Oracle Exadata Oracle Advanced Analytics

ability to analyze data stored in HDFS, and to scalably run R models against large volumes of data leveraging MapReduce processing without requiring R users to learn yet another API or language. End users can leverage over 3500 open source R packages to analyze data stored in HDFS, while administrators do not need to learn R to schedule R MapReduce models in production environments. R Connector for Hadoop enables users to write mapper and reducer functions in R and execute them on the Hadoop Cluster through R. In addition, users can easily transition R scripts from test environments to production. Hadoop-based R programs can be deployed on a Hadoop cluster without needing to know Hadoop internals, the Hadoop or HDFS command line interfaces, or IT infrastructure. R Connector for Hadoop can optionally be used together with the Oracle Advanced Analytics Option for Oracle Database. The Oracle Advanced Analytics Option enables R users to transparently work with database resident data without having to learn SQL or database concepts but with R computations executing directly in-database.

Oracle R Connector for Hadoop


Features Interactive R access to HDFS Manipulate and explore data in HDFS using R functions Transparently from within the R environment, move data between HDFS and o o o R Oracle Database Users local file system

Option
Oracle Exalytics

RELATED SERVICES The following services are available from Oracle Support Services:
Advanced Customer

Services
Product Support Services Consulting Services Oracle University Courses

Leverage map-reduce programming paradigm in the familiar context of R without having to learn Hadoop concepts Develop map-reduce R scripts independent of the Hadoop Cluster for testing on a local R users desktop before deploying for production execution on a Hadoop cluster Supports full mapper, combiner, and reducer R functions through intuitive and easy to use APIs No additional metadata encoding of files is required and there are no external/additional architectural dependencies

R integration with Hadoop

Contact Us
For more information about Oracle Big Data Appliance, visit oracle.com or call +1.800.ORACLE1 to speak to an Oracle representative.

Copyright 2011, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Cloudera, Cloudera CDH, and Cloudera Manager are registered and unregistered trademarks of Cloudera, Inc. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark licensed through X/Open Company, Ltd. 0611

You might also like