Hive Hbase integration
Hive
The Apache Hive ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
Hbase
-------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------
Hive
The Apache Hive ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
Hbase
Use Apache HBase when you need random, realtime read/write access to your Big Data. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.
Steps for hive and hbase integration
Step1: Create a table demotable and columnfamily as emp in hbase
create 'demotable','emp'
-------------------------------------------------------------------------------------------------------------
Step2:Add the following jar in hive shell
add jar /home/username/hbase-0.94.15/hbase-0.94.15.jar;
add jar /home/username/hbase-0.94.15/lib/protobuf-java-2.4.0a.jar;
add jar /home/username/hbase-0.94.15/lib/zookeeper-3.4.5.jar;
add jar /home/username/hbase-0.94.15/hbase-0.94.15-tests.jar;
add jar /home/username/hbase-0.94.15/lib/guava-11.0.2.jar;
list jars;
-------------------------------------------------------------------------------------------------------------
Step3:
set hbase.zookeeper.quorum=localhost;
-------------------------------------------------------------------------------------------------------------
Step4:Create an external table in hive as hivedemotable.
create external TABLE hivedemotable(empno int, empname string,empsal int,gender string) STORED BY'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,emp:empname,emp:empsal,emp:gender") TBLPROPERTIES("hbase.table.name" = "demotable");
-------------------------------------------------------------------------------------------------------------
Step5: Overwrite the hivedemotable with existing hive table employee(you have to create this table with some values in hive with same columns) which has the data inside the table
insert overwrite table hivedemotable select empno, empname ,empsal, gender from employee;
-------------------------------------------------------------------------------------------------------------
Step6:Try this in hive shell
select * from hivedemotable;
-------------------------------------------------------------------------------------------------------------
Step7:Try this in hbase shell
scan 'demotable'
-------------------------------------------------------------------------------------------------------------
Now you can see the values of hivedemotable in demotable in hbase
No comments:
Post a Comment