Thursday, 21 June 2018

Hive HBase Integration

Hive Hbase integration

Hive

The Apache Hive ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

Hbase


Use Apache HBase when you need random, realtime read/write access to your Big Data. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.


Steps for hive and hbase integration

Step1: Create a table demotable and columnfamily as emp in hbase

create 'demotable','emp'

-------------------------------------------------------------------------------------------------------------
Step2:Add the following jar in hive shell

add jar /home/username/hbase-0.94.15/hbase-0.94.15.jar;
add jar /home/username/hbase-0.94.15/lib/protobuf-java-2.4.0a.jar;
add jar /home/username/hbase-0.94.15/lib/zookeeper-3.4.5.jar;
add jar /home/username/hbase-0.94.15/hbase-0.94.15-tests.jar;
add jar /home/username/hbase-0.94.15/lib/guava-11.0.2.jar;
list jars;

-------------------------------------------------------------------------------------------------------------
Step3:

set hbase.zookeeper.quorum=localhost;

-------------------------------------------------------------------------------------------------------------
Step4:Create an external table in hive as hivedemotable.

create external TABLE hivedemotable(empno int, empname string,empsal int,gender string) STORED BY'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,emp:empname,emp:empsal,emp:gender") TBLPROPERTIES("hbase.table.name" = "demotable");

-------------------------------------------------------------------------------------------------------------
Step5: Overwrite the hivedemotable with existing hive table employee(you have to create this table with some values in hive with same columns) which has the data inside the table

insert overwrite table hivedemotable select empno, empname ,empsal, gender  from employee;

-------------------------------------------------------------------------------------------------------------
Step6:Try this in hive shell

select * from hivedemotable;
-------------------------------------------------------------------------------------------------------------
Step7:Try this in hbase shell

scan 'demotable'
-------------------------------------------------------------------------------------------------------------
Now you can see the values of hivedemotable in demotable in hbase

No comments:

Post a Comment