HBase Installation and Configuration

HBase Installation and ConfigurationThis blog post covering the HBase installation and important configurations to get first run successful. You can refer HBase – An Introduction for getting the basic ideas about this No SQL framework. HBase not advised to run below of 5 data nodes. Since HBase runs on the top of HDFS which have default replication as three.

HBase Installation
HBase installation done by downloading the HBase binary from Apache HBase site, then configure it and then update .bashrc file as we done like other Hadoop framework.

wget <hbase binary download link>
tar –zxvf <hbase-binary.gz>
mv <hbase-binary.gz> hbase
vi .bashrc file add the below
export HBASE_HOME=/home/user/hbase
export PATH=$HBASE_HOME/bin
save the .bashrc file
and put this command for saving the .bashrc file
source ~/.bashrc

HBase configuration
HBase can be installed by getting HBase binary and configure hbase-env.sh with the below three parameters
export JAVA_HOME=<Java path>
export HBASE_REGIONSERVERS=<HBase Region servers path usually in HBase Conf folder>
export HBASE_MANAGES_ZK=true (ZK – Zookeeper)

Another configuration file is hbase-site.xml

<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54310/hbase</value>
</property>
Directory shared by region servers. The URL should be fully qualified. By default $hbase.tmp.dir set to /tmp. This need to be changed otherwise data will be lost during machine restart.

<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
This property value indicates the mode of the cluster that will run in this configuration. So, for standalone mode it should be false

<property>
<name>hbase.zookeeor.quorum</name>
<value>localhost</value>
</property>
The nodes that the server(s) the zookeeper servers run. For pseudo distributed mode it could be single node(local host) for fully distributed cluster setup it should be list of zookeeper quorum servers. List should be comma separated.

<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
The port that clients connect

<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/user/hbase/zookeeper</value>
</property>
Directory where snapshot stored.

Another import configuration is /etc/hosts file
Here in this example I am using pseudo distributed mode of installation so it should be
127.0.0.1              localhost
127.0.0.1              ubuntu

However the real time it should be like below for Master Node (Name Node)
<actual ip>     hbaseMasterServer
<actual ip>     hbaseRegionServer1
<actual ip>     hbaseRegionServer2
<actual ip>     NameNode

For slave machines in each node (Region Server) /etc/host file should be
<Actual ip>         hbaseMasterServer

If it is not updated properly when staring up of Hbase will get the below exception à org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

HBase Starting and HBase Shell
After everything set traverse to bin folder and ./start-hbase.sh. After start hbase then for getting hbase shell need to prompt the command hbase shell in CLI as below.
hbase-shell