HBase Architecture

What is HBase Architecture?

HBase Architecture is a column-oriented key-value data store, and it is the natural fit for deployment on HDFS as a top layer because it fits very well with the type of data that Hadoop handles.

HBase Architecture

Also, it is extremely fast when it comes to both read and writes operations, and even with humongous data sets, it does not lose this significant value.                    

Components of HBase Architecture

There are three major components of HBase Architecture:

  • Zookeeper
  • HMaster server
  • Region server

Zookeeper

HBase uses Zookeeper to retain the cluster's database status as a distributed coordination system. Zookeeper manages the servers that are alive and available and provides notice of server failure. Zookeeper uses consensus to maintain a shared common condition. Remember that for agreement, there should be three or five computers.

Zookeeper

HMaster

HMaster is the implementation of a master server on the HBase architecture. It acts as a monitoring agent to monitor all instances of the Region Server present the cluster and acts as an interface for all changes to the metadata. In a distributed cluster environment, Master is run NameNode.

There are following are important roles performed by HMaster in HBase:-

  • HMaster assigns region to the servers of the region.
  • Controlling the failover
  • The HMaster handles DDL Operations.
  • The performs Administration.
HMaster

Coordinating the region servers

  • A Master assigns Regions on startup. Also, for recovery or load balancing, it re-assigns regions.
  • A master monitors all Region Server instances in the HBase Cluster.

Admin functions

  • Acts as an interface for creating, deleting, and updating tables in HBase.

Regions

HBase Tables are horizontally divided into "Regions" by row key range. A region contains all rows between the start and end key of the region in the table. Regions are allocated to the cluster nodes, known as "Area Servers," and this support read and write data. An area server can serve some 1,000 regions.