Because HBase runs on Section 9.9, “HDFS” it is important to understand how it works and how it affects HBase.
The original use-case for HDFS was batch processing. As such, there low-latency reads were historically not a priority. With the increased adoption of Apache HBase this is changing, and several improvements are already in development. See the Umbrella Jira Ticket for HDFS Improvements for HBase.
Since Hadoop 1.0.0 (also 0.22.1, 0.23.1, CDH3u3 and HDP 1.0) via HDFS-2246, it is possible for the DFSClient to take a "short circuit" and read directly from the disk instead of going through the DataNode when the data is local. What this means for HBase is that the RegionServers can read directly off their machine's disks instead of having to open a socket to talk to the DataNode, the former being generally much faster. See JD's Performance Talk. Also see HBase, mail # dev - read short circuit thread for more discussion around short circuit reads.
To enable "short circuit" reads, it will depend on your version of Hadoop. The original
shortcircuit read patch was much improved upon in Hadoop 2 in HDFS-347. See http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/
for details on the difference between the old and new implementations. See Hadoop
shortcircuit reads configuration page for how to enable the latter, better version
of shortcircuit. For example, here is a minimal config. enabling short-circuit reads added
<property> <name>dfs.client.read.shortcircuit</name> <value>true</value> <description> This configuration parameter turns on short-circuit local reads. </description> </property> <property> <name>dfs.domain.socket.path</name> <value>/home/stack/sockets/short_circuit_read_socket_PORT</value> <description> Optional. This is a path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients. If the string "_PORT" is present in this path, it will be replaced by the TCP port of the DataNode. </description> </property>
Be careful about permissions for the directory that hosts the shared domain socket; dfsclient will complain if open to other than the hbase user.
If you are running on an old Hadoop, one that is without HDFS-347 but that has
must set two configurations. First, the hdfs-site.xml needs to be amended. Set the property
dfs.block.local-path-access.user to be the only
user that can use the shortcut. This has to be the user that started HBase. Then in
dfs.client.read.shortcircuit to be
Services -- at least the HBase RegionServers -- will need to be restarted in order to pick up the new configurations.
The default for this value is too high when running on a highly trafficed HBase. In HBase, if this value has not been set, we set it down from the default of 1M to 128k (Since HBase 0.98.0 and 0.96.1). See HBASE-8143 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM). The Hadoop DFSClient in HBase will allocate a direct byte buffer of this size for each block it has open; given HBase keeps its HDFS files open all the time, this can add up quickly.
A fairly common question on the dist-list is why HBase isn't as performant as HDFS files in a batch context (e.g., as a MapReduce source or sink). The short answer is that HBase is doing a lot more than HDFS (e.g., reading the KeyValues, returning the most current row or specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this processing context. There is room for improvement and this gap will, over time, be reduced, but HDFS will always be faster in this use-case.