If snappy is installed, HBase can make use of it (courtesy of hadoop-snappy [37]).

  1. Build and install snappy on all nodes of your cluster (see below). HBase nor Hadoop cannot include snappy because of licensing issues (The hadoop libhadoop.so under its native dir does not include snappy; of note, the shipped .so may be for 32-bit architectures -- this fact has tripped up folks in the past with them thinking it 64-bit). The notes below are about installing snappy for HBase use. You may want snappy available in your hadoop context also. That is not covered here. HBase and Hadoop find the snappy .so in different locations currently: Hadoop picks those files in ./lib while HBase finds the .so in ./lib/[PLATFORM].

  2. Use CompressionTest to verify snappy support is enabled and the libs can be loaded ON ALL NODES of your cluster:

    $ hbase org.apache.hadoop.hbase.util.CompressionTest hdfs://host/path/to/hbase snappy

  3. Create a column family with snappy compression and verify it in the hbase shell:

    $ hbase> create 't1', { NAME => 'cf1', COMPRESSION => 'SNAPPY' }
    hbase> describe 't1'

    In the output of the "describe" command, you need to ensure it lists "COMPRESSION => 'SNAPPY'"

C.5.1.  Installation

Snappy is used by hbase to compress HFiles on flush and when compacting.

You will find the snappy library file under the .libs directory from your Snappy build (For example /home/hbase/snappy-1.0.5/.libs/). The file is called libsnappy.so.1.x.x where 1.x.x is the version of the snappy code you are building. You can either copy this file into your hbase lib directory -- under lib/native/PLATFORM -- naming the file as libsnappy.so, or simply create a symbolic link to it (See ./bin/hbase for how it does library path for native libs).

The second file you need is the hadoop native library. You will find this file in your hadoop installation directory under lib/native/Linux-amd64-64/ or lib/native/Linux-i386-32/. The file you are looking for is libhadoop.so.1.x.x. Again, you can simply copy this file or link to it from under hbase in lib/native/PLATFORM (e.g. Linux-amd64-64, etc.), using the name libhadoop.so.

At the end of the installation, you should have both libsnappy.so and libhadoop.so links or files present into lib/native/Linux-amd64-64 or into lib/native/Linux-i386-32 (where the last part of the directory path is the PLATFORM you built and rare running the native lib on)

To point hbase at snappy support, in hbase-env.sh set

export HBASE_LIBRARY_PATH=/pathtoyourhadoop/lib/native/Linux-amd64-64

In /pathtoyourhadoop/lib/native/Linux-amd64-64 you should have something like:


[37] See Alejandro's note up on the list on difference between Snappy in Hadoop and Snappy in HBase

comments powered by Disqus