Class HBaseFsck

java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.hadoop.hbase.util.HBaseFsck
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.hadoop.conf.Configurable

@Deprecated @LimitedPrivate("Tools") @Evolving public class HBaseFsck extends org.apache.hadoop.conf.Configured implements Closeable
Deprecated.
For removal in hbase-4.0.0. Use HBCK2 instead.
HBaseFsck (hbck) is a tool for checking and repairing region consistency and table integrity problems in a corrupted HBase. This tool was written for hbase-1.x. It does not work with hbase-2.x; it can read state but is not allowed to change state; i.e. effect 'repair'. Even though it can 'read' state, given how so much has changed in how hbase1 and hbase2 operate, it will often misread. See hbck2 (HBASE-19121) for a hbck tool for hbase2. This class is deprecated.

Region consistency checks verify that hbase:meta, region deployment on region servers and the state of data in HDFS (.regioninfo files) all are in accordance.

Table integrity checks verify that all possible row keys resolve to exactly one region of a table. This means there are no individual degenerate or backwards regions; no holes between regions; and that there are no overlapping regions.

The general repair strategy works in two phases:

  1. Repair Table Integrity on HDFS. (merge or fabricate regions)
  2. Repair Region Consistency with hbase:meta and assignments

For table integrity repairs, the tables' region directories are scanned for .regioninfo files. Each table's integrity is then verified. If there are any orphan regions (regions with no .regioninfo files) or holes, new regions are fabricated. Backwards regions are sidelined as well as empty degenerate (endkey==startkey) regions. If there are any overlapping regions, a new region is created and all data is merged into the new region.

Table integrity repairs deal solely with HDFS and could potentially be done offline -- the hbase region servers or master do not need to be running. This phase can eventually be used to completely reconstruct the hbase:meta table in an offline fashion.

Region consistency requires three conditions -- 1) valid .regioninfo file present in an HDFS region dir, 2) valid row with .regioninfo data in META, and 3) a region is deployed only at the regionserver that was assigned to with proper state in the master.

Region consistency repairs require hbase to be online so that hbck can contact the HBase master and region servers. The hbck#connect() method must first be called successfully. Much of the region consistency information is transient and less risky to repair.

If hbck is run from the command line, there are a handful of arguments that can be used to limit the kinds of repairs hbck will do. See the code in printUsageAndExit() for more details.