The Apache HBase™ Reference Guide

Revision History
Revision 2.0.0-SNAPSHOT 2014-07-28T15:24

Abstract

This is the official reference guide of Apache HBase™, a distributed, versioned, big data store built on top of Apache Hadoop™ and Apache ZooKeeper™.


Table of Contents

Preface
1. Getting Started
1.1. Introduction
1.2. Quick Start - Standalone HBase
2. Apache HBase Configuration
2.1. Basic Prerequisites
2.2. HBase run modes: Standalone and Distributed
2.3. Running and Confirming Your Installation
2.4. Configuration Files
2.5. Example Configurations
2.6. The Important Configurations
3. Upgrading
3.1. HBase version numbers
3.2. Upgrading from 0.98.x to 1.0.x
3.3. Upgrading from 0.96.x to 0.98.x
3.4. Upgrading from 0.94.x to 0.98.x
3.5. Upgrading from 0.94.x to 0.96.x
3.6. Upgrading from 0.92.x to 0.94.x
3.7. Upgrading from 0.90.x to 0.92.x
3.8. Upgrading to HBase 0.90.x from 0.20.x or 0.89.x
4. The Apache HBase Shell
4.1. Scripting
4.2. Shell Tricks
5. Data Model
5.1. Conceptual View
5.2. Physical View
5.3. Namespace
5.4. Table
5.5. Row
5.6. Column Family
5.7. Cells
5.8. Data Model Operations
5.9. Versions
5.10. Sort Order
5.11. Column Metadata
5.12. Joins
5.13. ACID
6. HBase and Schema Design
6.1. Schema Creation
6.2. On the number of column families
6.3. Rowkey Design
6.4. Number of Versions
6.5. Supported Datatypes
6.6. Joins
6.7. Time To Live (TTL)
6.8. Keeping Deleted Cells
6.9. Secondary Indexes and Alternate Query Paths
6.10. Constraints
6.11. Schema Design Case Studies
6.12. Operational and Performance Configuration Options
7. HBase and MapReduce
7.1. HBase, MapReduce, and the CLASSPATH
7.2. Bundled HBase MapReduce Jobs
7.3. HBase as a MapReduce Job Data Source and Data Sink
7.4. Writing HFiles Directly During Bulk Import
7.5. RowCounter Example
7.6. Map-Task Splitting
7.7. HBase MapReduce Examples
7.8. Accessing Other HBase Tables in a MapReduce Job
7.9. Speculative Execution
8. Secure Apache HBase
8.1. Secure Client Access to Apache HBase
8.2. Simple User Access to Apache HBase
8.3. Tags
8.4. Access Control
8.5. Secure Bulk Load
8.6. Visibility Labels
8.7. Transparent Server Side Encryption
9. Architecture
9.1. Overview
9.2. Catalog Tables
9.3. Client
9.4. Client Request Filters
9.5. Master
9.6. RegionServer
9.7. Regions
9.8. Bulk Loading
9.9. HDFS
9.10. Timeline-consistent High Available Reads
10. Apache HBase APIs
11. Apache HBase External APIs
11.1. Non-Java Languages Talking to the JVM
11.2. REST
11.3. Thrift
11.4. C/C++ Apache HBase Client
12. Thrift API and Filter Language
12.1. Filter Language
13. Apache HBase Coprocessors
13.1. Coprocessor Framework
13.2. Examples
13.3. Building A Coprocessor
13.4. Check the Status of a Coprocessor
13.5. Status of Coprocessors in HBase
14. Apache HBase Performance Tuning
14.1. Operating System
14.2. Network
14.3. Java
14.4. HBase Configurations
14.5. ZooKeeper
14.6. Schema Design
14.7. HBase General Patterns
14.8. Writing to HBase
14.9. Reading from HBase
14.10. Deleting from HBase
14.11. HDFS
14.12. Amazon EC2
14.13. Collocating HBase and MapReduce
14.14. Case Studies
15. Troubleshooting and Debugging Apache HBase
15.1. General Guidelines
15.2. Logs
15.3. Resources
15.4. Tools
15.5. Client
15.6. MapReduce
15.7. NameNode
15.8. Network
15.9. RegionServer
15.10. Master
15.11. ZooKeeper
15.12. Amazon EC2
15.13. HBase and Hadoop version issues
15.14. Running unit or integration tests
15.15. Case Studies
15.16. Cryptographic Features
15.17. Operating System Specific Issues
16. Apache HBase Case Studies
16.1. Overview
16.2. Schema Design
16.3. Performance/Troubleshooting
17. Apache HBase Operational Management
17.1. HBase Tools and Utilities
17.2. Region Management
17.3. Node Management
17.4. HBase Metrics
17.5. HBase Monitoring
17.6. Cluster Replication
17.7. HBase Backup
17.8. HBase Snapshots
17.9. Capacity Planning and Region Sizing
17.10. Table Rename
18. Building and Developing Apache HBase
18.1. Apache HBase Repositories
18.2. IDEs
18.3. Building Apache HBase
18.4. Releasing Apache HBase
18.5. Generating the HBase Reference Guide
18.6. Updating hbase.apache.org
18.7. Tests
18.8. Maven Build Commands
18.9. Getting Involved
18.10. Developing
18.11. Submitting Patches
19. ZooKeeper
19.1. Using existing ZooKeeper ensemble
19.2. SASL Authentication with ZooKeeper
20. Community
20.1. Decisions
20.2. Community Roles
20.3. Commit Message format
A. FAQ
B. hbck In Depth
B.1. Running hbck to identify inconsistencies
B.2. Inconsistencies
B.3. Localized repairs
B.4. Region Overlap Repairs
C. Compression and Data Block Encoding In HBase
C.1. Which Compressor or Data Block Encoder To Use
C.2. Compressor Configuration, Installation, and Use
C.3. Enable Data Block Encoding
D. YCSB: The Yahoo! Cloud Serving Benchmark and HBase
E. HFile format version 2
E.1. Motivation
E.2. HFile format version 1 overview
E.3. HBase file format with inline blocks (version 2)
F. Other Information About HBase
F.1. HBase Videos
F.2. HBase Presentations (Slides)
F.3. HBase Papers
F.4. HBase Sites
F.5. HBase Books
F.6. Hadoop Books
G. HBase History
H. HBase and the Apache Software Foundation
H.1. ASF Development Process
H.2. ASF Board Reporting
I. Enabling Dapper-like Tracing in HBase
I.1. SpanReceivers
I.2. Client Modifications
I.3. Tracing from HBase Shell
J. 0.95 RPC Specification
J.1. Goals
J.2. TODO
J.3. RPC
J.4. Notes
Index

List of Figures

9.1. HFile Version 1
C.1. ColumnFamily with No Encoding
C.2. ColumnFamily with Prefix Encoding
C.3. ColumnFamily with Diff Encoding

List of Tables

1.1. Distributed Cluster Demo Architecture
2.1. Java
2.2. Hadoop version support matrix
5.1. Table webtable
5.2. ColumnFamily anchor
5.3. ColumnFamily contents
8.1. Operation To Permission Mapping
8.2. ACL Matrix
9.1. Parameters Used by Compaction Algorithm

List of Examples

1.1. Example /etc/hosts File for Ubuntu
1.2. Example hbase-site.xml for Standalone HBase
1.3. node-a jps Output
1.4. node-b jps Output
1.5. node-c jps Output
2.1. Calculate the Potential Number of Open Files
2.2. Example Distributed HBase Cluster
5.1. Examples
5.2. Examples
8.1. Grant
8.2. Revoke
8.3. Alter
8.4. User Permission
9.1. Pre-Creating a HConnection
10.1. Create a Table Using Java
10.2. Add, Modify, and Delete a Table
12.1. Compound Operators
12.2. Precedence Example
12.3. Example 1
12.4. Example 2
12.5. Example 3
12.6. Example 4
13.1. Example RegionObserver Configuration
13.2. Load a Coprocessor On a Table Using HBase Shell
13.3. Unload a Coprocessor From a Table Using HBase Shell
C.1. Enabling Compression on a ColumnFamily of an Existing Table using HBase Shell
C.2. Creating a New Table with Compression On a ColumnFamily
C.3. Verifying a ColumnFamily's Compression Settings
C.4. LoadTestTool Usage
C.5. Example Usage of LoadTestTool
C.6. Enable Data Block Encoding On a Table
C.7. Verifying a ColumnFamily's Data Block Encoding
comments powered by Disqus