The Apache HBase™ Reference Guide

Revision History
Revision 0.99.0-SNAPSHOT 2014-03-17T15:52

Abstract

This is the official reference guide of Apache HBase™, a distributed, versioned, big data store built on top of Apache Hadoop™ and Apache ZooKeeper™.


Table of Contents

Preface
1. Getting Started
1.1. Introduction
1.2. Quick Start
2. Apache HBase Configuration
2.1. Basic Prerequisites
2.2. HBase run modes: Standalone and Distributed
2.3. Configuration Files
2.4. Example Configurations
2.5. The Important Configurations
3. Upgrading
3.1. HBase version numbers
3.2. Upgrading from 0.96.x to 0.98.x
3.3. Upgrading from 0.94.x to 0.96.x
3.4. Upgrading from 0.92.x to 0.94.x
3.5. Upgrading from 0.90.x to 0.92.x
3.6. Upgrading to HBase 0.90.x from 0.20.x or 0.89.x
4. The Apache HBase Shell
4.1. Scripting
4.2. Shell Tricks
5. Data Model
5.1. Conceptual View
5.2. Physical View
5.3. Namespace
5.4. Table
5.5. Row
5.6. Column Family
5.7. Cells
5.8. Data Model Operations
5.9. Versions
5.10. Sort Order
5.11. Column Metadata
5.12. Joins
5.13. ACID
6. HBase and Schema Design
6.1. Schema Creation
6.2. On the number of column families
6.3. Rowkey Design
6.4. Number of Versions
6.5. Supported Datatypes
6.6. Joins
6.7. Time To Live (TTL)
6.8. Keeping Deleted Cells
6.9. Secondary Indexes and Alternate Query Paths
6.10. Constraints
6.11. Schema Design Case Studies
6.12. Operational and Performance Configuration Options
7. HBase and MapReduce
7.1. Map-Task Splitting
7.2. HBase MapReduce Examples
7.3. Accessing Other HBase Tables in a MapReduce Job
7.4. Speculative Execution
8. Secure Apache HBase
8.1. Secure Client Access to Apache HBase
8.2. Simple User Access to Apache HBase
8.3. Tags
8.4. Access Control
8.5. Secure Bulk Load
8.6. Visibility Labels
8.7. Transparent Server Side Encryption
9. Architecture
9.1. Overview
9.2. Catalog Tables
9.3. Client
9.4. Client Request Filters
9.5. Master
9.6. RegionServer
9.7. Regions
9.8. Bulk Loading
9.9. HDFS
10. Apache HBase External APIs
10.1. Non-Java Languages Talking to the JVM
10.2. REST
10.3. Thrift
10.4. C/C++ Apache HBase Client
11. Apache HBase Coprocessors
12. Apache HBase Performance Tuning
12.1. Operating System
12.2. Network
12.3. Java
12.4. HBase Configurations
12.5. ZooKeeper
12.6. Schema Design
12.7. HBase General Patterns
12.8. Writing to HBase
12.9. Reading from HBase
12.10. Deleting from HBase
12.11. HDFS
12.12. Amazon EC2
12.13. Collocating HBase and MapReduce
12.14. Case Studies
13. Troubleshooting and Debugging Apache HBase
13.1. General Guidelines
13.2. Logs
13.3. Resources
13.4. Tools
13.5. Client
13.6. MapReduce
13.7. NameNode
13.8. Network
13.9. RegionServer
13.10. Master
13.11. ZooKeeper
13.12. Amazon EC2
13.13. HBase and Hadoop version issues
13.14. Running unit or integration tests
13.15. Case Studies
13.16. Cryptographic Features
14. Apache HBase Case Studies
14.1. Overview
14.2. Schema Design
14.3. Performance/Troubleshooting
15. Apache HBase Operational Management
15.1. HBase Tools and Utilities
15.2. Region Management
15.3. Node Management
15.4. HBase Metrics
15.5. HBase Monitoring
15.6. Cluster Replication
15.7. HBase Backup
15.8. HBase Snapshots
15.9. Capacity Planning and Region Sizing
15.10. Table Rename
16. Building and Developing Apache HBase
16.1. Apache HBase Repositories
16.2. IDEs
16.3. Building Apache HBase
16.4. Releasing Apache HBase
16.5. Generating the HBase Reference Guide
16.6. Updating hbase.apache.org
16.7. Tests
16.8. Maven Build Commands
16.9. Getting Involved
16.10. Developing
16.11. Submitting Patches
17. ZooKeeper
17.1. Using existing ZooKeeper ensemble
17.2. SASL Authentication with ZooKeeper
18. Apache HBase Coprocessors
19. Community
19.1. Decisions
19.2. Community Roles
19.3. Commit Message format
A. FAQ
B. hbck In Depth
B.1. Running hbck to identify inconsistencies
B.2. Inconsistencies
B.3. Localized repairs
B.4. Region Overlap Repairs
C. Compression In HBase
C.1. CompressionTest Tool
C.2. hbase.regionserver.codecs
C.3. LZO
C.4. GZIP
C.5. SNAPPY
C.6. Changing Compression Schemes
D. YCSB: The Yahoo! Cloud Serving Benchmark and HBase
E. HFile format version 2
E.1. Motivation
E.2. HFile format version 1 overview
E.3. HBase file format with inline blocks (version 2)
F. Other Information About HBase
F.1. HBase Videos
F.2. HBase Presentations (Slides)
F.3. HBase Papers
F.4. HBase Sites
F.5. HBase Books
F.6. Hadoop Books
G. HBase History
H. HBase and the Apache Software Foundation
H.1. ASF Development Process
H.2. ASF Board Reporting
I. Enabling Dapper-like Tracing in HBase
I.1. SpanReceivers
I.2. Client Modifications
I.3. Tracing from HBase Shell
J. 0.95 RPC Specification
J.1. Goals
J.2. TODO
J.3. RPC
J.4. Notes
Index

List of Tables

2.1. Hadoop version support matrix
5.1. Table webtable
5.2. ColumnFamily anchor
5.3. ColumnFamily contents
8.1. Operation To Permission Mapping
9.1.
15.1.
comments powered by Disqus