Chapter 3. Upgrading

Table of Contents

3.1. HBase version number and compatibility
3.1.1. Post 1.0 versions
3.1.2. Pre 1.0 versions
3.1.3. Rolling Upgrades
3.2. Upgrading from 0.98.x to 1.0.x
3.2.1. Changes of Note!
3.2.2. Rolling upgrade from 0.98.x to HBase 1.0.0
3.2.3. Upgrading to 1.0 from 0.94
3.3. Upgrading from 0.96.x to 0.98.x
3.4. Upgrading from 0.94.x to 0.98.x
3.5. Upgrading from 0.94.x to 0.96.x
3.5.1. Executing the 0.96 Upgrade
3.6. Upgrading from 0.92.x to 0.94.x
3.7. Upgrading from 0.90.x to 0.92.x
3.7.1. You can’t go back!
3.7.2. MSLAB is ON by default
3.7.3. Distributed Log Splitting is on by default
3.7.4. Memory accounting is different now
3.7.5. On the Hadoop version to use
3.7.6. HBase 0.92.0 ships with ZooKeeper 3.4.2
3.7.7. Online alter is off by default
3.7.8. WebUI
3.7.9. Security tarball
3.7.10. Changes in HBase replication
3.7.11. RegionServer now aborts if OOME
3.7.12. HFile V2 and the “Bigger, Fewer” Tendency
3.8. Upgrading to HBase 0.90.x from 0.20.x or 0.89.x

You cannot skip major versions upgrading. If you are upgrading from version 0.90.x to 0.94.x, you must first go from 0.90.x to 0.92.x and then go from 0.92.x to 0.94.x.

Note

It may be possible to skip across versions -- for example go from 0.92.2 straight to 0.98.0 just following the 0.96.x upgrade instructions -- but we have not tried it so cannot say whether it works or not.

Review Chapter 2, Apache HBase Configuration, in particular the section on Hadoop version.

3.1. HBase version number and compatibility

HBase has two versioning schemes, pre-1.0 and post-1.0. Both are detailed below.

3.1.1. Post 1.0 versions

Starting with 1.0.0 release, HBase uses Semantic Versioning for it release versioning. In summary:

Given a version number MAJOR.MINOR.PATCH, increment the:

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality in a backwards-compatible manner, and
  • PATCH version when you make backwards-compatible bug fixes.
  • Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

3.1.1.1. Compatibility Dimensions

In addition to the usual API versioning considerations HBase has other compatibility dimensions that we need to consider.

3.1.1.1.1. Client-Server wire protocol compatibility
  • Allows updating client and server out of sync.
  • We could only allow upgrading the server first. I.e. the server would be backward compatible to an old client, that way new APIs are OK.
  • Example: A user should be able to use an old client to connect to an upgraded cluster.
3.1.1.1.2. Server-Server protocol compatibility
  • Servers of different versions can co-exist in the same cluster.
  • The wire protocol between servers is compatible.
  • Workers for distributed tasks, such as replication and log splitting, can co-exist in the same cluster.
  • Dependent protocols (such as using ZK for coordination) will also not be changed.
  • Example: A user can perform a rolling upgrade.
3.1.1.1.3. File format compatibility
  • Support file formats backward and forward compatible
  • Example: File, ZK encoding, directory layout is upgraded automatically as part of an HBase upgrade. User can rollback to the older version and everything will continue to work.
3.1.1.1.4. Client API compatibility
  • Allow changing or removing existing client APIs.
  • An API needs to deprecated for a major version before we will change/remove it.
  • Example: A user using a newly deprecated api does not need to modify application code with hbase api calls until the next major version.
3.1.1.1.5. Client Binary compatibility
  • Old client code can run unchanged (no recompilation needed) against new jars.
  • Example: Old compiled client code will work unchanged with the new jars.
3.1.1.1.6. Server-Side Limited API compatibility (taken from Hadoop)
  • Internal APIs are marked as Stable, Evolving, or Unstable
  • This implies binary compatibility for coprocessors and plugins (pluggable classes, including replication) as long as these are only using marked interfaces/classes.
  • Example: Old compiled Coprocessor, Filter, or Plugin code will work unchanged with the new jars.
3.1.1.1.7. Dependency Compatibility
  • An upgrade of HBase will not require an incompatible upgrade of a dependent project, including the Java runtime.
  • Example: An upgrade of Hadoop will not invalidate any of the compatibilities guarantees we made.
3.1.1.1.8. Operational Compatibility
  • Metric changes
  • Behavioral changes of services
  • Web page APIs
3.1.1.1.9. Summary
  • A patch upgrade is a drop-in replacement. Any change that is not Java binary compatible would not be allowed.[1]
  • A minor upgrade requires no application/client code modification. Ideally it would be a drop-in replacement but client code, coprocessors, filters, etc might have to be recompiled if new jars are used.
  • A major upgrade allows the HBase community to make breaking changes.
3.1.1.1.10. Compatibility Matrix [2]

(Y means we support the compatibility. N means we can break it.)

Table 3.1. Compatibility Matrix

MajorMinorPatch
Client-Server wire CompatibilityNYY
Server-Server CompatibilityNYY
File Format CompatibilityN[a]YY
Client API CompatibilityNYY
Client Binary CompatibilityNNY
Server-Side Limited API Compatibility   
  • Stable
NYY
  • Evolving
NNY
  • Unstable
NNN
Dependency CompatibilityNYY
Operational CompatibilityNNY

[a] Running an offline upgrade tool without rollback might be needed. We will typically only support migrating data from major version X to major version X+1.


3.1.1.1.11. HBase API surface

HBase has a lot of API points, but for the compatibility matrix above, we differentiate between Client API, Limited Private API, and Private API. HBase uses a version of Hadoop's Interface classification. HBase's Interface classification classes can be found here.

  • InterfaceAudience: captures the intended audience, possible values are Public (for end users and external projects), LimitedPrivate (for other Projects, Coprocessors or other plugin points), and Private (for internal use).
  • InterfaceStability: describes what types of interface changes are permitted. Possible values are Stable, Evolving, Unstable, and Deprecated.

3.1.1.1.11.1. HBase Client API

HBase Client API consists of all the classes or methods that are marked with InterfaceAudience.Public interface. All main classes in hbase-client and dependent modules have either InterfaceAudience.Public, InterfaceAudience.LimitedPrivate, or InterfaceAudience.Private marker. Not all classes in other modules (hbase-server, etc) have the marker. If a class is not annotated with one of these, it is assumed to be a InterfaceAudience.Private class.

3.1.1.1.11.2. HBase LimitedPrivate API

LimitedPrivate annotation comes with a set of target consumers for the interfaces. Those consumers are coprocessors, phoenix, replication endpoint implemnetations or similar. At this point, HBase only guarantees source and binary compatibility for these interfaces between patch versions.

3.1.1.1.11.3. HBase Private API

All classes annotated with InterfaceAudience.Private or all classes that do not have the annotation are for HBase internal use only. The interfaces and method signatures can change at any point in time. If you are relying on a particular interface that is marked Private, you should open a jira to propose changing the interface to be Public or LimitedPrivate, or an interface exposed for this purpose.

3.1.2. Pre 1.0 versions

Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop's versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on HBase Versioning which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0.

3.1.2.1. Odd/Even Versioning or "Development"" Series Releases

Ahead of big releases, we have been putting up preview versions to start the feedback cycle turning-over earlier. These "Development" Series releases, always odd-numbered, come with no guarantees, not even regards being able to upgrade between two sequential releases (we reserve the right to break compatibility across "Development" Series releases). Needless to say, these releases are not for production deploys. They are a preview of what is coming in the hope that interested parties will take the release for a test drive and flag us early if we there are issues we've missed ahead of our rolling a production-worthy release.

Our first "Development" Series was the 0.89 set that came out ahead of HBase 0.90.0. HBase 0.95 is another "Development" Series that portends HBase 0.96.0. 0.99.x is the last series in "developer preview" mode before 1.0. Afterwards, we will be using semantic versioning naming scheme (see above).

3.1.2.2. Binary Compatibility

When we say two HBase versions are compatible, we mean that the versions are wire and binary compatible. Compatible HBase versions means that clients can talk to compatible but differently versioned servers. It means too that you can just swap out the jars of one version and replace them with the jars of another, compatible version and all will just work. Unless otherwise specified, HBase point versions are (mostly) binary compatible. You can safely do rolling upgrades between binary compatible versions; i.e. across point versions: e.g. from 0.94.5 to 0.94.6. See Does compatibility between versions also mean binary compatibility? discussion on the hbaes dev mailing list.

3.1.3. Rolling Upgrades

A rolling upgrade is the process by which you update the servers in your cluster a server at a time. You can rolling upgrade across HBase versions if they are binary or wire compatible. See <xlnk></xlnk> for more on what this means. Coarsely, a rolling upgrade is a graceful stop each server, update the software, and then restart. You do this for each server in the cluster. Usually you upgrade the Master first and then the regionservers. See <xlink></xlink> for tools that can help use the rolling upgrade process.

For example, in the below, hbase was symlinked to the actual hbase install. On upgrade, before running a rolling restart over the cluser, we changed the symlink to point at the new HBase software version and then ran

$ HADOOP_HOME=~/hadoop-2.6.0-CRC-SNAPSHOT ~/hbase/bin/rolling-restart.sh --config ~/conf_hbase

The rolling-restart script will first gracefully stop and restart the master, and then each of the regionservers in turn. Because the symlink was changed, on restart the server will come up using the new hbase version. Check logs for errors as the rolling upgrade proceeds.

3.1.3.1. Rolling Upgrade between versions that are Binary/Wire compatibile

Unless otherwise specified, HBase point versions are binary compatible. You can do a <xlink></xlink> between hbase point versions. For example, you can go to 0.94.6 from 0.94.5 by doing a rolling upgrade across the cluster replacing the 0.94.5 binary with a 0.94.6 binary.

In the minor version-particular sections below, we call out where the versions are wire/protocol compatible and in this case, it is also possible to do a <xlink></xlink>. For example, in <xlink></xlink>, we state that it is possible to do a rolling upgrade between hbase-0.98.x and hbase-1.0.0.



[1] http://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html

[2] Note that this indicates what could break, not that it will break. We will/should add specifics in our release notes.

comments powered by Disqus