HFileWriterImpl (Apache HBase 2.5.0 API)

java.lang.Object
- org.apache.hadoop.hbase.io.hfile.HFileWriterImpl

All Implemented Interfaces:

Closeable, AutoCloseable, HFile.Writer, CellSink, ShipperListener
```
@InterfaceAudience.Private
public class HFileWriterImpl
extends Object
implements HFile.Writer
```
Common functionality needed by all versions of HFile writers.

Field Summary

Fields
Modifier and Type	Field and Description
`private List<HFileBlock.BlockWritable>`	`additionalLoadOnOpenData` Additional data items to be written to the "load-on-open" section.
`protected HFileDataBlockEncoder`	`blockEncoder` The data block encoding which will be used.
`protected HFileBlock.Writer`	`blockWriter` block writer
`protected CacheConfig`	`cacheConf` Cache configuration for caching data on write.
`protected boolean`	`closeOutputStream` True if we opened the `outputStream` (and so will close it).
`private HFileBlockIndex.BlockIndexWriter`	`dataBlockIndexWriter`
`private int`	`encodedBlockSizeLimit` Block size limit after encoding, used to unify encoded block Cache entry size
`protected long`	`entryCount` Total # of key/value entries, i.e.
`protected HFileInfo`	`fileInfo` A "file info" block: a key-value map of file-wide metadata.
`protected Cell`	`firstCellInBlock` First cell in a block.
`private long`	`firstDataBlockOffset` The offset of the first data block or -1 if the file is empty.
`protected HFileContext`	`hFileContext`
`private List<InlineBlockWriter>`	`inlineBlockWriters` Inline block writers for multi-level block index and compound Blooms.
`static int`	`KEY_VALUE_VER_WITH_MEMSTORE` Version for KeyValue which includes memstore timestamp
`static byte[]`	`KEY_VALUE_VERSION` KeyValue version in FileInfo
`protected Cell`	`lastCell` The Cell previously appended.
`private Cell`	`lastCellOfPreviousBlock` The last(stop) Cell of the previous data block.
`protected long`	`lastDataBlockOffset` The offset of the last data block or 0 if the file is empty.
`private static org.slf4j.Logger`	`LOG`
`protected long`	`maxMemstoreTS`
`private int`	`maxTagsLength`
`private HFileBlockIndex.BlockIndexWriter`	`metaBlockIndexWriter`
`protected List<org.apache.hadoop.io.Writable>`	`metaData` `Writable`s representing meta block data.
`protected List<byte[]>`	`metaNames` Meta block names.
`protected String`	`name` Name for this object used when logging or in toString.
`protected org.apache.hadoop.fs.FSDataOutputStream`	`outputStream` FileSystem stream to write into.
`protected org.apache.hadoop.fs.Path`	`path` May be null if we were passed a stream.
`protected long`	`totalKeyLength` Used for calculating the average key length.
`protected long`	`totalUncompressedBytes` Total uncompressed bytes, maybe calculate a compression ratio later.
`protected long`	`totalValueLength` Used for calculating the average value length.
`static String`	`UNIFIED_ENCODED_BLOCKSIZE_RATIO` if this feature is enabled, preCalculate encoded data size before real encoding happens
`private static long`	`UNSET`

Fields inherited from interface org.apache.hadoop.hbase.io.hfile.HFile.Writer
MAX_MEMSTORE_TS_KEY

Constructor Summary

Constructors
Constructor and Description
`HFileWriterImpl(org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.FSDataOutputStream outputStream, HFileContext fileContext)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`private void`	`addBloomFilter(BloomFilterWriter bfw, BlockType blockType)`
`void`	`addDeleteFamilyBloomFilter(BloomFilterWriter bfw)` Store delete family Bloom filter in the file, which is only supported in HFile V2.
`void`	`addGeneralBloomFilter(BloomFilterWriter bfw)` Store general Bloom filter in the file.
`void`	`addInlineBlockWriter(InlineBlockWriter ibw)` Adds an inline block writer such as a multi-level block index writer or a compound Bloom filter writer.
`void`	`append(Cell cell)` Add key/value to file.
`void`	`appendFileInfo(byte[] k, byte[] v)` Add to the file info.
`void`	`appendMetaBlock(String metaBlockName, org.apache.hadoop.io.Writable content)` Add a meta block to the end of the file.
`void`	`beforeShipped()` The action that needs to be performed before `Shipper.shipped()` is performed n
`protected void`	`checkBlockBoundary()` At a block boundary, write all the inline blocks and opens new block.
`protected boolean`	`checkKey(Cell cell)` Checks that the given Cell's key does not violate the key order.
`protected void`	`checkValue(byte[] value, int offset, int length)` Checks the given value for validity.
`void`	`close()`
`static Compression.Algorithm`	`compressionByName(String algoName)`
`protected static org.apache.hadoop.fs.FSDataOutputStream`	`createOutputStream(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, InetSocketAddress[] favoredNodes)` A helper method to create HFile output streams in constructors
`private void`	`doCacheOnWrite(long offset)` Caches the last written HFile block.
`private void`	`finishBlock()` Clean up the data block that is currently being written.
`protected void`	`finishClose(FixedFileTrailer trailer)`
`protected void`	`finishFileInfo()`
`protected void`	`finishInit(org.apache.hadoop.conf.Configuration conf)` Additional initialization steps
`HFileContext`	`getFileContext()` Return the file context for the HFile this writer belongs to
`Cell`	`getLastCell()`
`private String`	`getLexicalErrorMessage(Cell cell)`
`protected int`	`getMajorVersion()`
`static Cell`	`getMidpoint(CellComparator comparator, Cell left, Cell right)` Try to return a Cell that falls between `left` and `right` but that is shorter; i.e.
`private static byte[]`	`getMinimumMidpointArray(byte[] leftArray, int leftOffset, int leftLength, byte[] rightArray, int rightOffset, int rightLength)` Try to get a byte array that falls between left and right as short as possible with lexicographical order;
`private static byte[]`	`getMinimumMidpointArray(ByteBuffer left, int leftOffset, int leftLength, ByteBuffer right, int rightOffset, int rightLength)` Try to create a new byte array that falls between left and right as short as possible with lexicographical order.
`protected int`	`getMinorVersion()`
`org.apache.hadoop.fs.Path`	`getPath()` Returns Path or null if we were passed a stream rather than a Path.
`long`	`getPos()`
`protected void`	`newBlock()` Ready a new block for writing.
`String`	`toString()`
`protected void`	`writeFileInfo(FixedFileTrailer trailer, DataOutputStream out)` Sets the file info offset in the trailer, finishes up populating fields in the file info, and writes the file info into the given data output.
`private void`	`writeInlineBlocks(boolean closing)` Gives inline block writers an opportunity to contribute blocks.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - LOG
```
private static final org.slf4j.Logger LOG
```
  - UNSET
```
private static final long UNSET
```
    See Also:
    
    Constant Field Values
  - UNIFIED_ENCODED_BLOCKSIZE_RATIO
```
public static final String UNIFIED_ENCODED_BLOCKSIZE_RATIO
```
    if this feature is enabled, preCalculate encoded data size before real encoding happens
    
    See Also:
    
    Constant Field Values
  - encodedBlockSizeLimit
```
private final int encodedBlockSizeLimit
```
    Block size limit after encoding, used to unify encoded block Cache entry size
  - lastCell
```
protected Cell lastCell
```
    The Cell previously appended. Becomes the last cell in the file.
  - outputStream
```
protected org.apache.hadoop.fs.FSDataOutputStream outputStream
```
    FileSystem stream to write into.
  - closeOutputStream
```
protected final boolean closeOutputStream
```
    True if we opened the outputStream (and so will close it).
  - fileInfo
```
protected HFileInfo fileInfo
```
    A "file info" block: a key-value map of file-wide metadata.
  - entryCount
```
protected long entryCount
```
    Total # of key/value entries, i.e. how many times add() was called.
  - totalKeyLength
```
protected long totalKeyLength
```
    Used for calculating the average key length.
  - totalValueLength
```
protected long totalValueLength
```
    Used for calculating the average value length.
  - totalUncompressedBytes
```
protected long totalUncompressedBytes
```
    Total uncompressed bytes, maybe calculate a compression ratio later.
  - metaNames
```
protected List<byte[]> metaNames
```
    Meta block names.
  - metaData
```
protected List<org.apache.hadoop.io.Writable> metaData
```
    Writables representing meta block data.
  - firstCellInBlock
```
protected Cell firstCellInBlock
```
    First cell in a block. This reference should be short-lived since we write hfiles in a burst.
  - path
```
protected final org.apache.hadoop.fs.Path path
```
    May be null if we were passed a stream.
  - cacheConf
```
protected final CacheConfig cacheConf
```
    Cache configuration for caching data on write.
  - name
```
protected final String name
```
    Name for this object used when logging or in toString. Is either the result of a toString on stream or else name of passed file Path.
  - blockEncoder
```
protected final HFileDataBlockEncoder blockEncoder
```
    The data block encoding which will be used. NoOpDataBlockEncoder.INSTANCE if there is no encoding.
  - hFileContext
```
protected final HFileContext hFileContext
```
  - maxTagsLength
```
private int maxTagsLength
```
  - KEY_VALUE_VERSION
```
public static final byte[] KEY_VALUE_VERSION
```
    KeyValue version in FileInfo
  - KEY_VALUE_VER_WITH_MEMSTORE
```
public static final int KEY_VALUE_VER_WITH_MEMSTORE
```
    Version for KeyValue which includes memstore timestamp
    
    See Also:
    
    Constant Field Values
  - inlineBlockWriters
```
private List<InlineBlockWriter> inlineBlockWriters
```
    Inline block writers for multi-level block index and compound Blooms.
  - blockWriter
```
protected HFileBlock.Writer blockWriter
```
    block writer
  - dataBlockIndexWriter
```
private HFileBlockIndex.BlockIndexWriter dataBlockIndexWriter
```
  - metaBlockIndexWriter
```
private HFileBlockIndex.BlockIndexWriter metaBlockIndexWriter
```
  - firstDataBlockOffset
```
private long firstDataBlockOffset
```
    The offset of the first data block or -1 if the file is empty.
  - lastDataBlockOffset
```
protected long lastDataBlockOffset
```
    The offset of the last data block or 0 if the file is empty.
  - lastCellOfPreviousBlock
```
private Cell lastCellOfPreviousBlock
```
    The last(stop) Cell of the previous data block. This reference should be short-lived since we write hfiles in a burst.
  - additionalLoadOnOpenData
```
private List<HFileBlock.BlockWritable> additionalLoadOnOpenData
```
    Additional data items to be written to the "load-on-open" section.
  - maxMemstoreTS
```
protected long maxMemstoreTS
```
- Constructor Detail
  - HFileWriterImpl
```
public HFileWriterImpl(org.apache.hadoop.conf.Configuration conf,
                       CacheConfig cacheConf,
                       org.apache.hadoop.fs.Path path,
                       org.apache.hadoop.fs.FSDataOutputStream outputStream,
                       HFileContext fileContext)
```
- Method Detail
  - appendFileInfo
```
public void appendFileInfo(byte[] k,
                           byte[] v)
                    throws IOException
```
    Add to the file info. All added key/value pairs can be obtained using HFile.Reader.getHFileInfo().
    
    Specified by:
    
    appendFileInfo in interface HFile.Writer
    
    Parameters:
    
    k - Key
    
    v - Value
    
    Throws:
    
    IOException - in case the key or the value are invalid
  - writeFileInfo
```
protected final void writeFileInfo(FixedFileTrailer trailer,
                                   DataOutputStream out)
                            throws IOException
```
    Sets the file info offset in the trailer, finishes up populating fields in the file info, and writes the file info into the given data output. The reason the data output is not always outputStream is that we store file info as a block in version 2.
    
    Parameters:
    
    trailer - fixed file trailer
    
    out - the data output to write the file info to
    
    Throws:
    
    IOException
  - getPos
```
public long getPos()
            throws IOException
```
    Throws:
    
    IOException
  - checkKey
```
protected boolean checkKey(Cell cell)
                    throws IOException
```
    Checks that the given Cell's key does not violate the key order.
    
    Parameters:
    
    cell - Cell whose key to check.
    
    Returns:
    
    true if the key is duplicate
    
    Throws:
    
    IOException - if the key or the key order is wrong
  - getLexicalErrorMessage
```
private String getLexicalErrorMessage(Cell cell)
```
  - checkValue
```
protected void checkValue(byte[] value,
                          int offset,
                          int length)
                   throws IOException
```
    Checks the given value for validity.
    
    Throws:
    
    IOException
  - getPath
```
public org.apache.hadoop.fs.Path getPath()
```
    Returns Path or null if we were passed a stream rather than a Path.
    
    Specified by:
    
    getPath in interface HFile.Writer
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object
  - compressionByName
```
public static Compression.Algorithm compressionByName(String algoName)
```
  - createOutputStream
```
protected static org.apache.hadoop.fs.FSDataOutputStream createOutputStream(org.apache.hadoop.conf.Configuration conf,
                                                                            org.apache.hadoop.fs.FileSystem fs,
                                                                            org.apache.hadoop.fs.Path path,
                                                                            InetSocketAddress[] favoredNodes)
                                                                     throws IOException
```
    A helper method to create HFile output streams in constructors
    
    Throws:
    
    IOException
  - finishInit
```
protected void finishInit(org.apache.hadoop.conf.Configuration conf)
```
    Additional initialization steps
  - checkBlockBoundary
```
protected void checkBlockBoundary()
                           throws IOException
```
    At a block boundary, write all the inline blocks and opens new block.
    
    Throws:
    
    IOException
  - finishBlock
```
private void finishBlock()
                  throws IOException
```
    Clean up the data block that is currently being written.
    
    Throws:
    
    IOException
  - getMidpoint
```
public static Cell getMidpoint(CellComparator comparator,
                               Cell left,
                               Cell right)
```
    Try to return a Cell that falls between left and right but that is shorter; i.e. takes up less space. This trick is used building HFile block index. Its an optimization. It does not always work. In this case we'll just return the right cell.
    
    Returns:
    
    A cell that sorts between left and right.
  - getMinimumMidpointArray
```
private static byte[] getMinimumMidpointArray(byte[] leftArray,
                                              int leftOffset,
                                              int leftLength,
                                              byte[] rightArray,
                                              int rightOffset,
                                              int rightLength)
```
    Try to get a byte array that falls between left and right as short as possible with lexicographical order;
    
    Returns:
    
    Return a new array that is between left and right and minimally sized else just return null if left == right.
  - getMinimumMidpointArray
```
private static byte[] getMinimumMidpointArray(ByteBuffer left,
                                              int leftOffset,
                                              int leftLength,
                                              ByteBuffer right,
                                              int rightOffset,
                                              int rightLength)
```
    Try to create a new byte array that falls between left and right as short as possible with lexicographical order.
    
    Returns:
    
    Return a new array that is between left and right and minimally sized else just return null if left == right.
  - writeInlineBlocks
```
private void writeInlineBlocks(boolean closing)
                        throws IOException
```
    Gives inline block writers an opportunity to contribute blocks.
    
    Throws:
    
    IOException
  - doCacheOnWrite
```
private void doCacheOnWrite(long offset)
```
    Caches the last written HFile block.
    
    Parameters:
    
    offset - the offset of the block we want to cache. Used to determine the cache key.
  - newBlock
```
protected void newBlock()
                 throws IOException
```
    Ready a new block for writing.
    
    Throws:
    
    IOException
  - appendMetaBlock
```
public void appendMetaBlock(String metaBlockName,
                            org.apache.hadoop.io.Writable content)
```
    Add a meta block to the end of the file. Call before close(). Metadata blocks are expensive. Fill one with a bunch of serialized data rather than do a metadata block per metadata instance. If metadata is small, consider adding to file info using appendFileInfo(byte[], byte[]) n * name of the block n * will call readFields to get data later (DO NOT REUSE)
    
    Specified by:
    
    appendMetaBlock in interface HFile.Writer
  - close
```
public void close()
           throws IOException
```
    Specified by:
    
    close in interface Closeable
    
    Specified by:
    
    close in interface AutoCloseable
    
    Throws:
    
    IOException
  - addInlineBlockWriter
```
public void addInlineBlockWriter(InlineBlockWriter ibw)
```
    Description copied from interface: HFile.Writer
    
    Adds an inline block writer such as a multi-level block index writer or a compound Bloom filter writer.
    
    Specified by:
    
    addInlineBlockWriter in interface HFile.Writer
  - addGeneralBloomFilter
```
public void addGeneralBloomFilter(BloomFilterWriter bfw)
```
    Description copied from interface: HFile.Writer
    
    Store general Bloom filter in the file. This does not deal with Bloom filter internals but is necessary, since Bloom filters are stored differently in HFile version 1 and version 2.
    
    Specified by:
    
    addGeneralBloomFilter in interface HFile.Writer
  - addDeleteFamilyBloomFilter
```
public void addDeleteFamilyBloomFilter(BloomFilterWriter bfw)
```
    Description copied from interface: HFile.Writer
    
    Store delete family Bloom filter in the file, which is only supported in HFile V2.
    
    Specified by:
    
    addDeleteFamilyBloomFilter in interface HFile.Writer
  - addBloomFilter
```
private void addBloomFilter(BloomFilterWriter bfw,
                            BlockType blockType)
```
  - getFileContext
```
public HFileContext getFileContext()
```
    Description copied from interface: HFile.Writer
    
    Return the file context for the HFile this writer belongs to
    
    Specified by:
    
    getFileContext in interface HFile.Writer
  - append
```
public void append(Cell cell)
            throws IOException
```
    Add key/value to file. Keys must be added in an order that agrees with the Comparator passed on construction. n * Cell to add. Cannot be empty nor null.
    
    Specified by:
    
    append in interface CellSink
    
    Parameters:
    
    cell - the cell to be added n
    
    Throws:
    
    IOException
  - beforeShipped
```
public void beforeShipped()
                   throws IOException
```
    Description copied from interface: ShipperListener
    
    The action that needs to be performed before Shipper.shipped() is performed n
    
    Specified by:
    
    beforeShipped in interface ShipperListener
    
    Throws:
    
    IOException
  - getLastCell
```
public Cell getLastCell()
```
  - finishFileInfo
```
protected void finishFileInfo()
                       throws IOException
```
    Throws:
    
    IOException
  - getMajorVersion
```
protected int getMajorVersion()
```
  - getMinorVersion
```
protected int getMinorVersion()
```
  - finishClose
```
protected void finishClose(FixedFileTrailer trailer)
                    throws IOException
```
    Throws:
    
    IOException

Class HFileWriterImpl

Field Summary

Fields inherited from interface org.apache.hadoop.hbase.io.hfile.HFile.Writer

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

LOG

UNSET

UNIFIED_ENCODED_BLOCKSIZE_RATIO

encodedBlockSizeLimit

lastCell

outputStream

closeOutputStream

fileInfo

entryCount

totalKeyLength

totalValueLength

totalUncompressedBytes

metaNames

metaData

firstCellInBlock

path

cacheConf

name

blockEncoder

hFileContext

maxTagsLength

KEY_VALUE_VERSION

KEY_VALUE_VER_WITH_MEMSTORE

inlineBlockWriters

blockWriter

dataBlockIndexWriter

metaBlockIndexWriter

firstDataBlockOffset

lastDataBlockOffset

lastCellOfPreviousBlock

additionalLoadOnOpenData

maxMemstoreTS

Constructor Detail

HFileWriterImpl

Method Detail

appendFileInfo

writeFileInfo

getPos

checkKey

getLexicalErrorMessage

checkValue

getPath

toString

compressionByName

createOutputStream

finishInit

checkBlockBoundary

finishBlock

getMidpoint

getMinimumMidpointArray

getMinimumMidpointArray

writeInlineBlocks

doCacheOnWrite

newBlock

appendMetaBlock

close

addInlineBlockWriter

addGeneralBloomFilter

addDeleteFamilyBloomFilter

addBloomFilter

getFileContext

append

beforeShipped

getLastCell

finishFileInfo

getMajorVersion

getMinorVersion

finishClose