Package org.apache.hadoop.hbase.io.hfile
Class HFileBlockIndex.BlockIndexWriter
java.lang.Object
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex.BlockIndexWriter
- All Implemented Interfaces:
InlineBlockWriter
- Enclosing class:
- HFileBlockIndex
Writes the block index into the output stream. Generate the tree from bottom up. The leaf level
is written to disk as a sequence of inline blocks, if it is larger than a certain number of
bytes. If the leaf level is not large enough, we write all entries to the root level instead.
After all leaf blocks have been written, we end up with an index referencing the resulting leaf
index blocks. If that index is larger than the allowed root index size, the writer will break
it up into reasonable-size intermediate-level index block chunks write those chunks out, and
create another index referencing those chunks. This will be repeated until the remaining index
is small enough to become the root index. However, in most practical cases we will only have
leaf-level blocks and the root index, or just the root index.
-
Field Summary
Modifier and TypeFieldDescriptionprivate HFileBlock.Writer
private CacheConfig
CacheConfig, or null if cache-on-write is disabledprivate BlockIndexChunk
Current leaf-level chunk.private byte[]
private HFileIndexBlockEncoder
Type of encoding used for index blocks in HFileprivate int
The maximum size guideline of all multi-level index blocks.private int
The maximum level of multi-level index blocksprivate String
Name to use for computing cache keysprivate int
The number of block index levels.private BlockIndexChunk
While the index is being written, this represents the current block index referencing all leaf blocks, with one exception.private boolean
Whether we require this block index to always be single-level.private long
Total compressed size of all index blocks.private long
Total uncompressed size of all index blocks.private long
The total number of leaf-level entries, i.e. -
Constructor Summary
ConstructorDescriptionCreates a single-level block index writerBlockIndexWriter
(HFileBlock.Writer blockWriter, CacheConfig cacheConf, String nameForCaching, HFileIndexBlockEncoder indexBlockEncoder) Creates a multi-level block index writer. -
Method Summary
Modifier and TypeMethodDescriptionvoid
addEntry
(byte[] firstKey, long blockOffset, int blockDataSize) Add one index entry to the current leaf-level block.void
blockWritten
(long offset, int onDiskSize, int uncompressedSize) Called after an inline block has been written so that we can add an entry referring to that block to the parent-level index.void
private void
expectNumLevels
(int expectedNumLevels) boolean
Returns true if inline blocks produced by this writer should be cachedThe type of blocks this block writer produces.int
Returns the number of levels in this block index.final int
Returns how many block index entries there are in the root levellong
The total uncompressed size of the root index block, intermediate-level index blocks, and leaf-level index blocks.void
setMaxChunkSize
(int maxChunkSize) void
setMinIndexNumEntries
(int minIndexNumEntries) boolean
shouldWriteBlock
(boolean closing) Whether there is an inline block ready to be written.long
writeIndexBlocks
(org.apache.hadoop.fs.FSDataOutputStream out) Writes the root level and intermediate levels of the block index into the output stream, generating the tree from bottom up.void
Write out the current inline index block.private void
writeIntermediateBlock
(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk parent, BlockIndexChunk curChunk) private BlockIndexChunk
writeIntermediateLevel
(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk currentLevel) Split the current level of the block index into intermediate index blocks of permitted size and write those blocks to disk.void
writeSingleLevelIndex
(DataOutput out, String description) Writes the block index data as a single level only.
-
Field Details
-
rootChunk
While the index is being written, this represents the current block index referencing all leaf blocks, with one exception. If the file is being closed and there are not enough blocks to complete even a single leaf block, no leaf blocks get written and this contains the entire block index. After all levels of the index were written bywriteIndexBlocks(FSDataOutputStream)
, this contains the final root-level index. -
curInlineChunk
Current leaf-level chunk. New entries referencing data blocks get added to this chunk until it grows large enough to be written to disk. -
numLevels
The number of block index levels. This is one if there is only root level (even empty), two if there a leaf level and root level, and is higher if there are intermediate levels. This is only final afterwriteIndexBlocks(FSDataOutputStream)
has been called. The initial value accounts for the root level, and will be increased to two as soon as we find out there is a leaf-level inblockWritten(long, int, int)
. -
blockWriter
-
firstKey
-
totalNumEntries
The total number of leaf-level entries, i.e. entries referenced by leaf-level blocks. For the data block index this is equal to the number of data blocks. -
totalBlockOnDiskSize
Total compressed size of all index blocks. -
totalBlockUncompressedSize
Total uncompressed size of all index blocks. -
maxChunkSize
The maximum size guideline of all multi-level index blocks. -
minIndexNumEntries
The maximum level of multi-level index blocks -
singleLevelOnly
Whether we require this block index to always be single-level. -
cacheConf
CacheConfig, or null if cache-on-write is disabled -
nameForCaching
Name to use for computing cache keys -
indexBlockEncoder
Type of encoding used for index blocks in HFile
-
-
Constructor Details
-
BlockIndexWriter
public BlockIndexWriter()Creates a single-level block index writer -
BlockIndexWriter
public BlockIndexWriter(HFileBlock.Writer blockWriter, CacheConfig cacheConf, String nameForCaching, HFileIndexBlockEncoder indexBlockEncoder) Creates a multi-level block index writer.- Parameters:
blockWriter
- the block writer to use to write index blockscacheConf
- used to determine when and how a block should be cached-on-write.
-
-
Method Details
-
setMaxChunkSize
-
setMinIndexNumEntries
-
writeIndexBlocks
Writes the root level and intermediate levels of the block index into the output stream, generating the tree from bottom up. Assumes that the leaf level has been inline-written to the disk if there is enough data for more than one leaf block. We iterate by breaking the current level of the block index, starting with the index of all leaf-level blocks, into chunks small enough to be written to disk, and generate its parent level, until we end up with a level small enough to become the root level. If the leaf level is not large enough, there is no inline block index anymore, so we only write that level of block index to disk as the root level.- Parameters:
out
- FSDataOutputStream- Returns:
- position at which we entered the root-level index.
- Throws:
IOException
-
writeSingleLevelIndex
Writes the block index data as a single level only. Does not do any block framing.- Parameters:
out
- the buffered output stream to write the index to. Typically a stream writing into anHFile
block.description
- a short description of the index being written. Used in a log message.- Throws:
IOException
-
writeIntermediateLevel
private BlockIndexChunk writeIntermediateLevel(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk currentLevel) throws IOException Split the current level of the block index into intermediate index blocks of permitted size and write those blocks to disk. Return the next level of the block index referencing those intermediate-level blocks.- Parameters:
currentLevel
- the current level of the block index, such as the a chunk referencing all leaf-level index blocks- Returns:
- the parent level block index, which becomes the root index after a few (usually zero) iterations
- Throws:
IOException
-
writeIntermediateBlock
private void writeIntermediateBlock(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk parent, BlockIndexChunk curChunk) throws IOException - Throws:
IOException
-
getNumRootEntries
Returns how many block index entries there are in the root level -
getNumLevels
Returns the number of levels in this block index. -
expectNumLevels
-
shouldWriteBlock
Whether there is an inline block ready to be written. In general, we write an leaf-level index block as an inline block as soon as its size as serialized in the non-root format reaches a certain threshold.- Specified by:
shouldWriteBlock
in interfaceInlineBlockWriter
-
writeInlineBlock
Write out the current inline index block. Inline blocks are non-root blocks, so the non-root index format is used.- Specified by:
writeInlineBlock
in interfaceInlineBlockWriter
- Throws:
IOException
-
blockWritten
Called after an inline block has been written so that we can add an entry referring to that block to the parent-level index.- Specified by:
blockWritten
in interfaceInlineBlockWriter
- Parameters:
offset
- the offset of the block in the streamonDiskSize
- the on-disk size of the blockuncompressedSize
- the uncompressed size of the block
-
getInlineBlockType
Description copied from interface:InlineBlockWriter
The type of blocks this block writer produces.- Specified by:
getInlineBlockType
in interfaceInlineBlockWriter
-
addEntry
Add one index entry to the current leaf-level block. When the leaf-level block gets large enough, it will be flushed to disk as an inline block. -
ensureSingleLevel
- Throws:
IOException
- if we happened to write a multi-level index.
-
getCacheOnWrite
Description copied from interface:InlineBlockWriter
Returns true if inline blocks produced by this writer should be cached- Specified by:
getCacheOnWrite
in interfaceInlineBlockWriter
- Returns:
- true if we are using cache-on-write. This is configured by the caller of the constructor by either passing a valid block cache or null.
-
getTotalUncompressedSize
The total uncompressed size of the root index block, intermediate-level index blocks, and leaf-level index blocks.- Returns:
- the total uncompressed size of all index blocks
-