@InterfaceAudience.Private public class KeyValueHeap extends NonReversedNonLazyKeyValueScanner implements KeyValueScanner, InternalScanner
Implements KeyValueScanner itself.
This class is used at the Region level to merge across Stores and at the Store level to merge across the memstore and StoreFiles.
In the Region case, we also need InternalScanner.next(List), so this class also implements InternalScanner. WARNING: As is, if you try to use this as an InternalScanner at the Store level, you will get runtime exceptions.
Modifier and Type | Class and Description |
---|---|
protected static class |
KeyValueHeap.KVScannerComparator |
Modifier and Type | Field and Description |
---|---|
protected KeyValueHeap.KVScannerComparator |
comparator |
protected KeyValueScanner |
current
The current sub-scanner, i.e.
|
protected PriorityQueue<KeyValueScanner> |
heap |
private static org.slf4j.Logger |
LOG |
protected List<KeyValueScanner> |
scannersForDelayedClose |
NO_NEXT_INDEXED_KEY
Constructor and Description |
---|
KeyValueHeap(List<? extends KeyValueScanner> scanners,
CellComparator comparator)
Constructor.
|
KeyValueHeap(List<? extends KeyValueScanner> scanners,
KeyValueHeap.KVScannerComparator comparator)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Close the KeyValue scanner.
|
private boolean |
generalizedSeek(boolean isLazy,
Cell seekKey,
boolean forward,
boolean useBloom) |
(package private) KeyValueScanner |
getCurrentForTesting() |
PriorityQueue<KeyValueScanner> |
getHeap()
Returns the current Heap
|
Cell |
getNextIndexedKey() |
(package private) boolean |
isLatestCellFromMemstore() |
Cell |
next()
Return the next Cell in this scanner, iterating the scanner
|
boolean |
next(List<Cell> result,
ScannerContext scannerContext)
Gets the next row of keys from the top-most scanner.
|
Cell |
peek()
Look at the next Cell in this scanner, but do not iterate scanner.
|
protected KeyValueScanner |
pollRealKV()
Fetches the top sub-scanner from the priority queue, ensuring that a real seek has been done on
it.
|
boolean |
requestSeek(Cell key,
boolean forward,
boolean useBloom)
Similar to
KeyValueScanner.seek(org.apache.hadoop.hbase.Cell) (or KeyValueScanner.reseek(org.apache.hadoop.hbase.Cell) if forward is true) but only does a seek operation
after checking that it is really necessary for the row/column combination specified by the kv
parameter. |
boolean |
reseek(Cell seekKey)
This function is identical to the
seek(Cell) function except that
scanner.seek(seekKey) is changed to scanner.reseek(seekKey). |
boolean |
seek(Cell seekKey)
Seeks all scanners at or below the specified seek key.
|
void |
shipped()
Called after a batch of rows scanned and set to be returned to client.
|
backwardSeek, seekToLastRow, seekToPreviousRow
doRealSeek, enforceSeek, getFilePath, isFileScanner, realSeekDone, shouldUseScanner
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
backwardSeek, enforceSeek, getFilePath, getScannerOrder, isFileScanner, realSeekDone, seekToLastRow, seekToPreviousRow, shouldUseScanner
next
private static final org.slf4j.Logger LOG
protected PriorityQueue<KeyValueScanner> heap
protected List<KeyValueScanner> scannersForDelayedClose
protected KeyValueScanner current
heap
(but we frequently add it back to the heap and
pull the new winner out). We maintain an invariant that the current sub-scanner has already
done a real seek, and that current.peek() is always a real key/value (or null) except for the
fake last-key-on-row-column supplied by the multi-column Bloom filter optimization, which is OK
to propagate to StoreScanner. In order to ensure that, always use pollRealKV()
to
update current.protected KeyValueHeap.KVScannerComparator comparator
public KeyValueHeap(List<? extends KeyValueScanner> scanners, CellComparator comparator) throws IOException
IOException
KeyValueHeap(List<? extends KeyValueScanner> scanners, KeyValueHeap.KVScannerComparator comparator) throws IOException
IOException
public Cell peek()
KeyValueScanner
peek
in interface KeyValueScanner
boolean isLatestCellFromMemstore()
public Cell next() throws IOException
KeyValueScanner
next
in interface KeyValueScanner
IOException
public boolean next(List<Cell> result, ScannerContext scannerContext) throws IOException
This method takes care of updating the heap.
This can ONLY be called when you are using Scanners that implement InternalScanner as well as
KeyValueScanner (a StoreScanner
).
next
in interface InternalScanner
result
- return output array n * @return true if more rows exist after this one, false if
scanner is doneIOException
- epublic void close()
KeyValueScanner
close
in interface Closeable
close
in interface AutoCloseable
close
in interface InternalScanner
close
in interface KeyValueScanner
public boolean seek(Cell seekKey) throws IOException
As individual scanners may run past their ends, those scanners are automatically closed and removed from the heap.
This function (and reseek(Cell)
) does not do multi-column Bloom filter and lazy-seek
optimizations. To enable those, call requestSeek(Cell, boolean, boolean)
.
seek
in interface KeyValueScanner
seekKey
- KeyValue to seek at or afterIOException
public boolean reseek(Cell seekKey) throws IOException
seek(Cell)
function except that
scanner.seek(seekKey) is changed to scanner.reseek(seekKey).reseek
in interface KeyValueScanner
seekKey
- seek value (should be non-null)IOException
public boolean requestSeek(Cell key, boolean forward, boolean useBloom) throws IOException
KeyValueScanner.seek(org.apache.hadoop.hbase.Cell)
(or KeyValueScanner.reseek(org.apache.hadoop.hbase.Cell)
if forward is true) but only does a seek operation
after checking that it is really necessary for the row/column combination specified by the kv
parameter. This function was added to avoid unnecessary disk seeks by checking row-column Bloom
filters before a seek on multi-column get/scan queries, and to optimize by looking up more
recent files first.requestSeek
in interface KeyValueScanner
requestSeek
in class NonLazyKeyValueScanner
forward
- do a forward-only "reseek" instead of a random-access seekuseBloom
- whether to enable multi-column Bloom filter optimizationIOException
private boolean generalizedSeek(boolean isLazy, Cell seekKey, boolean forward, boolean useBloom) throws IOException
isLazy
- whether we are trying to seek to exactly the given row/col. Enables Bloom
filter and most-recent-file-first optimizations for multi-column get/scan
queries.seekKey
- key to seek toforward
- whether to seek forward (also known as reseek)useBloom
- whether to optimize seeks using Bloom filtersIOException
protected KeyValueScanner pollRealKV() throws IOException
IOException
public PriorityQueue<KeyValueScanner> getHeap()
KeyValueScanner getCurrentForTesting()
public Cell getNextIndexedKey()
getNextIndexedKey
in interface KeyValueScanner
getNextIndexedKey
in class NonLazyKeyValueScanner
public void shipped() throws IOException
Shipper
shipped
in interface Shipper
shipped
in class NonLazyKeyValueScanner
IOException
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.