Package org.apache.hadoop.hbase.filter
Class FuzzyRowFilter
java.lang.Object
org.apache.hadoop.hbase.filter.Filter
org.apache.hadoop.hbase.filter.FilterBase
org.apache.hadoop.hbase.filter.FuzzyRowFilter
- All Implemented Interfaces:
HintingFilter
This is optimized version of a standard FuzzyRowFilter Filters data based on fuzzy row key.
Performs fast-forwards during scanning. It takes pairs (row key, fuzzy info) to match row keys.
Where fuzzy info is a byte array with 0 or 1 as its values:
- 0 - means that this byte in provided row key is fixed, i.e. row key's byte at same position must match
- 1 - means that this byte in provided row key is NOT fixed, i.e. row key's byte at this position can be different from the one in provided row key
row key = "????_99_????_01" (one can use any value instead of "?") fuzzy info = "\x01\x01\x01\x01\x00\x00\x00\x00\x01\x01\x01\x01\x00\x00\x00"I.e. fuzzy info tells the matching mask is "????_99_????_01", where at ? can be any value.
-
Nested Class Summary
Modifier and TypeClassDescriptionprivate static enum
Abstracts directional comparisons based on scan direction.private class
If we have multiple fuzzy keys, row tracker should improve overall performance.(package private) static enum
Nested classes/interfaces inherited from class org.apache.hadoop.hbase.filter.Filter
Filter.ReturnCode
-
Field Summary
Modifier and TypeFieldDescriptionprivate boolean
private boolean
private int
The index of a last successfully found matching fuzzy string (in fuzzyKeysData).private final FuzzyRowFilter.RowTracker
Row tracker (keeps all next rows after SEEK_NEXT_USING_HINT was returned)private static final boolean
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescription(package private) boolean
Returns true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other.boolean
boolean
Filters that never filter all remaining can inherit this implementation that never stops the filter early.filterCell
(Cell c) A way to filter based on the column family, column qualifier and/or the column value.boolean
Filters that never filter by rows based on previously gathered state fromFilter.filterCell(Cell)
can inherit this implementation that never filters a row.Returns the Fuzzy keys in the format expected by the constructor.getNextCellHint
(Cell currentCell) Filters that are not sure which key must be next seeked to, can inherit this implementation that, by default, returns a null Cell.(package private) static byte[]
getNextForFuzzyRule
(boolean reverse, byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) (package private) static byte[]
getNextForFuzzyRule
(boolean reverse, byte[] row, int offset, int length, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) Find out the closes next byte array that satisfies fuzzy rule and is after the given one.(package private) static byte[]
getNextForFuzzyRule
(byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) int
hashCode()
private boolean
isPreprocessedMask
(byte[] mask) static FuzzyRowFilter
parseFrom
(byte[] pbBytes) Parse a serialized representation ofFuzzyRowFilter
private byte[]
preprocessMask
(byte[] mask) We need to preprocess mask array, as since we treat 2's as unfixed positions and -1 (0xff) as fixed positionsprivate void
preprocessSearchKey
(Pair<byte[], byte[]> p) void
reset()
Filters that are purely stateless and do nothing in their reset() methods can inherit this null/empty implementation.(package private) static FuzzyRowFilter.SatisfiesCode
satisfies
(boolean reverse, byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) (package private) static FuzzyRowFilter.SatisfiesCode
satisfies
(boolean reverse, byte[] row, int offset, int length, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) (package private) static FuzzyRowFilter.SatisfiesCode
satisfies
(byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) (package private) static FuzzyRowFilter.SatisfiesCode
satisfiesNoUnsafe
(boolean reverse, byte[] row, int offset, int length, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) byte[]
Returns The filter serialized using pbtoString()
Return filter's info for debugging and logging purpose.private static byte[]
trimTrailingZeroes
(byte[] result, byte[] fuzzyKeyMeta, int toInc) For forward scanner, next cell hint should not contain any trailing zeroes unless they are part of fuzzyKeyMeta hint = '\x01\x01\x01\x00\x00' will skip valid row '\x01\x01\x01'Methods inherited from class org.apache.hadoop.hbase.filter.FilterBase
createFilterFromArguments, filterRowCells, filterRowKey, hasFilterRow, isFamilyEssential, transformCell
Methods inherited from class org.apache.hadoop.hbase.filter.Filter
isReversed, setReversed
-
Field Details
-
UNSAFE_UNALIGNED
-
fuzzyKeysData
-
filterRow
-
done
-
lastFoundIndex
The index of a last successfully found matching fuzzy string (in fuzzyKeysData). We will start matching next KV with this one. If they do not match then we will return back to the one-by-one iteration over fuzzyKeysData. -
tracker
Row tracker (keeps all next rows after SEEK_NEXT_USING_HINT was returned)
-
-
Constructor Details
-
FuzzyRowFilter
-
-
Method Details
-
preprocessSearchKey
-
preprocessMask
We need to preprocess mask array, as since we treat 2's as unfixed positions and -1 (0xff) as fixed positions- Returns:
- mask array
-
isPreprocessedMask
-
getFuzzyKeys
Returns the Fuzzy keys in the format expected by the constructor.- Returns:
- the Fuzzy keys in the format expected by the constructor
-
reset
Description copied from class:FilterBase
Filters that are purely stateless and do nothing in their reset() methods can inherit this null/empty implementation. Reset the state of the filter between rows. Concrete implementers can signal a failure condition in their code by throwing anIOException
.- Overrides:
reset
in classFilterBase
- Throws:
IOException
- in case an I/O or an filter specific failure needs to be signaled.
-
filterRow
Description copied from class:FilterBase
Filters that never filter by rows based on previously gathered state fromFilter.filterCell(Cell)
can inherit this implementation that never filters a row. Last chance to veto row based on previousFilter.filterCell(Cell)
calls. The filter needs to retain state then return a particular value for this call if they wish to exclude a row if a certain column is missing (for example). Concrete implementers can signal a failure condition in their code by throwing anIOException
.- Overrides:
filterRow
in classFilterBase
- Returns:
- true to exclude row, false to include row.
- Throws:
IOException
- in case an I/O or an filter specific failure needs to be signaled.
-
filterCell
Description copied from class:Filter
A way to filter based on the column family, column qualifier and/or the column value. Return code is described below. This allows filters to filter only certain number of columns, then terminate without matching ever column. If filterRowKey returns true, filterCell needs to be consistent with it. filterCell can assume that filterRowKey has already been called for the row. If your filter returnsReturnCode.NEXT_ROW
, it should returnReturnCode.NEXT_ROW
untilFilter.reset()
is called just in case the caller calls for the next row. Concrete implementers can signal a failure condition in their code by throwing anIOException
.- Overrides:
filterCell
in classFilter
- Parameters:
c
- the Cell in question- Returns:
- code as described below
- See Also:
-
getNextCellHint
Description copied from class:FilterBase
Filters that are not sure which key must be next seeked to, can inherit this implementation that, by default, returns a null Cell. If the filter returns the match code SEEK_NEXT_USING_HINT, then it should also tell which is the next key it must seek to. After receiving the match code SEEK_NEXT_USING_HINT, the QueryMatcher would call this function to find out which key it must next seek to. Concrete implementers can signal a failure condition in their code by throwing anIOException
. NOTICE: Filter will be evaluate at server side so the returnedCell
must be anExtendedCell
, although it is marked as IA.Private.- Overrides:
getNextCellHint
in classFilterBase
- Returns:
- KeyValue which must be next seeked. return null if the filter is not sure which key to seek to next.
-
filterAllRemaining
Description copied from class:FilterBase
Filters that never filter all remaining can inherit this implementation that never stops the filter early. If this returns true, the scan will terminate. Concrete implementers can signal a failure condition in their code by throwing anIOException
.- Overrides:
filterAllRemaining
in classFilterBase
- Returns:
- true to end scan, false to continue.
-
toByteArray
Returns The filter serialized using pb- Overrides:
toByteArray
in classFilterBase
- Returns:
- The filter serialized using pb
-
parseFrom
Parse a serialized representation ofFuzzyRowFilter
- Parameters:
pbBytes
- A pb serializedFuzzyRowFilter
instance- Returns:
- An instance of
FuzzyRowFilter
made frombytes
- Throws:
DeserializationException
- if an error occurred- See Also:
-
toString
Description copied from class:FilterBase
Return filter's info for debugging and logging purpose.- Overrides:
toString
in classFilterBase
-
satisfies
static FuzzyRowFilter.SatisfiesCode satisfies(byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) -
satisfies
static FuzzyRowFilter.SatisfiesCode satisfies(boolean reverse, byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) -
satisfies
static FuzzyRowFilter.SatisfiesCode satisfies(boolean reverse, byte[] row, int offset, int length, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) -
satisfiesNoUnsafe
static FuzzyRowFilter.SatisfiesCode satisfiesNoUnsafe(boolean reverse, byte[] row, int offset, int length, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) -
getNextForFuzzyRule
-
getNextForFuzzyRule
static byte[] getNextForFuzzyRule(boolean reverse, byte[] row, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) -
getNextForFuzzyRule
static byte[] getNextForFuzzyRule(boolean reverse, byte[] row, int offset, int length, byte[] fuzzyKeyBytes, byte[] fuzzyKeyMeta) Find out the closes next byte array that satisfies fuzzy rule and is after the given one. In the reverse case it returns increased byte array to make sure that the proper row is selected next.- Returns:
- byte array which is after the given row and which satisfies the fuzzy rule if it exists, null otherwise
-
trimTrailingZeroes
For forward scanner, next cell hint should not contain any trailing zeroes unless they are part of fuzzyKeyMeta hint = '\x01\x01\x01\x00\x00' will skip valid row '\x01\x01\x01'- Parameters:
toInc
- - position of incremented byte- Returns:
- trimmed version of result
-
areSerializedFieldsEqual
Returns true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.- Overrides:
areSerializedFieldsEqual
in classFilterBase
- Returns:
- true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.
-
equals
-
hashCode
-