Package org.apache.hadoop.hbase.filter
Class MultiRowRangeFilter
java.lang.Object
org.apache.hadoop.hbase.filter.Filter
org.apache.hadoop.hbase.filter.FilterBase
org.apache.hadoop.hbase.filter.MultiRowRangeFilter
- All Implemented Interfaces:
HintingFilter
Filter to support scan multiple row key ranges. It can construct the row key ranges from the
passed list which can be accessed by each region server. HBase is quite efficient when scanning
only one small row key range. If user needs to specify multiple row key ranges in one scan, the
typical solutions are: 1. through FilterList which is a list of row key Filters, 2. using the SQL
layer over HBase to join with two table, such as hive, phoenix etc. However, both solutions are
inefficient. Both of them can't utilize the range info to perform fast forwarding during scan
which is quite time consuming. If the number of ranges are quite big (e.g. millions), join is a
proper solution though it is slow. However, there are cases that user wants to specify a small
number of ranges to scan (e.g. <1000 ranges). Both solutions can't provide satisfactory
performance in such case. MultiRowRangeFilter is to support such usec ase (scan multiple row key
ranges), which can construct the row key ranges from user specified list and perform
fast-forwarding during scan. Thus, the scan will be quite efficient.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static classprivate static classAbstraction over the ranges of rows to return from this filter, regardless of forward or reverse scans being used.private static classInternal RowRange that reverses the sort-order to handle reverse scans.static classNested classes/interfaces inherited from class org.apache.hadoop.hbase.filter.Filter
Filter.ReturnCode -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate Filter.ReturnCodeprivate booleanprivate intprivate final List<MultiRowRangeFilter.RowRange>private final MultiRowRangeFilter.RangeIterationprivate static final int -
Constructor Summary
ConstructorsConstructorDescriptionMultiRowRangeFilter(byte[][] rowKeyPrefixes) Constructor for creating aMultiRowRangeFilterfrom multiple rowkey prefixes. -
Method Summary
Modifier and TypeMethodDescription(package private) booleanReturns true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other.private static List<MultiRowRangeFilter.RowRange>createRangeListFromRowKeyPrefixes(byte[][] rowKeyPrefixes) booleanbooleanFilters that never filter all remaining can inherit this implementation that never stops the filter early.filterCell(Cell ignored) A way to filter based on the column family, column qualifier and/or the column value.booleanfilterRowKey(Cell firstRowCell) Filters a row based on the row key.getNextCellHint(Cell currentKV) Filters that are not sure which key must be next seeked to, can inherit this implementation that, by default, returns a null Cell.inthashCode()static MultiRowRangeFilterparseFrom(byte[] pbBytes) Parse a serialized representation ofMultiRowRangeFilterstatic List<MultiRowRangeFilter.RowRange>sortAndMerge(List<MultiRowRangeFilter.RowRange> ranges) sort the ranges and if the ranges with overlap, then merge them.private static voidthrowExceptionForInvalidRanges(List<MultiRowRangeFilter.RowRange> invalidRanges, boolean details) byte[]Returns The filter serialized using pbMethods inherited from class org.apache.hadoop.hbase.filter.FilterBase
createFilterFromArguments, filterRow, filterRowCells, hasFilterRow, isFamilyEssential, reset, toString, transformCellMethods inherited from class org.apache.hadoop.hbase.filter.Filter
isReversed, setReversed
-
Field Details
-
ROW_BEFORE_FIRST_RANGE
- See Also:
-
rangeList
-
ranges
-
done
-
index
-
range
-
currentReturnCode
-
-
Constructor Details
-
MultiRowRangeFilter
- Parameters:
list- A list ofRowRange
-
MultiRowRangeFilter
Constructor for creating aMultiRowRangeFilterfrom multiple rowkey prefixes. AsMultiRowRangeFilterjavadoc says (See the solution 1 of the first statement), if you try to create a filter list that scans row keys corresponding to given prefixes (e.g.,FilterListcomposed of multiplePrefixFilters), this constructor provides a way to avoid creating an inefficient one.- Parameters:
rowKeyPrefixes- the array of byte array
-
-
Method Details
-
createRangeListFromRowKeyPrefixes
private static List<MultiRowRangeFilter.RowRange> createRangeListFromRowKeyPrefixes(byte[][] rowKeyPrefixes) -
getRowRanges
-
filterAllRemaining
Description copied from class:FilterBaseFilters that never filter all remaining can inherit this implementation that never stops the filter early. If this returns true, the scan will terminate. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
filterAllRemainingin classFilterBase- Returns:
- true to end scan, false to continue.
-
filterRowKey
Description copied from class:FilterFilters a row based on the row key. If this returns true, the entire row will be excluded. If false, each KeyValue in the row will be passed toFilter.filterCell(Cell)below. IfFilter.filterAllRemaining()returns true, thenFilter.filterRowKey(Cell)should also return true. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
filterRowKeyin classFilterBase- Parameters:
firstRowCell- The first cell coming in the new row- Returns:
- true, remove entire row, false, include the row (maybe).
-
filterCell
Description copied from class:FilterA way to filter based on the column family, column qualifier and/or the column value. Return code is described below. This allows filters to filter only certain number of columns, then terminate without matching ever column. If filterRowKey returns true, filterCell needs to be consistent with it. filterCell can assume that filterRowKey has already been called for the row. If your filter returnsReturnCode.NEXT_ROW, it should returnReturnCode.NEXT_ROWuntilFilter.reset()is called just in case the caller calls for the next row. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
filterCellin classFilter- Parameters:
ignored- the Cell in question- Returns:
- code as described below
- See Also:
-
getNextCellHint
Description copied from class:FilterBaseFilters that are not sure which key must be next seeked to, can inherit this implementation that, by default, returns a null Cell. If the filter returns the match code SEEK_NEXT_USING_HINT, then it should also tell which is the next key it must seek to. After receiving the match code SEEK_NEXT_USING_HINT, the QueryMatcher would call this function to find out which key it must next seek to. Concrete implementers can signal a failure condition in their code by throwing anIOException. NOTICE: Filter will be evaluate at server side so the returnedCellmust be anExtendedCell, although it is marked as IA.Private.- Overrides:
getNextCellHintin classFilterBase- Returns:
- KeyValue which must be next seeked. return null if the filter is not sure which key to seek to next.
-
toByteArray
Returns The filter serialized using pb- Overrides:
toByteArrayin classFilterBase- Returns:
- The filter serialized using pb
-
parseFrom
Parse a serialized representation ofMultiRowRangeFilter- Parameters:
pbBytes- A pb serialized instance- Returns:
- An instance of
MultiRowRangeFilter - Throws:
DeserializationException- if an error occurred- See Also:
-
areSerializedFieldsEqual
Returns true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.- Overrides:
areSerializedFieldsEqualin classFilterBase- Returns:
- true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.
-
sortAndMerge
public static List<MultiRowRangeFilter.RowRange> sortAndMerge(List<MultiRowRangeFilter.RowRange> ranges) sort the ranges and if the ranges with overlap, then merge them.- Parameters:
ranges- the list of ranges to sort and merge.- Returns:
- the ranges after sort and merge.
-
throwExceptionForInvalidRanges
private static void throwExceptionForInvalidRanges(List<MultiRowRangeFilter.RowRange> invalidRanges, boolean details) -
equals
-
hashCode
-