Package org.apache.hadoop.hbase.filter
Class FilterBase
java.lang.Object
org.apache.hadoop.hbase.filter.Filter
org.apache.hadoop.hbase.filter.FilterBase
- Direct Known Subclasses:
AccessControlFilter,ColumnCountGetFilter,ColumnPaginationFilter,ColumnPrefixFilter,ColumnRangeFilter,ColumnValueFilter,CompareFilter,FilterAllFilter,FilterList,FilterListBase,FirstKeyOnlyFilter,FuzzyRowFilter,InclusiveStopFilter,KeyOnlyFilter,MobReferenceOnlyFilter,MultipleColumnPrefixFilter,MultiRowRangeFilter,PageFilter,PrefixFilter,RandomRowFilter,SingleColumnValueFilter,SkipFilter,TimestampsFilter,VisibilityController.DeleteVersionVisibilityExpressionFilter,VisibilityLabelFilter,WhileMatchFilter
Abstract base class to help you implement new Filters. Common "ignore" or NOOP type methods can
go here, helping to reduce boiler plate in an ever-expanding filter library. If you could
instantiate FilterBase, it would end up being a "null" filter - that is one that never filters
anything.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.hbase.filter.Filter
Filter.ReturnCode -
Field Summary
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) booleanareSerializedFieldsEqual(Filter other) Default implementation so that writers of custom filters aren't forced to implement.static FiltercreateFilterFromArguments(ArrayList<byte[]> filterArguments) Given the filter's arguments it constructs the filterbooleanFilters that never filter all remaining can inherit this implementation that never stops the filter early.booleanFilters that never filter by rows based on previously gathered state fromFilter.filterCell(Cell)can inherit this implementation that never filters a row.voidfilterRowCells(List<Cell> ignored) Filters that never filter by modifying the returned List of Cells can inherit this implementation that does nothing.booleanfilterRowKey(Cell cell) Filters a row based on the row key.getHintForRejectedRow(Cell firstRowCell) Filters that cannot provide a seek hint after row-key rejection can inherit this no-op implementation.getNextCellHint(Cell currentCell) Filters that are not sure which key must be next seeked to, can inherit this implementation that, by default, returns a null Cell.getSkipHint(Cell skippedCell) Filters that cannot provide a structural-skip seek hint can inherit this no-op implementation.booleanFilters that never filter by modifying the returned List of Cells can inherit this implementation that does nothing.booleanisFamilyEssential(byte[] name) By default, we require all scan's column families to be present.voidreset()Filters that are purely stateless and do nothing in their reset() methods can inherit this null/empty implementation.byte[]Return length 0 byte array for Filters that don't require special serializationtoString()Return filter's info for debugging and logging purpose.By default no transformation takes place Give the filter a chance to transform the passed Cell.Methods inherited from class org.apache.hadoop.hbase.filter.Filter
filterCell, isReversed, parseFrom, setReversed
-
Constructor Details
-
FilterBase
public FilterBase()
-
-
Method Details
-
reset
Filters that are purely stateless and do nothing in their reset() methods can inherit this null/empty implementation. Reset the state of the filter between rows. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Specified by:
resetin classFilter- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
filterRowKey
Description copied from class:FilterFilters a row based on the row key. If this returns true, the entire row will be excluded. If false, each KeyValue in the row will be passed toFilter.filterCell(Cell)below. IfFilter.filterAllRemaining()returns true, thenFilter.filterRowKey(Cell)should also return true. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Specified by:
filterRowKeyin classFilter- Parameters:
cell- The first cell coming in the new row- Returns:
- true, remove entire row, false, include the row (maybe).
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
filterAllRemaining
Filters that never filter all remaining can inherit this implementation that never stops the filter early. If this returns true, the scan will terminate. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Specified by:
filterAllRemainingin classFilter- Returns:
- true to end scan, false to continue.
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
transformCell
By default no transformation takes place Give the filter a chance to transform the passed Cell. If the Cell is changed a new Cell object must be returned. NOTICE: Filter will be evaluate at server side so the returnedCellmust be anExtendedCell, although it is marked as IA.Private.- Specified by:
transformCellin classFilter- Parameters:
v- the Cell in question- Returns:
- the changed Cell
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.- See Also:
-
filterRowCells
Filters that never filter by modifying the returned List of Cells can inherit this implementation that does nothing. Chance to alter the list of Cells to be submitted. Modifications to the list will carry on Concrete implementers can signal a failure condition in their code by throwing anIOException.- Specified by:
filterRowCellsin classFilter- Parameters:
ignored- the list of Cells to be filtered- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
hasFilterRow
Filters that never filter by modifying the returned List of Cells can inherit this implementation that does nothing. Primarily used to check for conflicts with scans(such as scans that do not read a full row at a time).- Specified by:
hasFilterRowin classFilter- Returns:
- True if this filter actively uses filterRowCells(List) or filterRow().
-
filterRow
Filters that never filter by rows based on previously gathered state fromFilter.filterCell(Cell)can inherit this implementation that never filters a row. Last chance to veto row based on previousFilter.filterCell(Cell)calls. The filter needs to retain state then return a particular value for this call if they wish to exclude a row if a certain column is missing (for example). Concrete implementers can signal a failure condition in their code by throwing anIOException.- Specified by:
filterRowin classFilter- Returns:
- true to exclude row, false to include row.
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
getNextCellHint
Filters that are not sure which key must be next seeked to, can inherit this implementation that, by default, returns a null Cell. If the filter returns the match code SEEK_NEXT_USING_HINT, then it should also tell which is the next key it must seek to. After receiving the match code SEEK_NEXT_USING_HINT, the QueryMatcher would call this function to find out which key it must next seek to. Concrete implementers can signal a failure condition in their code by throwing anIOException. NOTICE: Filter will be evaluate at server side so the returnedCellmust be anExtendedCell, although it is marked as IA.Private.- Specified by:
getNextCellHintin classFilter- Returns:
- KeyValue which must be next seeked. return null if the filter is not sure which key to seek to next.
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
getHintForRejectedRow
Filters that cannot provide a seek hint after row-key rejection can inherit this no-op implementation. Subclasses whose row-key logic (e.g. a range pointer advanced insidefilterRowKey(Cell)) makes a better seek target available should override this. Provides a seek hint to bypass row-by-row scanning afterFilter.filterRowKey(Cell)rejects a row. WhenfilterRowKeyreturnstruethe scan pipeline would normally iterate through every remaining cell in the rejected row one-by-one (vianextRow()) before moving on. If the filter can determine a better forward position — for example, the next range boundary in aMultiRowRangeFilter— it should return that target cell here, allowing the scanner to seek directly past the unwanted rows.Contract:
- Only called after
Filter.filterRowKey(Cell)has returnedtruefor the samefirstRowCell. - Implementations may use state that was set during
Filter.filterRowKey(Cell)(e.g. an updated range pointer), but must not invokeFilter.filterCell(Cell)logic — the caller guarantees thatfilterCellhas not been called for this row. - The returned
Cell, if non-null, must be anExtendedCellbecause filters are evaluated on the server side. - Returning
null(the default) falls through to the existingnextRow()behaviour, preserving full backward compatibility. - For reversed scans (
Scan.isReversed()), the hint must point to a smaller row key (earlier in reverse-scan direction). The scanner validates hint direction and falls back tonextRow()if the hint does not advance in the scan direction. - Composite filter limitation:
FilterList,SkipFilter, andWhileMatchFilterdo not currently delegate this method to wrapped sub-filters. Hints from filters used inside these wrappers will be silently ignored.
- Overrides:
getHintForRejectedRowin classFilter- Parameters:
firstRowCell- the first cell encountered in the rejected row; contains the row key that was passed tofilterRowKey- Returns:
- a
Cellrepresenting the earliest position the scanner should seek to, ornullif this filter cannot provide a better position than a sequential skip - Throws:
IOException- in case an I/O or filter-specific failure needs to be signaled- See Also:
- Only called after
-
getSkipHint
Filters that cannot provide a structural-skip seek hint can inherit this no-op implementation. Subclasses with purely configuration-driven, stateless hint computation (e.g. a fixed column range or fuzzy-row pattern) may override this to avoid cell-by-cell advancement when the time-range, column, or version gate fires. Provides a seek hint for cells that are structurally skipped by the scan pipeline beforeFilter.filterCell(Cell)is ever reached. The pipeline short-circuits on several criteria — time-range mismatch, column-set exclusion, and version-limit exhaustion — and in each case the filter is bypassed entirely. When an implementation can compute a meaningful forward position purely from the cell's coordinates (without needing thefilterCellcall sequence), it should return that position here so the scanner can seek ahead instead of advancing one cell at a time.Contract:
- May be called for cells that have never been passed to
Filter.filterCell(Cell). - Implementations must not modify any filter state; this method is treated as logically stateless. Only filters whose hint computation is based solely on immutable configuration (e.g. a fixed column range or a fuzzy-row pattern) should override this.
- The returned
Cell, if non-null, must be anExtendedCellbecause filters are evaluated on the server side. - Returning
null(the default) falls through to the existing structural skip/seek behaviour, preserving full backward compatibility. - For reversed scans, the returned cell must have a smaller row key (i.e., earlier
in reverse-scan direction) than the
skippedCell. Hints that do not advance in the scan direction are silently ignored. - Composite filter limitation:
FilterList,SkipFilter, andWhileMatchFilterdo not currently delegate this method to wrapped sub-filters. Hints from filters used inside these wrappers will be silently ignored.
- Overrides:
getSkipHintin classFilter- Parameters:
skippedCell- the cell that was rejected by the time-range, column, or version gate beforefilterCellcould be consulted- Returns:
- a
Cellrepresenting the earliest position the scanner should seek to, ornullif this filter cannot provide a better position than the structural hint - Throws:
IOException- in case an I/O or filter-specific failure needs to be signaled- See Also:
- May be called for cells that have never been passed to
-
isFamilyEssential
By default, we require all scan's column families to be present. Our subclasses may be more precise. Check that given column family is essential for filter to check row. Most filters always return true here. But some could have more sophisticated logic which could significantly reduce scanning process by not even touching columns until we are 100% sure that it's data is needed in result. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Specified by:
isFamilyEssentialin classFilter- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
createFilterFromArguments
Given the filter's arguments it constructs the filter- Parameters:
filterArguments- the filter's arguments- Returns:
- constructed filter object
-
toString
Return filter's info for debugging and logging purpose. -
toByteArray
Return length 0 byte array for Filters that don't require special serialization- Specified by:
toByteArrayin classFilter- Returns:
- The filter serialized using pb
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
areSerializedFieldsEqual
Default implementation so that writers of custom filters aren't forced to implement.- Specified by:
areSerializedFieldsEqualin classFilter- Returns:
- true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.
-