Package org.apache.hadoop.hbase.filter
Class SkipFilter
java.lang.Object
org.apache.hadoop.hbase.filter.Filter
org.apache.hadoop.hbase.filter.FilterBase
org.apache.hadoop.hbase.filter.SkipFilter
A wrapper filter that filters an entire row if any of the Cell checks do not pass.
For example, if all columns in a row represent weights of different things, with the values being
the actual weights, and we want to filter out the entire row if any of its weights are zero. In
this case, we want to prevent rows from being emitted if a single key is filtered. Combine this
filter with a ValueFilter:
scan.setFilter(new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL,
new BinaryComparator(Bytes.toBytes(0))));
Any row which contained a column whose value was 0 will be filtered out (since
ValueFilter will not pass that Cell). Without this filter, the other non-zero valued columns in
the row would still be emitted.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.hbase.filter.Filter
Filter.ReturnCode -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) booleanReturns true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other.private voidchangeFR(boolean value) booleanfilterCell(Cell c) A way to filter based on the column family, column qualifier and/or the column value.booleanFilters that never filter by rows based on previously gathered state fromFilter.filterCell(Cell)can inherit this implementation that never filters a row.booleanfilterRowKey(Cell cell) Filters a row based on the row key.getHintForRejectedRow(Cell firstRowCell) Filters that cannot provide a seek hint after row-key rejection can inherit this no-op implementation.getSkipHint(Cell skippedCell) Filters that cannot provide a structural-skip seek hint can inherit this no-op implementation.booleanFilters that never filter by modifying the returned List of Cells can inherit this implementation that does nothing.inthashCode()booleanisFamilyEssential(byte[] name) By default, we require all scan's column families to be present.static SkipFilterparseFrom(byte[] pbBytes) Parse a serialized representation ofSkipFiltervoidreset()Filters that are purely stateless and do nothing in their reset() methods can inherit this null/empty implementation.byte[]Returns The filter serialized using pbtoString()Return filter's info for debugging and logging purpose.By default no transformation takes place By default no transformation takes place Give the filter a chance to transform the passed Cell.Methods inherited from class org.apache.hadoop.hbase.filter.FilterBase
createFilterFromArguments, filterAllRemaining, filterRowCells, getNextCellHintMethods inherited from class org.apache.hadoop.hbase.filter.Filter
isReversed, setReversed
-
Field Details
-
filterRow
-
filter
-
-
Constructor Details
-
SkipFilter
-
-
Method Details
-
getFilter
-
reset
Description copied from class:FilterBaseFilters that are purely stateless and do nothing in their reset() methods can inherit this null/empty implementation. Reset the state of the filter between rows. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
resetin classFilterBase- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
changeFR
-
filterRowKey
Description copied from class:FilterFilters a row based on the row key. If this returns true, the entire row will be excluded. If false, each KeyValue in the row will be passed toFilter.filterCell(Cell)below. IfFilter.filterAllRemaining()returns true, thenFilter.filterRowKey(Cell)should also return true. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
filterRowKeyin classFilterBase- Parameters:
cell- The first cell coming in the new row- Returns:
- true, remove entire row, false, include the row (maybe).
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
filterCell
Description copied from class:FilterA way to filter based on the column family, column qualifier and/or the column value. Return code is described below. This allows filters to filter only certain number of columns, then terminate without matching ever column. If filterRowKey returns true, filterCell needs to be consistent with it. filterCell can assume that filterRowKey has already been called for the row. If your filter returnsReturnCode.NEXT_ROW, it should returnReturnCode.NEXT_ROWuntilFilter.reset()is called just in case the caller calls for the next row. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
filterCellin classFilter- Parameters:
c- the Cell in question- Returns:
- code as described below
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.- See Also:
-
transformCell
Description copied from class:FilterBaseBy default no transformation takes place Give the filter a chance to transform the passed Cell. If the Cell is changed a new Cell object must be returned. NOTICE: Filter will be evaluate at server side so the returnedCellmust be anExtendedCell, although it is marked as IA.Private.- Overrides:
transformCellin classFilterBase- Parameters:
v- the Cell in question- Returns:
- the changed Cell
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.- See Also:
-
filterRow
Description copied from class:FilterBaseFilters that never filter by rows based on previously gathered state fromFilter.filterCell(Cell)can inherit this implementation that never filters a row. Last chance to veto row based on previousFilter.filterCell(Cell)calls. The filter needs to retain state then return a particular value for this call if they wish to exclude a row if a certain column is missing (for example). Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
filterRowin classFilterBase- Returns:
- true to exclude row, false to include row.
-
hasFilterRow
Description copied from class:FilterBaseFilters that never filter by modifying the returned List of Cells can inherit this implementation that does nothing. Primarily used to check for conflicts with scans(such as scans that do not read a full row at a time).- Overrides:
hasFilterRowin classFilterBase- Returns:
- True if this filter actively uses filterRowCells(List) or filterRow().
-
toByteArray
Returns The filter serialized using pb- Overrides:
toByteArrayin classFilterBase- Returns:
- The filter serialized using pb
- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
parseFrom
Parse a serialized representation ofSkipFilter- Parameters:
pbBytes- A pb serializedSkipFilterinstance- Returns:
- An instance of
SkipFiltermade frombytes - Throws:
DeserializationException- if an error occurred- See Also:
-
areSerializedFieldsEqual
Returns true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.- Overrides:
areSerializedFieldsEqualin classFilterBase- Returns:
- true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.
-
getHintForRejectedRow
Description copied from class:FilterBaseFilters that cannot provide a seek hint after row-key rejection can inherit this no-op implementation. Subclasses whose row-key logic (e.g. a range pointer advanced insideFilterBase.filterRowKey(Cell)) makes a better seek target available should override this. Provides a seek hint to bypass row-by-row scanning afterFilter.filterRowKey(Cell)rejects a row. WhenfilterRowKeyreturnstruethe scan pipeline would normally iterate through every remaining cell in the rejected row one-by-one (vianextRow()) before moving on. If the filter can determine a better forward position — for example, the next range boundary in aMultiRowRangeFilter— it should return that target cell here, allowing the scanner to seek directly past the unwanted rows.Contract:
- Only called after
Filter.filterRowKey(Cell)has returnedtruefor the samefirstRowCell. - Implementations may use state that was set during
Filter.filterRowKey(Cell)(e.g. an updated range pointer), but must not invokeFilter.filterCell(Cell)logic — the caller guarantees thatfilterCellhas not been called for this row. - The returned
Cell, if non-null, must be anExtendedCellbecause filters are evaluated on the server side. - Returning
null(the default) falls through to the existingnextRow()behaviour, preserving full backward compatibility. - For reversed scans (
Scan.isReversed()), the hint must point to a smaller row key (earlier in reverse-scan direction). The scanner validates hint direction and falls back tonextRow()if the hint does not advance in the scan direction. - Composite filter support:
FilterList(bothMUST_PASS_ALLandMUST_PASS_ONE),SkipFilter, andWhileMatchFilterdelegate this method to their sub-filters and merge the results. For AND (MUST_PASS_ALL), only sub-filters whosefilterRowKeyindividually returnedtrueare consulted, and the farthest (maximal-step) hint among them is returned. For OR (MUST_PASS_ONE), the nearest hint is returned only when every non-terminated sub-filter provides one — any null collapses the OR result to null.
- Overrides:
getHintForRejectedRowin classFilterBase- Parameters:
firstRowCell- the first cell encountered in the rejected row; contains the row key that was passed tofilterRowKey- Returns:
- a
Cellrepresenting the earliest position the scanner should seek to, ornullif this filter cannot provide a better position than a sequential skip - Throws:
IOException- in case an I/O or filter-specific failure needs to be signaled- See Also:
- Only called after
-
getSkipHint
Description copied from class:FilterBaseFilters that cannot provide a structural-skip seek hint can inherit this no-op implementation. Subclasses with purely configuration-driven, stateless hint computation (e.g. a fixed column range or fuzzy-row pattern) may override this to avoid cell-by-cell advancement when the time-range, column, or version gate fires. Provides a seek hint for cells that are structurally skipped by the scan pipeline beforeFilter.filterCell(Cell)is ever reached. The pipeline short-circuits on several criteria — time-range mismatch, column-set exclusion, and version-limit exhaustion — and in each case the filter is bypassed entirely. When an implementation can compute a meaningful forward position purely from the cell's coordinates (without needing thefilterCellcall sequence), it should return that position here so the scanner can seek ahead instead of advancing one cell at a time.Contract:
- May be called for cells that have never been passed to
Filter.filterCell(Cell). - Implementations must not modify any filter state; this method is treated as logically stateless. Only filters whose hint computation is based solely on immutable configuration (e.g. a fixed column range or a fuzzy-row pattern) should override this.
- The returned
Cell, if non-null, must be anExtendedCellbecause filters are evaluated on the server side. - Returning
null(the default) falls through to the existing structural skip/seek behaviour, preserving full backward compatibility. - For reversed scans, the returned cell must have a smaller row key (i.e., earlier
in reverse-scan direction) than the
skippedCell. Hints that do not advance in the scan direction are silently ignored. - Composite filter support:
FilterList(bothMUST_PASS_ALLandMUST_PASS_ONE),SkipFilter, andWhileMatchFilterdelegate this method to their sub-filters and merge the results (maximal step for AND; for OR, the nearest hint is returned only when every non-terminated sub-filter provides one — any null collapses the OR result to null).
- Overrides:
getSkipHintin classFilterBase- Parameters:
skippedCell- the cell that was rejected by the time-range, column, or version gate beforefilterCellcould be consulted- Returns:
- a
Cellrepresenting the earliest position the scanner should seek to, ornullif this filter cannot provide a better position than the structural hint - Throws:
IOException- in case an I/O or filter-specific failure needs to be signaled- See Also:
- May be called for cells that have never been passed to
-
isFamilyEssential
Description copied from class:FilterBaseBy default, we require all scan's column families to be present. Our subclasses may be more precise. Check that given column family is essential for filter to check row. Most filters always return true here. But some could have more sophisticated logic which could significantly reduce scanning process by not even touching columns until we are 100% sure that it's data is needed in result. Concrete implementers can signal a failure condition in their code by throwing anIOException.- Overrides:
isFamilyEssentialin classFilterBase- Throws:
IOException- in case an I/O or an filter specific failure needs to be signaled.
-
toString
Description copied from class:FilterBaseReturn filter's info for debugging and logging purpose.- Overrides:
toStringin classFilterBase
-
equals
-
hashCode
-