Class Filter

java.lang.Object
org.apache.hadoop.hbase.filter.Filter
Direct Known Subclasses:
FilterBase, FilterWrapper

@Public public abstract class Filter extends Object
Interface for row and column filters directly applied within the regionserver. A filter can expect the following call sequence:
  • reset() : reset the filter state before filtering a new row.
  • filterAllRemaining(): true means row scan is over; false means keep going.
  • filterRowKey(Cell): true means drop this row; false means include.
  • getHintForRejectedRow(Cell): if filterRowKey returned true, optionally provide a seek hint to skip past the rejected row efficiently.
  • getSkipHint(Cell): when a cell is structurally skipped (time-range, column, or version gate) before filterCell is reached, optionally provide a seek hint.
  • filterCell(Cell): decides whether to include or exclude this Cell. See Filter.ReturnCode.
  • transformCell(Cell): if the Cell is included, let the filter transform the Cell.
  • filterRowCells(List): allows direct modification of the final list to be submitted
  • filterRow(): last chance to drop entire row based on the sequence of filter calls. Eg: filter a row if it doesn't contain a specified column.
Filter instances are created one per region/scan. This abstract class replaces the old RowFilterInterface. When implementing your own filters, consider inheriting FilterBase to help you reduce boilerplate.
See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static enum 
    Return codes for filterValue().
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected boolean
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    (package private) abstract boolean
    Concrete implementers can signal a failure condition in their code by throwing an IOException.
    abstract boolean
    If this returns true, the scan will terminate.
    A way to filter based on the column family, column qualifier and/or the column value.
    abstract boolean
    Last chance to veto row based on previous filterCell(Cell) calls.
    abstract void
    Chance to alter the list of Cells to be submitted.
    abstract boolean
    filterRowKey(Cell firstRowCell)
    Filters a row based on the row key.
    Provides a seek hint to bypass row-by-row scanning after filterRowKey(Cell) rejects a row.
    abstract Cell
    getNextCellHint(Cell currentCell)
    If the filter returns the match code SEEK_NEXT_USING_HINT, then it should also tell which is the next key it must seek to.
    getSkipHint(Cell skippedCell)
    Provides a seek hint for cells that are structurally skipped by the scan pipeline before filterCell(Cell) is ever reached.
    abstract boolean
    Primarily used to check for conflicts with scans(such as scans that do not read a full row at a time).
    abstract boolean
    isFamilyEssential(byte[] name)
    Check that given column family is essential for filter to check row.
    boolean
     
    static Filter
    parseFrom(byte[] pbBytes)
    Concrete implementers can signal a failure condition in their code by throwing an IOException.
    abstract void
    Reset the state of the filter between rows.
    void
    setReversed(boolean reversed)
    alter the reversed scan flag
    abstract byte[]
    TODO: JAVADOC Concrete implementers can signal a failure condition in their code by throwing an IOException.
    abstract Cell
    Give the filter a chance to transform the passed Cell.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • reversed

      protected transient boolean reversed
  • Constructor Details

  • Method Details

    • reset

      public abstract void reset() throws IOException
      Reset the state of the filter between rows. Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • filterRowKey

      public abstract boolean filterRowKey(Cell firstRowCell) throws IOException
      Filters a row based on the row key. If this returns true, the entire row will be excluded. If false, each KeyValue in the row will be passed to filterCell(Cell) below. If filterAllRemaining() returns true, then filterRowKey(Cell) should also return true. Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Parameters:
      firstRowCell - The first cell coming in the new row
      Returns:
      true, remove entire row, false, include the row (maybe).
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • filterAllRemaining

      public abstract boolean filterAllRemaining() throws IOException
      If this returns true, the scan will terminate. Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Returns:
      true to end scan, false to continue.
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • filterCell

      A way to filter based on the column family, column qualifier and/or the column value. Return code is described below. This allows filters to filter only certain number of columns, then terminate without matching ever column. If filterRowKey returns true, filterCell needs to be consistent with it. filterCell can assume that filterRowKey has already been called for the row. If your filter returns ReturnCode.NEXT_ROW, it should return ReturnCode.NEXT_ROW until reset() is called just in case the caller calls for the next row. Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Parameters:
      c - the Cell in question
      Returns:
      code as described below
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
      See Also:
    • transformCell

      public abstract Cell transformCell(Cell v) throws IOException
      Give the filter a chance to transform the passed Cell. If the Cell is changed a new Cell object must be returned.

      NOTICE: Filter will be evaluate at server side so the returned Cell must be an ExtendedCell, although it is marked as IA.Private.

      Parameters:
      v - the Cell in question
      Returns:
      the changed Cell
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
      See Also:
    • filterRowCells

      public abstract void filterRowCells(List<Cell> kvs) throws IOException
      Chance to alter the list of Cells to be submitted. Modifications to the list will carry on Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Parameters:
      kvs - the list of Cells to be filtered
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • hasFilterRow

      public abstract boolean hasFilterRow()
      Primarily used to check for conflicts with scans(such as scans that do not read a full row at a time).
      Returns:
      True if this filter actively uses filterRowCells(List) or filterRow().
    • filterRow

      public abstract boolean filterRow() throws IOException
      Last chance to veto row based on previous filterCell(Cell) calls. The filter needs to retain state then return a particular value for this call if they wish to exclude a row if a certain column is missing (for example). Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Returns:
      true to exclude row, false to include row.
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • getNextCellHint

      public abstract Cell getNextCellHint(Cell currentCell) throws IOException
      If the filter returns the match code SEEK_NEXT_USING_HINT, then it should also tell which is the next key it must seek to. After receiving the match code SEEK_NEXT_USING_HINT, the QueryMatcher would call this function to find out which key it must next seek to. Concrete implementers can signal a failure condition in their code by throwing an IOException. NOTICE: Filter will be evaluate at server side so the returned Cell must be an ExtendedCell, although it is marked as IA.Private.
      Returns:
      KeyValue which must be next seeked. return null if the filter is not sure which key to seek to next.
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • getHintForRejectedRow

      public Cell getHintForRejectedRow(Cell firstRowCell) throws IOException
      Provides a seek hint to bypass row-by-row scanning after filterRowKey(Cell) rejects a row. When filterRowKey returns true the scan pipeline would normally iterate through every remaining cell in the rejected row one-by-one (via nextRow()) before moving on. If the filter can determine a better forward position — for example, the next range boundary in a MultiRowRangeFilter — it should return that target cell here, allowing the scanner to seek directly past the unwanted rows.

      Contract:

      • Only called after filterRowKey(Cell) has returned true for the same firstRowCell.
      • Implementations may use state that was set during filterRowKey(Cell) (e.g. an updated range pointer), but must not invoke filterCell(Cell) logic — the caller guarantees that filterCell has not been called for this row.
      • The returned Cell, if non-null, must be an ExtendedCell because filters are evaluated on the server side.
      • Returning null (the default) falls through to the existing nextRow() behaviour, preserving full backward compatibility.
      • For reversed scans (Scan.isReversed()), the hint must point to a smaller row key (earlier in reverse-scan direction). The scanner validates hint direction and falls back to nextRow() if the hint does not advance in the scan direction.
      • Composite filter limitation: FilterList, SkipFilter, and WhileMatchFilter do not currently delegate this method to wrapped sub-filters. Hints from filters used inside these wrappers will be silently ignored.
      Parameters:
      firstRowCell - the first cell encountered in the rejected row; contains the row key that was passed to filterRowKey
      Returns:
      a Cell representing the earliest position the scanner should seek to, or null if this filter cannot provide a better position than a sequential skip
      Throws:
      IOException - in case an I/O or filter-specific failure needs to be signaled
      See Also:
    • getSkipHint

      public Cell getSkipHint(Cell skippedCell) throws IOException
      Provides a seek hint for cells that are structurally skipped by the scan pipeline before filterCell(Cell) is ever reached. The pipeline short-circuits on several criteria — time-range mismatch, column-set exclusion, and version-limit exhaustion — and in each case the filter is bypassed entirely. When an implementation can compute a meaningful forward position purely from the cell's coordinates (without needing the filterCell call sequence), it should return that position here so the scanner can seek ahead instead of advancing one cell at a time.

      Contract:

      • May be called for cells that have never been passed to filterCell(Cell).
      • Implementations must not modify any filter state; this method is treated as logically stateless. Only filters whose hint computation is based solely on immutable configuration (e.g. a fixed column range or a fuzzy-row pattern) should override this.
      • The returned Cell, if non-null, must be an ExtendedCell because filters are evaluated on the server side.
      • Returning null (the default) falls through to the existing structural skip/seek behaviour, preserving full backward compatibility.
      • For reversed scans, the returned cell must have a smaller row key (i.e., earlier in reverse-scan direction) than the skippedCell. Hints that do not advance in the scan direction are silently ignored.
      • Composite filter limitation: FilterList, SkipFilter, and WhileMatchFilter do not currently delegate this method to wrapped sub-filters. Hints from filters used inside these wrappers will be silently ignored.
      Parameters:
      skippedCell - the cell that was rejected by the time-range, column, or version gate before filterCell could be consulted
      Returns:
      a Cell representing the earliest position the scanner should seek to, or null if this filter cannot provide a better position than the structural hint
      Throws:
      IOException - in case an I/O or filter-specific failure needs to be signaled
      See Also:
    • isFamilyEssential

      public abstract boolean isFamilyEssential(byte[] name) throws IOException
      Check that given column family is essential for filter to check row. Most filters always return true here. But some could have more sophisticated logic which could significantly reduce scanning process by not even touching columns until we are 100% sure that it's data is needed in result. Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • toByteArray

      public abstract byte[] toByteArray() throws IOException
      TODO: JAVADOC Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Returns:
      The filter serialized using pb
      Throws:
      IOException - in case an I/O or an filter specific failure needs to be signaled.
    • parseFrom

      public static Filter parseFrom(byte[] pbBytes) throws DeserializationException
      Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Parameters:
      pbBytes - A pb serialized Filter instance
      Returns:
      An instance of Filter made from bytes
      Throws:
      DeserializationException - if an error occurred
      See Also:
    • areSerializedFieldsEqual

      abstract boolean areSerializedFieldsEqual(Filter other)
      Concrete implementers can signal a failure condition in their code by throwing an IOException.
      Returns:
      true if and only if the fields of the filter that are serialized are equal to the corresponding fields in other. Used for testing.
    • setReversed

      public void setReversed(boolean reversed)
      alter the reversed scan flag
      Parameters:
      reversed - flag
    • isReversed

      public boolean isReversed()