@InterfaceAudience.Public @InterfaceStability.Stable public class Scan extends Query
All operations are identical to Get
with the exception of
instantiation. Rather than specifying a single row, an optional startRow
and stopRow may be defined. If rows are not specified, the Scanner will
iterate over all rows.
To get all columns from all rows of a Table, create an instance with no constraints; use the
Scan()
constructor. To constrain the scan to specific column families,
call addFamily
for each family to retrieve on your Scan instance.
To get specific columns, call addColumn
for each column to retrieve.
To only retrieve columns within a specific range of version timestamps,
call setTimeRange
.
To only retrieve columns with a specific timestamp, call
setTimestamp
.
To limit the number of versions of each column to be returned, call
setMaxVersions
.
To limit the maximum number of values returned for each call to next(),
call setBatch
.
To add a filter, call setFilter
.
Expert: To explicitly disable server-side block caching for this scan,
execute setCacheBlocks(boolean)
.
Note: Usage alters Scan instances. Internally, attributes are updated as the Scan runs and if enabled, metrics accumulate in the Scan instance. Be aware this is the case when you go to clone a Scan instance or if you go to reuse a created Scan instance; safer is create a Scan instance per usage.
Modifier and Type | Field and Description |
---|---|
private boolean |
allowPartialResults
|
private int |
batch |
private boolean |
cacheBlocks |
private int |
caching |
private Map<byte[],NavigableSet<byte[]>> |
familyMap |
private boolean |
getScan |
static String |
HINT_LOOKAHEAD
Deprecated.
without replacement
This is now a no-op, SEEKs and SKIPs are optimizated automatically.
Will be removed in 2.0+
|
private static org.apache.commons.logging.Log |
LOG |
private long |
maxResultSize |
private int |
maxVersions |
private static String |
RAW_ATTR |
private boolean |
reversed |
static String |
SCAN_ATTRIBUTES_METRICS_DATA
Deprecated.
|
static String |
SCAN_ATTRIBUTES_METRICS_ENABLE
Deprecated.
since 1.0.0. Use
setScanMetricsEnabled(boolean) |
static String |
SCAN_ATTRIBUTES_TABLE_NAME |
private boolean |
small
Set it true for small scan to get better performance
Small scan should use pread and big scan can use seek + read
seek + read is fast but can cause two problem (1) resource contention (2)
cause too much network io
[89-fb] Using pread for non-compaction read request
https://issues.apache.org/jira/browse/HBASE-7266
On the other hand, if setting it true, we would do
openScanner,next,closeScanner in one RPC call.
|
private byte[] |
startRow |
private byte[] |
stopRow |
private int |
storeLimit |
private int |
storeOffset |
private TimeRange |
tr |
colFamTimeRangeMap, consistency, filter, loadColumnFamiliesOnDemand, targetReplicaId
ID_ATRIBUTE
Constructor and Description |
---|
Scan()
Create a Scan operation across all rows.
|
Scan(byte[] startRow)
Create a Scan operation starting at the specified row.
|
Scan(byte[] startRow,
byte[] stopRow)
Create a Scan operation for the range of rows specified.
|
Scan(byte[] startRow,
Filter filter) |
Scan(Get get)
Builds a scan object with the same specs as get.
|
Scan(Scan scan)
Creates a new instance of this class while copying all values.
|
Modifier and Type | Method and Description |
---|---|
Scan |
addColumn(byte[] family,
byte[] qualifier)
Get the column from the specified family with the specified qualifier.
|
Scan |
addFamily(byte[] family)
Get all columns from the specified family.
|
private byte[] |
calculateTheClosestNextRowKeyForPrefix(byte[] rowKeyPrefix)
When scanning for a prefix the scan should stop immediately after the the last row that
has the specified prefix.
|
(package private) static Scan |
createGetClosestRowOrBeforeReverseScan(byte[] row)
Utility that creates a Scan that will do a small scan in reverse from passed row
looking for next closest row.
|
boolean |
getAllowPartialResults() |
int |
getBatch() |
boolean |
getCacheBlocks()
Get whether blocks should be cached for this Scan.
|
int |
getCaching() |
byte[][] |
getFamilies() |
Map<byte[],NavigableSet<byte[]>> |
getFamilyMap()
Getting the familyMap
|
Filter |
getFilter() |
Map<String,Object> |
getFingerprint()
Compile the table and column family (i.e.
|
long |
getMaxResultSize() |
int |
getMaxResultsPerColumnFamily() |
int |
getMaxVersions() |
int |
getRowOffsetPerColumnFamily()
Method for retrieving the scan's offset per row per column
family (#kvs to be skipped)
|
ScanMetrics |
getScanMetrics() |
byte[] |
getStartRow() |
byte[] |
getStopRow() |
TimeRange |
getTimeRange() |
boolean |
hasFamilies() |
boolean |
hasFilter() |
boolean |
isGetScan() |
boolean |
isRaw() |
boolean |
isReversed()
Get whether this scan is a reversed one.
|
boolean |
isScanMetricsEnabled() |
boolean |
isSmall()
Get whether this scan is a small scan
|
private boolean |
isStartRowAndEqualsStopRow() |
int |
numFamilies() |
Scan |
setACL(Map<String,Permission> perms) |
Scan |
setACL(String user,
Permission perms) |
Scan |
setAllowPartialResults(boolean allowPartialResults)
Setting whether the caller wants to see the partial results that may be returned from the
server.
|
Scan |
setAttribute(String name,
byte[] value)
Sets an attribute.
|
Scan |
setAuthorizations(Authorizations authorizations)
Sets the authorizations to be used by this Query
|
Scan |
setBatch(int batch)
Set the maximum number of values to return for each call to next().
|
Scan |
setCacheBlocks(boolean cacheBlocks)
Set whether blocks should be cached for this Scan.
|
Scan |
setCaching(int caching)
Set the number of rows for caching that will be passed to scanners.
|
Scan |
setColumnFamilyTimeRange(byte[] cf,
long minStamp,
long maxStamp)
Get versions of columns only within the specified timestamp range,
[minStamp, maxStamp) on a per CF bases.
|
Scan |
setConsistency(Consistency consistency)
Sets the consistency level for this operation
|
Scan |
setFamilyMap(Map<byte[],NavigableSet<byte[]>> familyMap)
Setting the familyMap
|
Scan |
setFilter(Filter filter)
Apply the specified server-side filter when performing the Query.
|
Scan |
setId(String id)
This method allows you to set an identifier on an operation.
|
Scan |
setIsolationLevel(IsolationLevel level)
Set the isolation level for this query.
|
Scan |
setLoadColumnFamiliesOnDemand(boolean value)
Set the value indicating whether loading CFs on demand should be allowed (cluster
default is false).
|
Scan |
setMaxResultSize(long maxResultSize)
Set the maximum result size.
|
Scan |
setMaxResultsPerColumnFamily(int limit)
Set the maximum number of values to return per row per Column Family
|
Scan |
setMaxVersions()
Get all available versions.
|
Scan |
setMaxVersions(int maxVersions)
Get up to the specified number of versions of each column.
|
Scan |
setRaw(boolean raw)
Enable/disable "raw" mode for this scan.
|
Scan |
setReplicaId(int Id)
Specify region replica id where Query will fetch data from.
|
Scan |
setReversed(boolean reversed)
Set whether this scan is a reversed one
|
Scan |
setRowOffsetPerColumnFamily(int offset)
Set offset for the row per Column Family.
|
Scan |
setRowPrefixFilter(byte[] rowPrefix)
Set a filter (using stopRow and startRow) so the result set only contains rows where the
rowKey starts with the specified prefix.
|
Scan |
setScanMetricsEnabled(boolean enabled)
Enable collection of
ScanMetrics . |
Scan |
setSmall(boolean small)
Set whether this scan is a small scan
|
Scan |
setStartRow(byte[] startRow)
Set the start row of the scan.
|
Scan |
setStopRow(byte[] stopRow)
Set the stop row.
|
Scan |
setTimeRange(long minStamp,
long maxStamp)
Get versions of columns only within the specified timestamp range,
[minStamp, maxStamp).
|
Scan |
setTimeStamp(long timestamp)
Get versions of columns with the specified timestamp.
|
Map<String,Object> |
toMap(int maxCols)
Compile the details beyond the scope of getFingerprint (row, columns,
timestamps, etc.) into a Map along with the fingerprinted information.
|
doLoadColumnFamiliesOnDemand, getACL, getAuthorizations, getColumnFamilyTimeRange, getConsistency, getIsolationLevel, getLoadColumnFamiliesOnDemandValue, getReplicaId
getAttribute, getAttributeSize, getAttributesMap, getId
private static final org.apache.commons.logging.Log LOG
private static final String RAW_ATTR
private byte[] startRow
private byte[] stopRow
private int maxVersions
private int batch
private boolean allowPartialResults
Result
s are Result
s must be combined to form a complete Result
.
The Result
s had to be returned in fragments (i.e. as partials) because the size of the
cells in the row exceeded max result size on the server. Typically partial results will be
combined client side into complete results before being delivered to the caller. However, if
this flag is set, the caller is indicating that they do not mind seeing partial results (i.e.
they understand that the results returned from the Scanner may only represent part of a
particular row). In such a case, any attempt to combine the partials into a complete result on
the client side will be skipped, and the caller will be able to see the exact results returned
from the server.private int storeLimit
private int storeOffset
private boolean getScan
@Deprecated public static final String SCAN_ATTRIBUTES_METRICS_ENABLE
setScanMetricsEnabled(boolean)
@Deprecated public static final String SCAN_ATTRIBUTES_METRICS_DATA
getScanMetrics()
public static final String SCAN_ATTRIBUTES_TABLE_NAME
@Deprecated public static final String HINT_LOOKAHEAD
private int caching
private long maxResultSize
private boolean cacheBlocks
private boolean reversed
private TimeRange tr
private Map<byte[],NavigableSet<byte[]>> familyMap
private boolean small
public Scan()
public Scan(byte[] startRow, Filter filter)
public Scan(byte[] startRow)
If the specified row does not exist, the Scanner will start from the next closest row after the specified row.
startRow
- row to start scanner at or afterpublic Scan(byte[] startRow, byte[] stopRow)
startRow
- row to start scanner at or after (inclusive)stopRow
- row to stop scanner before (exclusive)public Scan(Scan scan) throws IOException
scan
- The scan instance to copy from.IOException
- When copying the values fails.public Scan(Get get)
get
- get to model scan afterpublic boolean isGetScan()
private boolean isStartRowAndEqualsStopRow()
public Scan addFamily(byte[] family)
Overrides previous calls to addColumn for this family.
family
- family namepublic Scan addColumn(byte[] family, byte[] qualifier)
Overrides previous calls to addFamily for this family.
family
- family namequalifier
- column qualifierpublic Scan setTimeRange(long minStamp, long maxStamp) throws IOException
minStamp
- minimum timestamp value, inclusivemaxStamp
- maximum timestamp value, exclusiveIOException
setMaxVersions()
,
setMaxVersions(int)
public Scan setTimeStamp(long timestamp) throws IOException
timestamp
- version timestampIOException
setMaxVersions()
,
setMaxVersions(int)
public Scan setColumnFamilyTimeRange(byte[] cf, long minStamp, long maxStamp)
Query
setColumnFamilyTimeRange
in class Query
cf
- the column family for which you want to restrictminStamp
- minimum timestamp value, inclusivemaxStamp
- maximum timestamp value, exclusivepublic Scan setStartRow(byte[] startRow)
startRow
- row to start scan on (inclusive)
Note: In order to make startRow exclusive add a trailing 0 bytepublic Scan setStopRow(byte[] stopRow)
stopRow
- row to end at (exclusive)
Note: In order to make stopRow inclusive add a trailing 0 byte
Note: When doing a filter for a rowKey Prefix
use setRowPrefixFilter(byte[])
.
The 'trailing 0' will not yield the desired result.
public Scan setRowPrefixFilter(byte[] rowPrefix)
Set a filter (using stopRow and startRow) so the result set only contains rows where the rowKey starts with the specified prefix.
This is a utility method that converts the desired rowPrefix into the appropriate values for the startRow and stopRow to achieve the desired result.
This can safely be used in combination with setFilter.
NOTE: Doing a setStartRow(byte[])
and/or setStopRow(byte[])
after this method will yield undefined results.
rowPrefix
- the prefix all rows must start with. (Set null to remove the filter.)private byte[] calculateTheClosestNextRowKeyForPrefix(byte[] rowKeyPrefix)
When scanning for a prefix the scan should stop immediately after the the last row that has the specified prefix. This method calculates the closest next rowKey immediately following the given rowKeyPrefix.
IMPORTANT: This converts a rowKeyPrefix into a rowKey.
If the prefix is an 'ASCII' string put into a byte[] then this is easy because you can simply increment the last byte of the array. But if your application uses real binary rowids you may run into the scenario that your prefix is something like:
{ 0x12, 0x23, 0xFF, 0xFF }rowKeyPrefix
- the rowKeyPrefix.public Scan setMaxVersions()
public Scan setMaxVersions(int maxVersions)
maxVersions
- maximum versions for each columnpublic Scan setBatch(int batch)
setAllowPartialResults(boolean)
with a value of true
; partial results may be returned if
this method is called. Use setMaxResultSize(long)
} to
limit the size of a Scan's Results instead.batch
- the maximum number of valuespublic Scan setMaxResultsPerColumnFamily(int limit)
limit
- the maximum number of values returned / row / CFpublic Scan setRowOffsetPerColumnFamily(int offset)
offset
- is the number of kvs that will be skipped.public Scan setCaching(int caching)
HConstants.HBASE_CLIENT_SCANNER_CACHING
will
apply.
Higher caching values will enable faster scanners but will use more memory.caching
- the number of rows for cachingpublic long getMaxResultSize()
setMaxResultSize(long)
public Scan setMaxResultSize(long maxResultSize)
maxResultSize
- The maximum result size in bytes.public Scan setFilter(Filter filter)
Query
Filter.filterKeyValue(Cell)
is called AFTER all tests
for ttl, column match, deletes and max versions have been run.public Scan setFamilyMap(Map<byte[],NavigableSet<byte[]>> familyMap)
familyMap
- map of family to qualifierpublic Map<byte[],NavigableSet<byte[]>> getFamilyMap()
public int numFamilies()
public boolean hasFamilies()
public byte[][] getFamilies()
public byte[] getStartRow()
public byte[] getStopRow()
public int getMaxVersions()
public int getBatch()
public int getMaxResultsPerColumnFamily()
public int getRowOffsetPerColumnFamily()
public int getCaching()
public TimeRange getTimeRange()
public boolean hasFilter()
public Scan setCacheBlocks(boolean cacheBlocks)
This is true by default. When true, default settings of the table and family are used (this will never override caching blocks if the block cache is disabled for that family or entirely).
cacheBlocks
- if false, default settings are overridden and blocks
will not be cachedpublic boolean getCacheBlocks()
public Scan setReversed(boolean reversed)
This is false by default which means forward(normal) scan.
reversed
- if true, scan will be backward orderpublic boolean isReversed()
public Scan setAllowPartialResults(boolean allowPartialResults)
allowPartialResults
- public boolean getAllowPartialResults()
ResultScanner.next()
public Scan setLoadColumnFamiliesOnDemand(boolean value)
Query
setLoadColumnFamiliesOnDemand
in class Query
public Map<String,Object> getFingerprint()
getFingerprint
in class Operation
public Map<String,Object> toMap(int maxCols)
public Scan setRaw(boolean raw)
raw
- True/False to enable/disable "raw" mode.public boolean isRaw()
public Scan setSmall(boolean small)
Small scan should use pread and big scan can use seek + read seek + read is fast but can cause two problem (1) resource contention (2) cause too much network io [89-fb] Using pread for non-compaction read request https://issues.apache.org/jira/browse/HBASE-7266 On the other hand, if setting it true, we would do openScanner,next,closeScanner in one RPC call. It means the better performance for small scan. [HBASE-9488]. Generally, if the scan range is within one data block(64KB), it could be considered as a small scan.
small
- public boolean isSmall()
public Scan setAttribute(String name, byte[] value)
Attributes
setAttribute
in interface Attributes
setAttribute
in class OperationWithAttributes
name
- attribute namevalue
- attribute valuepublic Scan setId(String id)
OperationWithAttributes
setId
in class OperationWithAttributes
id
- id to set for the scanpublic Scan setAuthorizations(Authorizations authorizations)
Query
setAuthorizations
in class Query
public Scan setACL(Map<String,Permission> perms)
public Scan setACL(String user, Permission perms)
public Scan setConsistency(Consistency consistency)
Query
setConsistency
in class Query
consistency
- the consistency levelpublic Scan setReplicaId(int Id)
Query
Query.setConsistency(Consistency)
passing Consistency.TIMELINE
to read data from
a specific replicaId.
setReplicaId
in class Query
public Scan setIsolationLevel(IsolationLevel level)
Query
setIsolationLevel
in class Query
level
- IsolationLevel for this querystatic Scan createGetClosestRowOrBeforeReverseScan(byte[] row)
row
- family
- row
and family
to
scan in reverse for one row only.public Scan setScanMetricsEnabled(boolean enabled)
ScanMetrics
. For advanced users.enabled
- Set to true to enable accumulating scan metricspublic boolean isScanMetricsEnabled()
public ScanMetrics getScanMetrics()
setScanMetricsEnabled(boolean)
Copyright © 2007–2019 The Apache Software Foundation. All rights reserved.