Class StoreFileInfo
java.lang.Object
org.apache.hadoop.hbase.regionserver.StoreFileInfo
- All Implemented Interfaces:
org.apache.hadoop.conf.Configurable
Describe a StoreFile (hfile, reference, link)
-
Field Summary
Modifier and TypeFieldDescriptionprivate org.apache.hadoop.conf.Configuration
private RegionCoprocessorHost
private long
static final boolean
private final org.apache.hadoop.fs.FileSystem
private HDFSBlocksDistribution
private static final Pattern
Regex that will work for hfilesstatic final String
A non-capture group, for hfiles, so that this can be embedded.private HFileInfo
private final org.apache.hadoop.fs.Path
private final HFileLink
private static final org.slf4j.Logger
private final boolean
private final boolean
private static final Pattern
Regex that will work for straight reference names (<hfile>.<parentEncRegion>
) and hfilelink reference names (<table> =<region>-<hfile>.<parentEncRegion>
).private final AtomicInteger
private final Reference
private static final String
Cells in a bulkloaded file don't have a sequenceId since they don't go through memstore.private static final int
private long
static final String
-
Constructor Summary
ModifierConstructorDescriptionStoreFileInfo
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, long createdTimestamp, org.apache.hadoop.fs.Path initialPath, long size, Reference reference, HFileLink link, boolean primaryReplica) Create a Store File Info from an HFileLink and a Referenceprivate
StoreFileInfo
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, org.apache.hadoop.fs.Path initialPath, boolean primaryReplica, StoreFileTracker sft) StoreFileInfo
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, HFileLink link) Create a Store File Info from an HFileLinkStoreFileInfo
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, Reference reference) Create a Store File Info from an HFileLinkStoreFileInfo
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, Reference reference, HFileLink link) Create a Store File Info from an HFileLink and a Reference -
Method Summary
Modifier and TypeMethodDescriptioncomputeHDFSBlocksDistribution
(org.apache.hadoop.fs.FileSystem fs) Compute the HDFS Block Distribution for this StoreFileprivate HDFSBlocksDistribution
computeHDFSBlocksDistributionInternal
(org.apache.hadoop.fs.FileSystem fs) private static HDFSBlocksDistribution
computeRefFileHDFSBlockDistribution
(org.apache.hadoop.fs.FileSystem fs, Reference reference, org.apache.hadoop.fs.FileStatus status) helper function to compute HDFS blocks distribution of a given reference file.For reference file, we don't compute the exact value.createReader
(ReaderContext context, CacheConfig cacheConf) (package private) ReaderContext
createReaderContext
(boolean doDropBehind, long readahead, ReaderContext.ReaderType type) static StoreFileInfo
createStoreFileInfoForHFile
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path initialPath, boolean primaryReplica) (package private) int
boolean
static String
formatBulkloadSeqId
(long seqId) Return the active file name that contains the real data.static OptionalLong
getBulkloadSeqId
(org.apache.hadoop.fs.Path path) org.apache.hadoop.conf.Configuration
getConf()
long
Returns timestamp when this file was created (as returned by filesystem)org.apache.hadoop.fs.FileStatus
Returns TheFileStatus
of the file(package private) org.apache.hadoop.fs.FileSystem
Returns the HDFS block distributionlong
Returns Get the modification time of the file.org.apache.hadoop.fs.Path
getPath()
Returns ThePath
of the file(package private) int
org.apache.hadoop.fs.FileStatus
getReferencedFileStatus
(org.apache.hadoop.fs.FileSystem fs) Get theFileStatus
of the file referenced by this StoreFileInfostatic org.apache.hadoop.fs.Path
getReferredToFile
(org.apache.hadoop.fs.Path p) getReferredToRegionAndFile
(String referenceFile) long
getSize()
Size of the Hfilestatic boolean
hasBulkloadSeqId
(org.apache.hadoop.fs.Path path) int
hashCode()
(package private) int
(package private) void
void
initHFileInfo
(ReaderContext context) static boolean
static boolean
isHFile
(org.apache.hadoop.fs.Path path) boolean
isLink()
Returns True if the store file is a linkstatic boolean
isMobFile
(org.apache.hadoop.fs.Path path) Checks if the file is a MOB filestatic boolean
isMobRefFile
(org.apache.hadoop.fs.Path path) Checks if the file is a MOB reference file, created by snapshot(package private) boolean
boolean
Returns True if the store file is a Referencestatic boolean
isReference
(String name) static boolean
isReference
(org.apache.hadoop.fs.Path path) boolean
Returns True if the store file is a top Referencestatic boolean
isValid
(org.apache.hadoop.fs.FileStatus fileStatus) Return if the specified file is a valid store file or not.(package private) StoreFileReader
postStoreFileReaderOpen
(ReaderContext context, CacheConfig cacheConf, StoreFileReader reader) (package private) StoreFileReader
preStoreFileReaderOpen
(ReaderContext context, CacheConfig cacheConf) void
setConf
(org.apache.hadoop.conf.Configuration conf) void
setRegionCoprocessorHost
(RegionCoprocessorHost coprocessorHost) Sets the region coprocessor env.toString()
static boolean
validateStoreFileName
(String fileName) Validate the store file name.
-
Field Details
-
LOG
-
HFILE_NAME_REGEX
A non-capture group, for hfiles, so that this can be embedded. HFiles are uuid ([0-9a-z]+). Bulk loaded hfiles have (_SeqId_[0-9]+_) as a suffix. The mob del file has (_del) as a suffix.- See Also:
-
HFILE_NAME_PATTERN
Regex that will work for hfiles -
REF_NAME_PATTERN
Regex that will work for straight reference names (<hfile>.<parentEncRegion>
) and hfilelink reference names (<table> =<region>-<hfile>.<parentEncRegion>
). If reference, then the regex has more than just one group. Group 1, hfile/hfilelink pattern, is this file's id. Group 2 '(.+)' is the reference's parent region name. -
STORE_FILE_READER_NO_READAHEAD
- See Also:
-
DEFAULT_STORE_FILE_READER_NO_READAHEAD
- See Also:
-
conf
-
fs
-
hdfsBlocksDistribution
-
hfileInfo
-
reference
-
link
-
initialPath
-
coprocessorHost
-
createdTimestamp
-
size
-
-
noReadahead
-
refCount
-
SEQ_ID_MARKER
Cells in a bulkloaded file don't have a sequenceId since they don't go through memstore. When a bulkload file is committed, the current memstore ts is stamped onto the file name as the sequenceId of the file. At read time, the sequenceId is copied onto all of the cells returned so that they can be properly sorted relative to other cells in other files. Further, when opening multiple files for scan, the sequence id is used to ensusre that the bulkload file's scanner is porperly sorted amongst the other scanners. Non-bulkloaded files get their sequenceId from the MAX_MEMSTORE_TS_KEY since those go through the memstore and have true sequenceIds.- See Also:
-
SEQ_ID_MARKER_LENGTH
-
-
Constructor Details
-
StoreFileInfo
private StoreFileInfo(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, org.apache.hadoop.fs.Path initialPath, boolean primaryReplica, StoreFileTracker sft) throws IOException - Throws:
IOException
-
StoreFileInfo
public StoreFileInfo(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, HFileLink link) Create a Store File Info from an HFileLink- Parameters:
conf
- TheConfiguration
to usefs
- The current file system to usefileStatus
- TheFileStatus
of the file
-
StoreFileInfo
public StoreFileInfo(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, Reference reference) Create a Store File Info from an HFileLink- Parameters:
conf
- TheConfiguration
to usefs
- The current file system to usefileStatus
- TheFileStatus
of the filereference
- The reference instance
-
StoreFileInfo
public StoreFileInfo(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus, Reference reference, HFileLink link) Create a Store File Info from an HFileLink and a Reference- Parameters:
conf
- TheConfiguration
to usefs
- The current file system to usefileStatus
- TheFileStatus
of the filereference
- The reference instancelink
- The link instance
-
StoreFileInfo
public StoreFileInfo(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, long createdTimestamp, org.apache.hadoop.fs.Path initialPath, long size, Reference reference, HFileLink link, boolean primaryReplica) Create a Store File Info from an HFileLink and a Reference- Parameters:
conf
- TheConfiguration
to usefs
- The current file system to usefileStatus
- TheFileStatus
of the filereference
- The reference instancelink
- The link instance
-
-
Method Details
-
getConf
- Specified by:
getConf
in interfaceorg.apache.hadoop.conf.Configurable
-
setConf
- Specified by:
setConf
in interfaceorg.apache.hadoop.conf.Configurable
-
getSize
Size of the Hfile -
setRegionCoprocessorHost
Sets the region coprocessor env. -
getReference
- Returns:
- the Reference object associated to this StoreFileInfo. null if the StoreFile is not a reference.
-
isReference
Returns True if the store file is a Reference -
isTopReference
Returns True if the store file is a top Reference -
isLink
Returns True if the store file is a link -
getHDFSBlockDistribution
Returns the HDFS block distribution -
createReader
public StoreFileReader createReader(ReaderContext context, CacheConfig cacheConf) throws IOException - Throws:
IOException
-
createReaderContext
ReaderContext createReaderContext(boolean doDropBehind, long readahead, ReaderContext.ReaderType type) throws IOException - Throws:
IOException
-
computeHDFSBlocksDistribution
public HDFSBlocksDistribution computeHDFSBlocksDistribution(org.apache.hadoop.fs.FileSystem fs) throws IOException Compute the HDFS Block Distribution for this StoreFile- Throws:
IOException
-
computeHDFSBlocksDistributionInternal
private HDFSBlocksDistribution computeHDFSBlocksDistributionInternal(org.apache.hadoop.fs.FileSystem fs) throws IOException - Throws:
IOException
-
getReferencedFileStatus
public org.apache.hadoop.fs.FileStatus getReferencedFileStatus(org.apache.hadoop.fs.FileSystem fs) throws IOException Get theFileStatus
of the file referenced by this StoreFileInfo- Parameters:
fs
- The current file system to use.- Returns:
- The
FileStatus
of the file referenced by this StoreFileInfo - Throws:
IOException
-
getPath
Returns ThePath
of the file -
getFileStatus
Returns TheFileStatus
of the file- Throws:
IOException
-
getModificationTime
Returns Get the modification time of the file.- Throws:
IOException
-
toString
-
hasBulkloadSeqId
- Returns:
- True if the file name looks like a bulkloaded file, based on the presence of the SeqId marker added to those files.
- See Also:
-
getBulkloadSeqId
- Returns:
- If the path is a properly named bulkloaded file, returns the sequence id stamped at the end of the file name.
- See Also:
-
formatBulkloadSeqId
- Returns:
- A string value for appending to the end of a bulkloaded file name, containing the properly formatted SeqId marker.
- See Also:
-
isHFile
- Parameters:
path
- Path to check.- Returns:
- True if the path has format of a HFile.
-
isHFile
-
isMobFile
Checks if the file is a MOB file- Parameters:
path
- path to a file- Returns:
- true, if - yes, false otherwise
-
isMobRefFile
Checks if the file is a MOB reference file, created by snapshot- Parameters:
path
- path to a file- Returns:
- true, if - yes, false otherwise
-
isReference
- Parameters:
path
- Path to check.- Returns:
- True if the path has format of a HStoreFile reference.
-
isReference
- Parameters:
name
- file name to check.- Returns:
- True if the path has format of a HStoreFile reference.
-
getCreatedTimestamp
Returns timestamp when this file was created (as returned by filesystem) -
getReferredToFile
-
getReferredToRegionAndFile
-
validateStoreFileName
Validate the store file name.- Parameters:
fileName
- name of the file to validate- Returns:
- true if the file could be a valid store file, false otherwise
-
isValid
Return if the specified file is a valid store file or not.- Parameters:
fileStatus
- TheFileStatus
of the file- Returns:
- true if the file is valid
- Throws:
IOException
-
computeRefFileHDFSBlockDistribution
private static HDFSBlocksDistribution computeRefFileHDFSBlockDistribution(org.apache.hadoop.fs.FileSystem fs, Reference reference, org.apache.hadoop.fs.FileStatus status) throws IOException helper function to compute HDFS blocks distribution of a given reference file.For reference file, we don't compute the exact value. We use some estimate instead given it might be good enough. we assume bottom part takes the first half of reference file, top part takes the second half of the reference file. This is just estimate, given midkey ofregion != midkey of HFile, also the number and size of keys vary. If this estimate isn't good enough, we can improve it later.- Parameters:
fs
- The FileSystemreference
- The referencestatus
- The reference FileStatus- Returns:
- HDFS blocks distribution
- Throws:
IOException
-
equals
-
hashCode
-
getActiveFileName
Return the active file name that contains the real data.For referenced hfile, we will return the name of the reference file as it will be used to construct the StoreFileReader. And for linked hfile, we will return the name of the file being linked.
-
getFileSystem
org.apache.hadoop.fs.FileSystem getFileSystem() -
isNoReadahead
boolean isNoReadahead() -
getHFileInfo
-
initHDFSBlocksDistribution
- Throws:
IOException
-
preStoreFileReaderOpen
StoreFileReader preStoreFileReaderOpen(ReaderContext context, CacheConfig cacheConf) throws IOException - Throws:
IOException
-
postStoreFileReaderOpen
StoreFileReader postStoreFileReaderOpen(ReaderContext context, CacheConfig cacheConf, StoreFileReader reader) throws IOException - Throws:
IOException
-
initHFileInfo
- Throws:
IOException
-
getRefCount
int getRefCount() -
increaseRefCount
int increaseRefCount() -
decreaseRefCount
int decreaseRefCount() -
createStoreFileInfoForHFile
public static StoreFileInfo createStoreFileInfoForHFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path initialPath, boolean primaryReplica) throws IOException - Throws:
IOException
-