Package org.apache.hadoop.hbase.util
Class FSUtils
java.lang.Object
org.apache.hadoop.hbase.util.FSUtils
Utility methods for interacting with the underlying file system.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Directory filter that doesn't include any of the directories in the specified blackliststatic class
APathFilter
that only allows directories.static class
Filter for all dirs that are legal column family names.(package private) static class
APathFilter
that returns only regular files.static class
Filter for HFiles that excludes reference files.static class
Filter for HFileLinks (StoreFiles and HFiles not included).(package private) static interface
Called every so-often by storefile map builder getTableStoreFilePathMap to report progress.static class
static class
static class
Filter for all dirs that don't start with '.'static class
APathFilter
that returns usertable directories. -
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
addToHDFSBlocksDistribution
(HDFSBlocksDistribution blocksDistribution, org.apache.hadoop.fs.BlockLocation[] blockLocations) Update blocksDistribution with blockLocationsstatic boolean
checkClusterIdExists
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, long wait) Checks that a cluster ID file exists in the HBase root directorystatic void
checkDfsSafeMode
(org.apache.hadoop.conf.Configuration conf) Check whether dfs is in safemode.static void
checkFileSystemAvailable
(org.apache.hadoop.fs.FileSystem fs) Checks to see if the specified file system is availablestatic void
checkShortCircuitReadBufferSize
(org.apache.hadoop.conf.Configuration conf) Check if short circuit read buffer size is set and if not, set it to hbase value.static void
checkVersion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, boolean message) Verifies current version of file systemstatic void
checkVersion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, boolean message, int wait, int retries) Verifies current version of file systemstatic HDFSBlocksDistribution
computeHDFSBlocksDistribution
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus status, long start, long length) Compute HDFS blocks distribution of a given file, or a portion of the filestatic HDFSBlocksDistribution
computeHDFSBlocksDistribution
(org.apache.hadoop.hdfs.client.HdfsDataInputStream inputStream) Compute HDFS block distribution of a given HdfsDataInputStream.private static List<org.apache.hadoop.fs.Path>
copyFiles
(org.apache.hadoop.fs.FileSystem srcFS, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.FileSystem dstFS, org.apache.hadoop.fs.Path dst, org.apache.hadoop.conf.Configuration conf, ExecutorService pool, List<Future<Void>> futures) static List<org.apache.hadoop.fs.Path>
copyFilesParallel
(org.apache.hadoop.fs.FileSystem srcFS, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.FileSystem dstFS, org.apache.hadoop.fs.Path dst, org.apache.hadoop.conf.Configuration conf, int threads) static org.apache.hadoop.fs.FSDataOutputStream
create
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, InetSocketAddress[] favoredNodes) Create the specified file on the filesystem.static boolean
deleteRegionDir
(org.apache.hadoop.conf.Configuration conf, RegionInfo hri) Delete the region directory if exists.static List<org.apache.hadoop.fs.FileStatus>
filterFileStatuses
(Iterator<org.apache.hadoop.fs.FileStatus> input, FileStatusFilter filter) Filters FileStatuses in an iterator and returns a liststatic List<org.apache.hadoop.fs.FileStatus>
filterFileStatuses
(org.apache.hadoop.fs.FileStatus[] input, FileStatusFilter filter) Filters FileStatuses in an array and returns a liststatic ClusterId
getClusterId
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) Returns the value of the unique cluster ID stored for this HBase instance.static org.apache.hadoop.hdfs.DFSHedgedReadMetrics
getDFSHedgedReadMetrics
(org.apache.hadoop.conf.Configuration c) Returns The DFSClient DFSHedgedReadMetrics instance or null if can't be found or not on hdfs.static List<org.apache.hadoop.fs.Path>
getFamilyDirs
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) Given a particular region dir, return all the familydirs inside itprivate static List<org.apache.hadoop.fs.Path>
getFilePaths
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter pathFilter) private static String[]
getHostsForLocations
(org.apache.hadoop.hdfs.protocol.LocatedBlock block) static List<org.apache.hadoop.fs.Path>
getLocalTableDirs
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) private static Set<InetSocketAddress>
getNNAddresses
(org.apache.hadoop.hdfs.DistributedFileSystem fs, org.apache.hadoop.conf.Configuration conf) Returns A set containing all namenode addresses of fsstatic List<org.apache.hadoop.fs.Path>
getReferenceAndLinkFilePaths
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path familyDir) static List<org.apache.hadoop.fs.Path>
getReferenceFilePaths
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path familyDir) getRegionDegreeLocalityMappingFromFS
(org.apache.hadoop.conf.Configuration conf) This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region.getRegionDegreeLocalityMappingFromFS
(org.apache.hadoop.conf.Configuration conf, String desiredTable, int threadPoolSize) This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region.static org.apache.hadoop.fs.Path
getRegionDirFromRootDir
(org.apache.hadoop.fs.Path rootDir, RegionInfo region) static org.apache.hadoop.fs.Path
getRegionDirFromTableDir
(org.apache.hadoop.fs.Path tableDir, String encodedRegionName) static org.apache.hadoop.fs.Path
getRegionDirFromTableDir
(org.apache.hadoop.fs.Path tableDir, RegionInfo region) static List<org.apache.hadoop.fs.Path>
getRegionDirs
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path tableDir) Given a particular table dir, return all the regiondirs inside it, excluding files such as .tableinfoprivate static void
getRegionLocalityMappingFromFS
(org.apache.hadoop.conf.Configuration conf, String desiredTable, int threadPoolSize, Map<String, Map<String, Float>> regionDegreeLocalityMapping) This function is to scan the root path of the file system to get either the mapping between the region name and its best locality region server or the degree of locality of each region on each of the servers having at least one block of that region.static int
getRegionReferenceAndLinkFileCount
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p) static int
getRegionReferenceFileCount
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p) static List<org.apache.hadoop.fs.Path>
getTableDirs
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) getTableFragmentation
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir) Runs through the HBase rootdir and checks how many stores for each table have more than one file in them.getTableFragmentation
(HMaster master) Runs through the HBase rootdir and checks how many stores for each table have more than one file in them.getTableStoreFilePathMap
(Map<String, org.apache.hadoop.fs.Path> map, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName) Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.getTableStoreFilePathMap
(Map<String, org.apache.hadoop.fs.Path> resultMap, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, FSUtils.ProgressReporter progressReporter) Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.getTableStoreFilePathMap
(Map<String, org.apache.hadoop.fs.Path> resultMap, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, HbckErrorReporter progressReporter) Deprecated.Since 2.3.0.getTableStoreFilePathMap
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir) Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.getTableStoreFilePathMap
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, FSUtils.ProgressReporter progressReporter) Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.getTableStoreFilePathMap
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, HbckErrorReporter progressReporter) Deprecated.Since 2.3.0.static int
getTotalTableFragmentation
(HMaster master) Returns the total overall fragmentation percentage.static String
getVersion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) Verifies current version of file systemstatic boolean
isDistributedFileSystem
(org.apache.hadoop.fs.FileSystem fs) Returns True isfs
is instance of DistributedFileSystem nprivate static boolean
isInSafeMode
(org.apache.hadoop.fs.FileSystem dfs) Inquire the Active NameNode's safe mode status.static boolean
isMatchingTail
(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail) Compare path component of the Path URI; e.g.static boolean
isSameHdfs
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem srcFs, org.apache.hadoop.fs.FileSystem desFs) static List<org.apache.hadoop.fs.FileStatus>
listStatusWithStatusFilter
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, FileStatusFilter filter) Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException.static boolean
metaRegionExists
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir) Checks if meta region exists(package private) static String
parseVersionFrom
(byte[] bytes) Parse the content of the ${HBASE_ROOTDIR}/hbase.version file.static void
renameFile
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dst) private static void
rewriteAsPb
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, org.apache.hadoop.fs.Path p, ClusterId cid) static void
setClusterId
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, ClusterId clusterId, long wait) Writes a new unique identifier for this cluster to the "hbase.id" file in the HBase root directory.static void
setupShortCircuitRead
(org.apache.hadoop.conf.Configuration conf) Do our short circuit read setup.static void
setVersion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) Sets version of file systemstatic void
setVersion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, int wait, int retries) Sets version of file systemstatic void
setVersion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, String version, int wait, int retries) Sets version of file systemstatic boolean
supportSafeMode
(org.apache.hadoop.fs.FileSystem fs) (package private) static byte[]
toVersionByteArray
(String version) Create the content to write into the ${HBASE_ROOTDIR}/hbase.version file.static void
waitOnSafeMode
(org.apache.hadoop.conf.Configuration conf, long wait) If DFS, check safe mode and if so, wait until we clear it.
-
Field Details
-
LOG
-
THREAD_POOLSIZE
- See Also:
-
DEFAULT_THREAD_POOLSIZE
- See Also:
-
WINDOWS
Set to true on Windows platforms -
safeModeClazz
-
safeModeActionClazz
-
safeModeGet
-
-
Constructor Details
-
FSUtils
private FSUtils()
-
-
Method Details
-
isDistributedFileSystem
public static boolean isDistributedFileSystem(org.apache.hadoop.fs.FileSystem fs) throws IOException Returns True isfs
is instance of DistributedFileSystem n- Throws:
IOException
-
isMatchingTail
public static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail) Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. If you passed in 'hdfs://a/b/c and b/c, it would return true. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.- Parameters:
pathToSearch
- Path we will be trying to match.- Returns:
- True if
pathTail
is tail on the path ofpathToSearch
-
deleteRegionDir
public static boolean deleteRegionDir(org.apache.hadoop.conf.Configuration conf, RegionInfo hri) throws IOException Delete the region directory if exists.- Returns:
- True if deleted the region directory.
- Throws:
IOException
-
create
public static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, InetSocketAddress[] favoredNodes) throws IOException Create the specified file on the filesystem. By default, this will:- overwrite the file if it exists
- apply the umask in the configuration (if it is enabled)
- use the fs configured buffer size (or 4096 if not set)
- use the configured column family replication or default replication if
ColumnFamilyDescriptorBuilder.DEFAULT_DFS_REPLICATION
- use the default block size
- not track progress
- Parameters:
conf
- configurationsfs
-FileSystem
on which to write the filepath
-Path
to the file to writeperm
- permissionsfavoredNodes
- favored data nodes- Returns:
- output stream to the created file
- Throws:
IOException
- if the file cannot be created
-
checkFileSystemAvailable
Checks to see if the specified file system is available- Parameters:
fs
- filesystem- Throws:
IOException
- e
-
isInSafeMode
Inquire the Active NameNode's safe mode status.- Parameters:
dfs
- A DistributedFileSystem object representing the underlying HDFS.- Returns:
- whether we're in safe mode
- Throws:
IOException
-
checkDfsSafeMode
Check whether dfs is in safemode.- Throws:
IOException
-
getVersion
public static String getVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) throws IOException, DeserializationException Verifies current version of file system- Parameters:
fs
- filesystem objectrootdir
- root hbase directory- Returns:
- null if no version file exists, version string otherwise
- Throws:
IOException
- if the version file fails to openDeserializationException
- if the version data cannot be translated into a version
-
parseVersionFrom
Parse the content of the ${HBASE_ROOTDIR}/hbase.version file.- Parameters:
bytes
- The byte content of the hbase.version file- Returns:
- The version found in the file as a String
- Throws:
DeserializationException
- if the version data cannot be translated into a version
-
toVersionByteArray
Create the content to write into the ${HBASE_ROOTDIR}/hbase.version file.- Parameters:
version
- Version to persist- Returns:
- Serialized protobuf with
version
content and a bit of pb magic for a prefix.
-
checkVersion
public static void checkVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, boolean message) throws IOException, DeserializationException Verifies current version of file system- Parameters:
fs
- file systemrootdir
- root directory of HBase installationmessage
- if true, issues a message on System.out- Throws:
IOException
- if the version file cannot be openedDeserializationException
- if the contents of the version file cannot be parsed
-
checkVersion
public static void checkVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, boolean message, int wait, int retries) throws IOException, DeserializationException Verifies current version of file system- Parameters:
fs
- file systemrootdir
- root directory of HBase installationmessage
- if true, issues a message on System.outwait
- wait intervalretries
- number of times to retry- Throws:
IOException
- if the version file cannot be openedDeserializationException
- if the contents of the version file cannot be parsed
-
setVersion
public static void setVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) throws IOException Sets version of file system- Parameters:
fs
- filesystem objectrootdir
- hbase root- Throws:
IOException
- e
-
setVersion
public static void setVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, int wait, int retries) throws IOException Sets version of file system- Parameters:
fs
- filesystem objectrootdir
- hbase rootwait
- time to wait for retryretries
- number of times to retry before failing- Throws:
IOException
- e
-
setVersion
public static void setVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, String version, int wait, int retries) throws IOException Sets version of file system- Parameters:
fs
- filesystem objectrootdir
- hbase root directoryversion
- version to setwait
- time to wait for retryretries
- number of times to retry before throwing an IOException- Throws:
IOException
- e
-
checkClusterIdExists
public static boolean checkClusterIdExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, long wait) throws IOException Checks that a cluster ID file exists in the HBase root directory- Parameters:
fs
- the root directory FileSystemrootdir
- the HBase root directory in HDFSwait
- how long to wait between retries- Returns:
true
if the file exists, otherwisefalse
- Throws:
IOException
- if checking the FileSystem fails
-
getClusterId
public static ClusterId getClusterId(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) throws IOException Returns the value of the unique cluster ID stored for this HBase instance.- Parameters:
fs
- the root directory FileSystemrootdir
- the path to the HBase root directory- Returns:
- the unique cluster identifier
- Throws:
IOException
- if reading the cluster ID file fails
-
rewriteAsPb
private static void rewriteAsPb(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, org.apache.hadoop.fs.Path p, ClusterId cid) throws IOException - Throws:
IOException
-
setClusterId
public static void setClusterId(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, ClusterId clusterId, long wait) throws IOException Writes a new unique identifier for this cluster to the "hbase.id" file in the HBase root directory. If any operations on the ID file fails, andwait
is a positive value, the method will retry to produce the ID file until the thread is forcibly interrupted.- Parameters:
fs
- the root directory FileSystemrootdir
- the path to the HBase root directoryclusterId
- the unique identifier to storewait
- how long (in milliseconds) to wait between retries- Throws:
IOException
- if writing to the FileSystem fails and no wait value
-
waitOnSafeMode
public static void waitOnSafeMode(org.apache.hadoop.conf.Configuration conf, long wait) throws IOException If DFS, check safe mode and if so, wait until we clear it.- Parameters:
conf
- configurationwait
- Sleep between retries- Throws:
IOException
- e
-
supportSafeMode
-
metaRegionExists
public static boolean metaRegionExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir) throws IOException Checks if meta region exists- Parameters:
fs
- file systemrootDir
- root directory of HBase installation- Returns:
- true if exists
- Throws:
IOException
-
computeHDFSBlocksDistribution
public static HDFSBlocksDistribution computeHDFSBlocksDistribution(org.apache.hadoop.hdfs.client.HdfsDataInputStream inputStream) throws IOException Compute HDFS block distribution of a given HdfsDataInputStream. All HdfsDataInputStreams are backed by a series of LocatedBlocks, which are fetched periodically from the namenode. This method retrieves those blocks from the input stream and uses them to calculate HDFSBlockDistribution. The underlying method in DFSInputStream does attempt to use locally cached blocks, but may hit the namenode if the cache is determined to be incomplete. The method also involves making copies of all LocatedBlocks rather than return the underlying blocks themselves.- Throws:
IOException
-
getHostsForLocations
-
computeHDFSBlocksDistribution
public static HDFSBlocksDistribution computeHDFSBlocksDistribution(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus status, long start, long length) throws IOException Compute HDFS blocks distribution of a given file, or a portion of the file- Parameters:
fs
- file systemstatus
- file status of the filestart
- start position of the portionlength
- length of the portion- Returns:
- The HDFS blocks distribution
- Throws:
IOException
-
addToHDFSBlocksDistribution
public static void addToHDFSBlocksDistribution(HDFSBlocksDistribution blocksDistribution, org.apache.hadoop.fs.BlockLocation[] blockLocations) throws IOException Update blocksDistribution with blockLocations- Parameters:
blocksDistribution
- the hdfs blocks distributionblockLocations
- an array containing block location- Throws:
IOException
-
getTotalTableFragmentation
Returns the total overall fragmentation percentage. Includes hbase:meta and -ROOT- as well.- Parameters:
master
- The master defining the HBase root and file system- Returns:
- A map for each table and its percentage (never null)
- Throws:
IOException
- When scanning the directory fails
-
getTableFragmentation
Runs through the HBase rootdir and checks how many stores for each table have more than one file in them. Checks -ROOT- and hbase:meta too. The total percentage across all tables is stored under the special key "-TOTAL-".- Parameters:
master
- The master defining the HBase root and file system.- Returns:
- A map for each table and its percentage (never null).
- Throws:
IOException
- When scanning the directory fails.
-
getTableFragmentation
public static Map<String,Integer> getTableFragmentation(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir) throws IOException Runs through the HBase rootdir and checks how many stores for each table have more than one file in them. Checks -ROOT- and hbase:meta too. The total percentage across all tables is stored under the special key "-TOTAL-".- Parameters:
fs
- The file system to usehbaseRootDir
- The root directory to scan- Returns:
- A map for each table and its percentage (never null)
- Throws:
IOException
- When scanning the directory fails
-
renameFile
public static void renameFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dst) throws IOException - Throws:
IOException
-
getTableDirs
public static List<org.apache.hadoop.fs.Path> getTableDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) throws IOException - Throws:
IOException
-
getLocalTableDirs
public static List<org.apache.hadoop.fs.Path> getLocalTableDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir) throws IOException - Returns:
- All the table directories under
rootdir
. Ignore non table hbase folders such as .logs, .oldlogs, .corrupt folders. - Throws:
IOException
-
getRegionDirs
public static List<org.apache.hadoop.fs.Path> getRegionDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path tableDir) throws IOException Given a particular table dir, return all the regiondirs inside it, excluding files such as .tableinfo- Parameters:
fs
- A file system for the PathtableDir
- Path to a specific table directory <hbase.rootdir>/<tabledir>- Returns:
- List of paths to valid region directories in table dir.
- Throws:
IOException
-
getRegionDirFromRootDir
public static org.apache.hadoop.fs.Path getRegionDirFromRootDir(org.apache.hadoop.fs.Path rootDir, RegionInfo region) -
getRegionDirFromTableDir
public static org.apache.hadoop.fs.Path getRegionDirFromTableDir(org.apache.hadoop.fs.Path tableDir, RegionInfo region) -
getRegionDirFromTableDir
public static org.apache.hadoop.fs.Path getRegionDirFromTableDir(org.apache.hadoop.fs.Path tableDir, String encodedRegionName) -
getFamilyDirs
public static List<org.apache.hadoop.fs.Path> getFamilyDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) throws IOException Given a particular region dir, return all the familydirs inside it- Parameters:
fs
- A file system for the PathregionDir
- Path to a specific region directory- Returns:
- List of paths to valid family directories in region dir.
- Throws:
IOException
-
getReferenceFilePaths
public static List<org.apache.hadoop.fs.Path> getReferenceFilePaths(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path familyDir) throws IOException - Throws:
IOException
-
getReferenceAndLinkFilePaths
public static List<org.apache.hadoop.fs.Path> getReferenceAndLinkFilePaths(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path familyDir) throws IOException - Throws:
IOException
-
getFilePaths
private static List<org.apache.hadoop.fs.Path> getFilePaths(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter pathFilter) throws IOException - Throws:
IOException
-
getRegionReferenceAndLinkFileCount
public static int getRegionReferenceAndLinkFileCount(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p) -
getTableStoreFilePathMap
public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String, org.apache.hadoop.fs.Path> map, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName) throws IOException, InterruptedExceptionRuns through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744- Parameters:
map
- map to add values. If null, this method will create and populate one to returnfs
- The file system to use.hbaseRootDir
- The root directory to scan.tableName
- name of the table to scan.- Returns:
- Map keyed by StoreFile name with a value of the full Path.
- Throws:
IOException
- When scanning the directory fails.InterruptedException
-
getTableStoreFilePathMap
@Deprecated public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String, org.apache.hadoop.fs.Path> resultMap, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, HbckErrorReporter progressReporter) throws IOException, InterruptedExceptionDeprecated.Since 2.3.0. For removal in hbase4. Use ProgressReporter override instead.Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path. Note that because this method can be called on a 'live' HBase system that we will skip files that no longer exist by the time we traverse them and similarly the user of the result needs to consider that some entries in this map may not exist by the time this call completes.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744- Parameters:
resultMap
- map to add values. If null, this method will create and populate one to returnfs
- The file system to use.hbaseRootDir
- The root directory to scan.tableName
- name of the table to scan.sfFilter
- optional path filter to apply to store filesexecutor
- optional executor service to parallelize this operationprogressReporter
- Instance or null; gets called every time we move to new region of family dir and for each store file.- Returns:
- Map keyed by StoreFile name with a value of the full Path.
- Throws:
IOException
- When scanning the directory fails.InterruptedException
-
getTableStoreFilePathMap
public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String, org.apache.hadoop.fs.Path> resultMap, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, FSUtils.ProgressReporter progressReporter) throws IOException, InterruptedExceptionRuns through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path. Note that because this method can be called on a 'live' HBase system that we will skip files that no longer exist by the time we traverse them and similarly the user of the result needs to consider that some entries in this map may not exist by the time this call completes.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744- Parameters:
resultMap
- map to add values. If null, this method will create and populate one to returnfs
- The file system to use.hbaseRootDir
- The root directory to scan.tableName
- name of the table to scan.sfFilter
- optional path filter to apply to store filesexecutor
- optional executor service to parallelize this operationprogressReporter
- Instance or null; gets called every time we move to new region of family dir and for each store file.- Returns:
- Map keyed by StoreFile name with a value of the full Path.
- Throws:
IOException
- When scanning the directory fails.InterruptedException
- the thread is interrupted, either before or during the activity.
-
getRegionReferenceFileCount
public static int getRegionReferenceFileCount(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p) -
getTableStoreFilePathMap
public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir) throws IOException, InterruptedException Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744- Parameters:
fs
- The file system to use.hbaseRootDir
- The root directory to scan.- Returns:
- Map keyed by StoreFile name with a value of the full Path.
- Throws:
IOException
- When scanning the directory fails.InterruptedException
-
getTableStoreFilePathMap
@Deprecated public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, HbckErrorReporter progressReporter) throws IOException, InterruptedException Deprecated.Since 2.3.0. Will be removed in hbase4. UsedgetTableStoreFilePathMap(FileSystem, Path, PathFilter, ExecutorService, ProgressReporter)
Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744- Parameters:
fs
- The file system to use.hbaseRootDir
- The root directory to scan.sfFilter
- optional path filter to apply to store filesexecutor
- optional executor service to parallelize this operationprogressReporter
- Instance or null; gets called every time we move to new region of family dir and for each store file.- Returns:
- Map keyed by StoreFile name with a value of the full Path.
- Throws:
IOException
- When scanning the directory fails.InterruptedException
-
getTableStoreFilePathMap
public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, org.apache.hadoop.fs.PathFilter sfFilter, ExecutorService executor, FSUtils.ProgressReporter progressReporter) throws IOException, InterruptedException Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744- Parameters:
fs
- The file system to use.hbaseRootDir
- The root directory to scan.sfFilter
- optional path filter to apply to store filesexecutor
- optional executor service to parallelize this operationprogressReporter
- Instance or null; gets called every time we move to new region of family dir and for each store file.- Returns:
- Map keyed by StoreFile name with a value of the full Path.
- Throws:
IOException
- When scanning the directory fails.InterruptedException
-
filterFileStatuses
public static List<org.apache.hadoop.fs.FileStatus> filterFileStatuses(org.apache.hadoop.fs.FileStatus[] input, FileStatusFilter filter) Filters FileStatuses in an array and returns a list- Parameters:
input
- An array of FileStatusesfilter
- A required filter to filter the array- Returns:
- A list of FileStatuses
-
filterFileStatuses
public static List<org.apache.hadoop.fs.FileStatus> filterFileStatuses(Iterator<org.apache.hadoop.fs.FileStatus> input, FileStatusFilter filter) Filters FileStatuses in an iterator and returns a list- Parameters:
input
- An iterator of FileStatusesfilter
- A required filter to filter the array- Returns:
- A list of FileStatuses
-
listStatusWithStatusFilter
public static List<org.apache.hadoop.fs.FileStatus> listStatusWithStatusFilter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, FileStatusFilter filter) throws IOException Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException.- Parameters:
fs
- file systemdir
- directoryfilter
- file status filter- Returns:
- null if dir is empty or doesn't exist, otherwise FileStatus list
- Throws:
IOException
-
getRegionDegreeLocalityMappingFromFS
public static Map<String,Map<String, getRegionDegreeLocalityMappingFromFSFloat>> (org.apache.hadoop.conf.Configuration conf) throws IOException This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region. This is used by the toolRegionPlacementMaintainer
the configuration to use- Returns:
- the mapping from region encoded name to a map of server names to locality fraction in case of file system errors or interrupts
- Throws:
IOException
-
getRegionDegreeLocalityMappingFromFS
public static Map<String,Map<String, getRegionDegreeLocalityMappingFromFSFloat>> (org.apache.hadoop.conf.Configuration conf, String desiredTable, int threadPoolSize) throws IOException This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region. the configuration to use the table you wish to scan locality for the thread pool size to use- Returns:
- the mapping from region encoded name to a map of server names to locality fraction in case of file system errors or interrupts
- Throws:
IOException
-
getRegionLocalityMappingFromFS
private static void getRegionLocalityMappingFromFS(org.apache.hadoop.conf.Configuration conf, String desiredTable, int threadPoolSize, Map<String, Map<String, throws IOExceptionFloat>> regionDegreeLocalityMapping) This function is to scan the root path of the file system to get either the mapping between the region name and its best locality region server or the degree of locality of each region on each of the servers having at least one block of that region. The output map parameters are both optional. the configuration to use the table you wish to scan locality for the thread pool size to use the map into which to put the locality degree mapping or null, must be a thread-safe implementation in case of file system errors or interrupts- Throws:
IOException
-
setupShortCircuitRead
Do our short circuit read setup. Checks buffer size to use and whether to do checksumming in hbase or hdfs. -
checkShortCircuitReadBufferSize
Check if short circuit read buffer size is set and if not, set it to hbase value. -
getDFSHedgedReadMetrics
public static org.apache.hadoop.hdfs.DFSHedgedReadMetrics getDFSHedgedReadMetrics(org.apache.hadoop.conf.Configuration c) throws IOException Returns The DFSClient DFSHedgedReadMetrics instance or null if can't be found or not on hdfs.- Throws:
IOException
-
copyFilesParallel
public static List<org.apache.hadoop.fs.Path> copyFilesParallel(org.apache.hadoop.fs.FileSystem srcFS, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.FileSystem dstFS, org.apache.hadoop.fs.Path dst, org.apache.hadoop.conf.Configuration conf, int threads) throws IOException - Throws:
IOException
-
copyFiles
private static List<org.apache.hadoop.fs.Path> copyFiles(org.apache.hadoop.fs.FileSystem srcFS, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.FileSystem dstFS, org.apache.hadoop.fs.Path dst, org.apache.hadoop.conf.Configuration conf, ExecutorService pool, List<Future<Void>> futures) throws IOException - Throws:
IOException
-
getNNAddresses
private static Set<InetSocketAddress> getNNAddresses(org.apache.hadoop.hdfs.DistributedFileSystem fs, org.apache.hadoop.conf.Configuration conf) Returns A set containing all namenode addresses of fs -
isSameHdfs
public static boolean isSameHdfs(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem srcFs, org.apache.hadoop.fs.FileSystem desFs) - Parameters:
conf
- the Configuration of HBase- Returns:
- Whether srcFs and desFs are on same hdfs or not
-