Package org.apache.hadoop.hbase.util
Class CommonFSUtils
java.lang.Object
org.apache.hadoop.hbase.util.CommonFSUtils
Utility methods for interacting with the underlying file system.
Note that
setStoragePolicy(FileSystem, Path, String)
is tested in TestFSUtils and
pre-commit will run the hbase-server tests if there's code change in this class. See
HBASE-20838 for more details.-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Helper exception for those cases where the place where we need to check a stream capability is not where we have the needed context to explain the impact and mitigation for a lack. -
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
checkShortCircuitReadBufferSize
(org.apache.hadoop.conf.Configuration conf) Check if short circuit read buffer size is set and if not, set it to hbase value.static org.apache.hadoop.fs.FSDataOutputStream
create
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, boolean overwrite) Create the specified file on the filesystem.static boolean
delete
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive) Calls fs.delete() and returns the value returned by the fs.delete()static boolean
deleteDirectory
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) Delete if exists.static org.apache.hadoop.fs.FileSystem
getCurrentFileSystem
(org.apache.hadoop.conf.Configuration conf) Returns the filesystem of the hbase rootdir.static long
getDefaultBlockSize
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) Return the number of bytes that large input files should be optimally be split into to minimize i/o time.static int
getDefaultBufferSize
(org.apache.hadoop.fs.FileSystem fs) Returns the default buffer size to use during writes.static short
getDefaultReplication
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) static String
getDirUri
(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path p) Returns the URI in the string formatstatic org.apache.hadoop.fs.permission.FsPermission
getFilePermissions
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, String permssionConfKey) Get the file permissions specified in the configuration, if they are enabled.static org.apache.hadoop.fs.Path
getNamespaceDir
(org.apache.hadoop.fs.Path rootdir, String namespace) Returns thePath
object representing the namespace directory under path rootdirstatic String
getPath
(org.apache.hadoop.fs.Path p) Return the 'path' component of a Path.static org.apache.hadoop.fs.Path
getRegionDir
(org.apache.hadoop.fs.Path rootdir, TableName tableName, String regionName) Returns thePath
object representing the region directory under path rootdirstatic org.apache.hadoop.fs.Path
getRootDir
(org.apache.hadoop.conf.Configuration c) Get the path for the root data directorystatic org.apache.hadoop.fs.FileSystem
getRootDirFileSystem
(org.apache.hadoop.conf.Configuration c) static org.apache.hadoop.fs.Path
getTableDir
(org.apache.hadoop.fs.Path rootdir, TableName tableName) Returns thePath
object representing the table directory under path rootdirstatic TableName
getTableName
(org.apache.hadoop.fs.Path tablePath) Returns theTableName
object representing the table directory under path rootdirstatic org.apache.hadoop.fs.FileSystem
getWALFileSystem
(org.apache.hadoop.conf.Configuration c) static org.apache.hadoop.fs.Path
getWALRegionDir
(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) Returns the WAL region directory based on the given table name and region namestatic org.apache.hadoop.fs.Path
getWALRootDir
(org.apache.hadoop.conf.Configuration c) Get the path for the root directory for WAL datastatic org.apache.hadoop.fs.Path
getWALTableDir
(org.apache.hadoop.conf.Configuration conf, TableName tableName) Returns the Table directory under the WALRootDir for the specified table namestatic org.apache.hadoop.fs.Path
getWrongWALRegionDir
(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) Deprecated.For compatibility, will be removed in 4.0.0.private static void
invokeSetStoragePolicy
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) static boolean
isExists
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) Calls fs.exists().static boolean
isHDFS
(org.apache.hadoop.conf.Configuration conf) Return true if this is a filesystem whose scheme is 'hdfs'.static boolean
isMatchingTail
(org.apache.hadoop.fs.Path pathToSearch, String pathTail) Compare path component of the Path URI; e.g.static boolean
isMatchingTail
(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail) Compare path component of the Path URI; e.g.static boolean
isRecoveredEdits
(org.apache.hadoop.fs.Path path) Checks if the given path is the one with 'recovered.edits' dir.static boolean
isStartingWithPath
(org.apache.hadoop.fs.Path rootPath, String path) Compare of path component.private static boolean
isValidWALRootDir
(org.apache.hadoop.fs.Path walDir, org.apache.hadoop.conf.Configuration c) static List<org.apache.hadoop.fs.LocatedFileStatus>
listLocatedStatus
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) Calls fs.listFiles() to get FileStatus and BlockLocations together for reducing rpc callstatic org.apache.hadoop.fs.FileStatus[]
listStatus
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) Calls fs.listStatus() and treats FileNotFoundException as non-fatal This would accommodates differences between hadoop versionsstatic org.apache.hadoop.fs.FileStatus[]
listStatus
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter filter) Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException.static void
logFileSystemState
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, org.slf4j.Logger log) Log the current state of the filesystem from a certain root directoryprivate static void
logFSTree
(org.slf4j.Logger log, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, String prefix) Recursive helper to log the state of the FSstatic String
removeWALRootPath
(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) Checks for the presence of the WAL log root path (using the provided conf object) in the given path.static boolean
renameAndSetModifyTime
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dest) static void
setFsDefault
(org.apache.hadoop.conf.Configuration c, String uri) static void
setFsDefault
(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) static void
setRootDir
(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) static void
setStoragePolicy
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) Sets storage policy for given path.(package private) static void
setStoragePolicy
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy, boolean throwException) static void
setWALRootDir
(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) static org.apache.hadoop.fs.Path
validateRootPath
(org.apache.hadoop.fs.Path root) Verifies root directory path is a valid URI with a scheme
-
Field Details
-
LOG
-
HBASE_WAL_DIR
Parameter name for HBase WAL directory- See Also:
-
UNSAFE_STREAM_CAPABILITY_ENFORCE
Parameter to disable stream capability enforcement checks- See Also:
-
FULL_RWX_PERMISSIONS
Full access permissions (starting point for a umask)- See Also:
-
warningMap
-
-
Constructor Details
-
CommonFSUtils
private CommonFSUtils()
-
-
Method Details
-
isStartingWithPath
Compare of path component. Does not consider schema; i.e. if schemas different butpath
starts withrootPath
, then the function returns true- Parameters:
rootPath
- value to check forpath
- subject to check- Returns:
- True if
path
starts withrootPath
-
isMatchingTail
Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.- Parameters:
pathToSearch
- Path we will be trying to match against.pathTail
- what to match- Returns:
- True if
pathTail
is tail on the path ofpathToSearch
-
isMatchingTail
public static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail) Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. If you passed in 'hdfs://a/b/c and b/c, it would return true. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.- Parameters:
pathToSearch
- Path we will be trying to match agains againstpathTail
- what to match- Returns:
- True if
pathTail
is tail on the path ofpathToSearch
-
deleteDirectory
public static boolean deleteDirectory(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException Delete if exists.- Parameters:
fs
- filesystem objectdir
- directory to delete- Returns:
- True if deleted
dir
- Throws:
IOException
- e
-
getDefaultBlockSize
public static long getDefaultBlockSize(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) Return the number of bytes that large input files should be optimally be split into to minimize i/o time.- Parameters:
fs
- filesystem object- Returns:
- the default block size for the path's filesystem
-
getDefaultReplication
public static short getDefaultReplication(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) -
getDefaultBufferSize
Returns the default buffer size to use during writes. The size of the buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations.- Parameters:
fs
- filesystem object- Returns:
- default buffer size to use during writes
-
create
public static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, boolean overwrite) throws IOException Create the specified file on the filesystem. By default, this will:- apply the umask in the configuration (if it is enabled)
- use the fs configured buffer size (or 4096 if not set)
- use the default replication
- use the default block size
- not track progress
- Parameters:
fs
-FileSystem
on which to write the filepath
-Path
to the file to writeperm
- intial permissionsoverwrite
- Whether or not the created file should be overwritten.- Returns:
- output stream to the created file
- Throws:
IOException
- if the file cannot be created
-
getFilePermissions
public static org.apache.hadoop.fs.permission.FsPermission getFilePermissions(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, String permssionConfKey) Get the file permissions specified in the configuration, if they are enabled.- Parameters:
fs
- filesystem that the file will be created on.conf
- configuration to read for determining if permissions are enabled and which to usepermssionConfKey
- property key in the configuration to use when finding the permission- Returns:
- the permission to use when creating a new file on the fs. If special permissions are not specified in the configuration, then the default permissions on the the fs will be returned.
-
validateRootPath
public static org.apache.hadoop.fs.Path validateRootPath(org.apache.hadoop.fs.Path root) throws IOException Verifies root directory path is a valid URI with a scheme- Parameters:
root
- root directory path- Returns:
- Passed
root
argument. - Throws:
IOException
- if not a valid URI with a scheme
-
removeWALRootPath
public static String removeWALRootPath(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) throws IOException Checks for the presence of the WAL log root path (using the provided conf object) in the given path. If it exists, this method removes it and returns the String representation of remaining relative path.- Parameters:
path
- must not be nullconf
- must not be null- Returns:
- String representation of the remaining relative path
- Throws:
IOException
- from underlying filesystem
-
getPath
Return the 'path' component of a Path. In Hadoop, Path is a URI. This method returns the 'path' component of a Path's URI: e.g. If a Path ishdfs://example.org:9000/hbase_trunk/TestTable/compaction.dir
, this method returns/hbase_trunk/TestTable/compaction.dir
. This method is useful if you want to print out a Path without qualifying Filesystem instance.- Parameters:
p
- Filesystem Path whose 'path' component we are to return.- Returns:
- Path portion of the Filesystem
-
getRootDir
public static org.apache.hadoop.fs.Path getRootDir(org.apache.hadoop.conf.Configuration c) throws IOException Get the path for the root data directory- Parameters:
c
- configuration- Returns:
Path
to hbase root directory from configuration as a qualified Path.- Throws:
IOException
- e
-
setRootDir
public static void setRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) -
setFsDefault
public static void setFsDefault(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) -
setFsDefault
-
getRootDirFileSystem
public static org.apache.hadoop.fs.FileSystem getRootDirFileSystem(org.apache.hadoop.conf.Configuration c) throws IOException - Throws:
IOException
-
getWALRootDir
public static org.apache.hadoop.fs.Path getWALRootDir(org.apache.hadoop.conf.Configuration c) throws IOException Get the path for the root directory for WAL data- Parameters:
c
- configuration- Returns:
Path
to hbase log root directory: e.g. "hbase.wal.dir" from configuration as a qualified Path. Defaults to HBase root dir.- Throws:
IOException
- e
-
getDirUri
public static String getDirUri(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path p) throws IOException Returns the URI in the string format- Parameters:
c
- configurationp
- path- Returns:
- - the URI's to string format
- Throws:
IOException
-
setWALRootDir
public static void setWALRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) -
getWALFileSystem
public static org.apache.hadoop.fs.FileSystem getWALFileSystem(org.apache.hadoop.conf.Configuration c) throws IOException - Throws:
IOException
-
isValidWALRootDir
private static boolean isValidWALRootDir(org.apache.hadoop.fs.Path walDir, org.apache.hadoop.conf.Configuration c) throws IOException - Throws:
IOException
-
getWALRegionDir
public static org.apache.hadoop.fs.Path getWALRegionDir(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) throws IOException Returns the WAL region directory based on the given table name and region name- Parameters:
conf
- configuration to determine WALRootDirtableName
- Table that the region is underencodedRegionName
- Region name used for creating the final region directory- Returns:
- the region directory used to store WALs under the WALRootDir
- Throws:
IOException
- if there is an exception determining the WALRootDir
-
getWALTableDir
public static org.apache.hadoop.fs.Path getWALTableDir(org.apache.hadoop.conf.Configuration conf, TableName tableName) throws IOException Returns the Table directory under the WALRootDir for the specified table name- Parameters:
conf
- configuration used to get the WALRootDirtableName
- Table to get the directory for- Returns:
- a path to the WAL table directory for the specified table
- Throws:
IOException
- if there is an exception determining the WALRootDir
-
getWrongWALRegionDir
@Deprecated public static org.apache.hadoop.fs.Path getWrongWALRegionDir(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) throws IOException Deprecated.For compatibility, will be removed in 4.0.0.For backward compatibility with HBASE-20734, where we store recovered edits in a wrong directory without BASE_NAMESPACE_DIR. See HBASE-22617 for more details.- Throws:
IOException
-
getTableDir
public static org.apache.hadoop.fs.Path getTableDir(org.apache.hadoop.fs.Path rootdir, TableName tableName) Returns thePath
object representing the table directory under path rootdir- Parameters:
rootdir
- qualified path of HBase root directorytableName
- name of table- Returns:
Path
for table
-
getRegionDir
public static org.apache.hadoop.fs.Path getRegionDir(org.apache.hadoop.fs.Path rootdir, TableName tableName, String regionName) Returns thePath
object representing the region directory under path rootdir- Parameters:
rootdir
- qualified path of HBase root directorytableName
- name of tableregionName
- The encoded region name- Returns:
Path
for region
-
getTableName
Returns theTableName
object representing the table directory under path rootdir- Parameters:
tablePath
- path of table- Returns:
Path
for table
-
getNamespaceDir
public static org.apache.hadoop.fs.Path getNamespaceDir(org.apache.hadoop.fs.Path rootdir, String namespace) Returns thePath
object representing the namespace directory under path rootdir- Parameters:
rootdir
- qualified path of HBase root directorynamespace
- namespace name- Returns:
Path
for table
-
setStoragePolicy
public static void setStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) Sets storage policy for given path. If the passed path is a directory, we'll set the storage policy for all files created in the future in said directory. Note that this change in storage policy takes place at the FileSystem level; it will persist beyond this RS's lifecycle. If we're running on a version of FileSystem that doesn't support the given storage policy (or storage policies at all), then we'll issue a log message and continue. See http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html- Parameters:
fs
- We only do anything it implements a setStoragePolicy methodpath
- the Path whose storage policy is to be setstoragePolicy
- Policy to set onpath
; see hadoop 2.6+ org.apache.hadoop.hdfs.protocol.HdfsConstants for possible list e.g 'COLD', 'WARM', 'HOT', 'ONE_SSD', 'ALL_SSD', 'LAZY_PERSIST'.
-
setStoragePolicy
static void setStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy, boolean throwException) throws IOException - Throws:
IOException
-
invokeSetStoragePolicy
private static void invokeSetStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) throws IOException - Throws:
IOException
-
isHDFS
Return true if this is a filesystem whose scheme is 'hdfs'.- Throws:
IOException
- from underlying FileSystem
-
isRecoveredEdits
Checks if the given path is the one with 'recovered.edits' dir.- Parameters:
path
- must not be null- Returns:
- True if we recovered edits
-
getCurrentFileSystem
public static org.apache.hadoop.fs.FileSystem getCurrentFileSystem(org.apache.hadoop.conf.Configuration conf) throws IOException Returns the filesystem of the hbase rootdir.- Throws:
IOException
- from underlying FileSystem
-
listStatus
public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter filter) throws IOException Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException. Where possible, prefer FSUtils#listStatusWithStatusFilter(FileSystem, Path, FileStatusFilter) instead.- Parameters:
fs
- file systemdir
- directoryfilter
- path filter- Returns:
- null if dir is empty or doesn't exist, otherwise FileStatus array
- Throws:
IOException
-
listStatus
public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException Calls fs.listStatus() and treats FileNotFoundException as non-fatal This would accommodates differences between hadoop versions- Parameters:
fs
- file systemdir
- directory- Returns:
- null if dir is empty or doesn't exist, otherwise FileStatus array
- Throws:
IOException
-
listLocatedStatus
public static List<org.apache.hadoop.fs.LocatedFileStatus> listLocatedStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException Calls fs.listFiles() to get FileStatus and BlockLocations together for reducing rpc call- Parameters:
fs
- file systemdir
- directory- Returns:
- LocatedFileStatus list
- Throws:
IOException
-
delete
public static boolean delete(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive) throws IOException Calls fs.delete() and returns the value returned by the fs.delete()- Parameters:
fs
- must not be nullpath
- must not be nullrecursive
- delete tree rooted at path- Returns:
- the value returned by the fs.delete()
- Throws:
IOException
- from underlying FileSystem
-
isExists
public static boolean isExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException Calls fs.exists(). Checks if the specified path exists- Parameters:
fs
- must not be nullpath
- must not be null- Returns:
- the value returned by fs.exists()
- Throws:
IOException
- from underlying FileSystem
-
logFileSystemState
public static void logFileSystemState(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, org.slf4j.Logger log) throws IOException Log the current state of the filesystem from a certain root directory- Parameters:
fs
- filesystem to investigateroot
- root file/directory to start logging fromlog
- log to output information- Throws:
IOException
- if an unexpected exception occurs
-
logFSTree
private static void logFSTree(org.slf4j.Logger log, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, String prefix) throws IOException Recursive helper to log the state of the FS- Throws:
IOException
- See Also:
-
renameAndSetModifyTime
public static boolean renameAndSetModifyTime(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dest) throws IOException - Throws:
IOException
-
checkShortCircuitReadBufferSize
Check if short circuit read buffer size is set and if not, set it to hbase value.- Parameters:
conf
- must not be null
-