Package org.apache.hadoop.hbase.util
Class CommonFSUtils
java.lang.Object
org.apache.hadoop.hbase.util.CommonFSUtils
Utility methods for interacting with the underlying file system.
Note that
setStoragePolicy(FileSystem, Path, String) is tested in TestFSUtils and
pre-commit will run the hbase-server tests if there's code change in this class. See
HBASE-20838 for more details.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classHelper exception for those cases where the place where we need to check a stream capability is not where we have the needed context to explain the impact and mitigation for a lack. -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidcheckShortCircuitReadBufferSize(org.apache.hadoop.conf.Configuration conf) Check if short circuit read buffer size is set and if not, set it to hbase value.static org.apache.hadoop.fs.FSDataOutputStreamcreate(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, boolean overwrite) Create the specified file on the filesystem.static booleandelete(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive) Calls fs.delete() and returns the value returned by the fs.delete()static booleandeleteDirectory(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) Delete if exists.static org.apache.hadoop.fs.FileSystemgetCurrentFileSystem(org.apache.hadoop.conf.Configuration conf) Returns the filesystem of the hbase rootdir.static longgetDefaultBlockSize(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) Return the number of bytes that large input files should be optimally be split into to minimize i/o time.static intgetDefaultBufferSize(org.apache.hadoop.fs.FileSystem fs) Returns the default buffer size to use during writes.static shortgetDefaultReplication(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) static StringgetDirUri(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path p) Returns the URI in the string formatstatic org.apache.hadoop.fs.permission.FsPermissiongetFilePermissions(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, String permssionConfKey) Get the file permissions specified in the configuration, if they are enabled.static org.apache.hadoop.fs.PathgetNamespaceDir(org.apache.hadoop.fs.Path rootdir, String namespace) Returns thePathobject representing the namespace directory under path rootdirstatic StringgetPath(org.apache.hadoop.fs.Path p) Return the 'path' component of a Path.static org.apache.hadoop.fs.PathgetRegionDir(org.apache.hadoop.fs.Path rootdir, TableName tableName, String regionName) Returns thePathobject representing the region directory under path rootdirstatic org.apache.hadoop.fs.PathgetRootDir(org.apache.hadoop.conf.Configuration c) Get the path for the root data directorystatic org.apache.hadoop.fs.FileSystemgetRootDirFileSystem(org.apache.hadoop.conf.Configuration c) static org.apache.hadoop.fs.PathgetTableDir(org.apache.hadoop.fs.Path rootdir, TableName tableName) Returns thePathobject representing the table directory under path rootdirstatic TableNamegetTableName(org.apache.hadoop.fs.Path tablePath) Returns theTableNameobject representing the table directory under path rootdirstatic org.apache.hadoop.fs.FileSystemgetWALFileSystem(org.apache.hadoop.conf.Configuration c) static org.apache.hadoop.fs.PathgetWALRegionDir(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) Returns the WAL region directory based on the given table name and region namestatic org.apache.hadoop.fs.PathgetWALRootDir(org.apache.hadoop.conf.Configuration c) Get the path for the root directory for WAL datastatic org.apache.hadoop.fs.PathgetWALTableDir(org.apache.hadoop.conf.Configuration conf, TableName tableName) Returns the Table directory under the WALRootDir for the specified table namestatic org.apache.hadoop.fs.PathgetWrongWALRegionDir(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) Deprecated.For compatibility, will be removed in 4.0.0.private static voidinvokeSetStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) static booleanisExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) Calls fs.exists().static booleanisHDFS(org.apache.hadoop.conf.Configuration conf) Return true if this is a filesystem whose scheme is 'hdfs'.static booleanisMatchingTail(org.apache.hadoop.fs.Path pathToSearch, String pathTail) Compare path component of the Path URI; e.g.static booleanisMatchingTail(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail) Compare path component of the Path URI; e.g.static booleanisRecoveredEdits(org.apache.hadoop.fs.Path path) Checks if the given path is the one with 'recovered.edits' dir.static booleanisStartingWithPath(org.apache.hadoop.fs.Path rootPath, String path) Compare of path component.private static booleanisValidWALRootDir(org.apache.hadoop.fs.Path walDir, org.apache.hadoop.conf.Configuration c) static List<org.apache.hadoop.fs.LocatedFileStatus>listLocatedStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) Calls fs.listFiles() to get FileStatus and BlockLocations together for reducing rpc callstatic org.apache.hadoop.fs.FileStatus[]listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) Calls fs.listStatus() and treats FileNotFoundException as non-fatal This would accommodates differences between hadoop versionsstatic org.apache.hadoop.fs.FileStatus[]listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter filter) Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException.static voidlogFileSystemState(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, org.slf4j.Logger log) Log the current state of the filesystem from a certain root directoryprivate static voidlogFSTree(org.slf4j.Logger log, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, String prefix) Recursive helper to log the state of the FSstatic StringremoveWALRootPath(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) Checks for the presence of the WAL log root path (using the provided conf object) in the given path.static booleanrenameAndSetModifyTime(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dest) static voidsetFsDefault(org.apache.hadoop.conf.Configuration c, String uri) static voidsetFsDefault(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) static voidsetRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) static voidsetStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) Sets storage policy for given path.(package private) static voidsetStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy, boolean throwException) static voidsetWALRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) static org.apache.hadoop.fs.PathvalidateRootPath(org.apache.hadoop.fs.Path root) Verifies root directory path is a valid URI with a scheme
-
Field Details
-
LOG
-
HBASE_WAL_DIR
Parameter name for HBase WAL directory- See Also:
-
UNSAFE_STREAM_CAPABILITY_ENFORCE
Parameter to disable stream capability enforcement checks- See Also:
-
FULL_RWX_PERMISSIONS
Full access permissions (starting point for a umask)- See Also:
-
warningMap
-
-
Constructor Details
-
CommonFSUtils
private CommonFSUtils()
-
-
Method Details
-
isStartingWithPath
Compare of path component. Does not consider schema; i.e. if schemas different butpathstarts withrootPath, then the function returns true- Parameters:
rootPath- value to check forpath- subject to check- Returns:
- True if
pathstarts withrootPath
-
isMatchingTail
Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.- Parameters:
pathToSearch- Path we will be trying to match against.pathTail- what to match- Returns:
- True if
pathTailis tail on the path ofpathToSearch
-
isMatchingTail
public static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail) Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. If you passed in 'hdfs://a/b/c and b/c, it would return true. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.- Parameters:
pathToSearch- Path we will be trying to match agains againstpathTail- what to match- Returns:
- True if
pathTailis tail on the path ofpathToSearch
-
deleteDirectory
public static boolean deleteDirectory(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException Delete if exists.- Parameters:
fs- filesystem objectdir- directory to delete- Returns:
- True if deleted
dir - Throws:
IOException- e
-
getDefaultBlockSize
public static long getDefaultBlockSize(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) Return the number of bytes that large input files should be optimally be split into to minimize i/o time.- Parameters:
fs- filesystem object- Returns:
- the default block size for the path's filesystem
-
getDefaultReplication
public static short getDefaultReplication(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) -
getDefaultBufferSize
Returns the default buffer size to use during writes. The size of the buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations.- Parameters:
fs- filesystem object- Returns:
- default buffer size to use during writes
-
create
public static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, boolean overwrite) throws IOException Create the specified file on the filesystem. By default, this will:- apply the umask in the configuration (if it is enabled)
- use the fs configured buffer size (or 4096 if not set)
- use the default replication
- use the default block size
- not track progress
- Parameters:
fs-FileSystemon which to write the filepath-Pathto the file to writeperm- intial permissionsoverwrite- Whether or not the created file should be overwritten.- Returns:
- output stream to the created file
- Throws:
IOException- if the file cannot be created
-
getFilePermissions
public static org.apache.hadoop.fs.permission.FsPermission getFilePermissions(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, String permssionConfKey) Get the file permissions specified in the configuration, if they are enabled.- Parameters:
fs- filesystem that the file will be created on.conf- configuration to read for determining if permissions are enabled and which to usepermssionConfKey- property key in the configuration to use when finding the permission- Returns:
- the permission to use when creating a new file on the fs. If special permissions are not specified in the configuration, then the default permissions on the the fs will be returned.
-
validateRootPath
public static org.apache.hadoop.fs.Path validateRootPath(org.apache.hadoop.fs.Path root) throws IOException Verifies root directory path is a valid URI with a scheme- Parameters:
root- root directory path- Returns:
- Passed
rootargument. - Throws:
IOException- if not a valid URI with a scheme
-
removeWALRootPath
public static String removeWALRootPath(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) throws IOException Checks for the presence of the WAL log root path (using the provided conf object) in the given path. If it exists, this method removes it and returns the String representation of remaining relative path.- Parameters:
path- must not be nullconf- must not be null- Returns:
- String representation of the remaining relative path
- Throws:
IOException- from underlying filesystem
-
getPath
Return the 'path' component of a Path. In Hadoop, Path is a URI. This method returns the 'path' component of a Path's URI: e.g. If a Path ishdfs://example.org:9000/hbase_trunk/TestTable/compaction.dir, this method returns/hbase_trunk/TestTable/compaction.dir. This method is useful if you want to print out a Path without qualifying Filesystem instance.- Parameters:
p- Filesystem Path whose 'path' component we are to return.- Returns:
- Path portion of the Filesystem
-
getRootDir
public static org.apache.hadoop.fs.Path getRootDir(org.apache.hadoop.conf.Configuration c) throws IOException Get the path for the root data directory- Parameters:
c- configuration- Returns:
Pathto hbase root directory from configuration as a qualified Path.- Throws:
IOException- e
-
setRootDir
public static void setRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) -
setFsDefault
public static void setFsDefault(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) -
setFsDefault
-
getRootDirFileSystem
public static org.apache.hadoop.fs.FileSystem getRootDirFileSystem(org.apache.hadoop.conf.Configuration c) throws IOException - Throws:
IOException
-
getWALRootDir
public static org.apache.hadoop.fs.Path getWALRootDir(org.apache.hadoop.conf.Configuration c) throws IOException Get the path for the root directory for WAL data- Parameters:
c- configuration- Returns:
Pathto hbase log root directory: e.g. "hbase.wal.dir" from configuration as a qualified Path. Defaults to HBase root dir.- Throws:
IOException- e
-
getDirUri
public static String getDirUri(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path p) throws IOException Returns the URI in the string format- Parameters:
c- configurationp- path- Returns:
- - the URI's to string format
- Throws:
IOException
-
setWALRootDir
public static void setWALRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root) -
getWALFileSystem
public static org.apache.hadoop.fs.FileSystem getWALFileSystem(org.apache.hadoop.conf.Configuration c) throws IOException - Throws:
IOException
-
isValidWALRootDir
private static boolean isValidWALRootDir(org.apache.hadoop.fs.Path walDir, org.apache.hadoop.conf.Configuration c) throws IOException - Throws:
IOException
-
getWALRegionDir
public static org.apache.hadoop.fs.Path getWALRegionDir(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) throws IOException Returns the WAL region directory based on the given table name and region name- Parameters:
conf- configuration to determine WALRootDirtableName- Table that the region is underencodedRegionName- Region name used for creating the final region directory- Returns:
- the region directory used to store WALs under the WALRootDir
- Throws:
IOException- if there is an exception determining the WALRootDir
-
getWALTableDir
public static org.apache.hadoop.fs.Path getWALTableDir(org.apache.hadoop.conf.Configuration conf, TableName tableName) throws IOException Returns the Table directory under the WALRootDir for the specified table name- Parameters:
conf- configuration used to get the WALRootDirtableName- Table to get the directory for- Returns:
- a path to the WAL table directory for the specified table
- Throws:
IOException- if there is an exception determining the WALRootDir
-
getWrongWALRegionDir
@Deprecated public static org.apache.hadoop.fs.Path getWrongWALRegionDir(org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName) throws IOException Deprecated.For compatibility, will be removed in 4.0.0.For backward compatibility with HBASE-20734, where we store recovered edits in a wrong directory without BASE_NAMESPACE_DIR. See HBASE-22617 for more details.- Throws:
IOException
-
getTableDir
public static org.apache.hadoop.fs.Path getTableDir(org.apache.hadoop.fs.Path rootdir, TableName tableName) Returns thePathobject representing the table directory under path rootdir- Parameters:
rootdir- qualified path of HBase root directorytableName- name of table- Returns:
Pathfor table
-
getRegionDir
public static org.apache.hadoop.fs.Path getRegionDir(org.apache.hadoop.fs.Path rootdir, TableName tableName, String regionName) Returns thePathobject representing the region directory under path rootdir- Parameters:
rootdir- qualified path of HBase root directorytableName- name of tableregionName- The encoded region name- Returns:
Pathfor region
-
getTableName
Returns theTableNameobject representing the table directory under path rootdir- Parameters:
tablePath- path of table- Returns:
Pathfor table
-
getNamespaceDir
public static org.apache.hadoop.fs.Path getNamespaceDir(org.apache.hadoop.fs.Path rootdir, String namespace) Returns thePathobject representing the namespace directory under path rootdir- Parameters:
rootdir- qualified path of HBase root directorynamespace- namespace name- Returns:
Pathfor table
-
setStoragePolicy
public static void setStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) Sets storage policy for given path. If the passed path is a directory, we'll set the storage policy for all files created in the future in said directory. Note that this change in storage policy takes place at the FileSystem level; it will persist beyond this RS's lifecycle. If we're running on a version of FileSystem that doesn't support the given storage policy (or storage policies at all), then we'll issue a log message and continue. See http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html- Parameters:
fs- We only do anything it implements a setStoragePolicy methodpath- the Path whose storage policy is to be setstoragePolicy- Policy to set onpath; see hadoop 2.6+ org.apache.hadoop.hdfs.protocol.HdfsConstants for possible list e.g 'COLD', 'WARM', 'HOT', 'ONE_SSD', 'ALL_SSD', 'LAZY_PERSIST'.
-
setStoragePolicy
static void setStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy, boolean throwException) throws IOException - Throws:
IOException
-
invokeSetStoragePolicy
private static void invokeSetStoragePolicy(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, String storagePolicy) throws IOException - Throws:
IOException
-
isHDFS
Return true if this is a filesystem whose scheme is 'hdfs'.- Throws:
IOException- from underlying FileSystem
-
isRecoveredEdits
Checks if the given path is the one with 'recovered.edits' dir.- Parameters:
path- must not be null- Returns:
- True if we recovered edits
-
getCurrentFileSystem
public static org.apache.hadoop.fs.FileSystem getCurrentFileSystem(org.apache.hadoop.conf.Configuration conf) throws IOException Returns the filesystem of the hbase rootdir.- Throws:
IOException- from underlying FileSystem
-
listStatus
public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter filter) throws IOException Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException. Where possible, prefer FSUtils#listStatusWithStatusFilter(FileSystem, Path, FileStatusFilter) instead.- Parameters:
fs- file systemdir- directoryfilter- path filter- Returns:
- null if dir is empty or doesn't exist, otherwise FileStatus array
- Throws:
IOException
-
listStatus
public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException Calls fs.listStatus() and treats FileNotFoundException as non-fatal This would accommodates differences between hadoop versions- Parameters:
fs- file systemdir- directory- Returns:
- null if dir is empty or doesn't exist, otherwise FileStatus array
- Throws:
IOException
-
listLocatedStatus
public static List<org.apache.hadoop.fs.LocatedFileStatus> listLocatedStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException Calls fs.listFiles() to get FileStatus and BlockLocations together for reducing rpc call- Parameters:
fs- file systemdir- directory- Returns:
- LocatedFileStatus list
- Throws:
IOException
-
delete
public static boolean delete(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive) throws IOException Calls fs.delete() and returns the value returned by the fs.delete()- Parameters:
fs- must not be nullpath- must not be nullrecursive- delete tree rooted at path- Returns:
- the value returned by the fs.delete()
- Throws:
IOException- from underlying FileSystem
-
isExists
public static boolean isExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException Calls fs.exists(). Checks if the specified path exists- Parameters:
fs- must not be nullpath- must not be null- Returns:
- the value returned by fs.exists()
- Throws:
IOException- from underlying FileSystem
-
logFileSystemState
public static void logFileSystemState(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, org.slf4j.Logger log) throws IOException Log the current state of the filesystem from a certain root directory- Parameters:
fs- filesystem to investigateroot- root file/directory to start logging fromlog- log to output information- Throws:
IOException- if an unexpected exception occurs
-
logFSTree
private static void logFSTree(org.slf4j.Logger log, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, String prefix) throws IOException Recursive helper to log the state of the FS- Throws:
IOException- See Also:
-
renameAndSetModifyTime
public static boolean renameAndSetModifyTime(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dest) throws IOException - Throws:
IOException
-
checkShortCircuitReadBufferSize
Check if short circuit read buffer size is set and if not, set it to hbase value.- Parameters:
conf- must not be null
-