Package org.apache.hadoop.hbase.backup
Class HFileArchiver
java.lang.Object
org.apache.hadoop.hbase.backup.HFileArchiver
Utility class to handle the removal of HFiles (or the respective
StoreFiles
)
for a HRegion from the FileSystem
. The hfiles will be archived or deleted, depending on
the state of the system.-
Nested Class Summary
Modifier and TypeClassDescriptionprivate static class
Wrapper to handle file operations uniformlyprivate static class
private static class
private static class
Adapt a type to match theHFileArchiver.File
interface, which is used internally for handling archival/removal of filesprivate static class
Convert a FileStatus to something we can manage in the archivingprivate static class
Convert theHStoreFile
into something we can manage in the archive methods -
Field Summary
Modifier and TypeFieldDescriptionprivate static ThreadPoolExecutor
private static final int
Number of retries in case of fs operation failureprivate static final Function<HFileArchiver.File,
org.apache.hadoop.fs.Path> private static final org.slf4j.Logger
private static final String
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprivate static void
archive
(org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> compactedFiles, org.apache.hadoop.fs.Path storeArchiveDir) static void
archiveFamily
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path tableDir, byte[] family) Remove from the specified region the store files of the specified column family, either by archiving them or outright deletionstatic void
archiveFamilyByFamilyDir
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path familyDir, byte[] family) Removes from the specified region the store files of the specified column family, either by archiving them or outright deletionstatic void
archiveRecoveredEdits
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> replayedEdits) Archive recovered edits using existing logic for archiving store files.static void
archiveRegion
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) Cleans up all the files for a HRegion by archiving the HFiles to the archive directorystatic boolean
archiveRegion
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, org.apache.hadoop.fs.Path tableDir, org.apache.hadoop.fs.Path regionDir) Remove an entire region from the table directory via archiving the region's hfiles.static void
archiveRegions
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path tableDir, List<org.apache.hadoop.fs.Path> regionDirList) Archive the specified regions in parallel.static void
archiveStoreFile
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, org.apache.hadoop.fs.Path storeFile) Archive the store filestatic void
archiveStoreFiles
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, Collection<HStoreFile> compactedFiles) Remove the store files, either by archiving them or outright deletionprivate static boolean
deleteRegionWithoutArchiving
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) Without regard for backup, delete a region.private static void
deleteStoreFilesWithoutArchiving
(Collection<HStoreFile> compactedFiles) Just do a simple delete of the given store filesstatic boolean
exists
(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) Returns True if the Region exits in the filesystem.private static ThreadPoolExecutor
getArchiveExecutor
(org.apache.hadoop.conf.Configuration conf) private static ThreadFactory
private static List<HFileArchiver.File>
resolveAndArchive
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir, Collection<HFileArchiver.File> toArchive, long start) Resolve any conflict with an existing archive file via timestamp-append renaming of the existing file and then archive the passed in files.private static boolean
resolveAndArchiveFile
(org.apache.hadoop.fs.Path archiveDir, HFileArchiver.File currentFile, String archiveStartTime) Attempt to archive the passed in file to the archive directory.
-
Field Details
-
LOG
-
SEPARATOR
- See Also:
-
DEFAULT_RETRIES_NUMBER
Number of retries in case of fs operation failure- See Also:
-
FUNC_FILE_TO_PATH
-
archiveExecutor
-
-
Constructor Details
-
HFileArchiver
private HFileArchiver()
-
-
Method Details
-
exists
public static boolean exists(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) throws IOException Returns True if the Region exits in the filesystem.- Throws:
IOException
-
archiveRegion
public static void archiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) throws IOException Cleans up all the files for a HRegion by archiving the HFiles to the archive directory- Parameters:
conf
- the configuration to usefs
- the file system objectinfo
- RegionInfo for region to be deleted- Throws:
IOException
-
archiveRegion
public static boolean archiveRegion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, org.apache.hadoop.fs.Path tableDir, org.apache.hadoop.fs.Path regionDir) throws IOException Remove an entire region from the table directory via archiving the region's hfiles.- Parameters:
fs
-FileSystem
from which to remove the regionrootdir
-Path
to the root directory where hbase files are stored (for building the archive path)tableDir
-Path
to where the table is being stored (for building the archive path)regionDir
-Path
to where a region is being stored (for building the archive path)- Returns:
- true if the region was successfully deleted. false if the filesystem operations could not complete.
- Throws:
IOException
- if the request cannot be completed
-
archiveRegions
public static void archiveRegions(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path tableDir, List<org.apache.hadoop.fs.Path> regionDirList) throws IOException Archive the specified regions in parallel.- Parameters:
conf
- the configuration to usefs
-FileSystem
from which to remove the regionrootDir
-Path
to the root directory where hbase files are stored (for building the archive path)tableDir
-Path
to where the table is being stored (for building the archive path)regionDirList
-Path
to where regions are being stored (for building the archive path)- Throws:
IOException
- if the request cannot be completed
-
getArchiveExecutor
-
getThreadFactory
-
archiveFamily
public static void archiveFamily(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path tableDir, byte[] family) throws IOException Remove from the specified region the store files of the specified column family, either by archiving them or outright deletion- Parameters:
fs
- the filesystem where the store files liveconf
-Configuration
to examine to determine the archive directoryparent
- Parent region hosting the store filestableDir
-Path
to where the table is being stored (for building the archive path)family
- the family hosting the store files- Throws:
IOException
- if the files could not be correctly disposed.
-
archiveFamilyByFamilyDir
public static void archiveFamilyByFamilyDir(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path familyDir, byte[] family) throws IOException Removes from the specified region the store files of the specified column family, either by archiving them or outright deletion- Parameters:
fs
- the filesystem where the store files liveconf
-Configuration
to examine to determine the archive directoryparent
- Parent region hosting the store filesfamilyDir
-Path
to where the family is being storedfamily
- the family hosting the store files- Throws:
IOException
- if the files could not be correctly disposed.
-
archiveStoreFiles
public static void archiveStoreFiles(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, Collection<HStoreFile> compactedFiles) throws IOException Remove the store files, either by archiving them or outright deletion- Parameters:
conf
-Configuration
to examine to determine the archive directoryfs
- the filesystem where the store files liveregionInfo
-RegionInfo
of the region hosting the store filesfamily
- the family hosting the store filescompactedFiles
- files to be disposed of. No further reading of these files should be attempted; otherwise likely to cause anIOException
- Throws:
IOException
- if the files could not be correctly disposed.
-
archiveRecoveredEdits
public static void archiveRecoveredEdits(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> replayedEdits) throws IOException Archive recovered edits using existing logic for archiving store files. This is currently only relevant when hbase.region.archive.recovered.edits is true, as recovered edits shouldn't be kept after replay. In theory, we could use very same method available for archiving store files, but supporting WAL dir and store files on different FileSystems added the need for extra validation of the passed FileSystem instance and the path where the archiving edits should be placed.- Parameters:
conf
-Configuration
to determine the archive directory.fs
- the filesystem used for storing WAL files.regionInfo
-RegionInfo
a pseudo region representation for the archiving logic.family
- a pseudo familiy representation for the archiving logic.replayedEdits
- the recovered edits to be archived.- Throws:
IOException
- if files can't be achived due to some internal error.
-
archive
private static void archive(org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> compactedFiles, org.apache.hadoop.fs.Path storeArchiveDir) throws IOException - Throws:
IOException
-
archiveStoreFile
public static void archiveStoreFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, org.apache.hadoop.fs.Path storeFile) throws IOException Archive the store file- Parameters:
fs
- the filesystem where the store files liveregionInfo
- region hosting the store filesconf
-Configuration
to examine to determine the archive directorytableDir
-Path
to where the table is being stored (for building the archive path)family
- the family hosting the store filesstoreFile
- file to be archived- Throws:
IOException
- if the files could not be correctly disposed.
-
resolveAndArchive
private static List<HFileArchiver.File> resolveAndArchive(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir, Collection<HFileArchiver.File> toArchive, long start) throws IOException Resolve any conflict with an existing archive file via timestamp-append renaming of the existing file and then archive the passed in files.- Parameters:
fs
-FileSystem
on which to archive the filesbaseArchiveDir
- base archive directory to store the files. If any of the files to archive are directories, will append the name of the directory to the base archive directory name, creating a parallel structure.toArchive
- files/directories that need to be archviedstart
- time the archiving started - used for resolving archive conflicts.- Returns:
- the list of failed to archive files.
- Throws:
IOException
- if an unexpected file operation exception occurred
-
resolveAndArchiveFile
private static boolean resolveAndArchiveFile(org.apache.hadoop.fs.Path archiveDir, HFileArchiver.File currentFile, String archiveStartTime) throws IOException Attempt to archive the passed in file to the archive directory.If the same file already exists in the archive, it is moved to a timestamped directory under the archive directory and the new file is put in its place.
- Parameters:
archiveDir
-Path
to the directory that stores the archives of the hfilescurrentFile
-Path
to the original HFile that will be archivedarchiveStartTime
- time the archiving started, to resolve naming conflicts- Returns:
- true if the file is successfully archived. false if there was a problem, but the operation still completed.
- Throws:
IOException
- on failure to completeFileSystem
operations.
-
deleteRegionWithoutArchiving
private static boolean deleteRegionWithoutArchiving(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) throws IOException Without regard for backup, delete a region. Should be used with caution.- Parameters:
regionDir
-Path
to the region to be deleted.fs
- FileSystem from which to delete the region- Returns:
- true on successful deletion, false otherwise
- Throws:
IOException
- on filesystem operation failure
-
deleteStoreFilesWithoutArchiving
private static void deleteStoreFilesWithoutArchiving(Collection<HStoreFile> compactedFiles) throws IOException Just do a simple delete of the given store filesA best effort is made to delete each of the files, rather than bailing on the first failure.
- Parameters:
compactedFiles
- store files to delete from the file system.- Throws:
IOException
- if a file cannot be deleted. All files will be attempted to deleted before throwing the exception, rather than failing at the first file.
-