Package org.apache.hadoop.hbase.backup
Class HFileArchiver
java.lang.Object
org.apache.hadoop.hbase.backup.HFileArchiver
Utility class to handle the removal of HFiles (or the respective
StoreFiles)
for a HRegion from the FileSystem. The hfiles will be archived or deleted, depending on
the state of the system.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static classWrapper to handle file operations uniformlyprivate static classprivate static classprivate static classAdapt a type to match theHFileArchiver.Fileinterface, which is used internally for handling archival/removal of filesprivate static classConvert a FileStatus to something we can manage in the archivingprivate static classConvert theHStoreFileinto something we can manage in the archive methods -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static ThreadPoolExecutorprivate static final intNumber of retries in case of fs operation failureprivate static final Function<HFileArchiver.File,org.apache.hadoop.fs.Path> private static final org.slf4j.Loggerprivate static final String -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static voidarchive(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> compactedFiles, org.apache.hadoop.fs.Path storeArchiveDir) static voidarchiveFamily(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path tableDir, byte[] family) Remove from the specified region the store files of the specified column family, either by archiving them or outright deletionstatic voidarchiveFamilyByFamilyDir(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path familyDir, byte[] family) Removes from the specified region the store files of the specified column family, either by archiving them or outright deletionprivate static voidarchiveFilesConcurrently(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path baseArchiveDir, List<HFileArchiver.File> files, Queue<HFileArchiver.File> failures, String startTime) static voidarchiveRecoveredEdits(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> replayedEdits) Archive recovered edits using existing logic for archiving store files.static booleanarchiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, org.apache.hadoop.fs.Path tableDir, org.apache.hadoop.fs.Path regionDir) Remove an entire region from the table directory via archiving the region's hfiles.static voidarchiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) Cleans up all the files for a HRegion by archiving the HFiles to the archive directorystatic voidarchiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path tableDir) Cleans up all the files for a HRegion by archiving the HFiles to the archive directorystatic voidarchiveRegions(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path tableDir, List<org.apache.hadoop.fs.Path> regionDirList) Archive the specified regions in parallel.static voidarchiveStoreFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, org.apache.hadoop.fs.Path storeFile) Archive the store filestatic voidarchiveStoreFiles(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, Collection<HStoreFile> compactedFiles) Remove the store files, either by archiving them or outright deletionprivate static booleandeleteRegionWithoutArchiving(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) Without regard for backup, delete a region.private static voiddeleteStoreFilesWithoutArchiving(Collection<HStoreFile> compactedFiles) Just do a simple delete of the given store filesprivate static voidensureArchiveDirectoryExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir) static booleanexists(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) Returns True if the Region exits in the filesystem.private static ThreadPoolExecutorgetArchiveExecutor(org.apache.hadoop.conf.Configuration conf) private static ThreadFactorygetThreadFactory(String archiverName) private static voidhandleDirectory(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir, Queue<HFileArchiver.File> failures, HFileArchiver.File directory, long start) private static List<HFileArchiver.File>resolveAndArchive(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir, Collection<HFileArchiver.File> toArchive, long start) Resolve any conflict with an existing archive file via timestamp-append renaming of the existing file and then archive the passed in files.private static booleanresolveAndArchiveFile(org.apache.hadoop.fs.Path archiveDir, HFileArchiver.File currentFile, String archiveStartTime) Attempt to archive the passed in file to the archive directory.
-
Field Details
-
LOG
-
SEPARATOR
- See Also:
-
DEFAULT_RETRIES_NUMBER
Number of retries in case of fs operation failure- See Also:
-
FUNC_FILE_TO_PATH
-
archiveExecutor
-
-
Constructor Details
-
HFileArchiver
private HFileArchiver()
-
-
Method Details
-
exists
public static boolean exists(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) throws IOException Returns True if the Region exits in the filesystem.- Throws:
IOException
-
archiveRegion
public static void archiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info) throws IOException Cleans up all the files for a HRegion by archiving the HFiles to the archive directory- Parameters:
conf- the configuration to usefs- the file system objectinfo- RegionInfo for region to be deleted- Throws:
IOException
-
archiveRegion
public static void archiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo info, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path tableDir) throws IOException Cleans up all the files for a HRegion by archiving the HFiles to the archive directory- Parameters:
conf- the configuration to usefs- the file system objectinfo- RegionInfo for region to be deletedrootDir-Pathto the root directory where hbase files are stored (for building the archive path)tableDir-Pathto where the table is being stored (for building the archive path)- Throws:
IOException
-
archiveRegion
public static boolean archiveRegion(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, org.apache.hadoop.fs.Path tableDir, org.apache.hadoop.fs.Path regionDir) throws IOException Remove an entire region from the table directory via archiving the region's hfiles.- Parameters:
fs-FileSystemfrom which to remove the regionrootdir-Pathto the root directory where hbase files are stored (for building the archive path)tableDir-Pathto where the table is being stored (for building the archive path)regionDir-Pathto where a region is being stored (for building the archive path)- Returns:
- true if the region was successfully deleted. false if the filesystem operations could not complete.
- Throws:
IOException- if the request cannot be completed
-
archiveRegions
public static void archiveRegions(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path tableDir, List<org.apache.hadoop.fs.Path> regionDirList) throws IOException Archive the specified regions in parallel.- Parameters:
conf- the configuration to usefs-FileSystemfrom which to remove the regionrootDir-Pathto the root directory where hbase files are stored (for building the archive path)tableDir-Pathto where the table is being stored (for building the archive path)regionDirList-Pathto where regions are being stored (for building the archive path)- Throws:
IOException- if the request cannot be completed
-
getArchiveExecutor
-
getThreadFactory
-
archiveFamily
public static void archiveFamily(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path tableDir, byte[] family) throws IOException Remove from the specified region the store files of the specified column family, either by archiving them or outright deletion- Parameters:
fs- the filesystem where the store files liveconf-Configurationto examine to determine the archive directoryparent- Parent region hosting the store filestableDir-Pathto where the table is being stored (for building the archive path)family- the family hosting the store files- Throws:
IOException- if the files could not be correctly disposed.
-
archiveFamilyByFamilyDir
public static void archiveFamilyByFamilyDir(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, RegionInfo parent, org.apache.hadoop.fs.Path familyDir, byte[] family) throws IOException Removes from the specified region the store files of the specified column family, either by archiving them or outright deletion- Parameters:
fs- the filesystem where the store files liveconf-Configurationto examine to determine the archive directoryparent- Parent region hosting the store filesfamilyDir-Pathto where the family is being storedfamily- the family hosting the store files- Throws:
IOException- if the files could not be correctly disposed.
-
archiveStoreFiles
public static void archiveStoreFiles(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, Collection<HStoreFile> compactedFiles) throws IOException Remove the store files, either by archiving them or outright deletion- Parameters:
conf-Configurationto examine to determine the archive directoryfs- the filesystem where the store files liveregionInfo-RegionInfoof the region hosting the store filesfamily- the family hosting the store filescompactedFiles- files to be disposed of. No further reading of these files should be attempted; otherwise likely to cause anIOException- Throws:
IOException- if the files could not be correctly disposed.
-
archiveRecoveredEdits
public static void archiveRecoveredEdits(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> replayedEdits) throws IOException Archive recovered edits using existing logic for archiving store files. This is currently only relevant when hbase.region.archive.recovered.edits is true, as recovered edits shouldn't be kept after replay. In theory, we could use very same method available for archiving store files, but supporting WAL dir and store files on different FileSystems added the need for extra validation of the passed FileSystem instance and the path where the archiving edits should be placed.- Parameters:
conf-Configurationto determine the archive directory.fs- the filesystem used for storing WAL files.regionInfo-RegionInfoa pseudo region representation for the archiving logic.family- a pseudo familiy representation for the archiving logic.replayedEdits- the recovered edits to be archived.- Throws:
IOException- if files can't be achived due to some internal error.
-
archive
private static void archive(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, byte[] family, Collection<HStoreFile> compactedFiles, org.apache.hadoop.fs.Path storeArchiveDir) throws IOException - Throws:
IOException
-
archiveStoreFile
public static void archiveStoreFile(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, RegionInfo regionInfo, org.apache.hadoop.fs.Path tableDir, byte[] family, org.apache.hadoop.fs.Path storeFile) throws IOException Archive the store file- Parameters:
fs- the filesystem where the store files liveregionInfo- region hosting the store filesconf-Configurationto examine to determine the archive directorytableDir-Pathto where the table is being stored (for building the archive path)family- the family hosting the store filesstoreFile- file to be archived- Throws:
IOException- if the files could not be correctly disposed.
-
resolveAndArchive
private static List<HFileArchiver.File> resolveAndArchive(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir, Collection<HFileArchiver.File> toArchive, long start) throws IOException Resolve any conflict with an existing archive file via timestamp-append renaming of the existing file and then archive the passed in files.- Parameters:
fs-FileSystemon which to archive the filesbaseArchiveDir- base archive directory to store the files. If any of the files to archive are directories, will append the name of the directory to the base archive directory name, creating a parallel structure.toArchive- files/directories that need to be archviedstart- time the archiving started - used for resolving archive conflicts.- Returns:
- the list of failed to archive files.
- Throws:
IOException- if an unexpected file operation exception occurred
-
ensureArchiveDirectoryExists
private static void ensureArchiveDirectoryExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir) throws IOException - Throws:
IOException
-
handleDirectory
private static void handleDirectory(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path baseArchiveDir, Queue<HFileArchiver.File> failures, HFileArchiver.File directory, long start) -
archiveFilesConcurrently
private static void archiveFilesConcurrently(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path baseArchiveDir, List<HFileArchiver.File> files, Queue<HFileArchiver.File> failures, String startTime) -
resolveAndArchiveFile
private static boolean resolveAndArchiveFile(org.apache.hadoop.fs.Path archiveDir, HFileArchiver.File currentFile, String archiveStartTime) throws IOException Attempt to archive the passed in file to the archive directory.If the same file already exists in the archive, it is moved to a timestamped directory under the archive directory and the new file is put in its place.
- Parameters:
archiveDir-Pathto the directory that stores the archives of the hfilescurrentFile-Pathto the original HFile that will be archivedarchiveStartTime- time the archiving started, to resolve naming conflicts- Returns:
- true if the file is successfully archived. false if there was a problem, but the operation still completed.
- Throws:
IOException- on failure to completeFileSystemoperations.
-
deleteRegionWithoutArchiving
private static boolean deleteRegionWithoutArchiving(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) throws IOException Without regard for backup, delete a region. Should be used with caution.- Parameters:
regionDir-Pathto the region to be deleted.fs- FileSystem from which to delete the region- Returns:
- true on successful deletion, false otherwise
- Throws:
IOException- on filesystem operation failure
-
deleteStoreFilesWithoutArchiving
private static void deleteStoreFilesWithoutArchiving(Collection<HStoreFile> compactedFiles) throws IOException Just do a simple delete of the given store filesA best effort is made to delete each of the files, rather than bailing on the first failure.
- Parameters:
compactedFiles- store files to delete from the file system.- Throws:
IOException- if a file cannot be deleted. All files will be attempted to deleted before throwing the exception, rather than failing at the first file.
-