Package org.apache.hadoop.hbase.wal
Class WALSplitUtil
java.lang.Object
org.apache.hadoop.hbase.wal.WALSplitUtil
This class provides static methods to support WAL splitting related works
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
A struct used by getMutationsFromWALEntry -
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescription(package private) static void
archive
(org.apache.hadoop.fs.Path wal, boolean corrupt, org.apache.hadoop.fs.Path oldWALDir, org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.conf.Configuration conf) Moves processed logs to a oldLogDir after successful processing Moves corrupted logs (any log that couldn't be successfully parsed to corruptDir (.corrupt) for later investigationstatic void
finishSplitLogFile
(String logfile, org.apache.hadoop.conf.Configuration conf) Completes the work done by splitLogFile by archiving logs(package private) static String
formatRecoveredEditsFileName
(long seqid) (package private) static org.apache.hadoop.fs.Path
getCompletedRecoveredEditsFilePath
(org.apache.hadoop.fs.Path srcPath, long maximumEditWALSeqNum) Get the completed recovered edits file path, renaming it to be by last edit in the file from its first edit.static long
getMaxRegionSequenceId
(org.apache.hadoop.conf.Configuration conf, RegionInfo region, IOExceptionSupplier<org.apache.hadoop.fs.FileSystem> rootFsSupplier, IOExceptionSupplier<org.apache.hadoop.fs.FileSystem> walFsSupplier) Deprecated.Only for compatibility, will be removed in 4.0.0.static long
getMaxRegionSequenceId
(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir) Get the max sequence id which is stored in the region directory.private static long
getMaxSequenceId
(org.apache.hadoop.fs.FileStatus[] files) static List<WALSplitUtil.MutationReplay>
getMutationsFromWALEntry
(org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos.WALEntry entry, ExtendedCellScanner cells, Pair<WALKey, WALEdit> logEntry, Durability durability) Deprecated.Since 3.0.0, will be removed in 4.0.0.static org.apache.hadoop.fs.FileStatus[]
getRecoveredHFiles
(org.apache.hadoop.fs.FileSystem rootFS, org.apache.hadoop.fs.Path regionDir, String familyName) private static org.apache.hadoop.fs.Path
getRecoveredHFilesDir
(org.apache.hadoop.fs.Path regionDir, String familyName) static org.apache.hadoop.fs.Path
getRegionDirRecoveredEditsDir
(org.apache.hadoop.fs.Path regionDir) (package private) static org.apache.hadoop.fs.Path
getRegionSplitEditsPath
(TableName tableName, byte[] encodedRegionName, long seqId, String fileNameBeingSplit, String tmpDirName, org.apache.hadoop.conf.Configuration conf) Path to a file under RECOVERED_EDITS_DIR directory of the region found inlogEntry
named for the sequenceid in the passedlogEntry
: e.g.private static org.apache.hadoop.fs.FileStatus[]
getSequenceIdFiles
(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir) static NavigableSet<org.apache.hadoop.fs.Path>
getSplitEditFilesSorted
(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir) Returns sorted set of edit files made by splitter, excluding files with '.temp' suffix.private static String
getTmpRecoveredEditsFileName
(String fileName) static boolean
hasRecoveredEdits
(org.apache.hadoop.conf.Configuration conf, RegionInfo regionInfo) Check whether there is recovered.edits in the region dirstatic boolean
isSequenceIdFile
(org.apache.hadoop.fs.Path file) Is the given file a region open sequence id file.private static void
mkdir
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) static org.apache.hadoop.fs.Path
moveAsideBadEditsFile
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path edits) Move aside a bad edits file.static void
moveWAL
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p, org.apache.hadoop.fs.Path targetDir) Move WAL.(package private) static org.apache.hadoop.fs.Path
tryCreateRecoveredHFilesDir
(org.apache.hadoop.fs.FileSystem rootFS, org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName, String familyName) Return path to recovered.hfiles directory of the region's column family: e.g.static void
writeRegionSequenceIdFile
(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir, long newMaxSeqId) Create a file with name as region's max sequence id
-
Field Details
-
LOG
-
EDITFILES_NAME_PATTERN
-
RECOVERED_LOG_TMPFILE_SUFFIX
- See Also:
-
SEQUENCE_ID_FILE_SUFFIX
- See Also:
-
OLD_SEQUENCE_ID_FILE_SUFFIX
- See Also:
-
SEQUENCE_ID_FILE_SUFFIX_LENGTH
-
-
Constructor Details
-
WALSplitUtil
private WALSplitUtil()
-
-
Method Details
-
finishSplitLogFile
public static void finishSplitLogFile(String logfile, org.apache.hadoop.conf.Configuration conf) throws IOException Completes the work done by splitLogFile by archiving logsIt is invoked by SplitLogManager once it knows that one of the SplitLogWorkers have completed the splitLogFile() part. If the master crashes then this function might get called multiple times.
- Throws:
IOException
-
archive
static void archive(org.apache.hadoop.fs.Path wal, boolean corrupt, org.apache.hadoop.fs.Path oldWALDir, org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.conf.Configuration conf) throws IOException Moves processed logs to a oldLogDir after successful processing Moves corrupted logs (any log that couldn't be successfully parsed to corruptDir (.corrupt) for later investigation- Throws:
IOException
-
mkdir
private static void mkdir(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir) throws IOException - Throws:
IOException
-
moveWAL
public static void moveWAL(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p, org.apache.hadoop.fs.Path targetDir) throws IOException Move WAL. Used to move processed WALs to archive or bad WALs to corrupt WAL dir. WAL may have already been moved; makes allowance.- Throws:
IOException
-
getRegionSplitEditsPath
static org.apache.hadoop.fs.Path getRegionSplitEditsPath(TableName tableName, byte[] encodedRegionName, long seqId, String fileNameBeingSplit, String tmpDirName, org.apache.hadoop.conf.Configuration conf) throws IOException Path to a file under RECOVERED_EDITS_DIR directory of the region found inlogEntry
named for the sequenceid in the passedlogEntry
: e.g. /hbase/some_table/2323432434/recovered.edits/2332. This method also ensures existence of RECOVERED_EDITS_DIR under the region creating it if necessary. And also set storage policy for RECOVERED_EDITS_DIR if WAL_STORAGE_POLICY is configured.- Parameters:
tableName
- the table nameencodedRegionName
- the encoded region nameseqId
- the sequence id which used to generate file namefileNameBeingSplit
- the file being split currently. Used to generate tmp file name.tmpDirName
- of the directory used to sideline old recovered edits fileconf
- configuration- Returns:
- Path to file into which to dump split log edits.
- Throws:
IOException
-
getTmpRecoveredEditsFileName
-
getCompletedRecoveredEditsFilePath
static org.apache.hadoop.fs.Path getCompletedRecoveredEditsFilePath(org.apache.hadoop.fs.Path srcPath, long maximumEditWALSeqNum) Get the completed recovered edits file path, renaming it to be by last edit in the file from its first edit. Then we could use the name to skip recovered edits when doing HRegion#replayRecoveredEditsIfAny(Map, CancelableProgressable, MonitoredTask).- Returns:
- dstPath take file's last edit log seq num as the name
-
formatRecoveredEditsFileName
-
getRegionDirRecoveredEditsDir
public static org.apache.hadoop.fs.Path getRegionDirRecoveredEditsDir(org.apache.hadoop.fs.Path regionDir) - Parameters:
regionDir
- This regions directory in the filesystem.- Returns:
- The directory that holds recovered edits files for the region
regionDir
-
hasRecoveredEdits
public static boolean hasRecoveredEdits(org.apache.hadoop.conf.Configuration conf, RegionInfo regionInfo) throws IOException Check whether there is recovered.edits in the region dir- Parameters:
conf
- confregionInfo
- the region to check- Returns:
- true if recovered.edits exist in the region dir
- Throws:
IOException
-
getMaxRegionSequenceId
@Deprecated public static long getMaxRegionSequenceId(org.apache.hadoop.conf.Configuration conf, RegionInfo region, IOExceptionSupplier<org.apache.hadoop.fs.FileSystem> rootFsSupplier, IOExceptionSupplier<org.apache.hadoop.fs.FileSystem> walFsSupplier) throws IOException Deprecated.Only for compatibility, will be removed in 4.0.0.This method will check 3 places for finding the max sequence id file. One is the expected place, another is the old place under the region directory, and the last one is the wrong one we introduced in HBASE-20734. See HBASE-22617 for more details. Notice that, you should always call this method instead ofgetMaxRegionSequenceId(FileSystem, Path)
until 4.0.0 release.- Throws:
IOException
-
getSplitEditFilesSorted
public static NavigableSet<org.apache.hadoop.fs.Path> getSplitEditFilesSorted(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir) throws IOException Returns sorted set of edit files made by splitter, excluding files with '.temp' suffix.- Parameters:
walFS
- WAL FileSystem used to retrieving split edits files.regionDir
- WAL region dir to look for recovered edits files under.- Returns:
- Files in passed
regionDir
as a sorted set. - Throws:
IOException
-
moveAsideBadEditsFile
public static org.apache.hadoop.fs.Path moveAsideBadEditsFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path edits) throws IOException Move aside a bad edits file.- Parameters:
fs
- the file system used to rename bad edits file.edits
- Edits file to move aside.- Returns:
- The name of the moved aside file.
- Throws:
IOException
-
isSequenceIdFile
Is the given file a region open sequence id file. -
getSequenceIdFiles
private static org.apache.hadoop.fs.FileStatus[] getSequenceIdFiles(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir) throws IOException - Throws:
IOException
-
getMaxSequenceId
-
getMaxRegionSequenceId
public static long getMaxRegionSequenceId(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir) throws IOException Get the max sequence id which is stored in the region directory. -1 if none.- Throws:
IOException
-
writeRegionSequenceIdFile
public static void writeRegionSequenceIdFile(org.apache.hadoop.fs.FileSystem walFS, org.apache.hadoop.fs.Path regionDir, long newMaxSeqId) throws IOException Create a file with name as region's max sequence id- Throws:
IOException
-
getMutationsFromWALEntry
@Deprecated public static List<WALSplitUtil.MutationReplay> getMutationsFromWALEntry(org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos.WALEntry entry, ExtendedCellScanner cells, Pair<WALKey, WALEdit> logEntry, Durability durability) throws IOExceptionDeprecated.Since 3.0.0, will be removed in 4.0.0.This function is used to construct mutations from a WALEntry. It also reconstructs WALKey & WALEdit from the passed in WALEntry- Parameters:
logEntry
- pair of WALKey and WALEdit instance stores WALKey and WALEdit instances extracted from the passed in WALEntry.- Returns:
- list of Pair<MutationType, Mutation> to be replayed
- Throws:
IOException
-
tryCreateRecoveredHFilesDir
static org.apache.hadoop.fs.Path tryCreateRecoveredHFilesDir(org.apache.hadoop.fs.FileSystem rootFS, org.apache.hadoop.conf.Configuration conf, TableName tableName, String encodedRegionName, String familyName) throws IOException Return path to recovered.hfiles directory of the region's column family: e.g. /hbase/some_table/2323432434/cf/recovered.hfiles/. This method also ensures existence of recovered.hfiles directory under the region's column family, creating it if necessary.- Parameters:
rootFS
- the root file systemconf
- configurationtableName
- the table nameencodedRegionName
- the encoded region namefamilyName
- the column family name- Returns:
- Path to recovered.hfiles directory of the region's column family.
- Throws:
IOException
-
getRecoveredHFilesDir
private static org.apache.hadoop.fs.Path getRecoveredHFilesDir(org.apache.hadoop.fs.Path regionDir, String familyName) - Parameters:
regionDir
- This regions directory in the filesystemfamilyName
- The column family name- Returns:
- The directory that holds recovered hfiles for the region's column family
-
getRecoveredHFiles
public static org.apache.hadoop.fs.FileStatus[] getRecoveredHFiles(org.apache.hadoop.fs.FileSystem rootFS, org.apache.hadoop.fs.Path regionDir, String familyName) throws IOException - Throws:
IOException
-