Interface ExportSnapshot.CustomFileGrouper

All Known Implementing Classes:
ExportSnapshot.NoopCustomFileGrouper
Enclosing class:
ExportSnapshot

@Public public static interface ExportSnapshot.CustomFileGrouper
If desired, you may implement a CustomFileGrouper in order to influence how ExportSnapshot chooses which input files go into the MapReduce job's InputSplits. Your implementation must return a data structure that contains each input file exactly once. Files that appear in separate entries in the top-level returned Collection are guaranteed to not be placed in the same InputSplit. This can be used to segregate your input files by the rack or host on which they are available, which, used in conjunction with ExportSnapshot.FileLocationResolver, can improve the performance of your ExportSnapshot runs. To use this, pass the --custom-file-grouper argument with the fully qualified class name of an implementation of CustomFileGrouper that's on the classpath. If this argument is not used, no particular grouping logic will be applied.
  • Method Summary

    Modifier and Type
    Method
    Description
    Collection<Collection<Pair<org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotFileInfo,Long>>>
    getGroupedInputFiles(Collection<Pair<org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotFileInfo,Long>> snapshotFiles)