Class StochasticLoadBalancer
- All Implemented Interfaces:
ConfigurationObserver
,LoadBalancer
,Stoppable
- Direct Known Subclasses:
CacheAwareLoadBalancer
,FavoredStochasticBalancer
This is a best effort load balancer. Given a Cost function F(C) => x It will randomly try and mutate the cluster to Cprime. If F(Cprime) < F(C) then the new cluster state becomes the plan. It includes costs functions to compute the cost of:
- Region Load
- Table Load
- Data Locality
- Memstore Sizes
- Storefile Sizes
Every cost function returns a number between 0 and 1 inclusive; where 0 is the lowest cost best solution, and 1 is the highest possible cost and the worst solution. The computed costs are scaled by their respective multipliers:
- hbase.master.balancer.stochastic.regionLoadCost
- hbase.master.balancer.stochastic.moveCost
- hbase.master.balancer.stochastic.tableLoadCost
- hbase.master.balancer.stochastic.localityCost
- hbase.master.balancer.stochastic.memstoreSizeCost
- hbase.master.balancer.stochastic.storefileSizeCost
You can also add custom Cost function by setting the the following configuration value:
- hbase.master.balancer.stochastic.additionalCostFunctions
All custom Cost Functions needs to extends CostFunction
In addition to the above configurations, the balancer can be tuned by the following configuration values:
- hbase.master.balancer.stochastic.maxMoveRegions which controls what the max number of regions that can be moved in a single invocation of this balancer.
- hbase.master.balancer.stochastic.stepsPerRegion is the coefficient by which the number of regions is multiplied to try and get the number of times the balancer will mutate all servers.
- hbase.master.balancer.stochastic.maxSteps which controls the maximum number of times that the balancer will try and mutate all the servers. The balancer will use the minimum of this value and the above computation.
This balancer is best used with hbase.master.loadbalance.bytable set to false so that the balancer gets the full picture of all loads on the cluster.
-
Nested Class Summary
-
Field Summary
Modifier and TypeFieldDescriptionprotected List<CandidateGenerator>
protected static final String
protected List<CostFunction>
private double[]
private double
protected static final int
protected static final long
protected static final int
protected static final float
protected static final boolean
protected static final int
protected static final String
(package private) Map<String,
Deque<BalancerRegionLoad>> private LocalityBasedCandidateGenerator
private ServerLocalityCostFunction
private static final org.slf4j.Logger
protected static final String
protected static final String
private long
private int
protected static final String
private float
private int
static final String
private RackLocalityCostFunction
(package private) Map<String,
Pair<ServerName, Float>> private RegionReplicaHostCostFunction
private RegionReplicaRackCostFunction
protected static final String
private boolean
protected static final String
private int
private float
private static final String
private double[]
private double[]
Fields inherited from class org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer
BALANCER_DECISION_BUFFER_ENABLED, BALANCER_REJECTION_BUFFER_ENABLED, clusterStatus, DEFAULT_BALANCER_DECISION_BUFFER_ENABLED, DEFAULT_BALANCER_REJECTION_BUFFER_ENABLED, DEFAULT_HBASE_MASTER_LOADBALANCE_BYTABLE, isByTable, masterServerName, metricsBalancer, MIN_SERVER_BALANCE, provider, rackManager, regionFinder, slop, useRegionFinder
Fields inherited from interface org.apache.hadoop.hbase.master.LoadBalancer
BOGUS_SERVER_NAME, HBASE_RSGROUP_LOADBALANCER_CLASS
-
Constructor Summary
ConstructorDescriptionThe constructor that pass a MetricsStochasticBalancer to BaseLoadBalancer to replace its default MetricsBalancerStochasticLoadBalancer
(MetricsStochasticBalancer metricsStochasticBalancer) -
Method Summary
Modifier and TypeMethodDescriptionprivate void
addCostFunction
(List<CostFunction> costFunctions, CostFunction costFunction) private boolean
protected List<RegionPlan>
balanceTable
(TableName tableName, Map<ServerName, List<RegionInfo>> loadOfOneTable) Given the cluster state this will try and approach an optimal balance.private long
calculateMaxSteps
(BalancerClusterState cluster) (package private) static String
composeAttributeName
(String tableName, String costFunctionName) A helper function to compose the attribute name from tablename and costfunction name(package private) double
computeCost
(BalancerClusterState cluster, double previousCost) This is the main cost function.protected List<CandidateGenerator>
private static CostFunction
createCostFunction
(Class<? extends CostFunction> clazz, org.apache.hadoop.conf.Configuration conf) protected List<CostFunction>
createCostFunctions
(org.apache.hadoop.conf.Configuration conf) private List<RegionPlan>
createRegionPlans
(BalancerClusterState cluster) Create all of the RegionPlan's needed to move from the initial cluster state to the desired state.protected String
private String
getBalanceReason
(double total, double sumMultiplier) (package private) List<CandidateGenerator>
(package private) String[]
Get the names of the cost functions(package private) List<CostFunction>
protected CandidateGenerator
Select the candidate generator to use based on the cost of cost functions.(package private) void
initCosts
(BalancerClusterState cluster) protected void
loadConf
(org.apache.hadoop.conf.Configuration conf) private void
loadCustomCostFunctions
(org.apache.hadoop.conf.Configuration conf) (package private) boolean
needsBalance
(TableName tableName, BalancerClusterState cluster) (package private) BalanceAction
nextAction
(BalancerClusterState cluster) private void
sendRegionPlansToRingBuffer
(List<RegionPlan> plans, double currentCost, double initCost, String initFunctionTotalCosts, long step) protected void
sendRejectionReasonToRingBuffer
(Supplier<String> reason, List<CostFunction> costFunctions) (package private) void
setRackManager
(RackManager rackManager) private String
void
updateBalancerLoadInfo
(Map<TableName, Map<ServerName, List<RegionInfo>>> loadOfAllTable) In some scenarios, Balancer needs to update internal status or information according to the current tables loadprivate void
updateBalancerTableLoadInfo
(TableName tableName, Map<ServerName, List<RegionInfo>> loadOfOneTable) void
Set the current cluster status.(package private) void
updateCostsAndWeightsWithAction
(BalancerClusterState cluster, BalanceAction action) Update both the costs of costfunctions and the weights of candidate generators(package private) void
updateMetricsSize
(int size) Update the number of metrics that are reported to JMXprivate void
Store the current region loads.private void
updateStochasticCosts
(TableName tableName, double overall, double[] subCosts) update costs to JMXMethods inherited from class org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer
balanceCluster, getConf, getDefaultSlop, idleRegionServerExist, initialize, isStopped, onConfigurationChange, postMasterStartupInitialize, preBalanceCluster, randomAssignment, regionOffline, regionOnline, retainAssignment, roundRobinAssignment, setClusterInfoProvider, sloppyRegionServerExist, stop, toEnsumbleTableLoad, updateBalancerStatus
-
Field Details
-
LOG
-
STEPS_PER_REGION_KEY
- See Also:
-
DEFAULT_STEPS_PER_REGION
- See Also:
-
MAX_STEPS_KEY
- See Also:
-
DEFAULT_MAX_STEPS
- See Also:
-
RUN_MAX_STEPS_KEY
- See Also:
-
DEFAULT_RUN_MAX_STEPS
- See Also:
-
MAX_RUNNING_TIME_KEY
- See Also:
-
DEFAULT_MAX_RUNNING_TIME
- See Also:
-
KEEP_REGION_LOADS
- See Also:
-
DEFAULT_KEEP_REGION_LOADS
- See Also:
-
TABLE_FUNCTION_SEP
- See Also:
-
MIN_COST_NEED_BALANCE_KEY
- See Also:
-
DEFAULT_MIN_COST_NEED_BALANCE
- See Also:
-
COST_FUNCTIONS_COST_FUNCTIONS_KEY
- See Also:
-
OVERALL_COST_FUNCTION_NAME
- See Also:
-
loads
-
maxSteps
-
runMaxSteps
-
stepsPerRegion
-
maxRunningTime
-
numRegionLoadsToRemember
-
minCostNeedBalance
-
regionCacheRatioOnOldServerMap
-
costFunctions
-
sumMultiplier
-
curOverallCost
-
tempFunctionCosts
-
curFunctionCosts
-
weightsOfGenerators
-
localityCandidateGenerator
-
localityCost
-
rackLocalityCost
-
regionReplicaHostCostFunction
-
regionReplicaRackCostFunction
-
candidateGenerators
-
-
Constructor Details
-
StochasticLoadBalancer
public StochasticLoadBalancer()The constructor that pass a MetricsStochasticBalancer to BaseLoadBalancer to replace its default MetricsBalancer -
StochasticLoadBalancer
-
-
Method Details
-
createCostFunction
private static CostFunction createCostFunction(Class<? extends CostFunction> clazz, org.apache.hadoop.conf.Configuration conf) -
loadCustomCostFunctions
-
getCandidateGenerators
-
createCandidateGenerators
-
createCostFunctions
-
loadConf
- Overrides:
loadConf
in classBaseLoadBalancer
-
updateClusterMetrics
Description copied from interface:LoadBalancer
Set the current cluster status. This allows a LoadBalancer to map host name to a server- Specified by:
updateClusterMetrics
in interfaceLoadBalancer
- Overrides:
updateClusterMetrics
in classBaseLoadBalancer
-
updateBalancerTableLoadInfo
private void updateBalancerTableLoadInfo(TableName tableName, Map<ServerName, List<RegionInfo>> loadOfOneTable) -
updateBalancerLoadInfo
Description copied from interface:LoadBalancer
In some scenarios, Balancer needs to update internal status or information according to the current tables load- Parameters:
loadOfAllTable
- region load of servers for all table
-
updateMetricsSize
Update the number of metrics that are reported to JMX -
areSomeRegionReplicasColocated
-
getBalanceReason
-
needsBalance
-
nextAction
-
getRandomGenerator
Select the candidate generator to use based on the cost of cost functions. The chance of selecting a candidate generator is propotional to the share of cost of all cost functions among all cost functions that benefit from it. -
setRackManager
-
calculateMaxSteps
-
balanceTable
protected List<RegionPlan> balanceTable(TableName tableName, Map<ServerName, List<RegionInfo>> loadOfOneTable) Given the cluster state this will try and approach an optimal balance. This should always approach the optimal state given enough steps.- Specified by:
balanceTable
in classBaseLoadBalancer
- Parameters:
tableName
- the table to be balancedloadOfOneTable
- region load of servers for the specific one table- Returns:
- List of plans
-
sendRejectionReasonToRingBuffer
protected void sendRejectionReasonToRingBuffer(Supplier<String> reason, List<CostFunction> costFunctions) -
sendRegionPlansToRingBuffer
private void sendRegionPlansToRingBuffer(List<RegionPlan> plans, double currentCost, double initCost, String initFunctionTotalCosts, long step) -
updateStochasticCosts
update costs to JMX -
addCostFunction
-
functionCost
-
getCostFunctions
-
totalCostsPerFunc
-
createRegionPlans
Create all of the RegionPlan's needed to move from the initial cluster state to the desired state.- Parameters:
cluster
- The state of the cluster- Returns:
- List of RegionPlan's that represent the moves needed to get to desired final state.
-
updateRegionLoad
Store the current region loads. -
initCosts
-
updateCostsAndWeightsWithAction
Update both the costs of costfunctions and the weights of candidate generators -
getCostFunctionNames
Get the names of the cost functions -
computeCost
This is the main cost function. It will compute a cost associated with a proposed cluster state. All different costs will be combined with their multipliers to produce a double cost.- Parameters:
cluster
- The state of the clusterpreviousCost
- the previous cost. This is used as an early out.- Returns:
- a double of a cost associated with the proposed cluster state. This cost is an aggregate of all individual cost functions.
-
composeAttributeName
A helper function to compose the attribute name from tablename and costfunction name
-