Package org.apache.hadoop.metrics2.util
Class MetricSampleQuantiles
java.lang.Object
org.apache.hadoop.metrics2.util.MetricSampleQuantiles
Implementation of the Cormode, Korn, Muthukrishnan, and Srivastava algorithm for streaming
calculation of targeted high-percentile epsilon-approximate quantiles. This is a generalization
of the earlier work by Greenwald and Khanna (GK), which essentially allows different error bounds
on the targeted quantiles, which allows for far more efficient calculation of high-percentiles.
See: Cormode, Korn, Muthukrishnan, and Srivastava "Effective Computation of Biased Quantiles over
Data Streams" in ICDE 2005 Greenwald and Khanna, "Space-efficient online computation of quantile
summaries" in SIGMOD 2001
-
Nested Class Summary
Modifier and TypeClassDescriptionprivate static class
Describes a measured value passed to the estimator, tracking additional metadata required by the CKMS algorithm. -
Field Summary
Modifier and TypeFieldDescriptionprivate long[]
Buffers incoming items to be inserted in batch.private int
private long
Total number of items in streamprivate final MetricQuantile[]
Array of Quantiles that we care about, along with desired error.Current list of sampled items, maintained in sorted order with error bounds -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprivate double
allowableError
(int rank) Specifies the allowable error for this rank, depending on which quantiles are being targeted.void
clear()
Resets the estimator, clearing out all previously inserted itemsprivate void
compress()
Try to remove extraneous items from the set of sampled items.long
getCount()
Returns the number of items that the estimator has processedint
Returns the number of samples kept by the estimatorvoid
insert
(long v) Add a new value from the stream.private void
Merges items from buffer into the samples array in one pass.private long
query
(double quantile) Get the estimated value at the specified quantile.snapshot()
Get a snapshot of the current values of all the tracked quantiles.
-
Field Details
-
count
Total number of items in stream -
samples
Current list of sampled items, maintained in sorted order with error bounds -
buffer
Buffers incoming items to be inserted in batch. Items are inserted into the buffer linearly. When the buffer fills, it is flushed into the samples array in its entirety. -
bufferCount
-
quantiles
Array of Quantiles that we care about, along with desired error.
-
-
Constructor Details
-
MetricSampleQuantiles
-
-
Method Details
-
allowableError
Specifies the allowable error for this rank, depending on which quantiles are being targeted. This is the f(r_i, n) function from the CKMS paper. It's basically how wide the range of this rank can be. the index in the list of samples -
insert
Add a new value from the stream.- Parameters:
v
- the value to insert
-
insertBatch
Merges items from buffer into the samples array in one pass. This is more efficient than doing an insert on every item. -
compress
Try to remove extraneous items from the set of sampled items. This checks if an item is unnecessary based on the desired error bounds, and merges it with the adjacent item if it is. -
query
Get the estimated value at the specified quantile.- Parameters:
quantile
- Queried quantile, e.g. 0.50 or 0.99.- Returns:
- Estimated value at that quantile.
- Throws:
IOException
-
snapshot
Get a snapshot of the current values of all the tracked quantiles.- Returns:
- snapshot of the tracked quantiles if no items have been added to the estimator
- Throws:
IOException
-
getCount
Returns the number of items that the estimator has processed- Returns:
- count total number of items processed
-
getSampleCount
Returns the number of samples kept by the estimator- Returns:
- count current number of samples
-
clear
Resets the estimator, clearing out all previously inserted items
-