public abstract class KllSketch extends Object implements QuantilesAPI
KLL is an implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained quantile.
Reference Optimal Quantile Approximation in Streams.
The default k of 200 yields a "single-sided" epsilon of about 1.33% and a "double-sided" (PMF) epsilon of about 1.65%, with a confidence of 99%.
QuantilesAPI| Modifier and Type | Class and Description |
|---|---|
static class |
KllSketch.SketchType
Used to define the variable type of the current instance of this class.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_K
The default K
|
static int |
MAX_K
The maximum K
|
| Modifier and Type | Method and Description |
|---|---|
int |
getCurrentCompactSerializedSizeBytes()
Deprecated.
version 4.0.0 use
getSerializedSizeBytes(). |
int |
getCurrentUpdatableSerializedSizeBytes()
Deprecated.
version 4.0.0 use
getSerializedSizeBytes(). |
abstract int |
getK()
Gets the user configured parameter k, which controls the accuracy of the sketch
and its memory space usage.
|
static int |
getKFromEpsilon(double epsilon,
boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.
|
static int |
getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
Returns upper bound on the serialized size of a KllSketch given the following parameters.
|
abstract long |
getN()
Gets the length of the input stream.
|
double |
getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.
|
static double |
getNormalizedRankError(int k,
boolean pmf)
Gets the normalized rank error given k and pmf.
|
int |
getNumRetained()
Gets the number of quantiles retained by the sketch.
|
int |
getSerializedSizeBytes()
Returns the current number of bytes this Sketch would require if serialized.
|
boolean |
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.
|
boolean |
isDirect()
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).
|
boolean |
isEmpty()
Returns true if this sketch is empty.
|
boolean |
isEstimationMode()
Returns true if this sketch is in estimation mode.
|
boolean |
isMemoryUpdatableFormat()
Returns true if the backing WritableMemory is in updatable format.
|
boolean |
isReadOnly()
Returns true if this sketch is read only.
|
boolean |
isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource
of that.
|
void |
merge(KllSketch other)
Merges another sketch into this one.
|
String |
toString()
Returns a summary of the key parameters of the sketch.
|
String |
toString(boolean withLevels,
boolean withData)
Returns a summary of the sketch as a string.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitgetRankLowerBound, getRankUpperBound, resetpublic static final int DEFAULT_K
public static final int MAX_K
public static int getKFromEpsilon(double epsilon,
boolean pmf)
epsilon - the normalized rank error between zero and one.pmf - if true, this function returns the k assuming the input epsilon
is the desired "double-sided" epsilon for the getPMF() function. Otherwise, this function
returns k assuming the input epsilon is the desired "single-sided"
epsilon for all the other queries.public static int getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
k - parameter that controls size of the sketch and accuracy of estimatesn - stream lengthsketchType - either DOUBLES_SKETCH or FLOATS_SKETCHupdatableMemFormat - true if updatable Memory format, otherwise the standard compact format.public static double getNormalizedRankError(int k,
boolean pmf)
k - the configuration parameterpmf - if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.@Deprecated public final int getCurrentCompactSerializedSizeBytes()
getSerializedSizeBytes().@Deprecated public final int getCurrentUpdatableSerializedSizeBytes()
getSerializedSizeBytes().public abstract int getK()
QuantilesAPIgetK in interface QuantilesAPIpublic abstract long getN()
QuantilesAPIgetN in interface QuantilesAPIpublic final double getNormalizedRankError(boolean pmf)
pmf - if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.public final int getNumRetained()
QuantilesAPIgetNumRetained in interface QuantilesAPIpublic int getSerializedSizeBytes()
public boolean hasMemory()
QuantilesAPIhasMemory in interface QuantilesAPIpublic boolean isDirect()
QuantilesAPIisDirect in interface QuantilesAPIpublic final boolean isEmpty()
QuantilesAPIisEmpty in interface QuantilesAPIpublic final boolean isEstimationMode()
QuantilesAPIisEstimationMode in interface QuantilesAPIpublic final boolean isMemoryUpdatableFormat()
public final boolean isReadOnly()
QuantilesAPIisReadOnly in interface QuantilesAPIpublic final boolean isSameResource(org.apache.datasketches.memory.Memory that)
that - A different non-null objectpublic final void merge(KllSketch other)
other - sketch to merge into this onepublic final String toString()
QuantilesAPItoString in interface QuantilesAPItoString in class Objectpublic String toString(boolean withLevels, boolean withData)
withLevels - if true include information about levelswithData - if true include sketch dataCopyright © 2015–2022 The Apache Software Foundation. All rights reserved.