public abstract class KllSketch extends Object
Please refer to the documentation in the package-info:
org.apache.datasketches.kll
| Modifier and Type | Class and Description |
|---|---|
static class |
KllSketch.SketchType
Used to define the variable type of the current instance of this class.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_K
The default value of K
|
static int |
MAX_K
The maximum value of K
|
| Modifier and Type | Method and Description |
|---|---|
int |
getCurrentCompactSerializedSizeBytes()
Returns the current number of bytes this sketch would require to store in the compact Memory Format.
|
int |
getCurrentUpdatableSerializedSizeBytes()
Returns the current number of bytes this sketch would require to store in the updatable Memory Format.
|
abstract int |
getK()
Returns the user configured parameter k
|
static int |
getKFromEpsilon(double epsilon,
boolean pmf)
Gets the approximate value of k to use given epsilon, the normalized rank error.
|
static int |
getMaxSerializedSizeBytes(int k,
long n)
Deprecated.
Instead use getMaxSerializedSizeBytes(int, long, boolean)
from the descendants of this class, or
getMaxSerializedSizeBytes(int, long, SketchType, boolean) from this class.
Version 3.2.0
|
static int |
getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
Returns upper bound on the serialized size of a KllSketch given the following parameters.
|
abstract long |
getN()
Returns the length of the input stream in items.
|
double |
getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.
|
static double |
getNormalizedRankError(int k,
boolean pmf)
Gets the normalized rank error given k and pmf.
|
int |
getNumRetained()
Returns the number of retained items (samples) in the sketch.
|
int |
getSerializedSizeBytes()
Returns the current number of bytes this Sketch would require if serialized.
|
boolean |
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.
|
boolean |
isDirect()
Returns true if the backing resource is direct, i.e., actually allocated in off-heap memory.
|
boolean |
isEmpty()
Returns true if this sketch is empty.
|
boolean |
isEstimationMode()
Returns true if this sketch is in estimation mode.
|
boolean |
isMemoryUpdatableFormat()
Returns true if the backing WritableMemory is in updatable format.
|
boolean |
isReadOnly()
Returns true if this sketch is read only.
|
boolean |
isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource
of that.
|
void |
merge(KllSketch other)
Merges another sketch into this one.
|
void |
reset()
This resets the current sketch back to zero entries.
|
byte[] |
toByteArray()
Returns serialized sketch in a compact byte array form.
|
String |
toString() |
String |
toString(boolean withLevels,
boolean withData)
Returns a summary of the sketch as a string.
|
public static final int DEFAULT_K
public static final int MAX_K
public static int getKFromEpsilon(double epsilon,
boolean pmf)
epsilon - the normalized rank error between zero and one.pmf - if true, this function returns the value of k assuming the input epsilon
is the desired "double-sided" epsilon for the getPMF() function. Otherwise, this function
returns the value of k assuming the input epsilon is the desired "single-sided"
epsilon for all the other queries.
Please refer to the documentation in the package-info:
org.apache.datasketches.kll
@Deprecated public static int getMaxSerializedSizeBytes(int k, long n)
k - parameter that controls size of the sketch and accuracy of estimatesn - stream lengthpublic static int getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
k - parameter that controls size of the sketch and accuracy of estimatesn - stream lengthsketchType - either DOUBLES_SKETCH or FLOATS_SKETCHupdatableMemFormat - true if updatable Memory format, otherwise the standard compact format.public static double getNormalizedRankError(int k,
boolean pmf)
k - the configuration parameterpmf - if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.public final int getCurrentCompactSerializedSizeBytes()
public final int getCurrentUpdatableSerializedSizeBytes()
public abstract int getK()
public abstract long getN()
public final double getNormalizedRankError(boolean pmf)
pmf - if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.
The epsilon value returned is a best fit to 99 percentile empirically measured max error in
thousands of trialsPlease refer to the documentation in the package-info:
org.apache.datasketches.kll
public final int getNumRetained()
public int getSerializedSizeBytes()
public boolean hasMemory()
public boolean isDirect()
public final boolean isEmpty()
public final boolean isEstimationMode()
public final boolean isMemoryUpdatableFormat()
public final boolean isReadOnly()
public final boolean isSameResource(org.apache.datasketches.memory.Memory that)
that - A different non-null objectpublic final void merge(KllSketch other)
other - sketch to merge into this onepublic final void reset()
public byte[] toByteArray()
public String toString(boolean withLevels, boolean withData)
withLevels - if true include information about levelswithData - if true include sketch dataCopyright © 2015–2020 The Apache Software Foundation. All rights reserved.