All Classes

Class Description
AnotB
Computes a set difference, A-AND-NOT-B, of two theta sketches.
AnotB<S extends Summary>
Computes a set difference, A-AND-NOT-B, of two generic tuple sketches.
ArrayOfBooleansSerDe
Methods of serializing and deserializing arrays of Boolean as a bit array.
ArrayOfDoublesAnotB
Computes a set difference of two tuple sketches of type ArrayOfDoubles
ArrayOfDoublesCombiner
Combines two arrays of double values for use with ArrayOfDoubles tuple sketches
ArrayOfDoublesCompactSketch
Top level compact tuple sketch of type ArrayOfDoubles.
ArrayOfDoublesIntersection
Computes the intersection of two or more tuple sketches of type ArrayOfDoubles.
ArrayOfDoublesSerDe
Methods of serializing and deserializing arrays of Double.
ArrayOfDoublesSetOperationBuilder
Builds set operations object for tuple sketches of type ArrayOfDoubles.
ArrayOfDoublesSketch
The base class for the tuple sketch of type ArrayOfDoubles, where an array of double values is associated with each key.
ArrayOfDoublesSketches
Convenient static methods to instantiate tuple sketches of type ArrayOfDoubles.
ArrayOfDoublesSketchIterator
Interface for iterating over tuple sketches of type ArrayOfDoubles
ArrayOfDoublesUnion
The base class for unions of tuple sketches of type ArrayOfDoubles.
ArrayOfDoublesUpdatableSketch
The top level for updatable tuple sketches of type ArrayOfDoubles.
ArrayOfDoublesUpdatableSketchBuilder
For building a new ArrayOfDoublesUpdatableSketch
ArrayOfItemsSerDe<T>
Base class for serializing and deserializing custom types.
ArrayOfLongsSerDe
Methods of serializing and deserializing arrays of Long.
ArrayOfNumbersSerDe
Methods of serializing and deserializing arrays of the object version of primitive types of Number.
ArrayOfStringsSerDe
Methods of serializing and deserializing arrays of String.
ArrayOfStringsSketch  
ArrayOfStringsSummary  
ArrayOfStringsSummaryDeserializer  
ArrayOfStringsSummaryFactory  
ArrayOfStringsSummarySetOperations  
ArrayOfUtf16StringsSerDe
Methods of serializing and deserializing arrays of String.
BinarySearch
Contains common equality binary search algorithms.
BinomialBoundsN
This class enables the estimation of error bounds given a sample set size, the sampling probability theta, the number of standard deviations and a simple noDataSeen flag.
BoundsOnBinomialProportions
Confidence intervals for binomial proportions.
BoundsOnRatiosInSampledSets
This class is used to compute the bounds on the estimate of the ratio |B| / |A|, where: |A| is the unknown size of a set A of unique identifiers. |B| is the unknown size of a subset B of A. a = |SA| is the observed size of a sample of A that was obtained by Bernoulli sampling with a known inclusion probability f. b = |SA ∩ B| is the observed size of a subset of SA.
BoundsOnRatiosInThetaSketchedSets
This class is used to compute the bounds on the estimate of the ratio B / A, where: A is a Theta Sketch of population PopA. B is a Theta Sketch of population PopB that is a subset of A, obtained by an intersection of A with some other Theta Sketch C, which acts like a predicate or selection clause. The estimate of the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(A, B). The Upper Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(A, B). The Lower Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(A, B). Note: The theta of A cannot be greater than the theta of B.
BoundsOnRatiosInTupleSketchedSets
This class is used to compute the bounds on the estimate of the ratio B / A, where: A is a Tuple Sketch of population PopA. B is a Tuple or Theta Sketch of population PopB that is a subset of A, obtained by an intersection of A with some other Tuple or Theta Sketch C, which acts like a predicate or selection clause. The estimate of the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getEstimateOfBoverA(A, B). The Upper Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getUpperBoundForBoverA(A, B). The Lower Bound estimate on the ratio PopB/PopA is BoundsOnRatiosInThetaSketchedSets.getLowerBoundForBoverA(A, B). Note: The theta of A cannot be greater than the theta of B.
ByteArrayUtil
Useful methods for byte arrays.
CompactDoublesSketch  
CompactSketch
The parent class of all the CompactSketches.
CompactSketch<S extends Summary>
CompactSketches are never created directly.
CompressionCharacterization
This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.
CpcSketch
This is a unique-counting sketch that implements the Compressed Probabilistic Counting (CPC, a.k.a FM85) algorithms developed by Kevin Lang in his paper Back to the Future: an Even More Nearly Optimal Cardinality Estimation Algorithm.
CpcUnion
The union (merge) operation for the CPC sketches.
CpcWrapper
This provides a read-only view of a serialized image of a CpcSketch, which can be on-heap or off-heap represented as a Memory object, or on-heap represented as a byte array.
DeserializeResult<T>
Returns an object and its size in bytes as a result of a deserialize operation
DoubleSketch  
DoublesSketch
This is a stochastic streaming sketch that enables near-real time analysis of the approximate distribution of real values from a very large stream in a single pass.
DoublesSketchBuilder
For building a new quantiles DoublesSketch.
DoublesSketchIterator
Iterator over DoublesSketch.
DoubleSummary
Summary for generic tuple sketches of type Double.
DoubleSummary.Mode
The aggregation modes for this Summary
DoubleSummaryDeserializer  
DoubleSummaryFactory
Factory for DoubleSummary.
DoubleSummarySetOperations
Methods for defining how unions and intersections of two objects of type DoubleSummary are performed.
DoublesUnion
The API for Union operations for quantiles DoublesSketches
DoublesUnionBuilder
For building a new DoublesSketch Union operation.
ErrorType
Specifies one of two types of error regions of the statistical classification Confusion Matrix that can be excluded from a returned sample of Frequent Items.
Family
Defines the various families of sketch and set operation classes.
FdtSketch
A Frequent Distinct Tuples sketch.
Filter<T extends Summary>
Class for filtering entries from a Sketch given a Summary
GenericInequalitySearch
This provides efficient, unique and unambiguous binary searching for inequalities for ordered arrays of values that may include duplicate values.
GenericInequalitySearch.Inequality
The enumerator of inequalities
Group
Defines a Group from a Frequent Distinct Tuple query.
HashIterator
This is used to iterate over the retained hash values of the Theta sketch.
HashOperations
Helper class for the common hash table methods.
HllSketch
This is a high performance implementation of Phillipe Flajolet’s HLL sketch but with significantly improved error behavior.
InequalitySearch
This provides efficient, unique and unambiguous binary searching for inequality comparison criteria for ordered arrays of values that may include duplicate values.
IntegerSketch  
IntegerSummary
Summary for generic tuple sketches of type Integer.
IntegerSummary.Mode
The aggregation modes for this Summary
IntegerSummaryDeserializer  
IntegerSummaryFactory
Factory for IntegerSummary.
IntegerSummarySetOperations
Methods for defining how unions and intersections of two objects of type IntegerSummary are performed.
Intersection
The API for intersection operations
Intersection<S extends Summary>
Computes an intersection of two or more generic tuple sketches or generic tuple sketches combined with theta sketches.
IntMemoryPairIterator
Iterates within a given Memory extracting integer pairs.
ItemsSketch<T>
This sketch is useful for tracking approximate frequencies of items of type <T> with optional associated counts (<T> item, long count) that are members of a multiset of such items.
ItemsSketch<T>
This is a stochastic streaming sketch that enables near-real time analysis of the approximate distribution of comparable items from a very large stream in a single pass.
ItemsSketch.Row<T>
Row class that defines the return values from a getFrequentItems query.
ItemsSketchIterator<T>
Iterator over ItemsSketch.
ItemsUnion<T>
The API for Union operations for generic ItemsSketches
JaccardSimilarity
Jaccard similarity of two Theta Sketches.
JaccardSimilarity
Jaccard similarity of two Tuple Sketches, or alternatively, of a Tuple and Theta Sketch.
KllFloatsSketch
Implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained item.
KllFloatsSketchIterator
Iterator over KllFloatsSketch.
LongsSketch
This sketch is useful for tracking approximate frequencies of long items with optional associated counts (long item, long count) that are members of a multiset of such items.
LongsSketch.Row
Row class that defines the return values from a getFrequentItems query.
MergingValidation
This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.
MurmurHash3
The MurmurHash3 is a fast, non-cryptographic, 128-bit hash function that has excellent avalanche and 2-way bit independence properties.
MurmurHash3Adaptor
A general purpose wrapper for the MurmurHash3.
PairwiseSetOperations Deprecated.
v2.0.0.
PostProcessor
This processes the contents of a FDT sketch to extract the primary keys with the most frequent unique combinations of the non-primary dimensions.
QuantilesHelper
Common static methods for quantiles sketches
QuickMergingValidation
This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.
QuickSelect
QuickSelect algorithm improved from Sedgewick.
ReqDebug
The signaling interface that allows comprehensive analysis of the ReqSketch and ReqCompactor while eliminating code clutter in the main classes.
ReqIterator
Iterator over all retained items of the ReqSketch.
ReqSketch
This Relative Error Quantiles Sketch is the Java implementation based on the paper "Relative Error Streaming Quantiles", https://arxiv.org/abs/2004.01668, and loosely derived from a Python prototype written by Pavel Vesely.
ReqSketchBuilder
For building a new ReqSketch
ReservoirItemsSketch<T>
This sketch provides a reservoir sample over an input stream of items.
ReservoirItemsUnion<T>
Class to union reservoir samples of generic items.
ReservoirLongsSketch
This sketch provides a reservoir sample over an input stream of longs.
ReservoirLongsUnion
Class to union reservoir samples of longs.
ResizeFactor
For the Families that accept this configuration parameter, it controls the size multiple that affects how fast the internal cache grows, when more space is required.
SampleSubsetSummary
A simple object o capture the results of a subset sum query on a sampling sketch.
SerializerDeserializer
Multipurpose serializer-deserializer for a collection of sketches defined by the enum.
SerializerDeserializer.SketchType
Defines the sketch classes that this SerializerDeserializer can handle.
SetOperation
The parent API for all Set Operations
SetOperationBuilder
For building a new SetOperation.
Sketch
The top-level class for all sketches.
Sketch<S extends Summary>
This is an equivalent to org.apache.datasketches.theta.Sketch with addition of a user-defined Summary object associated with every unique entry in the sketch.
Sketches
This class brings together the common sketch and set operation creation methods and the public static methods into one place.
Sketches
Convenient static methods to instantiate generic tuple sketches.
SketchesArgumentException
Illegal Arguments Exception class for the library
SketchesException
Exception class for the library
SketchesReadOnlyException
Write operation attempted on a read-only class.
SketchesStateException
Illegal State Exception class for the library
SketchIterator<S extends Summary>
Iterator over a generic tuple sketch
StreamingValidation
This code is used both by unit tests, for short running tests, and by the characterization repository for longer running, more exhaustive testing.
Summary
Interface for user-defined Summary, which is associated with every hash in a tuple sketch
SummaryDeserializer<S extends Summary>
Interface for deserializing user-defined Summary
SummaryFactory<S extends Summary>
Interface for user-defined SummaryFactory
SummarySetOperations<S extends Summary>
This is to provide methods of producing unions and intersections of two Summary objects.
TestUtil  
TgtHllType
Specifies the target type of HLL sketch to be created.
Union
This performs union operations for all HllSketches.
Union
Compute the union of two or more theta sketches.
Union<S extends Summary>
Compute the union of two or more generic tuple sketches or generic tuple sketches combined with theta sketches.
UniqueCountMap
This is a real-time, key-value HLL mapping sketch that tracks approximate unique counts of identifiers (the values) associated with each key.
UpdatableSketch<U,​S extends UpdatableSummary<U>>
An extension of QuickSelectSketch<S>, which can be updated with many types of keys.
UpdatableSketchBuilder<U,​S extends UpdatableSummary<U>>
For building a new generic tuple UpdatableSketch
UpdatableSummary<U>
Interface for updating user-defined Summary
UpdateDoublesSketch  
UpdateReturnState
UpdateSketch
The parent class for the Update Sketch families, such as QuickSelect and Alpha.
UpdateSketchBuilder
For building a new UpdateSketch.
Util
Common utility functions for Tuples
Util
Common utility functions.
VarOptItemsSamples<T>
This class provides access to the samples contained in a VarOptItemsSketch.
VarOptItemsSketch<T>
This sketch provides a variance optimal sample over an input stream of weighted items.
VarOptItemsUnion<T>
Provides a unioning operation over varopt sketches.
XxHash
The XxHash is a fast, non-cryptographic, 64-bit hash function that has excellent avalanche and 2-way bit independence properties.