Package org.apache.datasketches.tuple

The tuple package contains implementation of sketches based on the idea of theta sketches with the addition of values associated with unique keys. Two sets of tuple sketch classes are available at the moment: generic tuple sketches with user-defined Summary, and a faster specialized implementation with an array of double values. See unit tests for usage examples.
Author:
Alexander Saydakov
  • Interface Summary 
    Interface Description
    Summary
    Interface for user-defined Summary, which is associated with every hash in a tuple sketch
    SummaryDeserializer<S extends Summary>
    Interface for deserializing user-defined Summary
    SummaryFactory<S extends Summary>
    Interface for user-defined SummaryFactory
    SummarySetOperations<S extends Summary>
    This is to provide methods of producing unions and intersections of two Summary objects.
    UpdatableSummary<U>
    Interface for updating user-defined Summary
  • Class Summary 
    Class Description
    AnotB<S extends Summary>
    Computes a set difference, A-AND-NOT-B, of two generic tuple sketches.
    CompactSketch<S extends Summary>
    CompactSketches are never created directly.
    DeserializeResult<T>
    Returns an object and its size in bytes as a result of a deserialize operation
    Filter<T extends Summary>
    Class for filtering entries from a Sketch given a Summary
    Intersection<S extends Summary>
    Computes an intersection of two or more generic tuple sketches or generic tuple sketches combined with theta sketches.
    JaccardSimilarity
    Jaccard similarity of two Tuple Sketches, or alternatively, of a Tuple and Theta Sketch.
    SerializerDeserializer
    Multipurpose serializer-deserializer for a collection of sketches defined by the enum.
    Sketch<S extends Summary>
    This is an equivalent to org.apache.datasketches.theta.Sketch with addition of a user-defined Summary object associated with every unique entry in the sketch.
    Sketches
    Convenient static methods to instantiate generic tuple sketches.
    SketchIterator<S extends Summary>
    Iterator over a generic tuple sketch
    Union<S extends Summary>
    Compute the union of two or more generic tuple sketches or generic tuple sketches combined with theta sketches.
    UpdatableSketch<U,​S extends UpdatableSummary<U>>
    An extension of QuickSelectSketch<S>, which can be updated with many types of keys.
    UpdatableSketchBuilder<U,​S extends UpdatableSummary<U>>
    For building a new generic tuple UpdatableSketch
    Util
    Common utility functions for Tuples
  • Enum Summary 
    Enum Description
    SerializerDeserializer.SketchType
    Defines the sketch classes that this SerializerDeserializer can handle.