class documentation

class Vector: (source)

View In Hierarchy

A vector is used to construct the vector space of documents and queries. These vectors support operations to determine the similarity between two documents or a document and a query. Normally no parameters are required for initializing a vector, but in the case of loading a previously dumped vector the raw elements can be provided to the constructor. For performance reasons vectors are implemented with a flat array, where an elements index is immediately followed by its value. E.g. [index, value, index, value]. TODO: consider implemetation as 2-tuples. This allows the underlying array to be as sparse as possible and still offer decent performance when being used for vector calculations.

Method __init__ Undocumented
Method __iter__ Undocumented
Method __repr__ Undocumented
Method dot Calculates the dot product of this vector and another vector.
Method insert Inserts an element at an index within the vector.
Method position_for_index Calculates the position within the vector to insert a given index.
Method serialize Undocumented
Method similarity Calculates the cosine similarity between this vector and another vector.
Method to_list Converts the vector to an array of the elements within the vector
Method upsert Inserts or updates an existing index within the vector.
Instance Variable elements Undocumented
Property magnitude Undocumented
Instance Variable _magnitude Undocumented
def __init__(self, elements=None): (source)

Undocumented

def __iter__(self): (source)

Undocumented

def __repr__(self): (source)

Undocumented

def dot(self, other): (source)

Calculates the dot product of this vector and another vector.

def insert(self, insert_index, val): (source)

Inserts an element at an index within the vector. Does not allow duplicates, will throw an error if there is already an entry for this index.

def position_for_index(self, index): (source)

Calculates the position within the vector to insert a given index. This is used internally by insert and upsert. If there are duplicate indexes then the position is returned as if the value for that index were to be updated, but it is the callers responsibility to check whether there is a duplicate at that index

def serialize(self): (source)

Undocumented

def similarity(self, other): (source)

Calculates the cosine similarity between this vector and another vector.

def to_list(self): (source)

Converts the vector to an array of the elements within the vector

def upsert(self, insert_index, val, fn=None): (source)

Inserts or updates an existing index within the vector. Args: - insert_index (int): The index at which the element should be inserted. - val (int|float): The value to be inserted into the vector. - fn (callable, optional): An optional callable taking two arguments, the current value and the passed value to generate the final inserted value at the position in case of collision.

elements = (source)

Undocumented

@property
magnitude = (source)

Undocumented

_magnitude: int = (source)

Undocumented