Skip to content

More "julian" idioms or keep it closer to HDC/VSA nomenclature? #14

@cvigilv

Description

@cvigilv

I was looking at the current implementation of the vector types and its associated function and noticed that there are some operations that could be better modeled via overloading Base Julia functions. For example, instead of having something like empty_vector(T::AbstractHDV) we could have zeros(T::AbstractHDV). I compiled a list of things we could implement to make the package more "julian", but I'm unsure if this is better.

I'll leave the list here of the ones that come to my mind, which I can tackle once the refactor is done if it makes sense:

  • Base.zeros(T::Type{AbstractHDV}, N::Int=10_000) for empty hypervector initialization
  • Base.ones(T::Type{AbstractHDV}, N::Int=10_000) for filled hypervector initialization
  • Base.inv(h::AbstractHDV) inverts the hypervector
  • Base.isequal(h::AbstractHDV, u::AbstractHDV) for comparing equality
  • Base.hash(h::AbstractHDV) for hashing hypervector (this is necessary for some things)
  • Base.isapprox(u::T, v::T) where T<:AbstractHDV to check if hypervectors are similar (unlocks \approx operator)

If you have an idea for more, please add them to the list.

Why I think this is important? Mainly to make the use of the package more friendly or idiomatic. I think aiming to keep this as abstracted as possible would be nice since we could just write pipelines as simple algebra, for example let's replicate the "What's the dollar of Mexico?" exercise Pentti Kanerva proposed:

using HyperdimensionalComputing

# Some missing things
HyperdimensionalComputing.isapprox(u::T, v::T) where T<:BipolarHDV = similarity(u, v) > 1/sqrt(length(v)) # mean + 3*sd
HyperdimensionalComputing.hash(u::AbstractHDV) = hash(u.v, hash(typeof(v)))
HyperdimensionalComputing.isequal(u::T, v::T) where T<:AbstractHDV = hash(u) == hash(v)

# Holistic representation of countries
# 1. Concept hypervectors
COUNTRY = BipolarHDV()
CAPITAL = BipolarHDV()
MONEY = BipolarHDV()

# 2. Values hypervectors
USA = BipolarHDV()
MEX = BipolarHDV()

WDC = BipolarHDV()
MXC = BipolarHDV()

DOL = BipolarHDV()
PES = BipolarHDV()

# 3. Country representations
USTATES = (COUNTRY * USA) + (CAPITAL * WDC) + (MONEY * DOL)
MEXICO  = (COUNTRY * MEX) + (CAPITAL * MXC) + (MONEY * PES)

# Are USA and Mexico similar?
USTATES  MEXICO # returns: false

# If we now pair USTATES with MEXICO, we get a bundle that pairs USA with Mexico, Washington DC
# with Mexico City, and dollar with peso, plus noise.
# This is, in principle, a mapping code from USA <-> Mexico, which can be used to extract 
# information using similarity
F_UM = USTATES * MEXICO

# What in Mexico corresponds to United States' dollar:
DOL * F_UM  PES # returns: true

# Let's add Sweden
SWE = BipolarHDV()
STO = BipolarHDV()
KRO = BipolarHDV()

SWEDEN = (COUNTRY * SWE) + (CAPITAL * STO) + (MONEY * KRO)

# Can we reconstruct the mapping Sweden <-> Mexico from the mapping Sweden <-> USA <-> Mexico?
F_UM = USTATES * MEXICO
F_SU = SWEDEN * USTATES
F_SM = SWEDEN * MEXICO

F_SU * F_UM  F_SM # returns: true

Easy to read, to implement and to maintain. LMKWYT

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions