More "julian" idioms or keep it closer to HDC/VSA nomenclature?

I was looking at the current implementation of the vector types and its associated function and noticed that there are some operations that could be better modeled via overloading Base Julia functions. For example, instead of having something like `empty_vector(T::AbstractHDV)` we could have `zeros(T::AbstractHDV)`. I compiled a list of things we could implement to make the package more "julian", but I'm unsure if this is better.

I'll leave the list here of the ones that come to my mind, which I can tackle once the refactor is done if it makes sense:

- `Base.zeros(T::Type{AbstractHDV}, N::Int=10_000)` for empty hypervector initialization
- `Base.ones(T::Type{AbstractHDV}, N::Int=10_000)` for filled hypervector initialization
- `Base.inv(h::AbstractHDV)` inverts the hypervector
- `Base.isequal(h::AbstractHDV, u::AbstractHDV)` for comparing equality
- `Base.hash(h::AbstractHDV)` for hashing hypervector (this is necessary for some things)
- `Base.isapprox(u::T, v::T) where T<:AbstractHDV` to check if hypervectors are similar (unlocks `\approx` operator)

If you have an idea for more, please add them to the list.

Why I think this is important? Mainly to make the use of the package more friendly or idiomatic. I think aiming to keep this as abstracted as possible would be nice since we could just write pipelines as simple algebra, for example let's replicate the ["What's the dollar of Mexico?" exercise Pentti Kanerva proposed](https://redwood.berkeley.edu/wp-content/uploads/2020/05/kanerva2010what.pdf):

```julia
using HyperdimensionalComputing

# Some missing things
HyperdimensionalComputing.isapprox(u::T, v::T) where T<:BipolarHDV = similarity(u, v) > 1/sqrt(length(v)) # mean + 3*sd
HyperdimensionalComputing.hash(u::AbstractHDV) = hash(u.v, hash(typeof(v)))
HyperdimensionalComputing.isequal(u::T, v::T) where T<:AbstractHDV = hash(u) == hash(v)

# Holistic representation of countries
# 1. Concept hypervectors
COUNTRY = BipolarHDV()
CAPITAL = BipolarHDV()
MONEY = BipolarHDV()

# 2. Values hypervectors
USA = BipolarHDV()
MEX = BipolarHDV()

WDC = BipolarHDV()
MXC = BipolarHDV()

DOL = BipolarHDV()
PES = BipolarHDV()

# 3. Country representations
USTATES = (COUNTRY * USA) + (CAPITAL * WDC) + (MONEY * DOL)
MEXICO  = (COUNTRY * MEX) + (CAPITAL * MXC) + (MONEY * PES)

# Are USA and Mexico similar?
USTATES ≈ MEXICO # returns: false

# If we now pair USTATES with MEXICO, we get a bundle that pairs USA with Mexico, Washington DC
# with Mexico City, and dollar with peso, plus noise.
# This is, in principle, a mapping code from USA <-> Mexico, which can be used to extract 
# information using similarity
F_UM = USTATES * MEXICO

# What in Mexico corresponds to United States' dollar:
DOL * F_UM ≈ PES # returns: true

# Let's add Sweden
SWE = BipolarHDV()
STO = BipolarHDV()
KRO = BipolarHDV()

SWEDEN = (COUNTRY * SWE) + (CAPITAL * STO) + (MONEY * KRO)

# Can we reconstruct the mapping Sweden <-> Mexico from the mapping Sweden <-> USA <-> Mexico?
F_UM = USTATES * MEXICO
F_SU = SWEDEN * USTATES
F_SM = SWEDEN * MEXICO

F_SU * F_UM ≈ F_SM # returns: true
```

Easy to read, to implement and to maintain. LMKWYT


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More "julian" idioms or keep it closer to HDC/VSA nomenclature? #14

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

More "julian" idioms or keep it closer to HDC/VSA nomenclature? #14

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions