Skip to content

Make the notebook fully parametric #81

@grayswandyr

Description

@grayswandyr

Hi, thanks for the good work! I'd like to generate my own layout based on my personal corpus (my mails, code, LaTeX reports, etc., both in French and English). I tried to adapt the code but there are too many parts I don't understand sufficiently, and it also seems that some hard-coded data is defined in several places in the code, so that I ultimately don't end up a specific layout.

Would it be possible to adapt the code such that interested people only have to provide a few arrays at the beginning of the notebook and then one just has to run everything to get candidate layouts in the end? In principle, it should be enough to provide a table of letter frequencies and bigram frequencies, right?

As an example, this is what I did for my own letter frequencies :

my_24letters = [ ('E', 1286273), ('T', 911921), ('I', 785967), ('A', 767995), ... ]
my_bigrams = [('IN', 178498), ('ON', 149623), ('TH', 134033), ('TI', 132851), ('RE', 131569), ... ]

letters24, instances24 = list(zip(*my_24letters))
max_frequency = instances24[0]

bigrams_arr, bigram_freqs_arr = list(zip(*my_bigrams))
bigrams = np.array(bigrams_arr)
bigram_frequencies = np.array(bigram_freqs_arr)

I don't know where to go from there... but if the code was totally parametric, this would be far-reaching for Engram I think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions