Skip to content

DRAFT: Proper word alignment for translation#5

Open
valgardg wants to merge 1 commit into
mainfrom
word_alignment
Open

DRAFT: Proper word alignment for translation#5
valgardg wants to merge 1 commit into
mainfrom
word_alignment

Conversation

@valgardg
Copy link
Copy Markdown
Owner

The previous (current) solution to Icelandic -> English word translations is to iterate through each word and translate them one by one. This has several disadvantages

  • Because its a per word translation, the context of the word within the overall sentence is lost and translations lose meaning and are often incorrect
  • AWS translation API is currently used for translation and while translating whole sentences is good enough, per word translation is severely lacking and provides inaccurate / incorrect and sometimes nonexistent translations which are counterproductive to the goals of the webapp

This pull request aims to add better translation for each word by leveraging the use of a fine-tuned word alignment model to align each source sentence word to its translated target sentence equivalent
Model fine-tuning was done using awesome-align (https://github.com/neulab/awesome-align)

@valgardg valgardg self-assigned this Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant