Skip to content
This repository was archived by the owner on Oct 13, 2022. It is now read-only.

Use tropical semiring for lm_paths.get_tot_scores#214

Merged
csukuangfj merged 1 commit into
k2-fsa:masterfrom
csukuangfj:fix-n-best-rescoring
Jun 16, 2021
Merged

Use tropical semiring for lm_paths.get_tot_scores#214
csukuangfj merged 1 commit into
k2-fsa:masterfrom
csukuangfj:fix-n-best-rescoring

Conversation

@csukuangfj

Copy link
Copy Markdown
Collaborator

See #201 (comment)

The 2nd arg to get_tot_scores() here, representing log_semiring, should be false, because ARPA-type language models are constructed in such a way that the backoff prob is included in the direct arc. I.e. we would be double-counting if we were to sum the probabilities of the non-backoff and backoff arcs.

Change log_semiring to tropical_semiring indeed improves the WER. For the test-clean dataset, when num_paths is 100
and lm_scale=1.2, the WER decreases from 6.06 to 5.98.

@csukuangfj csukuangfj merged commit ad161f6 into k2-fsa:master Jun 16, 2021
@csukuangfj csukuangfj deleted the fix-n-best-rescoring branch June 16, 2021 10:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant