Recipes for commonvoice ASR and LID by neillu23 · Pull Request #129 · hyperion-ml/hyperion

neillu23 · 2023-02-01T05:23:22Z

No description provided.

…actor

Persephone asr

jesus-villalba

If you haven't, could you pass black on the python files you changed? you can config vscode to do it automatically each time you save, otherwise you can just run "black file_path" on each file.

jesus-villalba · 2023-05-03T21:11:17Z

hyperion/bin/preprocess_audio_files.py

                t2 = time.time()

+                if output_sampling_rate is not None:
+                    x = signal.resample(x, int(x.shape[0]*output_sampling_rate/fs))


signal resample may not be a good option, I don't know if it could affect audio quality. I used this function for the VAD because I just wanted to stretch it from frame-level to sample level vad. But I don't know if this function is good for audio. Could you check the audios you got?

jesus-villalba · 2023-05-03T21:12:16Z

hyperion/bin/train_wav2vec2languageid.py

@@ -0,0 +1,261 @@
+#!/usr/bin/env python
+"""


has this faile something different to wav2vec2xvector trainer?

jesus-villalba · 2023-05-03T21:15:32Z

hyperion/torch/data/audio_dataset.py

        else:
            assert "duration" in self.seg_set

+


run black on this file and other files python file you edited to remove the extra white lines

jesus-villalba · 2023-05-03T21:17:35Z

hyperion/torch/data/seg_sampler_factory.py

        parser.add_argument(
            "--base-sampler-type",
-            choices=["seg_sampler", "bucketing_seg_sampler"],
+            choices=["seg_sampler", "bucketing_seg_sampler", "bucketing_seg_sampler","class_weighted_seg_sampler"],


there is a repeated choice

jesus-villalba · 2023-05-03T21:18:24Z

hyperion/torch/models/__init__.py

 from .vae.vae import VAE
 from .vae.vq_vae import VQVAE
 from .transducer import RNNTransducer, RNNRNNTransducer
+from .wav2languageid import HFWav2Vec2ResNet1dLanguageID


Do we need this one?

jesus-villalba · 2023-05-03T21:24:08Z

hyperion/torch/models/transducer/rnn_transducer.py

        x: torch.Tensor,
        x_lengths: torch.Tensor,
-        y: k2.RaggedTensor,
+        y: Union[Dict, k2.RaggedTensor],


jesus-villalba · 2023-05-03T21:24:32Z

hyperion/torch/models/wav2languageid/__init__.py

@@ -0,0 +1,7 @@
+"""


can we delete this directory?

jesus-villalba · 2023-05-03T21:25:58Z

hyperion/torch/models/wav2transducer_languageid/__init__.py

@@ -0,0 +1,7 @@
+"""


I would invert the name to languageid_transducer since we do first the language id and the we want to use it for the asr

jesus-villalba · 2023-05-03T21:27:24Z

hyperion/torch/trainers/languageid_trainer.py

@@ -0,0 +1,212 @@
+"""


can we delete this one?

…erion into persephone-refactor

…erion into persephone-asr

…erion into persephone-refactor

Persephone entry

…hyperion into persephone-refactor

…erion into persephone-refactor

…hyperion into persephone-refactor

neillu23 and others added 2 commits January 23, 2023 17:25

commonvoice speech recognition recipe

48b1e4e

update slurm configuration for rockfish

beab75c

neillu23 changed the title ~~Add recipe for commonvoice tranducer and slurm configuration~~ Recipe for commonvoice tranducer and slurm configuration Feb 1, 2023

ylu125 and others added 5 commits February 1, 2023 13:32

update data preparation for different languge

046b5f7

update config and add cer scripts

beb2ed5

temporal remove data preparation for duration

ff0fd55

Add combination for multiple languages

f179db4

Add language identification task for commonvoice

f816ed3

neillu23 changed the title ~~Recipe for commonvoice tranducer and slurm configuration~~ Recipes for commonvoice ASR and LID Feb 20, 2023

ylu125 and others added 8 commits March 23, 2023 20:36

Add Class Weighted Sampler for ASR and utterance-wise LID

b524b84

Remove the seg_weighted_mode for sequence-level task

07ddda6

Merge remote-tracking branch 'hyp/persephone-asr' into persephone-ref…

a2eff8e

…actor

Update the LID trainer for merging the new dataloader

396e020

add commonvoice config for rnnt transducer

2ecdebf

Add fine-tuning code for pruned RNN-T, LID, and Both

d33abe9

Add LID decode scripts

3b7e8ac

Merge pull request hyperion-ml#127 from hyperion-ml/persephone-asr

85282ac

Persephone asr

jesus-villalba reviewed May 3, 2023

View reviewed changes

jesus-villalba and others added 12 commits May 4, 2023 09:55

new vox2 dataprep

35391de

update the np.str to np.str_

ebef851

update np.str to np.str_

720bd6e

Merge branch 'persephone-refactor' of https://github.com/neillu23/hyp…

845d2e0

…erion into persephone-refactor

Add empty __init__.py

b112ebd

fix new vox2 dataprep durations, scp -> RecordingSet

cf861bc

some fixes in sre21

c408f74

update lid configs and np.str to str

9c28408

FiLM transducer

7f43376

Add FiLMed Transducer

20c13e7

remove unused function

f8c84a9

Add decode script and configurations

05474de

ylu125 and others added 30 commits July 4, 2023 17:34

update joint training for ASR-LID

acbfc06

Merge branch 'persephone-refactor' of https://github.com/neillu23/hyp…

f2e5aad

…erion into persephone-asr

merge commit

47fae72

update decode code

562498f

Merge branch 'persephone-refactor' of https://github.com/neillu23/hyp…

27e96ba

…erion into persephone-refactor

add rnn_original for film-rnn

458e65e

finished experiments of models 2.0 in voxceleb/v2

c1d193a

add configs for commonvoice speaker verification

26eca97

voxceleb v1.2 works up to snorm backend

89efce4

Add new parameters for feat_fusion_end

77bbad4

finished vox v1.2 except plda

89c6e20

introduce entry points

44f085a

make it work with cuda 11

6105476

Merge pull request hyperion-ml#137 from hyperion-ml/persephone-entry

e4a5be1

Persephone entry

started vox/v2.1 recipe and fix some readmes

392cd30

vox/v2.1 recipe done, not tested

ed35173

implemented lora in w2v2, not tested

8760d05

Merge branch 'persephone-refactor' of https://github.com/hyperion-ml/…

09ba2f2

…hyperion into persephone-refactor

Merge branch 'persephone-refactor' of https://github.com/neillu23/hyp…

844b6e3

…erion into persephone-refactor

add lora into ASR (haven't tested)

71f629d

vox2.1 working and lora

a75610e

Merge branch 'persephone-refactor' of https://github.com/hyperion-ml/…

d4823db

…hyperion into persephone-refactor

lora in wavlm and hubert

c23103e

fix bug in w2v constructors with lora

81c540b

Merge branch 'persephone-refactor' of https://github.com/hyperion-ml/…

95ed74d

…hyperion into persephone-refactor

update default argument of lora_merge_weights to false

a54c963

update config for 4 langs experiment

6a72173

Add FiLM inside the Wav2vec2

e15b227

update FiLM Wav2vec2

9022d8a

add charachter based model for ASR

27fffa0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recipes for commonvoice ASR and LID#129

Recipes for commonvoice ASR and LID#129
neillu23 wants to merge 116 commits intohyperion-ml:persephone-asrfrom
neillu23:persephone-refactor

neillu23 commented Feb 1, 2023

Uh oh!

jesus-villalba left a comment

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

jesus-villalba May 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

neillu23 commented Feb 1, 2023

Uh oh!

jesus-villalba left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants