Skip to content

OSError: Character label file (csv format) doesn`t exist : ../../../data/vocab/aihub_character_vocabs.csv #168

@agag8945

Description

@agag8945

python main.py --dataset_path $DATASET_PATH --vocab_dest $VOCAB_DEST --output_unit $OUTPUT_UNIT --preprocess_mode $PREPROCESS_MODE --vocab_size $VOCAB_SIZE
위의 코드를 실행하여 transcript.txt파일과 aihub_labels.csv파일을 생성하는 것 까지는 성공했습니다.
이후
python ./bin/main.py model=ds2 train=ds2_train train.dataset_path=$DATASET_PATH
코드를 실행하여 학습을 진행시켰는데

[2022-11-07 10:56:29,128][kospeech.utils][INFO] - Operating System : Linux 5.10.133+
[2022-11-07 10:56:29,128][kospeech.utils][INFO] - Processor : x86_64
[2022-11-07 10:56:29,129][kospeech.utils][INFO] - CUDA is available : False
[2022-11-07 10:56:29,129][kospeech.utils][INFO] - PyTorch version : 1.12.1+cu113
Error executing job with overrides: ['model=ds2', 'train=ds2_train', 'train.dataset_path=/content/drive/MyDrive/KoreanSpeech_dataset/KoreanSpeech_categori/KsponSpeech_01']
Traceback (most recent call last):
File "/content/drive/MyDrive/kospeech_lastest/bin/kospeech/vocabs/ksponspeech.py", line 126, in load_vocab
with open(label_path, 'r', encoding=encoding) as f:
FileNotFoundError: [Errno 2] No such file or directory: '../../../data/vocab/aihub_character_vocabs.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/content/drive/MyDrive/kospeech_lastest/bin/main.py", line 162, in main
last_model_checkpoint = train(config)
File "/content/drive/MyDrive/kospeech_lastest/bin/main.py", line 85, in train
output_unit=config.train.output_unit,
File "/content/drive/MyDrive/kospeech_lastest/bin/kospeech/vocabs/ksponspeech.py", line 46, in init
self.vocab_dict, self.id_dict = self.load_vocab(vocab_path, encoding='utf-8')
File "/content/drive/MyDrive/kospeech_lastest/bin/kospeech/vocabs/ksponspeech.py", line 139, in load_vocab
raise IOError("Character label file (csv format) doesnt exist : {0}".format(label_path)) OSError: Character label file (csv format) doesnt exist : ../../../data/vocab/aihub_character_vocabs.csv

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

위와 같은 에러가 발생한 것을 확인했습니다.
aihub_character_vocabs.csv파일이 존재함에도 저런 에러가 생기네요..
해결 방법이 있을까요??

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions