Skip to content

Error running BERT tagger on CoLi servers #4

@siyutao

Description

@siyutao

Currently getting error while running the allennlp0.8 BERT config tagger/tagger_with_bert_config.json after changing the label_encoding to "BIO" ("BIOUL" throws a different error)
Error output:

Traceback (most recent call last):
  File "/proj/irtg.shadow/conda/envs/allennlp/bin/allennlp", line 10, in <module>
    sys.exit(run())
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/commands/__init__.py", line 102, in main
    args.func(args)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/commands/train.py", line 116, in train_model_from_args
    args.cache_prefix)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/commands/train.py", line 160, in train_model_from_file
    cache_directory, cache_prefix)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/commands/train.py", line 243, in train_model
    metrics = trainer.train()
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/training/trainer.py", line 480, in train
    train_metrics = self._train_epoch(epoch)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/training/trainer.py", line 322, in _train_epoch
    loss = self.batch_loss(batch_group, for_training=True)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/training/trainer.py", line 263, in batch_loss
    output_dict = self.model(**batch)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/models/crf_tagger.py", line 182, in forward
    embedded_text_input = self.text_field_embedder(tokens)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 125, in forward
    return torch.cat(embedded_representations, dim=-1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 433 and 422 in dimension 1 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:71

Another TO-DO: we need to add min_padding_length to config. May or may not be related to the current error.

/proj/irtg.shadow/conda/envs/allennlp/lib/python3.7/site-packages/allennlp/data/token_indexers/token_characters_indexer.py:55: UserWarning: You are using the default value (0) of `min_padding_length`, which can cause some subtle bugs (more info see https://github.com/allenai/allennlp/issues/1954). Strongly recommend to set a value, usually the maximum size of the convolutional layer size when using CnnEncoder.

Metadata

Metadata

Labels

bugSomething isn't workingwontfixThis will not be worked on

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions