Skip to content

Construction-Material/Construction-Dataset-CONSD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Construction-Dataset-CONSD

Introduction

It is an annotated dataset CONSD through the improved distantly supervised strategy Ont4RE for entity-property relation extraction in the construction industry.

More details about Ont4RE can be referred to another repo Ontology-for-Relation-Extraction-Ont4RE

img.png

Usage

  • corpus.txt is the file containing sentence pool;
  • corpus_chinese_word_segmentation.txt is the file containing chinese-segmented sentence pool;
  • CEMO_triples.txt is the file containing ontological classes;
  • \CONSD is the annotated sentences using the Ont4RE;
  • \CONSD_rule is the annotated sentences using the traditional distantly supervised strategy.

Citation

If you find CONSD dataset is helpful for your research, please consider giving a star and citing our paper:

Junjie Jiang, Chengke Wu, Wenjie Sun, Yong He, Yuanjun Guo, Yang Su, Zhile Yang. Ontology-based distant supervision for extracting entity-property relations in construction documents.

Contact

Any question please contact junj.chiang1102@gmail.com.

About

It's an annotated dataset CONSD through the improved distant supervision framework Ont4RE for relation extraction in construction industry. More detrails about Ont4RE can be referred to another repo Ontology-for-Relation-Extraction-Ont4RE

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors