Today, Thai have many word tokenizition model and It's hard to train own model, so I think we can add easy train model method.
Preprocessing
Get pandas.DataFrame to train/val/test model
| Domain | type(train/test/val) | text |
| news | train | ผม|กำลัง|อยู่|ที่|ทำเนียนัฐบาล |
and train/va/test by each domain.
Model
Train with
and more...
Inference
Get onnx to running model.
Today, Thai have many word tokenizition model and It's hard to train own model, so I think we can add easy train model method.
Preprocessing
Get pandas.DataFrame to train/val/test model
and train/va/test by each domain.
Model
Train with
and more...
Inference
Get onnx to running model.