Run the init.sh script or install the Tagger project under SuperTagger name. And upload the tagger.pt in the directory 'models'. (You may need to modify 'model_tagger' in train.py.)
### Structure
The structure should look like this :
```
.
.
├── Configuration # Configuration
│ ├── Configuration.py # Contains the function to execute for config
│ └── config.ini # contains parameters
├── find_config.py # auto-configurate datasets parameters (max length sentence etc) according to the dataset given
├── requirements.txt # librairies needed
├── Datasets # TLGbank data with links
├── SuperTagger # The Supertagger directory (that you need to install)
│ ├── Datasets # TLGbank data
│ ├── SuperTagger # Implementation of BertForTokenClassification
│ │ ├── SuperTagger.py # Main class
│ │ └── Tagging_bert_model.py # Bert model
│ ├── predict.py # Example of prediction for supertagger
│ └── train.py # Example of train for supertagger
├── Linker # The Linker directory
│ ├── ...
│ └── Linker.py # Linker class containing the neural network
├── models
│ └── supertagger.pt # the pt file contaning the pretrained supertagger (you need to install it)
├── Output # Directory where your linker models will be savec if checkpoint=True in train
├── TensorBoard # Directory where the stats will be savec if tensorboard=True in train
└── train.py # Example of train
```
### Dataset format
The sentences should be in a column "X", the links with '_x' postfix should be in a column "Y" and the categories in a column "Z".
...
...
@@ -24,8 +53,8 @@ For the links each atom_x goes with the one and only other atom_x in the sentenc
Launch train.py, if you look at it you can give another dataset file and another tagging model.
In train, if you use `checkpoint=True`, the model is automatically saved in a folder: Training_XX-XX_XX-XX. It saves
after each epoch. Use `tensorboard=True` for log in same folder. (`tensorboard --logdir=logs` for see logs)
In train, if you use `checkpoint=True`, the model is automatically saved in a folder: Output/Training_XX-XX_XX-XX. It saves
after each epoch. Use `tensorboard=True` for log saving in folder TensorBoard. (`tensorboard --logdir=logs` for see logs)