Skip to content
Snippets Groups Projects
Commit d4bebcce authored by ezzabady.morteza's avatar ezzabady.morteza
Browse files

updated readme

parent 81136d8d
No related branches found
No related tags found
No related merge requests found
......@@ -2,38 +2,20 @@
Discourse segmenter for DISRPT 2021
Data for DISRPT 2021: https://github.com/disrpt/sharedtask2021
Website DISRPT 2021: https://sites.google.com/georgetown.edu/disrpt2021
Code for DISRTP 2019: https://gitlab.inria.fr/andiamo/tony
## Meeting 04.06.2021
TODO:
x- install allennlp 0.9 + tony19
x- train a model (on english for instance)
x- test it with tony script
x- begin reading the tutorial on allennlp
x- continue with general reading NLP
Next steps:
x- change to allennlp 1.xx
- switch to xlm multi lingual
- play with some hyper-parameters
- go back on the architecture: add the CRF layer on top of BERT/LSTM
- grouping the corpora during training
- the sentence problem ? how do we address it
## Meeting 21.05.2021
TODO:
- install allennlp 0.9 + tony19
- train a model (on english for instance)
- test it with tony script
- begin reading the tutorial on allennlp
- continue with general reading NLP
Next steps:
- change to allennlp 1.xx
- switch to xlm multi lingual
- grouping the corpora during training
- the sentence problem ? how do we address it
Useful Links:
- Data for DISRPT 2021: https://github.com/disrpt/sharedtask2021
- Website DISRPT 2021: https://sites.google.com/georgetown.edu/disrpt2021
- Code for DISRTP 2019: https://gitlab.inria.fr/andiamo/tony
Requirements:
- python 3.7
- requirements.txt: `pip install -r requirements.txt`
- pytorch: `pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio===0.9.0 -f https://download.pytorch.org/whl/torch_stable.html`
Usage:
- train: `bash expes.sh eng.rst.rstdt conllu bert train`
- test: `bash expes.sh eng.rst.rstdt conllu bert test`
- fine-tune with other model: `bash expes.sh eng.rst.rstdt conllu bert train eng`
- test on other model: `bash expes.sh eng.rst.rstdt conllu bert test eng`
- merge two datasets: `bash merger.sh eng.rst.rstdt eng.rst.gum eng`
- split with stanza: `python parse_corpus.py eng.rst.rstdt --parser stanza`
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment