updated readme

d4bebcce · ezzabady.morteza · 81136d8d · d4bebcce
Commit d4bebcce authored 3 years ago by ezzabady.morteza
--- a/README.md
+++ b/README.md
@@ -2,38 +2,20 @@
 Discourse segmenter for DISRPT 2021
-Data for DISRPT 2021: https://github.com/disrpt/sharedtask2021 
+Useful Links:
-Website DISRPT 2021: https://sites.google.com/georgetown.edu/disrpt2021
+- Data for DISRPT 2021: https://github.com/disrpt/sharedtask2021 
-Code for DISRTP 2019: https://gitlab.inria.fr/andiamo/tony
+- Website DISRPT 2021: https://sites.google.com/georgetown.edu/disrpt2021
+- Code for DISRTP 2019: https://gitlab.inria.fr/andiamo/tony
-## Meeting 04.06.2021
+Requirements:
-TODO:
+- python 3.7
-x- install allennlp 0.9 + tony19
+- requirements.txt: `pip install -r requirements.txt`
-x- train a model (on english for instance)
+- pytorch: `pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio===0.9.0 -f https://download.pytorch.org/whl/torch_stable.html`
-x- test it with tony script
-x- begin reading the tutorial on allennlp
+Usage:
-x- continue with general reading NLP
+- train: `bash expes.sh eng.rst.rstdt conllu bert train`
+- test: `bash expes.sh eng.rst.rstdt conllu bert test`
-Next steps:
+- fine-tune with other model: `bash expes.sh eng.rst.rstdt conllu bert train eng`
-x- change to allennlp 1.xx
+- test on other model: `bash expes.sh eng.rst.rstdt conllu bert test eng`
- switch to xlm multi lingual
+- merge two datasets: `bash merger.sh eng.rst.rstdt eng.rst.gum eng`
- play with some hyper-parameters
+- split with stanza: `python parse_corpus.py eng.rst.rstdt --parser stanza`
- go back on the architecture: add the CRF layer on top of BERT/LSTM
- grouping the corpora during training
- the sentence problem ? how do we address it
-## Meeting 21.05.2021
-TODO:
- install allennlp 0.9 + tony19
- train a model (on english for instance)
- test it with tony script
- begin reading the tutorial on allennlp
- continue with general reading NLP
-Next steps:
- change to allennlp 1.xx
- switch to xlm multi lingual
- grouping the corpora during training
- the sentence problem ? how do we address it