Skip to content
Snippets Groups Projects
Commit f29eaba9 authored by Jurica Seva's avatar Jurica Seva
Browse files

Training VAE on PubTator text to use for transfer and fine tune on the corpora...

Training VAE on PubTator text to use for transfer and fine tune on the corpora at hand. VAE as input to ranking function.

To Do Now:
1) start writing paper as you do each module!
2) Add corpora overview (i.e. unique documents, cancer types, ratio 0/1 for about_cancer and clinical tasks)
3) Describe preparation (link to BioNLP paper)
4) Desribe training process for a) DL models (used FastText, 75 percentile for sentence length, used masking) with 5-fold CV b) sklearn models (custom tokenizer, tf-idf, no chi2, RandomizedSearchCV with 1- folds, char and word n-gram, hyperparam optimization with 10 combinations)

To Do Future:
1) reindex VIST with new models (after selecting best one)
2) extend ranking function evaluation metrics with MAP, MAR
3) try Logistical Regression and (V)AE as ranking function
parent 63d16a7f
Branches main
No related merge requests found
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment