Training VAE on PubTator text to use for transfer and fine tune on the corpora...
Training VAE on PubTator text to use for transfer and fine tune on the corpora at hand. VAE as input to ranking function. To Do Now: 1) start writing paper as you do each module! 2) Add corpora overview (i.e. unique documents, cancer types, ratio 0/1 for about_cancer and clinical tasks) 3) Describe preparation (link to BioNLP paper) 4) Desribe training process for a) DL models (used FastText, 75 percentile for sentence length, used masking) with 5-fold CV b) sklearn models (custom tokenizer, tf-idf, no chi2, RandomizedSearchCV with 1- folds, char and word n-gram, hyperparam optimization with 10 combinations) To Do Future: 1) reindex VIST with new models (after selecting best one) 2) extend ranking function evaluation metrics with MAP, MAR 3) try Logistical Regression and (V)AE as ranking function
Showing
- .idea/workspace.xml 189 additions, 124 deletions.idea/workspace.xml
- data/original/relevance_User_ul.csv 189 additions, 0 deletionsdata/original/relevance_User_ul.csv
- data/original/relevance_User_ul.p 0 additions, 0 deletionsdata/original/relevance_User_ul.p
- data/prepared/evaluation_vist.csv 0 additions, 120 deletionsdata/prepared/evaluation_vist.csv
- notebooks/function_imports.py 95 additions, 0 deletionsnotebooks/function_imports.py
- train_vae.py 19 additions, 1 deletiontrain_vae.py
- train_vae_pubmed.py 87 additions, 0 deletionstrain_vae_pubmed.py
Please register or sign in to comment