The task was solved using two deep learning models: a multi-lingual sequence2sequence model with attention to generate a representation for each certificate line and a separate classification model to obtain the ICD10 code from these embeddings. The models were trained using only the data provided by the organizers, mostly due to our inability to obtain ICD10 dictionaries for selected languages.
\ No newline at end of file
The task was solved using two deep learning models: a sequence2sequence model to generate a representation for each certificate line and a separate classification model with attention to obtain the ICD10 labels. The models were trained using only the data provided by the organizers, mostly due to our inability to obtain ICD10 dictionaries for selected languages (besides French).