diff --git a/paper/wbi-eclef18.tex b/paper/wbi-eclef18.tex index cebb1f3a4f6f82796965bddba59123364db90e4b..01ab85a6e06d7dc944a4242301bcacc2c90823f3 100644 --- a/paper/wbi-eclef18.tex +++ b/paper/wbi-eclef18.tex @@ -50,10 +50,10 @@ This paper describes the participation of the WBI team in the CLEF eHealth 2018 shared task 1 (``Multilingual Information Extraction - ICD-10 coding''). Our approach builds on two recurrent neural networks models to extract and classify causes of death from French, Italian and Hungarian death certificates. First, we -employ a LSTM-based sequence-to-sequence model to obtain a disease name for each +employ a LSTM-based sequence-to-sequence model to obtain a symptom name from each death certificate line. We then utilize a bidirectional LSTM model with attention mechanism to assign the respective ICD-10 codes to the received -disease names. Our model achieves \ldots +symptom names. Our model achieves \ldots \keywords{ICD-10 coding \and Biomedical information extraction \and @@ -91,9 +91,9 @@ was encouraged. \section{Methods} Our approach models the extraction and classification of death causes as two-step process. First, we employ a neural, multi-language sequence-to-sequence -model to receive a disease name for a given death certificate line. We will then +model to receive a symptom name for a given death certificate line. We will then use a second classification model to assign the respective ICD-10 codes to the -obtained disease names. The remainder of this section gives a short introduction +obtained symptom names. The remainder of this section gives a short introduction to recurrent neural networks, followed by a detailed explanation of our two models. \subsection{Recurrent neural networks} @@ -125,12 +125,12 @@ data from left to right, and and backward chain, consuming the data in the opposite direction. The final representation is typically the concatenation or a linear combination of both states. -\subsection{Disease Name Model} -The first step in our pipeline is the extraction of a disease name from a given +\subsection{Symptom Model} +The first step in our pipeline is the extraction of a symptom name from a given death certificate line. We use the training certificate lines (with their corresponding ICD-10 codes) and the ICD-10 dictionaries as basis for -our model. The dictionaries provide us with a disease name for each ICD-10 code. -The goal of the model is to reassemble the dictionary disease name from the +our model. The dictionaries provide us with a symptom name for each ICD-10 code. +The goal of the model is to reassemble the dictionary symptom name from the certificate line. For this we adopt the encoder-decoder architecture proposed in @@ -149,19 +149,19 @@ the word. The encoders final state represents the semantic meaning of the certificate line and serves as intial input for decoding process. As decoder with utilize another LSTM model. The initial input of the decoder is -the final state of the encoder. Moreover, each token of the dictionary disease +the final state of the encoder. Moreover, each token of the dictionary symptom name (padded with special start and end tag) serves as input for the different time steps. Again, we use FastEmbeddngs of all three languages to represent the -token. The decoder predicts one-hot-encoded words of the disease name. During +token. The decoder predicts one-hot-encoded words of the symptom name. During test time we use the encoder to obtain a semantic representation of the -certificate line and decode the disease name word by word starting with the +certificate line and decode the symptom name word by word starting with the special start tag. The decoding process finishs when the decoder outputs the end tag. \subsection{ICD-10 Classification Model} The second step in our pipeline is to assign a ICD-10 code to the obtained -disease name. For this purpose we employ a bidirectional LSTM model which is -able to capture the past and future context for each token of a disease name. +symptom name. For this purpose we employ a bidirectional LSTM model which is +able to capture the past and future context for each token of a symptom name. Just as in our encoder-decoder disease name model we encode each token using the concatenation of the FastText embeddings of the word from all three languages. To enable our model to attend to different parts of a disease name we add an @@ -171,6 +171,14 @@ model using the provided ICD-10 dictionaries from all three languages. During development we also experimented with character-level RNNs, but couldn't achieve any approvements. +\begin{figure} +\includegraphics[width=\textwidth]{Input.pdf} +\caption{A figure caption is always placed below the illustration. +Please note that short captions are centered, while long ones are +justified by the macro package automatically.} +\label{fig1} +\end{figure} + \section{Experiments and Results} \nj{TODO: Insert text!}