diff --git a/paper/encoder-decoder-model.docx b/paper/encoder-decoder-model.docx
new file mode 100644
index 0000000000000000000000000000000000000000..2b62f3a2df1223e6b0da7fb4886dcaa5e1290c94
Binary files /dev/null and b/paper/encoder-decoder-model.docx differ
diff --git a/paper/encoder-decoder-model.pdf b/paper/encoder-decoder-model.pdf
new file mode 100644
index 0000000000000000000000000000000000000000..14e5c1cfef1f60cbbd2b6bec7bdfb42a0dc90e73
Binary files /dev/null and b/paper/encoder-decoder-model.pdf differ
diff --git a/paper/wbi-eclef18.tex b/paper/wbi-eclef18.tex
index 01ab85a6e06d7dc944a4242301bcacc2c90823f3..ba02265bb7a54a05004715209ca352fa303ffde5 100644
--- a/paper/wbi-eclef18.tex
+++ b/paper/wbi-eclef18.tex
@@ -134,9 +134,10 @@ The goal of the model is to reassemble the dictionary symptom name from the
 certificate line.
 
 For this we adopt the encoder-decoder architecture proposed in
-\cite{sutskever_sequence_2014}. As encoder we utilize a forward LSTM model,
-which takes the single words of a certificate line as inputs and scans the line
-from left to right. Each token will be represented using pre-trained FastText
+\cite{sutskever_sequence_2014}. Figure \ref{fig:encoder_decoder} illustrates the
+architecture of the model. As encoder we utilize a forward LSTM model, which
+takes the single words of a certificate line as inputs and scans the line from
+left to right. Each token will be represented using pre-trained FastText
 word embeddings. Word embedding models represent words using a real-valued
 vector and caputure syntactic and semantic similiarities between them. FastText
 embeddings take sub-word information into account during training whereby the
@@ -168,15 +169,19 @@ To enable our model to attend to different parts of a disease name we add an
 extra attention layer \cite{raffel_feed-forward_2015} to the model. We train the
 model using the provided ICD-10 dictionaries from all three languages.
 
-During development we also experimented with character-level RNNs, but
-couldn't achieve any approvements.
+During development we also experimented with character-level RNNs for better
+ICD-10 classification, however couldn't achieve any performance approvements.
 
 \begin{figure}
-\includegraphics[width=\textwidth]{Input.pdf}
-\caption{A figure caption is always placed below the illustration.
-Please note that short captions are centered, while long ones are
-justified by the macro package automatically.} 
-\label{fig1}
+\includegraphics[width=\textwidth,trim={0 17cm 0 3cm},clip=true]{encoder-decoder-model.pdf}
+\caption{Illustration of the neural encoder-decoder model for symptom
+extraction. The encoder processes a death certificate line token-wise from left
+to right. The final state of the encoder forms a semantic representation of the
+line and serves as initial input for the decoding process. The decoder will be
+trained to predict the symptom name word by word.  All input tokens will be
+represented using the concatenation of the FastText embeddings of all three
+languages.}
+\label{fig:encoder_decoder}
 \end{figure}
 
 \section{Experiments and Results}