-
Hermann Stolte authoredd72807a0
SOUND
In the following, we are providing necessary setup steps as well as instructions on how to reproduce the SOUND experiments. All steps have been tested under MacOS Ventura 13.6.
Setup
Step 0: Dependencies
The following dependencies are not managed by our automatic setup and need to be installed beforehand.:
- git
- wget
- maven
- unzip
- OpenJDK 1.8
- python3 (+ pip3)
For Ubuntu 20.04: sudo apt-get install git wget maven unzip openjdk-8-jdk python3-pip python3-yaml libyaml-dev cython
Below we assume the repository has been cloned into the directory repo_dir
.
Step 1: Install Python Requirements
- Using a virtual environment is suggested.
- Initializing the environment and running the experiments requires:
gdown pyyaml tqdm
- Recreating the figures requires additionally:
pandas matplotlib seaborn numpy adjustText xlsxwriter
To install the python requirements automatically, from repo_dir
, run one of the following commands:
# Full Dependencies (Running Experiments and Plotting)
pip install -r requirements.txt
# Minimal Dependencies (Running Experiments Only)
pip install -r requirements-minimal.txt
Step 2: Automated Setup
This method will automatically download Apache Flink and the input datasets, and compile the framework.
- From
repo_dir
, run./auto-setup.sh
- Run
./init-configs.sh
to use the Flink configuration used in our experiments
(Alternative) Manual Setup
In case of problems with the automatic setup, you can prepare the environment manually:
- Download Apache Flink 1.14.0 and decompress in
repo_dir
- Get the input datasets, either using
./get-datasets.sh
or by manually downloading from here and decompressing inrepo_dir/data/input
- Compile the two experiment jars, from repo dir:
mvn -f helper_pom.xml clean package && mv target/helper*.jar jars; mvn clean package; mvn install
- Run
./init-configs.sh
to use the Flink configurations used in our experiments
Running Experiments
Automatic Reproduction of the Paper's Experiments and Plots
The reproduce/sound/
directory contains a script for each evaluation figure of the paper, which will run the experiment automatically, store the results, and create a plot. You need to provide the #repetitions
and duration
in minutes as arguments.
For example, to reproduce Figure 5, run (from repo_dir
):
# Reproduce Figure 5, left panel, of the paper for 5 repetitions of 1.5 minutes
./reproduce/sound/figure6_left.sh 5 1.5
To reproduce all experiments sequentially, run (from repo_dir
):
./reproduce/sound/complete.sh
Results are stored in the folder data/output
.
Some experiment scripts print debugging information that is usually safe to ignore, as long as the figures are generated successfully.