# SOUND

In the following, we are providing necessary setup steps as well as instructions on how to reproduce the
SOUND experiments. All steps have been tested under **MacOS Ventura 13.6**.

## Setup

### Step 0: Dependencies

The following dependencies are not managed by our automatic setup and need to be installed
beforehand.:

- git
- wget
- maven
- unzip
- OpenJDK 1.8
- python3 (+ pip3)

For Ubuntu 20.04: `sudo apt-get install git wget maven unzip openjdk-8-jdk python3-pip python3-yaml libyaml-dev cython`

Below we assume the repository has been cloned into the directory `repo_dir`. 

### Step 1: Install Python Requirements

- Using a virtual environment is suggested. 
- Initializing the environment and running the experiments requires: `gdown pyyaml tqdm`
- Recreating the figures requires additionally: `pandas matplotlib seaborn numpy adjustText xlsxwriter` 

To install the python requirements automatically, from `repo_dir`, run *one* of the following commands:

```bash
# Full Dependencies (Running Experiments and Plotting)
pip install -r requirements.txt
# Minimal Dependencies (Running Experiments Only)
pip install -r requirements-minimal.txt

```

### Step 2: Automated Setup

This method will automatically download Apache Flink and the input datasets,
and compile the framework.

1. From `repo_dir`, run `./auto-setup.sh`
2. Run `./init-configs.sh` to use the Flink configuration used in our experiments


#### (Alternative) Manual Setup

In case of problems with the automatic setup, you can prepare the environment manually:

1. Download [Apache Flink 1.14.0](https://archive.apache.org/dist/flink/flink-1.14.0/flink-1.14.0-bin-scala_2.11.tgz) and decompress in `repo_dir`
3. Get the input datasets, either using `./get-datasets.sh` or by manually downloading from [here](https://zenodo.org/records/14007044) and decompressing in `repo_dir/data/input`
4. Compile the two experiment jars, from repo dir: `mvn -f helper_pom.xml clean package && mv target/helper*.jar jars; mvn clean package; mvn install`
2. Run `./init-configs.sh` to use the Flink configurations used in our experiments


## Running Experiments

### Automatic Reproduction of the Paper's Experiments and Plots

The `reproduce/sound/` directory contains a script for each evaluation figure of the paper, which will run the experiment automatically, store the results, and create a plot. You need to provide the `#repetitions` and `duration` in minutes as arguments.
For example, to reproduce Figure 5, run (from `repo_dir`):

```bash
# Reproduce Figure 5, left panel, of the paper for 5 repetitions of 1.5 minutes
./reproduce/sound/figure6_left.sh 5 1.5 
```

To reproduce all experiments sequentially, run (from `repo_dir`):
```bash
./reproduce/sound/complete.sh
```

Results are stored in the folder `data/output`.

*Some experiment scripts print debugging information that
is usually safe to ignore, as long as the figures are generated successfully.*