initial commit

a2e1c0aa · Hermann Stolte · a2e1c0aa · a2e1c0aa · a2e1c0aa · a2e1c0aa
Commit a2e1c0aa authored 8 months ago by Hermann Stolte
--- a/.gitattributes
+++ b/.gitattributes
+# Auto detect text files and perform LF normalization
+* text=auto
--- a/.gitignore
+++ b/.gitignore
+.DS_Store
+reproduce/.DS_Store
--- a/README.md
+++ b/README.md
+# SOUND
+In the following, we are providing necessary setup steps as well as instructions on how to reproduce the
+SOUND experiments. All steps have been tested under **MacOS Ventura 13.6**.
+## Setup
+### Step 0: Dependencies
+The following dependencies are not managed by our automatic setup and need to be installed
+beforehand.:
+- git
+- wget
+- maven
+- unzip
+- OpenJDK 1.8
+- python3 (+ pip3)
+For Ubuntu 20.04: `sudo apt-get install git wget maven unzip openjdk-8-jdk python3-pip python3-yaml libyaml-dev cython`
+Below we assume the repository has been cloned into the directory `repo_dir`. 
+### Step 1: Install Python Requirements
+- Using a virtual environment is suggested. 
+- Initializing the environment and running the experiments requires: `gdown pyyaml tqdm`
+- Recreating the figures requires additionally: `pandas matplotlib seaborn numpy adjustText xlsxwriter` 
+To install the python requirements automatically, from `repo_dir`, run *one* of the following commands:
+```bash
+# Full Dependencies (Running Experiments and Plotting)
+pip install -r requirements.txt
+# Minimal Dependencies (Running Experiments Only)
+pip install -r requirements-minimal.txt
+```
+### Step 2: Automated Setup
+This method will automatically download Apache Flink and the input datasets,
+and compile the framework.
+1. From `repo_dir`, run `./auto-setup.sh`
+2. Run `./init-configs.sh` to use the Flink configuration used in our experiments
+#### (Alternative) Manual Setup
+In case of problems with the automatic setup, you can prepare the environment manually:
+1. Download [Apache Flink 1.14.0](https://archive.apache.org/dist/flink/flink-1.14.0/flink-1.14.0-bin-scala_2.11.tgz) and decompress in `repo_dir`
+3. Get the input datasets, either using `./get-datasets.sh` or by manually downloading from [here](https://drive.google.com/u/0/uc?id=1uNOUlCoa9CfH7WCxe3nSsCwQVVjf9teB) and decompressing in `repo_dir/data/input`
+4. Compile the two experiment jars, from repo dir: `mvn -f helper_pom.xml clean package && mv target/helper*.jar jars; mvn clean package; mvn install`
+2. Run `./init-configs.sh` to use the Flink configurations used in our experiments
+## Running Experiments
+### Automatic Reproduction of the Paper's Experiments and Plots
+The `reproduce/sound/` directory contains a script for each evaluation figure of the paper, which will run the experiment automatically, store the results, and create a plot. You need to provide the `#repetitions` and `duration` in minutes as arguments.
+For example, to reproduce Figure 5, run (from `repo_dir`):
+```bash
+# Reproduce Figure 5, left panel, of the paper for 5 repetitions of 1.5 minutes
+./reproduce/sound/figure6_left.sh 5 1.5 
+```
+To reproduce all experiments sequentially, run (from `repo_dir`):
+```bash
+./reproduce/sound/complete.sh
+```
+Results are stored in the folder `data/output`.
+*Some experiment scripts print debugging information that
+is usually safe to ignore, as long as the figures are generated successfully.*
--- a/auto-setup.sh
+++ b/auto-setup.sh
+#!/usr/bin/env bash
+set -e
+echo "Automatic setup about to start"
+sleep 3
+echo "Downloading Apache Flink..."
+wget -O flink-1.14.0-bin-scala_2.11.tgz https://archive.apache.org/dist/flink/flink-1.14.0/flink-1.14.0-bin-scala_2.11.tgz
+tar zxvf flink-1.14.0-bin-scala_2.11.tgz
+rm flink-1.14.0-bin-scala_2.11.tgz 
+echo "Done."
+#echo "Downloading Apache Kafka..."
+#wget -O kafka_2.13-3.1.0.tgz https://archive.apache.org/dist/kafka/3.1.0/kafka_2.13-3.1.0.tgz
+#tar -xzf kafka_2.13-3.1.0.tgz
+#rm kafka_2.13-3.1.0.tgz
+#echo "Done."
+bash get-datasets.sh
+echo
+echo "Compiling..."
+echo
+mvn -f helper_pom.xml clean package && mv target/helper*.jar jars
+mvn clean package
+mvn install
+echo "Done!"
--- a/configs/flink/flink-conf.yaml
+++ b/configs/flink/flink-conf.yaml
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+################################################################################
+#==============================================================================
+# Common
+#==============================================================================
+# The external address of the host on which the JobManager runs and can be
+# reached by the TaskManagers and any clients which want to connect. This setting
+# is only used in Standalone mode and may be overwritten on the JobManager side
+# by specifying the --host <hostname> parameter of the bin/jobmanager.sh executable.
+# In high availability mode, if you use the bin/start-cluster.sh script and setup
+# the conf/masters file, this will be taken care of automatically. Yarn
+# automatically configure the host name based on the hostname of the node where the
+# JobManager runs.
+jobmanager.rpc.address: localhost
+# The RPC port where the JobManager is reachable.
+jobmanager.rpc.port: 6123
+# The total process memory size for the JobManager.
+#
+# Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
+jobmanager.memory.process.size: 1600m
+# The total process memory size for the TaskManager.
+#
+# Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
+taskmanager.memory.process.size: 1728m
+# To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
+# It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
+#
+# taskmanager.memory.flink.size: 1280m
+# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
+taskmanager.numberOfTaskSlots: 4
+# The parallelism used for programs that did not specify and other parallelism.
+parallelism.default: 1
+# The default file system scheme and authority.
+# 
+# By default file paths without scheme are interpreted relative to the local
+# root file system 'file:///'. Use this to override the default and interpret
+# relative paths relative to a different file system,
+# for example 'hdfs://mynamenode:12345'
+#
+# fs.default-scheme
+#==============================================================================
+# High Availability
+#==============================================================================
+# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
+#
+# high-availability: zookeeper
+# The path where metadata for master recovery is persisted. While ZooKeeper stores
+# the small ground truth for checkpoint and leader election, this location stores
+# the larger objects, like persisted dataflow graphs.
+# 
+# Must be a durable file system that is accessible from all nodes
+# (like HDFS, S3, Ceph, nfs, ...) 
+#
+# high-availability.storageDir: hdfs:///flink/ha/
+# The list of ZooKeeper quorum peers that coordinate the high-availability
+# setup. This must be a list of the form:
+# "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
+#
+# high-availability.zookeeper.quorum: localhost:2181
+# ACL options are based on https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
+# It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open" (ZOO_OPEN_ACL_UNSAFE)
+# The default value is "open" and it can be changed to "creator" if ZK security is enabled
+#
+# high-availability.zookeeper.client.acl: open
+#==============================================================================
+# Fault tolerance and checkpointing
+#==============================================================================
+# The backend that will be used to store operator state checkpoints if
+# checkpointing is enabled. Checkpointing is enabled when execution.checkpointing.interval > 0.
+#
+# Execution checkpointing related parameters. Please refer to CheckpointConfig and ExecutionCheckpointingOptions for more details.
+#
+# execution.checkpointing.interval: 3min
+# execution.checkpointing.externalized-checkpoint-retention: [DELETE_ON_CANCELLATION, RETAIN_ON_CANCELLATION]
+# execution.checkpointing.max-concurrent-checkpoints: 1
+# execution.checkpointing.min-pause: 0
+# execution.checkpointing.mode: [EXACTLY_ONCE, AT_LEAST_ONCE]
+# execution.checkpointing.timeout: 10min
+# execution.checkpointing.tolerable-failed-checkpoints: 0
+# execution.checkpointing.unaligned: false
+#
+# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
+# <class-name-of-factory>.
+#
+# state.backend: filesystem
+# Directory for checkpoints filesystem, when using any of the default bundled
+# state backends.
+#
+# state.checkpoints.dir: hdfs://namenode-host:port/flink-checkpoints
+# Default target directory for savepoints, optional.
+#
+# state.savepoints.dir: hdfs://namenode-host:port/flink-savepoints
+# Flag to enable/disable incremental checkpoints for backends that
+# support incremental checkpoints (like the RocksDB state backend). 
+#
+# state.backend.incremental: false
+# The failover strategy, i.e., how the job computation recovers from task failures.
+# Only restart tasks that may have been affected by the task failure, which typically includes
+# downstream tasks and potentially upstream tasks if their produced data is no longer available for consumption.
+jobmanager.execution.failover-strategy: region
+#==============================================================================
+# Rest & web frontend
+#==============================================================================
+# The port to which the REST client connects to. If rest.bind-port has
+# not been specified, then the server will bind to this port as well.
+#
+#rest.port: 8081
+# The address to which the REST client will connect to
+#
+#rest.address: 0.0.0.0
+# Port range for the REST and web server to bind to.
+#
+#rest.bind-port: 8080-8090
+# The address that the REST & web server binds to
+#
+#rest.bind-address: 0.0.0.0
+# Flag to specify whether job submission is enabled from the web-based
+# runtime monitor. Uncomment to disable.
+#web.submit.enable: false
+# Flag to specify whether job cancellation is enabled from the web-based
+# runtime monitor. Uncomment to disable.
+#web.cancel.enable: false
+#==============================================================================
+# Advanced
+#==============================================================================
+# Override the directories for temporary files. If not specified, the
+# system-specific Java temporary directory (java.io.tmpdir property) is taken.
+#
+# For framework setups on Yarn, Flink will automatically pick up the
+# containers' temp directories without any need for configuration.
+#
+# Add a delimited list for multiple directories, using the system directory
+# delimiter (colon ':' on unix) or a comma, e.g.:
+#     /data1/tmp:/data2/tmp:/data3/tmp
+#
+# Note: Each directory entry is read from and written to by a different I/O
+# thread. You can include the same directory multiple times in order to create
+# multiple I/O threads against that directory. This is for example relevant for
+# high-throughput RAIDs.
+#
+# io.tmp.dirs: /tmp
+# The classloading resolve order. Possible values are 'child-first' (Flink's default)
+# and 'parent-first' (Java's default).
+#
+# Child first classloading allows users to use different dependency/library
+# versions in their application than those in the classpath. Switching back
+# to 'parent-first' may help with debugging dependency issues.
+#
+# classloader.resolve-order: child-first
+# The amount of memory going to the network stack. These numbers usually need 
+# no tuning. Adjusting them may be necessary in case of an "Insufficient number
+# of network buffers" error. The default min is 64MB, the default max is 1GB.
+# 
+# taskmanager.memory.network.fraction: 0.1
+# taskmanager.memory.network.min: 64mb
+# taskmanager.memory.network.max: 1gb
+#==============================================================================
+# Flink Cluster Security Configuration
+#==============================================================================
+# Kerberos authentication for various components - Hadoop, ZooKeeper, and connectors -
+# may be enabled in four steps:
+# 1. configure the local krb5.conf file
+# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
+# 3. make the credentials available to various JAAS login contexts
+# 4. configure the connector to use JAAS/SASL
+# The below configure how Kerberos credentials are provided. A keytab will be used instead of
+# a ticket cache if the keytab path and principal are set.
+# security.kerberos.login.use-ticket-cache: true
+# security.kerberos.login.keytab: /path/to/kerberos/keytab
+# security.kerberos.login.principal: flink-user
+# The configuration below defines which JAAS login contexts
+# security.kerberos.login.contexts: Client,KafkaClient
+#==============================================================================
+# ZK Security Configuration
+#==============================================================================
+# Below configurations are applicable if ZK ensemble is configured for security
+# Override below configuration to provide custom ZK service name if configured
+# zookeeper.sasl.service-name: zookeeper
+# The configuration below must match one of the values set in "security.kerberos.login.contexts"
+# zookeeper.sasl.login-context-name: Client
+#==============================================================================
+# HistoryServer
+#==============================================================================
+# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
+# Directory to upload completed jobs to. Add this directory to the list of
+# monitored directories of the HistoryServer as well (see below).
+#jobmanager.archive.fs.dir: hdfs:///completed-jobs/
+# The address under which the web-based HistoryServer listens.
+#historyserver.web.address: 0.0.0.0
+# The port under which the web-based HistoryServer listens.
+#historyserver.web.port: 8082
+# Comma separated list of directories to monitor for completed jobs.
+#historyserver.archive.fs.dir: hdfs:///completed-jobs/
+# Interval in milliseconds for refreshing the monitored directories.
+#historyserver.archive.fs.refresh-interval: 10000
--- a/data/input/.keep
+++ b/data/input/.keep
--- a/data/output/.keep
+++ b/data/output/.keep
--- a/data/remote/.keep
+++ b/data/remote/.keep
--- a/experiments/checkresults/AstroCheckResultSparsityCase.yaml
+++ b/experiments/checkresults/AstroCheckResultSparsityCase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'AstroOverhead'
+n_samples: 100
+CI: 950
+dimensions:
+  schema: variant.manual_sparsity
+  bufferDelay:
+    - 0
+  manual_sparsity:
+    #    - 1
+    - 0
+    - 5
+    - 3
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.CaseWithSparsity {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --nSamples {n_samples} --CI {CI}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/astro"
+experiment_args: "--inputFile extended"
+parallelism: 1
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/AstroCheckResultVACase.yaml
+++ b/experiments/checkresults/AstroCheckResultVACase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'AstroVA'
+CI: 950
+n_samples: 200
+dimensions:
+  schema: variant.manual_sparsity
+  bufferDelay:
+    - 0
+  #    - 3600
+  manual_sparsity:
+    #    - 3
+    - 0
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.CaseWithVA {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --nSamples {n_samples} --CI {CI}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/astro"
+experiment_args: "--inputFile extended"
+parallelism: 1
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/AstroCheckResultWithoutSOUNDCase.yaml
+++ b/experiments/checkresults/AstroCheckResultWithoutSOUNDCase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'AstroOverhead'
+CI: 997
+dimensions:
+  schema: variant.nSamples
+  bufferDelay:
+    - 0
+  #    - 3600
+  #  manual_sparsity:
+  #    - 3
+  #  #    - 0
+  nSamples:
+    - 0
+    - 200
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.CaseWithSparsity {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --manual_sparsity 0 --parallelism {parallelism} --CI {CI}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/astro"
+experiment_args: "--inputFile extended"
+parallelism: 1
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/SmartGridAnomalyCheckResult.yaml
+++ b/experiments/checkresults/SmartGridAnomalyCheckResult.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'SmartGridAnomalyCheckResult'
+dimensions:
+  schema: variant.bufferDelay
+  bufferDelay:
+    - 0
+#    - 3600
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomaly {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/sg-debs"
+experiment_args: "--inputFile sg-debs-1G"
+parallelism: 2
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/SmartGridAnomalyCheckResultCICase.yaml
+++ b/experiments/checkresults/SmartGridAnomalyCheckResultCICase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'SmartGridAnomalyCaseWithUncertainty'
+n_samples: 200
+manual_value_uncertainty: 1
+dimensions:
+  schema: variant.CI
+  bufferDelay:
+    - 0
+  CI:
+    - 900
+    #    - 950
+    - 999
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomalyCaseWithUncertainty {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --nSamples {n_samples} --manual_value_uncertainty {manual_value_uncertainty}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/sg-debs"
+experiment_args: "--inputFile sg-debs-1G"
+parallelism: 2
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/SmartGridAnomalyCheckResultNOSOUNDCase.yaml
+++ b/experiments/checkresults/SmartGridAnomalyCheckResultNOSOUNDCase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'SmartGridAnomalyCaseWithUncertainty'
+manual_value_uncertainty: 1
+CI: 997
+dimensions:
+  schema: variant.nSamples
+  bufferDelay:
+    - 0
+  nSamples:
+    - 0
+    - 200
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomalyCaseWithUncertainty {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --CI {CI} --manual_value_uncertainty {manual_value_uncertainty}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/sg-debs"
+experiment_args: "--inputFile sg-debs-1G"
+parallelism: 2
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/SmartGridAnomalyCheckResultNSamplesCase.yaml
+++ b/experiments/checkresults/SmartGridAnomalyCheckResultNSamplesCase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'SmartGridAnomalyCaseWithUncertainty'
+CI: 900
+dimensions:
+  bufferDelay:
+    - 0
+  #    - 3600
+  nSamples:
+    - 10
+    - 200
+  schema: variant.nSamples
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomalyCaseWithUncertainty {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --manual_value_uncertainty {manual_value_uncertainty} --CI {CI}
+    args: ''
+#  --manual_value_uncertainty {manual_value_uncertainty}
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/sg-debs"
+experiment_args: "--inputFile sg-debs-1G"
+manual_value_uncertainty: 0.1
+parallelism: 2
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/SmartGridAnomalyCheckResultNSamplesCaseB.yaml
+++ b/experiments/checkresults/SmartGridAnomalyCheckResultNSamplesCaseB.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'SmartGridAnomalyCaseWithUncertainty'
+CI: 990
+dimensions:
+  bufferDelay:
+    - 0
+  #    - 3600
+  nSamples:
+    - 10
+    - 200
+  schema: variant.nSamples
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomalyCaseWithUncertainty {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --manual_value_uncertainty {manual_value_uncertainty} --CI {CI}
+    args: ''
+#  --manual_value_uncertainty {manual_value_uncertainty}
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/sg-debs"
+experiment_args: "--inputFile sg-debs-1G"
+manual_value_uncertainty: 0.1
+parallelism: 2
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/checkresults/SmartGridAnomalyCheckResultUncertaintyCase.yaml
+++ b/experiments/checkresults/SmartGridAnomalyCheckResultUncertaintyCase.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'SmartGridAnomalyCaseWithUncertainty'
+dimensions:
+  schema: variant.manual_value_uncertainty
+  bufferDelay:
+    - 0
+  #    - 3600
+  manual_value_uncertainty:
+    - 0.01
+    #    - 0.1
+    - 1
+    - 10
+variants:
+  - name: SOUND
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomalyCaseWithUncertainty {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --nSamples {n_samples} --CI {CI}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/sg-debs"
+experiment_args: "--inputFile sg-debs-1G"
+parallelism: 2
+n_samples: 100
+CI: 950
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/overhead/AstroOverhead.yaml
+++ b/experiments/overhead/AstroOverhead.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'AstroOverhead'
+dimensions:
+  schema: variant.bufferDelay
+  bufferDelay:
+    - 0
+  #    - 3600
+variants:
+  - name: NS
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.Query {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism}
+    args: ''
+  #  - name: SOUND
+  #    spe_command: >
+  #      {flink_cmd}
+  #      io.stolther.soundcheck.usecases.smartgrid.sound.SmartGridAnomaly {query_jar}
+  #      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+  #      --parallelism {parallelism}
+  #    args: ''
+  - name: SOUNDNOOP
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.QuerySoundNoOpSink {query_jar}
+      --inputFolder {input_folder} --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --nSamples {nSamples} --CI {CI}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+input_folder: "{repo_dir}/data/input/astro"
+experiment_args: "--inputFile extended"
+nSamples: 100
+CI: 950
+parallelism: 4
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
\ No newline at end of file
--- a/experiments/overhead/AstroOverheadByCI.yaml
+++ b/experiments/overhead/AstroOverheadByCI.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'AstroOverhead'
+nSamples: 100
+dimensions:
+  CI:
+    - 900
+    - 950
+    - 990
+  schema: variant.CI
+  bufferDelay:
+    - 0
+#    - 3600
+variants:
+  - name: NS
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.Query {query_jar}
+      --inputFolder {repo_dir}/data/input/astro --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --nSamples {nSamples}
+    args: ''
+  - name: SOUNDNOOP
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.QuerySoundNoOpSink {query_jar}
+      --inputFolder {repo_dir}/data/input/astro --statisticsFolder {statistics_folder} {args} --inputFile extended {extra_args}
+      --parallelism {parallelism} --nSamples {nSamples}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+experiment_args: "--inputFile extended"
+parallelism: 4
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'
--- a/experiments/overhead/AstroOverheadByMaxSamples.yaml
+++ b/experiments/overhead/AstroOverheadByMaxSamples.yaml
+# Experiment template file
+# All instances of {repo_dir} will be replaced by the absolute path of the repository
+executor_script: './scripts/flink_do_run.sh'
+query: 'AstroOverhead'
+CI: 950
+dimensions:
+  nSamples:
+    - 10
+    - 100
+    - 200
+  schema: variant.nSamples
+  bufferDelay:
+    - 0
+#    - 3600
+variants:
+  - name: NS
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.Query {query_jar}
+      --inputFolder {repo_dir}/data/input/astro --statisticsFolder {statistics_folder} {args} {experiment_args} {extra_args}
+      --parallelism {parallelism} --CI {CI}
+    args: ''
+  - name: SOUNDNOOP
+    spe_command: >
+      {flink_cmd}
+      io.stolther.soundcheck.usecases.astro.QuerySoundNoOpSink {query_jar}
+      --inputFolder {repo_dir}/data/input/astro --statisticsFolder {statistics_folder} {args} --inputFile extended {extra_args}
+      --parallelism {parallelism} --CI {CI}
+    args: ''
+flink_cmd: "{repo_dir}/flink-1.14.0/bin/flink run --class"
+query_jar: "{repo_dir}/target/streaming-why-not-1.0-SNAPSHOT.jar"
+experiment_args: "--inputFile extended"
+parallelism: 4
+utilization_command: './scripts/utilization-flink.sh {statistics_folder}'