README.md 2.85 KB
Newer Older
Cristian Bell's avatar
Cristian Bell committed
1
2
3
Analyzing anonymized buying habits
--------------------------------------------------------------

cristian.bell's avatar
cristian.bell committed
4
5
6
This is the code for the BA-Thesis "Analyzing anonymized buying habits.". 
The main goal is to investigate if analysis of buying habits is possible using data that completely maintains the anonymity of the buyers.
The code contians tool used to generate purchases based on recipes from chefkoch.de and then mine association rules in this purchases and all other scripts used to generate data for the paper.
Cristian Bell's avatar
Cristian Bell committed
7

8

Cristian Bell's avatar
Cristian Bell committed
9
10
Installation
-------------
cristian.bell's avatar
cristian.bell committed
11
12
To use the command-line interface just make sure all sql files are in the configured directories and type `python3 assoc_rules.py`.
Using the web interface: 
Cristian Bell's avatar
Cristian Bell committed
13
To use the web component you need to install [flask](http://flask.pocoo.org/) - a micro web framework for Python.
Cristian Bell's avatar
Cristian Bell committed
14

Cristian Bell's avatar
Cristian Bell committed
15
- It is recommended to run everything in a **virtual environment** [venv](https://docs.python.org/3/tutorial/venv.html) where you can install **flask** and run the whole project - **OPT A**. Alternatively flask can be installed globally and everything can be run from ./web_component, **OPT B**.
Cristian Bell's avatar
Cristian Bell committed
16

Cristian Bell's avatar
Cristian Bell committed
17
Steps:
Cristian Bell's avatar
Cristian Bell committed
18
- make sure Python 3 is installed and working. The entire project is designed targeting Python 3 and tested on 3.5 and 3.6.
Cristian Bell's avatar
Cristian Bell committed
19
- [pip](https://pypi.python.org/pypi/pip) (package manager for Python) is also required.
Cristian Bell's avatar
Cristian Bell committed
20
To install on Ubuntu/Debian: `sudo apt-get install python3-pip`, on CentOS/Red Hat: `yum install python-pip`<br/>
Cristian Bell's avatar
Cristian Bell committed
21
- please make sure pip is up to date: on Ubuntu/Debian: `sudo -H pip3 install --upgrade pip`, CentOS/Red Hat: `pip install -U pip`<br/>
Cristian Bell's avatar
Cristian Bell committed
22

Cristian Bell's avatar
Cristian Bell committed
23
24
25
26
27
28
29
30
**OPT A**, using a virtual environment:
- install the virtual environment (**venv**) module: on Ubuntu/Debian run: `apt-get install python3-venv` for CentOS/Red Hat `pip install -U virtualenv`
- `python3 -m venv /path/to/your/virtual_env/venv_name`
- change to the virtual environment `cd /path/to/your/virtual_env/venv_name`
- copy the ./web_component files over: `cp -R /path/to/git/repo/web_component/* ./`
- copy the core files to the venv: `cp -R /path/to/git/repo/core ./`
- activate your virtual environment `source /path/to/your/virtual_env/venv_name/bin/activate`<br/>
The command prompt should now have a `(venv)` in front of the command line.<br/>
Cristian Bell's avatar
Cristian Bell committed
31

Cristian Bell's avatar
Cristian Bell committed
32
33
Both **OPT A** and **OPT B**:
- install Flask in the venv: `pip3 install Flask` **OPT A** OR `sudo -H pip3 install Flask` to install globally **OPT B**.<br/>
Cristian Bell's avatar
Cristian Bell committed
34
To verify that Flask is correctly installed run `pip3 list` and see it listed among all available modules.
Cristian Bell's avatar
Cristian Bell committed
35
- the path to the sql data files (chefkoch.w.data.500.db or chefkoch.w.data.200.db) and to the results output folder need to be set via `sql_data_dir` and `results_dir` variables in `./core/assoc_rule_finder.py`.
36
- run the app with `python3 app.py`
37
- for **OPT A** type `deactivate` at any time to exit the virtual environment;
cristian.bell's avatar
cristian.bell committed
38
39

![screenshot](./paper/images/web_screenshot.png?raw=true "anonimized_purchase_data")