anonymize

Data anonymizer from CSV/Database to CSV file. (more sources/outputs to come)

Setup

Requirements

Python 3.9
poetry (optional)

Installation

Clone the repository
Install dependencies

poetry install
or
pip install -r requirements.txt

Create a config.yml file with the configuration for source, output and rules to apply. (see Config) (example config)
Run: python -m anonymize -c config.yml

Config

Sources (see `sources.py`)

Database

List of supported databases.

source:
  type: db
  uri: postgres://postgres:pass@localhost:5432/postgres
  table: mydata

CSV

source:
  type: csv
  path: /path/to/data.csv
  separator: '|' # Optional, default is ','

Outputs (see `outputs.py`)

CSV

output:
  type: csv
  path: /path/to/output.csv
  separator: '|' # Optional, default is ','

Rules (see `rules.py`)

The rules will validate the column name and the method, and then apply the method to the column
If the column is not found in the source, it will be ignored.
The if the column is not found in rules list, it will be kept as is.

Hash

Available algorithms are the ones in hashlib module.

rules:
  - column: credit_card
    method: hash
    algorithm: md5
    salt: my_very_secret_salt

Fake

Available types are: email, firstname, lastname, fullname.

rules:
  - column: name
    method: fake
    faker_type: firstname

Mask right (last n characters)

rules:
  - column: email
    method: mask_right
    n_chars: 5
    mask_char: x

Mask left (first n characters)

rules:
  - column: birthdate
    method: mask_left
    n_chars: 4
    mask_char: "*"

Destroy (replace with a fixed value)

The destroy name is inspired from postgresql_anonymizer

rules:
  - column: email
    method: destroy
    replace_with: "SOME VALUE" # Optional, default is "CONFIDENTIAL"

Shuffle

Shuffle letters and numbers separately (example: abc1.2!3 -> skM4.9!0)

rules:
  - column: email
    method: shuffle

Contributing

🍴 Fork the repository
⬇️ Install dev dependencies: poetry install --with=dev or pip install -r requirements-dev.txt
🌳 Create a branch git checkout -b feature/my-feature
🔧 Make your changes
✅ Run formatting, linting and tests poe all (see pyproject.toml)
🔃 Create a pull request

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
anonymize		anonymize
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
README.md		README.md
config.yml		config.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

anonymize

Setup

Requirements

Installation

Config

Sources (see `sources.py`)

Database

CSV

Outputs (see `outputs.py`)

CSV

Rules (see `rules.py`)

Hash

Fake

Mask right (last n characters)

Mask left (first n characters)

Destroy (replace with a fixed value)

Shuffle

Contributing

Next steps/Improvements

About

Releases

Languages

Aymane11/anonymize

Folders and files

Latest commit

History

Repository files navigation

anonymize

Setup

Requirements

Installation

Config

Sources (see sources.py)

Database

CSV

Outputs (see outputs.py)

CSV

Rules (see rules.py)

Hash

Fake

Mask right (last n characters)

Mask left (first n characters)

Destroy (replace with a fixed value)

Shuffle

Contributing

Next steps/Improvements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages

Sources (see `sources.py`)

Outputs (see `outputs.py`)

Rules (see `rules.py`)