DR
dridk/steganodf
Hiding a message in a data table such as a CSV or parquet file
Steganodf
A steganography tool for hiding a message in a dataset, such as csv or parquet files.
This tool hides a payload by permuting the rows of the dataset. The is tolerant
to modification thanks to a Reed-Solomon code and a Luby-s LT fontain code.
Demo
You can experiment with the Python API using this Google Colab notebook.
Installation
pip install steganodf
Usage
From command line
# Encoding
steganodf encode -m hello host.csv stegano.csv
steganodf encode -m hello host.parquet stegano.parquet
steganodf encode -m hello -p password host.parquet stegano.parquet
# Decoding
steganodf decode stegano.csv
steganodf decode stegano.csv -p password
From Python
import steganodf
import polars as pl
df = pl.read_csv("https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv")
new_df = steganodf.encode(df, "made by steganodf", password="secret")
# Extract your message
message = steganodf.decode(df, password="secret")Citation
Sacha Schutz, Meganne Souprayen. Watermark tabular datasets with rows permutations and fountain code. TechRxiv. April 28, 2025.
DOI: 10.36227/techrxiv.174585796.61215338/v1
Watermark tabular datasets with rows permutations and fountain code
computing and processing