false-positive reduction lab

rule tuning + cost trade-offs

problem

financial crime teams face a trade-off: rules that catch more fraud also create more false positives. too many alerts waste investigation time and cost money, while too strict thresholds let fraud slip through.

dataset

500 customers (kyc risk, pep flag, sar history)
10,000v transactions (amount, country, mcc, channel, device, ip, timestamp)
~3% transactions labelled as fraud (fraud_y) for evaluation

features engineered

amount_z : z-score of transaction amount relative to customer history
tx_count_1d, tx_count_7d : customer activity velocity
geo_mismatch : tx country vs home country
device_fanout : distinct customers per device in last 7 days

rules

r1 : amount > 95th percentile per customer
r2 : tx_count_1d >= 10
r3 : geo_mismatch == 1 and amount > 80th percentile globally

final alert logic = r1 OR (r2 AND r3)

results

baseline (p=0.95): 304 alerts, precision ~1.6%, recall ~24%, weekly cost ~£12.5k
tuned (p≈0.70): 329 alerts, precision ~2.1%, recall ~33%, weekly cost ~£11.9k
drift monitoring showed stable r1, fluctuating r3, higher alert rates in online/mobile

plots

disclaimer

this project uses synthetic data. it is an educational prototype, not a production aml engine or legal advice.

repository structure

false_positive_lab/
README.md # project overview and instructions
requirements.txt # dependencies
.gitignore # files to ignore in gits
LICENSE # open source license (MIT)
data/ # synthetic customer and transaction datasets
docs/ # charts (cost curve, pr curve, drift monitoring)
notebooks/ # main analysis notebook
outputs/ # csv of threshold evaluation

tests/ # simple pytest scripts
basic validation checks are included in /tests/test_features.py.

transactions_fp.csv exists in /data
amount_z exists and has no nulls
geo_mismatch is binary (0/1)
device_fanout is always >= 1

run tests

pytest -v

userenigmatic/false-positive-reduction-lab