Leaking Sensitive Financial Accounting Data in Plain Sight using Deep Autoencoder Neural Networks

The data leakage process applied to learn a steganographic model of real-world accounting data. The process is designed to encode and decode sensitive Enterprise Resource Planing (ERP) system information into unobtrusive ‘day-to-day’ cover images.

Nowadays, organizations collect vast quantities of sensitive information in ‘Enterprise Resource Planning’ (ERP) systems, such as accounting relevant transactions, customer master data, or strategic sales price information. The leakage of such information poses a severe threat for companies as the number of incidents and the reputational damage to those experiencing them continue to increase. At the same time, discoveries in deep learning research revealed that machine learning models could be maliciously misused to create new attack vectors.

Understanding the nature of such attacks becomes increasingly important for the (internal) audit and fraud examination practice. The creation of such an awareness holds in particular for the fraudulent data leakage using deep learning-based steganographic techniques that might remain undetected by state-of-the-art ‘Computer Assisted Audit Techniques’ (CAATs). In this work, we introduce a real-world ‘threat model’ designed to leak sensitive accounting data. In addition, we show that a deep steganographic process, constituted by three neural networks, can be trained to hide such data in unobtrusive ‘day-to-day’ images. Finally, we provide qualitative and quantitative evaluations on two publicly available real-world payment datasets.