Anomaly detection in double-entry bookkeeping data by federated learning system with non-model sharing approach.
Sota Mashiko, Yuji Kawamata, Tomoru Nakayama, Tetsuya Sakurai, Yukihiko Okada
Abstract
Open AccessAnomaly detection is crucial in financial auditing, yet effective detection often requires large volumes of data from multiple organizations. However, confidentiality concerns hinder data sharing among audit firms. Existing journal entry anomaly detectors built on model-sharing federated learning (FL) mitigate data transfer but still demand multiple parameter-exchange rounds with external servers, forcing devices holding confidential data onto networks. We propose a new framework based on data collaboration (DC) analysis, a non-model-sharing FL technique that enables anomaly detection without requiring confidential data to be directly connected to external networks. Our method first encodes journal entry data via dimensionality reduction to obtain secure intermediate representations, then transforms them into collaboration representations for building an autoencoder. Notably, the approach does not require raw data to be exposed or devices to connect to external networks, and the process needs only one round of communication. We evaluated the framework on synthetic and real journal entry datasets from eight organizations. Experiments show the DC-based approach not only surpasses models trained locally but also outperforms model-sharing FL methods such as FedAvg and FedProx, especially under non-i.i.d. conditions reflecting practical audits. This work demonstrates how organizational knowledge can be integrated while preserving confidentiality, advancing practical intelligent auditing systems.