Deep Learning Approach to Fraud
Data scientists rely on domain knowledge and intuition to create feature spaces for adversarial use cases like fraud and infosec. The hope is that feature spaces made up of business attributes and statistical summaries can be used for anomaly / outlier detection by delineating normal and extraordinary user behaviors (Hooi).
Hooi offers the following feature space for "online commerce, in which fraudulent sellers write or purchase fake reviews to manipulate perception of their products and services."
Once the feature space is defined, a variety of statistical and unsupervised machine learning approaches are available for anomaly detection including:
DBSCAN
MeanShift
Gaussian Mixture Model (GMM)
Principal components analysis (PCA)
One-class SVM
Z-Score and Median absolute deviation (MAD)
Hierarchical clustering
Hidden Markov Model (HMM)
Self-Organizing Maps (SOM)
However when fully labeled datasets are compared to the anomaly scores of these methods, it is observed that their detection rate is rather poor; measured as an F1 score, often less then 0.5 (Domingues).
Sölch points out one of the biggest drawbacks of such anomaly detection approaches is the common assumption "that data streams are i.i.d. in time and/or space." Creating a feature space that breaks this assumption may be beyond the ability of even the most skilled practitioners.
In many other domains like computer vision, speech recognition, language translation it has been observed that feature learning can produce significantly better results compared to manually engineered inputs. However these domains have the benefit of labeled datasets.
In fraud labeled datasets are often hard or impossible to come by. Instead autoencoders and other generative approaches are being studied as a possible mechanism for anomaly detection. Autoencoder is a neural network that is trained by reconstructing the original inputs in a smaller hidden space. Anomaly detection can be accomplished with an autoencoder by using the reconstruction probability or reconstruction error (An).
To capture the spatial and temporal aspects of the data within the autoencoder, more sophisticated neural network architectures can be used including convnets and recurrent neural networks (Xie).
From limited testing, An and Cho show significant improvement of F1 scores using a variational autoencoder over PCA. Thus the hidden units that these autoencoders learn may be a promising way to model the ingenuity of an active adversary.
References
An, Jinwon, and Sungzoon Cho. "Variational Autoencoder based Anomaly Detection using Reconstruction Probability." (2015).
Berniker, Max, and Konrad P. Kording. "Deep networks for motor control functions." Frontiers in computational neuroscience 9 (2015).
Domingues, Rémi. "Machine Learning for Unsupervised Fraud Detection." (2015).
Hooi, Bryan, et al. "BIRDNEST: Bayesian Inference for Ratings-Fraud Detection." arXiv preprint arXiv:1511.06030 (2015).
Sölch, Maximilian, et al. "Variational Inference for On-line Anomaly Detection in High-Dimensional Time Series." arXiv preprint arXiv:1602.07109 (2016).
Xie, Jianwen, et al. "A theory of generative convnet." arXiv preprint arXiv:1602.03264 (2016).