With over 128,000,000 accounts, 7,000,000 transactions a day and $160 billion in transactions per year if we can trim ever 1/10th of 1 % of the fraud we can save $10s of million annually.

– Franz online banking client

Online fraud schemes are becoming more sophisticated every day.

To catch these virtually imperceptible patterns this online bank needed a new analytic approach that can connect events and analyze:

  • Geo-Spatial – Where are transactions happening?
  • Temporal – When are these transactions taking place?
  • Social Network & Relationship Analysis – Who knows whom in this transaction group?

Their Goal

Deploy an industry standard, semantic analytics approach that is highly scalable and which will extend their existing use of Hadoop in fraud pattern detection.

Their Challenges

A Big Data problem that cannot be solved with Hadoop alone
Dis-ambiguation of people, places
Computing the non-obvious relationships between people using multiple, varied social data sources
Multiple, disparate data sources
Scale to terabytes of content spanning multiple repositories and forms
Uncover relationships not discoverable by traditional name/place matching

The Solution

Hadoop platform for
  • Large dataset processing
  • Semi-structured data processing
  • Economical scalability of data storage and processing
  • Map-Reduce framework
Mahout machine learning platform
  • Extraction of blocks of metadata from large XML fields
  • Machine learning for field information
  • Streaming input to Hadoop file system (HDFS)
  • Map-Reduce framework
AllegroGraph Semantic platform
  • Creation of semantic triples
  • RDF, triple store and ontology platform
  • Resolution of ambiguous names, abbreviations, places
  • Identify affiliations/relationships
  • Threshold-based matching of people relationships
  • Analysis of connectedness, clusters of people and centrality
  • Assembly of the notion of an Event
    • Geography Event correlation using proxies for location
    • Temporal Event correlation
    • Social inter-connectivity
    • Financial transaction Events

The Benefits

  • Enables sophisticated techniques uncover the hidden connections of people and transactions to discover undetected fraud
  • System learns over time to improve unique fraud pattern recognition
  • Architecture scales to real world needs with commercial Hadoop – for economical computing costs for tracking/discovery
  • Substantial cost savings due to fraud losses