Online banking

Fraud detection, big data and semantics

Fraud-online-Bank-quote

Online fraud schemes are becoming more sophisticated every day. To catch these virtually imperceptible patterns this online bank needed a new analytic approach that can connect events and analyze:

  • Geo-Spatial – Where are transactions happening?

  • Temporal – When are these transactions taking place?

  • Social Network & Relationship Analysis – Who knows whom in this transaction group?

Additionally the solution needed to work with multiple Big Data sources not all of which are managed or built by the bank themselves, i.e., social media information, law enforcement records, etc.

Their Goal:

Deploy an industry standard, semantic analytics approach that is highly scalable and which will extend their existing use of Hadoop in fraud pattern detection.

Their Challenges:

  • A Big Data problem that cannot be solved with Hadoop alone
  • Computing the non-obvious relationships between people using multiple, varied social data sources
  • Dis-ambiguation of people, places
  • Multiple, disparate data sources
  • Scale to terabytes of content spanning multiple repositories and forms
  • Uncover relationships not discoverable by traditional name/place matching 

The Solution:

  • Hadoop platform for:
    • Large dataset processing
    • Semi-structured data processing
    • Economical scalability of data storage and processing
    • Map-Reduce framework
  • Mahout machine learning platform
    • Extraction of blocks of metadata from large XML fields
    • Machine learning for field information
    • Streaming input  to Hadoop file system (HDFS)
    • Map-Reduce framework
  • AllegroGraph Semantic platform:
    • Creation of semantic triples
    • RDF, triple store and ontology platform
    • Resolution of ambiguous names, abbreviations, places
    • Identify affiliations/relationships
    • Threshold-based matching of people relationships
    • Analysis of connectedness, clusters of people and centrality
    • Assembly of the notion of an Event
      • Geography Event correlation using proxies for location
      • Temporal Event correlation
      • Social inter-connectivity
      • Financial transaction Events 

The Benefits:

• Enables sophisticated techniques uncover the hidden connections of people and transactions to discover undetected fraud
• System learns over time to improve unique fraud pattern recognition
• Architecture scales to real world needs with commercial Hadoop – for economical computing costs for tracking/discovery
• Substantial cost savings due to fraud losses
Onlinebank-bigdata diagram

AllegroGraph turns complex data into actionable business insights