Disambiguation of names – for identification

Los Alamos National Labs - 99% name recongnition


With the AllegroGraph semantics graph platform Los Alamos National Labs is able to disambiguate names with a 99% level of accuracy. Their framework can disambiguate and resolve names of different spellings, abbreviations, or just initials while connecting these people to other people, events, geographic locations and other conceptual data.


Their Goal:

Build a scalable application for processing terabytes of names and co-incident data using a demonstration dataset of structured and semi-structured bibliographic metadata to resolve authors, co-authors, all their associated publications, and shared affiliations.

Their Challenges:

  • Disambiguation of people’s names – for spelling variants, nick-names, misspellings, abbreviations
  • Uncovering entity relationships:
    • Events
    • Locations
    • Other people, items
    • Conceptual
  • Structured and semi-structured data
  • Scalable architecture for terabytes of content spanning multiple repositories and forms
  • Uncover relationships not discoverable by traditional name matching

 Their Solution:

  • Allegrograph Semantic platform:
    • RDF, triple store and ontology platform
    • Social Network Analysis query feature
      •  Analysis of connectedness, clusters of people and centrality
    • Identify affiliations/relationships
    • AllegroGraph triple store API
    • Custom triple store calculations
  • Gruff – graph database visualization and query creation
  • Commercial applications text extraction, open source for phonemes, extraction and open source

The Benefits:

This proof-of-concept for Los Alamos is the foundational work for identification of inter-relationships that represent potential threats to national security.


Download a detailed whitepaper written by the Los Alamos team of the Use Case.


Los-alamos-disambiguation -hadoop

AllegroGraph turns complex data into actionable business insights