With the AllegroGraph semantics graph platform Los Alamos National Labs is able to disambiguate names with a 99% level of accuracy. Their framework can disambiguate and resolve names of different spellings, abbreviations, or just initials while connecting these people to other people, events, geographic locations and other conceptual data.
Mirroring (the) natural thought process we have created a prototype framework for disambiguating entities…
– Los Alamos National Labs
Their Goal
Build a scalable application for processing terabytes of names and co-incident data using a demonstration dataset of structured and semi-structured bibliographic metadata to resolve authors, co-authors, all their associated publications, and shared affiliations.
Their Challenges
Disambiguation of people’s names – for spelling variants, nick-names, misspellings, abbreviations
Uncovering entity relationships:
- Events
- Locations
- Other people, items
- Conceptual
Structured and semi-structured data
Scalable architecture for terabytes of content spanning multiple repositories and forms
Uncover relationships not discoverable by traditional name matching
The Solution
Allegrograph Semantic platform:
RDF, triple store and ontology platform
Social Network Analysis query feature
- Analysis of connectedness, clusters of people and centrality
Identify affiliations/relationships
AllegroGraph triple store API
Custom triple store calculations
Gruff – graph database visualization and query creation
Commercial applications text extraction, open source for phonemes, extraction and open source
The Benefits
This proof-of-concept for Los Alamos is the foundational work for identification of inter-relationships that represent potential threats to national security.