Publications / 2019 Proceedings of the 36th ISARC, Banff, Alberta, Canada

Automatic Key-phrase Extraction to Support the Understanding of Infrastructure Disaster Resilience

Xuan Lv, Syed Ahnaf Morshed and Lu Zhang
Pages 1276-1281 (2019 Proceedings of the 36th ISARC, Banff, Alberta, Canada)

Preventing natural disasters from causing substantial social-economic damages relies heavily on the disaster resilience of the nation’s critical infrastructure. According to the National Academy of Sciences, research on understanding and analyzing the disaster resilience of our infrastructure systems is a “national imperative”. To address this need, this paper proposes an automatic keyphrase extraction methodology to extract relevant phrases on disaster resilience from documents in infrastructure domain. In developing the proposed methodology, a document collection including research papers and public reports are prepared. Noun phrases are first extracted from every sentence in the collection and form the candidates for keyphrases following a filtering procedure. Each candidate phrase is then represented as a global semantic vector and a local semantic vector. To select relevant phrases on disaster resilience, a semantic similarly measure is proposed to incorporate the semantics of candidate phrases in both the general and infrastructure domain. Ten physical resilience concepts from a pre-developed community resilience hierarchy is selected as the target concepts to evaluate the performance of the proposed methodology. When evaluated on the document collection, the proposed methodology achieved 66% of precision at top 20 extracted keyphrases on average.

Keywords: Infrastructure disaster resilience; Automatic keyphrase extraction; Natural language processing