Automatic Text Summarization of COVID-19 Scientific Research Topics Using Pre-trained Models from Hugging Face

Conference proceedings article

Authors/Editors

JONATHAN HOYIN CHAN

Strategic Research Themes

Publication Details

Author list: Sakdipat Ontoum, Jonathan H. Chan

Publication year: 2022

URL: https://www.ri2c-2022.com/home.aspx

Languages: English-Canada (EN-CA)

Abstract

Automated text summarizing helps the scientific and medical sectors by identifying and extracting relevant information from articles. Automatic text summarization is a way of compressing text documents so that users may find important information in the original text in less time. We will first review some new works in the field of summarizing that use deep learning approaches, and then we will explain the COVID-19 summarization research papers. The ease with which a reader can grasp written text is referred to as the readability test. The substance of text determines its readability in natural language processing. We constructed word clouds using the abstract’s most commonly used text. By looking at those three measurements, we can determine the mean of ROUGE-1, ROUGE-2, ROUGEL, ROUGE-L-SUM. As a consequence, Distilbart-mnli-12-6 and GPT2-large outperform than others.

Index Terms—Automatic Summarization, COVID-19, COVID- 19 Open Research Dataset (CORD-19), Hugging Face, Latent Dirichlet allocation (LDA), Flesch Reading Ease, Recall-Oriented Understudy for Gisting Evaluation (ROUGE)

Keywords

Automatic Summarization, COVID-19, COVID- 19 Open Research Dataset (CORD-19), Flesch Reading Ease, Hugging Face, Latent Dirichlet allocation (LDA), Recall-Oriented Understudy for Gisting Evaluation (ROUGE)