CSECU-DSG at WNUT-2020 Task 2: Exploiting Ensemble of Transfer Learning and Hand-crafted Features for Identification of Informative COVID-19 English Tweets

Abstract

COVID-19 pandemic has become the trending topic on twitter and people are interested in sharing diverse information ranging from new cases, healthcare guidelines, medicine, and vaccine news. Such information assists the people to be updated about the situation as well as beneficial for public safety personnel for decision making. However, the informal nature of twitter makes it challenging to refine the informative tweets from the huge tweet streams. To address these challenges WNUT-2020 introduced a shared task focusing on COVID-19 related informative tweet identification. In this paper, we describe our participation in this task. We propose a neural model that adopts the strength of transfer learning and hand-crafted features in a unified architecture. To extract the transfer learning features, we utilize the state-of-the-art pre-trained sentence embedding model BERT, RoBERTa, and InferSent, whereas various twitter characteristics are exploited to extract the hand-crafted features. Next, various feature combinations are utilized to train a set of multilayer perceptron (MLP) as the base-classifier. Finally, a majority voting based fusion approach is employed to determine the informative tweets. Our approach achieved competitive performance and outperformed the baseline by 7% (approx.).

Type
Publication
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Fareen Tasneem
Fareen Tasneem
Research Assistant (Full Time)
Jannatun Naim
Jannatun Naim
Research Assistant (Full Time)
Radiathun Tasnia
Radiathun Tasnia
Research Assistant (Full Time)
Tashin Hossain
Tashin Hossain
Research Assistant (Full Time)
Abu Nowshed Chy
Abu Nowshed Chy
Assistant Professor