SimpleText@SYMPOSIUM MADICS Home


SimpleText@SYMPOSIUM MADICS Home

Home Program Organisers Contact

SimpleText Workshop - RG AI (2021)

Simplification and Popularization of Scientific Texts

Topics

  • Automatic natural language processing
  • Information retrieval
  • Scientific journalism
  • Scientific popularization

Involved data

  • Scientific papers
  • Abstracts of scientific papers
  • Wikipedia articles
  • Scientific journalism papers

Scientific background

Scientific knowledge (including health-related issues) is essential for citizens to make good decisions, evaluate the quality of information, maintain their physiological and mental health, and avoid falling into the traps of impostors.

For example, stories that are deemed believable by individuals may influence their response to the COVID-19 pandemic, including the application of social distancing and the use of dangerous, fake medical treatments. Unfortunately, texts published on social media have their virality (propagation) maximized by soliciting our emotions and are often more easily accessible than scientific publications pursuing an ideal of objectivity and perspective.

Improving the intelligibility of texts and adapting them to different audiences remain an unresolved issue. Despite the existence of datasets like WebSPlit and WikiSplit, automatic text simplification comes down to the “Split and Rephrase” task (Aharoni and Goldberg, 2018; Botha et al., 2018; Narayan et al., 2017). Another existing dataset is based on Simple Wikipedia (Coster and Kauchak, 2011). Although there have been some attempts to address the issue of text intelligibility, these attempts are mainly based on readability formulas, which have not been convincingly shown to reduce text difficulty (Collins-Thompson and Callan, 2004; Leroy et al., 2013; Flesch, 1948; Gunning, 1968; Si and Callan, 2001). Recent research applies Transformers (BERT) models to simplify sentences (Fang and Stevens, 2019; Maruyama and Yamamoto, 2019; Zhao et al., 2018). Unlike previous work, the SimpleText workshop will target the problem of lack of knowledge that can be a serious impediment to understanding scientific texts (O’Reilly et al., 2019).

Improving the intelligibility of texts and adapting them to different audiences raise societal, technical and evaluation challenges.

There is a wide range of important societal challenges that SimpleText aims to describe more precisely. Open science is one of them. Making research truly open and accessible to all implicates providing works in a way that is easy to understand for the average reader (Fecher and Friesike, 2014).

Text simplification must also solve technical challenges such as the selection of important passages, the summary of these passages, the readability of texts, etc. SimpleText aims to adress these technical challenges by mobilizing different scientific disciplines involved.

The use, usability and evaluation of simplified texts are another area of focus for the SimpleText workshop.

Science popularization is related to scientific journalism. Contrary to the MADICS MADONA Action (Mastering Interactive Data Analysis for Journalism) which aims at generating an article from structured data analysis, the SimpleText workshop is oriented towards generating an article based on textual data (scientific publications and their abstracts).


Visit the SimpleText Workshop website