Health literacy and language teaching: data-based host language lexicons



Palabras clave:

Lexicons, Host Language Teaching, Corpus-Based, Health Literacy


Using a corpus-based approach, this paper presents methods and results for assessing, extracting, and describing the core vocabulary relevant to healthcare access among migrant populations. The aim is to bridge the gap between the basic information conveyed to people arriving in Portugal and the materials as well as other lexicographic resources used in language teaching. The work includes identifying available resources and/or sources for compiling the relevant dataset for healthcare access; selecting available tools for corpus inquiry; testing and comparing results from different functionalities and different lexical statistics measures available in the tools; manual filtering of the data; and analyzing the results and the extracted lexicon. The obtained results reflect the organization of the extracted lexicon in subdomains, the organization of the items within each subdomain, the relationship with common vocabulary, and the extraction of authentic examples from the corpus.


  • Raquel Amaro, Universidade NOVA de Lisboa

    PhD in Computational Linguistics from the University of Lisbon (2010). Assistant Professor with tenure at the NOVA University of Lisbon - NOVA FCSH (Portugal). Researcher at the Linguistics Research Center of NOVA University of Lisbon - CLUNL (Portugal).


AMARO, Raquel. Health literacy and language teaching: data-based host language lexicons. Linha D’Água, São Paulo, v. 37, n. 2, p. 136–160, 2024. DOI: 10.11606/issn.2236-4242.v37i2p136-160. Disponível em: Acesso em: 1 oct. 2024.

