Facilitating listeners’ understanding of scientific knowledge in TED Talks: A corpus-based analysis of code glosses as metadiscourse resources of popularization

Problem statement

In the 20th century, the need to make science closer to laypeople contributed to the emergence of popular science discourse. This need has been developing with the advent of the Internet that brings new communicative dynamics which affects science discourse. The number of popular science texts presented on the Internet is growing and new genres have appeared.

TED (Technology, Entertainment and Design) is considered to be one of the new ways of spreading scientific information online. It is a nonprofit organization that aims to spread ideas in the form of short talks and share education lessons through the TED-EDs (the education-specific part of the TED platform). As stated on the website, TED’s mission is to spread ideas that spark imagination, embrace possibility, and catalyze impact. The organization welcomes speakers from different disciplines and cultures who seek a deeper understanding of the world and connection with others. The emergence and popularity of this type of discourse result from the growing interest in scientific information satisfied by digital resources that popularize science. These are specialized network portals and copyright blogs on various platforms which have become the point of interaction with scientific information [Rysakova 2022].

With this abundance of popular science information, the questions arise whether popular science communication is effective and to what extent a lay audience understands specialized knowledge. As far as this knowledge is addressed to non-experts, different popularization strategies are required to make it understandable.

It is well known that the understanding of specialized texts can be arduous for nonexperts. On the contrary, popularization discourse uses more informal language, including conversational style. Producers of popular science texts act as mediators between the scientific content and popularized information which should become accessible to laypeople through the right choice of lexical items and syntactic structures that they consider transparent and user-friendly. It is important for producers of popular science texts to avoid using words that may be incomprehensible for non-experts or employ popularization strategies.

Despite the fact that the issues of popularization and scientific knowledge dissemination have attracted many discourse analysts [Anesa 2016; Boldyrev, Efimenko 2022; Boginskaya 2020; Boginskaya, 2022; Calsamiglia, van Dijk 2004; Ciapuscio 2003; Gotti 2011; Mamonova 2023; Rysakova 2022], the ways how specialized knowledge is re-contextualized in TED Talks have not been analyzed, and code glosses used in this genre have barely been studied in terms of knowledge popularization. The present study seeks to fill this research gap contributing to linguistic literature on knowledge popularization through the use of metadiscourse.

The study aims to analyze code glosses as metadiscourse resources of popularization exploited by TED speakers to make scientific knowledge understandable for a lay audience and help them decode the speaker’s message. In achieving this purpose, the study intends to answer the following questions:

  1. What types of code glossing are employed in TED Talks with the aim of depicting specific concepts?
  2. What is the frequency of occurrence of these types of code glossing in the corpus?
  3. What linguistic items signal the types of code glossing that appear in the corpus?

Hence, this article has two focal points: it contributes to supporting the view of popularization in media and examines code glosses as metadiscourse resources employed to explain scientific knowledge in TED Talks as a popularizing genre in the media space. The rationale for this study lies in the need to contribute to our understanding of the popularization processes through which scientific information is presented by on the TED platform.

This article is structured as follows: Prior studies on popularization of science, TED Talks as a genre of media discourse and metadiscourse patterns are reviewed in the following section. Then, the corpus, including the corpus selection criteria, and the methods employed to analyze the corpus are described. The ‘Analysis’ section considers code glosses found in the corpus through some exemplifications. Conclusion are made and further research avenues are outlined in the final section.

Theoretical background

Popularization of science

Research on scientific knowledge popularization has been conducted by a large number of scholars who described linguistic features of expert-lay interactions [Anesa 2016; Calsamiglia 2003; Calsamiglia, van Dijk 2004; Ciapuscio 2003; Gülich 2003; Hilgartner 1990; Myers 2003; Rysakova 2022].

In literature, the concept of popularization is often associated with the communication of scientific knowledge through the modification or distortion. Hilgartner, for example, claimed that popularization can be considered both as a positive and negative process: it is appropriate simplification of science for non-experts but it distorts science [Hilgartner 1990]. In a different light, Calsamiglia defined the concept of popularization as vulgarization, debasement, translation, transposition, or reformulation [Calsamiglia 2003]. Ciapuscio described knowledge popularization as the process of recontextualization or putting something in a different context [Ciapuscio 2003], while Sarangi defined it as a transfer and transformation of discourse into texts divorced from the social interaction that created them [Sarangi 1998].

A different interpretation of the concept of popularization was provided by Gotti, who defined popularization as “a kind of redrafting that does not alter the disciplinary content — object of the transaction — as much as its language which needs to be remodeled to suit a new target audience” [Gotti 2014: 19]. Gotti claimed that popularized information is transferred linguistically in a way similar to periphrasis or to intralingual translation.

In contrast to Gotti, Anesa argued that “the notion of popularization goes beyond that of a form of intra-linguistic translation where specialized language is transposed into a simplified one” [Anesa 2016: 73]. The researcher described popularization “as a form of rediscoursification, intended as the creation of a new type of discourse, where specialized language is exploited differently according to a new set of participants, setting, objectives and rules, thus producing new culturally and historically located meanings” [Anesa 2016: 73].

One of the most comprehensive definitions of popularization taken as the theoretical basis in the present study has been provided by Calsamiglia and van Dijk who referred popularization to “the transformation of specialized knowledge into ‘everyday’ or ‘lay’ knowledge” and claimed that popularization is formulated in such a way that laypeople are able to construct lay versions of scientific knowledge and integrate these with everyday knowledge [Calsamiglia, van Dijk 2004: 370].

TED Talks as a popularizing genre

TED Talks create a new popularizing genre started by a conference in Technology and design in 1984 in California. It rapidly became a platform sharing individual experience to a wide audience with an estimated one billion viewers. TED Talks are delivered by experts in different fields who deal with a variety of topics, ranging from the “hard” sciences to the humanities and covering areas such as architecture, health, history, and culture.

The participation in TED conferences is regulated by strict rules: each speaker has no more than 18 minutes and must carefully prepare the lecture. Conference participants submit applications and send their lecture plans to the organizers six months before the conference. The lecture is usually monologic, but in rare cases it can take the form of an interview. Sometimes the monologue is supplemented by answers to facilitator’s questions. The lecture is delivered without written support, sometimes accompanied by slides. TED lecturers may speak in pairs, taking turns in pronouncing their text. The TED’s decision to launch the website in 2006 marked a new era in the history of this genre. Hundreds of videos of selected TED Talks are now available on the website contributing to the formation of a secondary audience for video recordings of conferences.

In their presentations, TED speakers recontextualize scientific knowledge and employ various metadiscourse resources to interact with a lay audience in achieving both the informational and promotional purposes. They explain abstract concepts and specialized terms to disseminate scientific knowledge. The popularization function realized by TED Talks implies therefore the provision of a possibility to transmit scientific knowledge to laypeople, and to disseminate knowledge on crucial scientific phenomena. As Hyland put it, “the languages of the academy have quietly begun to insert themselves into every cranny of our lives in the West, colonizing the discourses of technocracy, bureaucracy, entertainment and advertising. Almost unnoticed, academic discourses have reshaped our entire world view, becoming the dominant mode for interpreting reality and our own existence” [Hyland 1998: 2].

Despite the growing popularity of this genre, linguistic studies of TED lectures are not numerous. TED Talks are commonly explored within studies of popular science discourse focusing on their linguistic and culture-bound features, rhetorical strategies and moves, or translation strategies. Caliendo describes this genre as hybrid and claims that TED Talks lie at the intersection of a number of genres such as university lectures, newspaper articles, conference presentations and TV science programmes [Caliendo 2014]. Among the studies that reveal linguistic features of TED Talks is Caliendo and Compagnone’s research which compares the use of pronouns in academic lectures and TED Talks. The researchers found that unlike academic lecturers, TED speakers use ‘we’-pronoun to refer to themselves excluding the audience and presenting themselves as experts [Caliendo, Compagnone 2014]. In his another study, Compagnone explored how academic discourse is reconceptualized as a professional practice via the web-mediated genre of TED Talks. The researcher focused on the way in which academics represent themselves discursively in the setting of the university classroom and that of TED [Compagnone 2015].

A recent study of TED Talks was conducted by Almaged who has investigated the linguistic mechanism of disseminating knowledge about terrorism [Almaged 2021]. Miranda and Moritz’s study, conducted in the same year, has focused on the rhetorical structure of the genre of TED Talks. The analysis has revealed a constant pattern of moves and steps: every talk under consideration contained five moves such as topic introduction, speaker presentation, topic development, concluding messages, and acknowledgments/ gratitude [Miranda, Moritz 2021].

In the Russian academic context, Nechaeva explored TED Talks as a special type of public lecture which features a number of linguistic resources used to attract attention and create a certain attitude to the issue under consideration. The researcher emphasized that figurative language makes TED Talks more emotional and enhances the impact on a lay audience turning a lecture into a friendly confidential conversation [Nechaeva 2016]. Viktorova analyzed TED Talks as a media genre, identified its place in the speech genre hierarchy, and specified its chronotopos and linguistic features. The generic features of TED Talks were analyzed within the pragmatic framework constituted by discourse markers and including discourse structure, addressor-addressee’s interaction, authorial self-presentation. The study showed that the TED Talks is an independent genre realized both in real and online communication. TED Talks are marked by structural and logical transparency and preciseness, strong orientation to the addressee’s needs, persuasiveness and dialogueness. In addition, TED lectures feature the compaction of several genres, which refers to the complication and enrichment of the generic characteristics of the text: a popular science lecture turns into media discourse [Viktorova 2019].

Metadiscourse as a crucial feature of popularizing discourse

The term “metadiscourse” was created by Harris in 1959 to indicate the ways of representing speaker’s attempts to guide a receiver’s perception of a text. Since then, the concept of metadiscourse has been further developed in a number of studies. Vande Kopple, for instance, defined metadiscourse as a range of rhetorical features such as hedges, connectives and commentaries used to influence the reader’s perception of the text [Vande Kopple 1985]. In the same vein, Hyland described metadiscourse as expressions assisting writers/speakers to express a viewpoint and engage with readers taking into account their understandings and values [Hyland 2005]. The researcher has developed a model of metadiscourse including two categories — interactive metadiscourse and interactional metadiscourse. While interactive markers serve to organize information in coherent and convincing ways and allow speakers and writers to organize the propositional content in such a way that an audience finds it comprehensible, interactional devices help build a relationship with the reader by expressing doubt or certainty or various other attitudes towards the proposition [Hyland 2005]. It should be mentioned that though Hyland’s model is usually used in the analysis of academic texts, attempts to apply the model to the study of other types of discourse are also recurrent [Chaemsaithong 2014; Fu 2012; Zou, Hyland 2020].

Code glosses chosen for the present study were referred to interactive metadiscourse resources which, along with endophoric markers, evidentials, frame markers and transition markers, play a crucial role in allowing speakers/writers to make their texts coherent, understandable, and user-friendly. They are intended to assist the audience with elaboration, specificity, clarification and examples facilitating listeners’ comprehension of the propositional material by connecting sentences to the reader’s experience and knowledge-base. Hyland distinguished between two types of code glossing: reformulation and exemplification, which were further subdivided into subtypes (Figure1).

Figure. Taxonomy of code glosses in Hyland’s model

Reformulation is a discourse function by means of which the second part rewords the first one using different words to reinforce the message [Hyland 2007]. It involves two procedures — expansion and reduction. Expansion can be realized through explanation or implication which “restate an idea in such a way as to widen the sense in which the writer intends it to be understood” [Hyland 2007: 274]. While explanation increases the accessibility of a concept by expanding the reader’s understanding of material, implication allows speakers to “draw a conclusion or sum up the main import of the prior segment” [Hyland 2007: 275]. In contrast to expansion, reduction narrows the meaning of what has been said and the range of interpretations through paraphrase or specification. While paraphrase restates an idea in different words to provide a summary, specification details “features which are salient to the primary thesis in order to constrain how the reader might interpret it” [Hyland 2007: 276]. Exemplification is the second type of code glossing which makes speaker’s ideas accessible through illustration or scenario-based elaboration.

Examples of lexical items used for code glossing are presented below:

— explanation markers: referred to, that is, known as, called, defined as;

— implication markers: this/which means;

— paraphrase markers: that is, in other words, put it another way, put it differently, that is to say;

— specification markers: especially, more specifically, in particular to, particularly, in particular, specifically, more accurately speaking, to be exact, to be precise, namely;

— exemplification markers: such as, for example, for instance, an example of, like.


In order to explore the types and frequency of code glosses employed in TED lectures, the present study applied the quantitative method. With the aim of going beyond a mere list of code glosses that serve the popularization function, the study applied the interpretative method.

The following are the steps of the analysis as it appears in the article:

    1. Identification of the types of code glossing based on the search items presented in Section “Metadiscourse as a crucial feature of popularizing discourse”.
    2. Identification of the frequency of each type of code glossing.
    3. Identification of the frequency of lexical items used as code glosses.
    4. Description of the pragmatic functions code glosses serve in the examples taken from the corpus.

For the purpose of the present study intended to reveal the types and frequency of code glosses employed in TED Talks to facilitate comprehension of scientific knowledge by a lay audience, a linguistic corpus was designed following the principles of Corpus Linguistics. The corpus built includes 80 TED Talks whose transcriptions were derived from the ted​.com website. The website provides an opportunity to get acquainted with popular science lectures on a variety of issues ranging from technology to social problems. The website represents therefore an interesting locus to explore popular science discourse. On the one hand, its content brings it closer to scientific articles and academic lectures. On the other hand, it is free from literary canons and not constrained by the rigor of science and academia.

To compile the corpus for the present study, the TED Talks were selected based on the following criteria:

1) thematic variety that ranges from technology to human rights;

2) recency: all TED Talks date back to the period between 2020 and 2023 as the aim is to focus on synchronically comparable texts.

The size of the corpus is 151,667 words. The average length of 80 presentations is 1896 words.


Taking the types of code glossing presented in Figure as the starting point, the present study focuses on the popularization tools that frequently appear in the corpus of TED Talks.

The quantitative analysis revealed the frequency of occurrences of the code glosses used by TED speakers to popularize scientific knowledge:

— implication markers: 24 occurrences;

— explanation markers: 17 occurrences;

— paraphrase markers: 29 occurrences;

— specification markers: 42 occurrences;

— scenario-based elaboration markers: 67 occurrences;

— illustration markers: 130 occurrences.

As can be seen, illustration markers, scenario-based elaboration and specification markers are the most common code gloss signals in the corpus of TED Talks.


Reformulation occurs when the TED speaker reformulates an utterance by expressing specialized concepts in a different way. Reformulation markers comprised about 36 % of the code gloss signals in the corpus. As the quantitative analysis revealed, reduction signals (paraphrase markers and specification markers) that narrow the meaning of what has been said and the range of interpretations were the most frequently used type of reformulation (63,4 %). Here is an extract from the TED lecture which features the paraphrase as a type of reduction.

(1) In other words, if I want to study how the dragonfly does coordinate transformations, the neural circuit that I need to understand, the neural circuit that I need to study, can have at most four layers of neurons.

(2) Debate is a way to organize conversations about how the world is, could, should be. Or to put it another way, I would love to offer you my experience-backed, evidence-tested guide to talking to your cousin about politics at your next family dinner; reorganizing the way in which your team debates new proposals; thinking about how we change our public conversation.

The paraphrase markers are employed here to restate ideas in different words to make them comprehensible for a lay listener. Paraphrasing is purposeful, indicating that the TED speaker is making an effort to facilitate listeners’ comprehension and achieve a particular pragmatic effect. On the one hand, the reference expression and the treating expression are in a relationship of semantic equivalence; on the other hand, they are in a relationship of difference, since the treating expression contains new words and belongs to a different register. As Ciapuscio put it, when a speaker applies this procedure, he/she refers to a previous expression by means of a new one that somehow changes it. The procedure should be considered as an equivalence operation so that the two units are different ways of expressing a single meaning [Ciapuscio 2003].

Below are two examples of the specification, one more type of code glossing used to narrow the meaning.

(3) We would extract and sequence DNA, which allowed us to understand which microorganisms, and particularly fungi, live in each of these forests.

(4) Specifically, we could remove components of the translational machinery, specific tRNAs, that normally read the codons that we’ve removed from the genome.

In (3), the equivalence between the original statement and the reformulated one is signaled by the specification marker particularly that details what microorganism lives in specific forests in order to constrain how the listener might interpret the statement. In (4), the specifier details which components of the translational machinery can be removed from the cell. Here is an example illustrating the use of the specification marker namely, which was less frequent in the corpus:

(5) You can be present-hedonistic, namely you focus on the joys of life, or present-fatalist — it doesn’t matter, your life is controlled.

Among the expansion markers, implication resources were employed more frequently. Here are two examples illustrating this type:

(6) And this means that most amino acids are encoded by more than one triplet codon.

(7) Schizophrenia is considered a syndrome, which means it may encompass a number of related disorders that have similar symptoms but varying causes.

It should be noted that despite its rather frequent appearance (24 occurrences) in the corpus, implication was realized only through the use of two linguistic items — this means and which means.

As for the explanation markers, they were surprisingly the least frequent in the corpus and appeared only 17 times. The most common code glosses functioning as explanation markers were known as and called. Here are two examples from the corpus:

(8) Now, proteins are amazing, but they’re just one example from a vast class of molecules known as polymers, which includes plastics, materials and drugs.

(9) So these circular structures in this root network are called root nodules.

The code gloss defined that frequently appears in scientific discourse was rarely employed in my corpus (two occurrences). Below is an example of the use of this marker.

(10) And triplet codons that encode the same amino acid are defined as synonymous codons.

The analysis also revealed the most frequent linguistic resources used in the TED Talks to verbalize reformulation: in particular(24 % of the total number), known as (18 % of the total number), this means (13 % of the total number), specifically (11 % of the total number), that is and in other words (each accounts for 9 % of the total number), namely (7 % of the total number), and called (6 % of the total number). The preponderance of two markers — in particular and known as — over the other ones implies that TED talkers used a limited set of reformulation items.


Exemplification includes the resources used by TED speakers to elaborate abstract concepts in terms of everyday experience. Exemplification as a metadiscourse function realized by code glosses is based on analogical processes that allow experts to link abstract concepts to specific objects or events. The corpus-driven analysis revealed that TED speakers frequently used this metadiscourse resource in an attempt to make scientific information comprehensible. Taking a look from the relative frequency of exemplifiers in the corpus, they were the most common code glosses.

The analysis identified two types of exemplification in the corpus: scenario-based elaboration and illustration. Regarding the difference between these two techniques, the former involves drawing up events, creating possible but imaginary situations, while in case of illustration meaning is clarified by a second unit which illustrates the first by providing an example. Both techniques, therefore, elaborate specialized concepts with the aim of making them less abstract. Such exemplifications are often easier to remember than general knowledge and hence are quite useful as popularization devices in expert-lay interactions [Calsamiglia, van Dijk 2004]. The following example illustrates the case where the TED speaker feels necessary to ensure that the specialized concept is understood by the lay audience and provides a highly comprehensible example.

(11) Using synthetic biology, Lisa Nip hopes to harness special powers from microbes on Earth — such as the ability to withstand radiation — to make humans more fit for exploring space.

Here the concept special powers from microbe, is explained by offering an example which helps the listeners immediately understand the meaning of the proposition, thus succeeding in improving its comprehensibility. The code gloss such as serves the exemplification purpose. Consider one more example.

(12) The brain not only categorizes that information as a particular odor, it may also begin to associate feelings, like pleasure or disgust and other moods and emotions with that odor for future reference. For example, you sniff bacon, you eat it, your taste buds get salt, and then your body gets a whack of fat, which is an energy source.

The scenario provided to elaborate the meaning of the proposition is a powerful appeal to understandings the speaker believes are recoverable from the exemplification. The speaker elaborates the meaning by creating an imaginary situation using the code gloss for example.

In the following statement, the TED speaker explains the meaning of the chemistry concept polymers providing an example of the types of polymer known to the lay audience and used in everyday life. By selecting the exemplification strategy, the speaker tries to avoid comprehension difficulties on the part of the listener and formulates the specialized concept in a simplified manner that is closer to the lay audience therefore facilitating the understanding.

(13) Now, proteins are amazing, but they’re just one example from a vast class of molecules known as polymers, which includes plastics, materials and drugs.

There were also some occurrences of the preposition like as a code gloss to offer an instance of a general category:

(14) On the slightly uneven pink, on the beautiful gold. It would have glittered in an interior, a little like a little firework.

Regarding the linguistic realization of exemplification, the analysis revealed the frequent appearance of two exemplifiers — for example (44 % of the total number) and such as (32 % of the total number). Other exemplification markers were used far less frequently: for instance — 12 % and one/an example of — 9 %, and like 3 % of the total number.


The purpose of this study was to examine the process of dissemination of scientific knowledge by TED speakers and observe what code glosses as interactive metadiscourse resources serving a popularization function are predominant in the corpus. The analysis identified six types of code glossing which are employed:

1) to reword or restate the message by:

— expanding the listener’s understanding of the content through the use of expansion markers;

— or narrowing the meaning of what has been said through the use of reduction markers;

2) to facilitate listeners’ comprehension and to make speaker’s ideas accessible through the use of illustration or scenario-based elaboration markers.

The higher frequency of exemplification markers in the corpus seems to be not surprising since classical rhetoric has always considered examples useful for public speaking allowing speakers make their ideas accessible and recoverable from personal experience. Appeals to instances of a general category or a similar case are very helpful in addressing a non-expert audience, making statements both comprehensible and retainable. Although the quantitative analysis did not reveal a high frequency of reformulation markers in the corpus, the findings still allow us to speak about TED speakers’ attempts to take listeners’ needs, existing knowledge and experience into consideration through the re-elaboration of the propositional content. The study therefore contributes to our understanding of the discursive structure of TED Talks as a type of popularization discourse and the role of code glossing in clarifying scientific knowledge.

The results can be used by TED speakers and other presenters. In order to convince the audience and stir up an interest in the topic, they should present the material in a comprehensible way. Interactive metadiscourse markers, including code glosses, can be helpful in this way. In addition, TED speakers should take into consideration the following things when they are preparing their lectures:

1) think about your audience: when seeking to engage the listeners with interesting ideas, they should make sure they are addressing the appropriate audience;

2) do not assume everyone possesses the same knowledge base: assuming that “everyone knows” something, you may marginalize people who may lack this knowledge;

3) make sure that the listener recovers your intended meaning.

It should be admitted that the findings presented here are limited due to a small corpus and should be understood as trends in TED Talks that can be confirmed or disproved by a large-scale corpus-driven analysis. It is plausible that a larger corpus or a corpus built with different TED Talks will feature a significantly larger number of code glosses. I also suggest that further studies be undertaken to explore this area of research, either by extending the methodology or examining other aspects of expert-lay interactions in the media space, including but not limited to the types of syntactic complexity that interferes with comprehension or the types of trope that improve it. Preferences for particular kinds of linguistic markers signalling popularization also represent an interesting empirical site to study the transformation of scientific knowledge into popular science. The research could be extended further by carrying out studies in other special domains, where incomprehensible specialized texts create a demand for expert-to-lay translation on the materials written in languages other than English.

