AI finds how we use and process language to better understanding the development of Alzheimer’s disease


What can reading 26,000 books tell researchers about how language environment affects language behavior?

Brendan T. Johns, an assistant professor of communicative disorders and sciences in the University at Buffalo’s College of Arts and Sciences, has some answers that are helping to inform questions ranging from how we use and process language to better understanding the development of Alzheimer’s disease.

But let’s be clear: Johns didn’t read all of those books. He’s an expert in computational cognitive science who has published a computational modeling study that suggests our experience and interaction with specific learning environments, like the characteristics of what we read, leads to differences in language behavior that were once attributed to differences in cognition.

“Previously in linguistics it was assumed a lot of our ability to use language was instinctual and that our environmental experience lacked the depth necessary to fully acquire the necessary skills,” says Johns. “The models that we’re developing today have us questioning those earlier conclusions. Environment does appear to be shaping behavior.”

Johns’ findings, with his co-author, Randall K. Jamieson, a professor in the University of Manitoba’s Department of Psychology, appear in the journal Behavior Research Methods.

Advances in natural language processing and computational resources allow researchers like Johns and Jamieson to examine once intractable questions.

The models, called distributional models, serve as analogies to the human language learning process.

The 26,000 books that support the analysis of this research come from 3,000 different authors (about 2,000 from the U.S. and roughly 500 from the U.K.) who used over 1.3 billion total words.

George Bernard Shaw is often credited with saying Britain and America are two countries separated by a common language.

But the languages are not identical, and in order to establish and represent potential cultural differences, the researchers considered where each of the 26,000 books was located in both time (when the author was born) and place (where the book was published).

With that information established, the researchers analyzed data from 10 different studies involving more than 1,000 participants, using multiple psycholinguistic tasks.

“The question this paper tries to answer is, ‘If we train a model with similar materials that someone in the U.K. might have read versus what someone in the U.S. might have read, will they become more like these people?’” says Johns. “We found that the environment people are embedded in seems to shape their behavior.”

The culture-specific books in this study explain much of the variance in the data, according to Johns.

“It’s a huge benefit to have a culture-specific corpus, and an even greater benefit to have a time-specific corpus,” says Johns. “The differences we find in language environment and behavior as a function of time and place is what we call the ‘selective reading hypothesis.’”

Using these machine-learning approaches demonstrates the richly informative nature of these environments, and Johns has been working toward building machine-learning frameworks to optimize education.

This latest paper shows how you can take a person’s language behavior and estimate the types of materials they’ve read.

Advances in natural language processing and computational resources allow researchers like Johns and Jamieson to examine once intractable questions.

“We want to take someone’s past experience with language and develop a model of what that person knows,” says Johns. “That lets us identify which information can maximize that person’s learning potential.”

But Johns also studies clinical populations, and his work with Alzheimer’s patients has him thinking about how to apply his models to potentially help people at risk of developing the disease.

He says some people show slight memory loss without other indications of cognitive decline. These patients with mild cognitive impairment have a 10-15% chance of being diagnosed with Alzheimer’s in any given year, compared to 2% of the general population over age 65.

“We’re finding that people who go on to develop Alzheimer’s across time are showing specific types of language loss and production where they seem to be losing long-distance semantic associations between words, as well as low-frequency words,” he says.

“Can we develop tasks and stimuli that will allow that group to retain their language ability for longer, or develop a more personalized assessment to understand what type of information they’re losing in their cognitive system?

“This research program has the potential to inform these important questions.”

The increasing prevalence of dementia among the elderly population is a major societal challenge, leading to a growing demand of diagnostic services for defects in memory and cognitive performance.

One major diagnostic focus would be the early distinction among memory and cognitive complains likely to evolve as neurodegenerative disease, from functional symptoms, or non-neurological disorders. In this area, terminology and diagnostic criteria are still under discussion.

For example, the general description of “a person reporting the feeling of an impairment of the cognitive function” is named “subjective cognitive impairment” or “subjective cognitive decline,” or “subjective memory complains,” or “functional memory disorder” etc., and robust diagnostic criteria are not yet available (Stewart, 2012; Burmester et al., 2016), although a specific working group proposed research criteria (Jessen et al., 2014). Moreover, extensive research over the past decades in the dementia field, have recognized “an intermediate state of cognitive function between the changes seen in aging and those fulfilling the criteria for dementia and often Alzheimer disease (AD),” named Mild Cognitive Impairment (MCI, Petersen, 2011).

From a clinical point of view, MCI has been then categorized in two major subtypes, i.e., amnestic MCI (aMCI) and non-amnesic MCI (naMCI), each of them including one (single) or more (multiple) cognitive domains (Petersen et al., 2014), which might or not evolve in dementia. When evolving as dementia, MCI is preceded by a very long biological history of the disease, as suggested by longitudinal models of the alteration of AD biomarkers including Ab42 and tau in the cerebrospinal fluid (CSF), amyloid deposition at PET, MRI alterations and FDG PET abnormalities (Selkoe and Hardy, 2016).

This leads to the identification of new entities to be considered as research criteria, referred as “prodromal AD” by the International Working Group-2 (IWG-2; Dubois et al., 2014) and “MCI due to AD” by the AD group at the National Institute of Aging-Alzheimer Association (NIA-AA; Albert et al., 2011). This preclinical period could offer a window of opportunity for drug development, risk assessment, and prevention (Calzà et al., 2015; Epelbaum et al., 2017; Ritchie et al., 2017).

Overall these studies addressed research attention on the feasibility of detecting early cognitive changes, and several initiatives and researches are in progress, focusing on identifying the best predictive among the available cognitive tests (Mortamais et al., 2017). Memory is probably the most investigated domain.

Episodic memory functioning seems to be a robust predictor of dementia in prospective studies based on in vivo amyloid imaging (Bäckman et al., 2005; Hedden et al., 2013). Some aspects of language have also been the subject of growing interest, and most of these studies focused on verbal ability, verbal learning and memory, naming, category or letter verbal fluency, verbal episodic memory, etc.

The evaluation of the linguistic functions is usually performed by means of traditional pencil-and-paper or corresponding computer-assisted tests (Ostberg et al., 2005; Duong et al., 2006; Cuetos et al., 2009; Joubert et al., 2010; Pakhomov et al., 2012). Composite scores exploring both memory and language have also been proposed, such as the Alzheimer’s Disease Cooperative Study Preclinical Alzheimer Cognitive Composite (ADCS-PACC) (Donohue et al., 2014) and the Alzheimer prevention initiative (API) composite score (Langbaum et al., 2014).

The API score is composed of seven test scores, i.e., category fluency—fruits and vegetables—, Boston naming test, Logical Memory-delayed recall, east Boston naming test immediate recall, Ravens progressive matrices subset, symbol digit modalities, and the Mini-Mental State Examination (MMSE) orientation to time items. Composite scores are now being used as primary end-point in secondary prevention trials in AD involving presenilin 1 E280A mutation carriers (API trial, Ayutyanont et al., 2014) or in anti-amyloid treatments in asymptomatic individuals that show early amyloid accumulation (Donohue et al., 2014).

While sometimes significant differences between the MCI and normal elderly participants have been recognized by these tests, the range of variation of the scores in MCI often overlaps with that of normal people, making their clinical use unreliable in categorizing individual participants (Taler and Phillips, 2008). Even more confused results emerged from the few studies on Subjective Cognitive Complaints (Martins et al., 2012).

New perspectives are being opened up by the interest toward computerized analysis of spoken language (Natural Language Processing techniques), together with the availability of numerous algorithms for analysis and classification of “speech.”

The experience gained in the “Electronic linguistic corpora” studies, chosen by virtue of their representativeness in characterizing a particular language or linguistic variety, are now opening up new perspectives for language analysis in clinical contexts, also considering that these approaches might quantify many aspects of language, both at the segmental and suprasegmental level, such as prosody and rhythm, that are not explored by conventional language tests.

When applied to “pathological language” (i.e., linguistic productions of subjects affected by a developmental or acquired speech and language disorder), this approach and related technologies would also have the significant advantage of representing a natural and spontaneous language record, outside the diagnostic set-up of the conventional neuropsychological test of language, potentially applicable to large sections of the population using low-cost tools.

With this connection established, we intended to investigate whether the analysis of the spontaneous speech performed by Natural Language Processing techniques could reveal alterations of the language performance in early cognitive decline.

This proof-of-concept study analyzed, by using the Natural Language Processing techniques, the spontaneous speech used by the participants to answer to three specific tasks, i.e., the description of a drawing, details of a last dream and the description of a working day. The study included 96 participants, divided into a control group (CG, N = 48) and three pathological groups (PG), e.g., amnestic MCI (aMCI, N = 16), multiple domain MCI (mdMCI, N = 16) and early dementia (eD, N = 16).


The early recognition of cognitive decline is widely shared goal in the aging global population, and the focus is rapidly moving from defined clinical entities, such as MCI, to pre-clinical or asymptomatically stages in a general and still poorly defined frame addressed as “cognitive frailty” (Calzà et al., 2015).

Specifically, early recognition helps to diagnose early dementia; identify dementia in at-risk individuals; design preventive clinical trials; identify reversible cognitive deficit in systemic diseases–metabolic, renal, cardiovascular, etc. – in depression or inappropriate pharmacological regimens; for secondary and tertiary prevention; and to define more appropriate health and social policies (Sugimoto et al., 2018; Vella Azzopardi et al., 2018).

Language has a central role among the cognitive domains that may reveal early signs of decline, becoming an established topic of research and clinical monitoring of AD progression (reviewed by Bucks et al., 2000; Kempler and Goral, 2008; Taler and Phillips, 2008; Shafto and Tyler, 2014; Szatloczki et al., 2015).

Extensive literature on the use of traditional tests for the language assessment, especially with lexical and semantic access tasks, provides evidence that the lexico-semantic system is already affected in the initial stages of the disease, and patients have difficulties in tasks such as picture naming (Jacobson et al., 2002) and phonemic and semantic verbal fluency (Marczinski and Kertesz, 2006; Rascovsky et al., 2007). On the contrary, the phonological, morphological and syntactic systems are believed to be relatively preserved in the initial stages of the disease, as indicated by such tasks involving reading letters and words (Stilwell et al., 2016).

Studies dedicated to evaluate early language signs in prodromal or preclinical stages, such as MCI (Taler and Phillips, 2008; Drummond et al., 2015; Szatloczki et al., 2015; Hernández-Domínguez et al., 2018), reports inhomogeneous results. For example some authors described significant differences between controls and MCI in semantic fluency and naming tests (Ostberg et al., 2005; Duong et al., 2006; Radanovic et al., 2007; Cuetos et al., 2009; Joubert et al., 2010; Ahmed et al., 2013; Mueller et al., 2016), while others did not confirm differences in Boston Naming, semantic and phonemic verbal fluency (Bschor et al., 2001).

Moreover, the conventional neuropsychological language tests used in these studies fail to explore different levels of the cognitive network involved in complex linguistic activities (phonological, morphosyntactic, semantic-lexical, semantic-pragmatic).

Thus, the spontaneous speech analysis is raising increasing interest in the neuropsychological research for the early detection of cognitive decline (Drummond et al., 2015; Aramaki et al., 2016; Pistono et al., 2016), also because of the high complexity of tasks that require not just lexical-semantic abilities, but also memory and executive functions.

These novel approaches in language analysis may offer an opportunity to detect subclinical language changes, that may be present several years before the clinical phase of the disease and can be considered as one of the prodromal (or preclinical) manifestations of the disease.

The analysis at the discourse level is today possible by using the computational tools of NLP that allow the automatic detection of acoustic, lexical, semantic, syntactic and pragmatic parameters (Roark et al., 2011; Satt et al., 2013; König et al., 2015), thus leading to, a quantitative description and analysis of speech elicited by visual stimuli (“please describe this picture”), or by episodic memory (“please describe your last dream”).

This approach has been also applied to the DementiaBank corpus, including narrative samples from 167 patients with “possible” or “probable” AD. By using two machine-learning classifiers, four factors distinguished AD vs. control narrative samples: semantic impairment, acoustic abnormality, syntactic impairment, and information impairment (Fraser et al., 2016).

Other studies in small cohorts of AD patients (mild, moderate, and severe) have indicated alterations in articulation rate, speech tempo, hesitation ratio, and rate of grammatical errors (Hoffmann et al., 2010); and in acoustic measurements, such as pitch level, pitch modulation, and speaking rate (Horley et al., 2010). However, it should be noted that the potentiality of spontaneous speech analysis is poor in AD patients, even at early stages of the disease, due to the already severe alteration of the language performance.

Thus, increasing interest is directed toward subjects with subjective cognitive impairment or subjective memory complaints (Cuetos et al., 2009), a condition that could be a preclinical phase of the MCI condition (Jessen et al., 2014; Eichler et al., 2015; Mendonça et al., 2016), and MCI. In this proof-of-concept study we used the NLP tools for the analysis of the spontaneous discourse in early cognitive decline (aMCI and mdMCI) and in early Alzheimer disease (eD), included in the study as “positive control.” According to other studies, MCI group’s results could represent an intermediate stage between CG and eD (Drummond et al., 2015).

However, we demonstrated that aspects of the language not considered in conventional neuropsychological tests are deeply affected in MCI compared to CG. In particular, the acoustic features of language–e.g., pause duration, speech segment duration, and phonation rate-conveying linguistic and paralinguistic information such as illocution, modality, emphasis, attitude, and emotion (Finegan, 2011) seem to be sensitive markers of early cognitive decline, also distinguishing amnestic from multiple domain MCI. Notably, pause alterations during autobiographic discourse collected by the EPITOUL ecological task (exploring the episodic memory test) has been also described in MCI by others (Pistono et al., 2016).

Consistent with previous scientific literature, the deterioration of verbal fluency, lexical retrieval process and discourse planning may result in longer hesitations, increased pauses and lower phonation rate. These acoustic features may discriminate between control groups and aMCI (König et al., 2015).

On the contrary, the speech rhythm seems to be rather preserved in the PGs included in our study, while in a study including eAD patients (MMSE > 24), a high variability of syllabic interval was reported (Martínez-Sánchez et al., 2017).

A number of studies have already demonstrated that lexical-semantic system is often impaired in MCI and dementia: our results confirm the finding, showing that patient’s linguistic productions are semantically impoverished. Moreover, even though the correctness of grammatical form is generally preserved, syntax shows to be overall simplified.

Our study provides strong evidence to the emerging, but still puzzling literature, supporting spontaneous speech analysis as a potential tool for early detection of cognitive decline. The need to make these evaluation tools applicable on a large scale and at low cost, has prompted researchers to devise automated forms of analysis of collected samples of speech, recorded and manually transcribed according to appropriate coding systems, by software that can detect a series of acoustic and lexical variables, with detection of acoustic, lexical, semantic, syntactic, and pragmatic parameters (Thomas et al., 2005; Roark et al., 2011; Pakhomov et al., 2012; Satt et al., 2013). In these studies, speech was assessed through the collection of linguistic production samples obtained using various types of tasks.

An additional contribution from this study derived from the use of three different speech tasks. In spite of the fact that the description of a complex picture is the most widely used task (Goodglass et al., 2000; Bschor et al., 2001; Cuetos et al., 2009; König et al., 2015), we observed that the description of a “working day” and “the last dream” seem to be more sensitive tasks, probably because require memory recall and a more structured narration. Some other researchers have investigated different aspects of language as communication abilities (Toledo et al., 2018), reading comprehension (Hudon et al., 2006; Schmitter-Edgecombe and Creamer, 2010), the repetition of complex sentences (König et al., 2015; Lust et al., 2015) or the ability to recognize the grammatical correctness (Taler and Jarema, 2004). Other groups of researchers have applied the analysis of discourse on verbal productions recorded during the classic episodic memory tests as the Wechsler logical memory (Roark et al., 20072011). The most ambitious studies have also used analytical tools applicable directly on the voice recordings of subjects (Meilán et al., 2014), showing a good correlation between the automated classification and that based on clinical and manual data processing (Roark et al., 2011; Satt et al., 2013; Hernández-Domínguez et al., 2018).


Results from this proof-of-concept study suggest that computerized speech analysis identifies alterations in MCI for language features not explored by conventional diagnostic neuropsychological tests, also including language tests such as phonological and lexical fluency. Numerous acoustic features can distinguish between healthy controls and aMCI subjects, and lexical, rhythmic, and syntactic features may be also relevant, depending on the type of language task evaluated. While longitudinal studies (in progress) are necessary to confirm this hypothesis, we suggest that (i) speech analysis should be included as exploratory end-point in AD prevention studies; (ii) speech analysis coupled to imaging studies in cognitive decline could provide new information for language neuroanatomy; (iii) computerized speech analysis should be considered also for the development of novel tests for preclinical AD, thus contributing to the research priorities (prevention and identification of AD risk) identified by the WHO (Ministerial Conference on Global Action against Dementia, Shah et al., 2016).

University at Buffalo
Media Contacts:
Bert Gambini – University at Buffalo
Image Source:
The image is in the public domain.

Original Research: Closed access
“The influence of place and time on lexical behavior: A distributional analysis”. Brendan T. Johns, Randall K. Jamieson.
Behavior Research Methods doi:10.3758/s13428-019-01289-z.


Please enter your comment!
Please enter your name here

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.