Lexical meaning is lower dimensional in psychosis.
Claudio Palominos, Frederike Stein, Tilo Kircher, Rosa Ayesa-Arriola, Lena Palaniyappan, Philipp Homan, Iris E Sommer, Wolfram Hinzen
Abstract
Open AccessDiverse language models (LMs), including large language models (LLMs) based on deep neural networks, allow us to chart how people organize meanings in speech and how this process breaks down in conditions. Recent evidence has pointed to higher mean semantic similarities between words in people with psychosis, conceptualized as a 'shrunk' (more compressed) semantic space. Based on this, we hypothesized that the dimensionality of the vector spaces as defined by the embeddings of speech samples from LMs would also be easier to reduce in psychosis. To test this, we used principal component analysis (PCA) to calculate different metrics serving as proxies for reducibility, including the number of components needed to reach 90% of variance, and the cumulative variance explained by the first two components. For further exploration, intrinsic dimensionality (ID) was also estimated. Results consistent over datasets in three languages confirmed significantly higher reducibility of the semantic space in psychosis. This result points to the existence of an underlying intrinsic geometry of the space of semantic associations in speech, which may underlie more surface-level measurements such as semantic similarity. It also offers a new foundational approach to speech in mental disorders.