Psychology of Language by Dinesh Ramoo is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.
© 2021 Dinesh Ramoo
The CC licence permits you to retain, reuse, copy, redistribute, and revise this book—in whole or in part—for free providing it is for non-commercial purposes, and adapted and reshared content retains the same licence, and the author is attributed as follows:
If you redistribute all or part of this book, it is recommended the following statement be added to the copyright page so readers can access the original book at no cost:
Download for free from the B.C. Open Collection.
Sample APA-style citation:
This textbook can be referenced. In APA citation style (7th edition), it would appear as follows:
Cover image attribution:
Ebook ISBN: 978-1-77420-132-9
Print ISBN: 978-1-77420-131-2
Visit BCcampus Open Education to learn about open education in British Columbia.
Anything of merit in this work is through the generosity of my teacher:
Dr. Andrew Olson.
BCcampus Open Education believes that education must be available to everyone. This means supporting the creation of free, open, and accessible educational resources. We are actively committed to increasing the accessibility and usability of the textbooks we produce.
The web version of this resource Psychology of Language has been designed to meet Web Content Accessibility Guidelines 2.0, level AA. In addition, it follows all guidelines in Appendix A: Checklist for Accessibility of the Accessibility Toolkit – 2nd Edition. It includes:
Element | Requirements | Pass? |
---|---|---|
Headings | Content is organized under headings and subheadings that are used sequentially. | Yes |
Images | Images that convey information include alternative text descriptions. These descriptions are provided in the alt text field, in the surrounding text, or linked to as a long description. | Yes |
Images | Images and text do not rely on colour to convey information. | Yes |
Images | Images that are purely decorative or are already described in the surrounding text contain empty alternative text descriptions. (Descriptive text is unnecessary if the image doesn’t convey contextual content information.) | Yes |
Tables | Tables include row and/or column headers that have the correct scope assigned. | Yes |
Tables | Tables include a title or caption. | Yes |
Tables | Tables do not have merged or split cells. | No |
Tables | Tables have adequate cell padding. | Yes |
Links | The link text describes the destination of the link. | Yes |
Links | Links do not open new windows or tabs. If they do, a textual reference is included in the link text. | Yes |
Links | Links to files include the file type in the link text. | Yes |
Audio | All audio content includes a transcript that includes all speech content and relevant descriptions of non-speach audio and speaker names/headings where necessary. | Yes |
Video | All videos include high-quality (i.e., not machine generated) captions of all speech content and relevant non-speech content. | Yes |
Video | All videos with contextual visuals (graphs, charts, etc.) are described audibly in the video. | Yes |
H5P | All H5P activities have been tested for accessibility by the H5P team and have passed their testing. | No |
H5P | All H5P activities that include images, videos, and/or audio content meet the accessibility requirements for those media types. | N/A |
Formulas | Formulas have been created using LaTeX and are rendered with MathJax. | N/A |
Formulas | If LaTeX is not an option, formulas are images with alternative text descriptions. | N/A |
Font | Font size is 12 point or higher for body text. | Yes |
Font | Font size is 9 point for footnotes or endnotes. | Yes |
Font | Font size can be zoomed to 200% in the webbook or eBook formats. | Yes |
We are always looking for ways to make our textbooks more accessible. If you have problems accessing this textbook, please contact us to let us know so we can fix the issue.
Please include the following information:
You can contact us one of the following ways:
This statement was last updated on July 28, 2022.
The Accessibility Checklist table was adapted from one originally created by the Rebus Community and shared under a CC BY 4.0 License.
This textbook is available in the following formats:
For more information about the accessibility of this textbook, see the Accessibility Statement in the front matter.
You can access the online webbook and download any of the formats for free here: Psychology of Language. To download the book in a different format, look for the “Download this book” drop-down menu and select the file type you want.
Format | Internet required? | Device | Required apps | Accessibility Features | Screen reader compatible |
---|---|---|---|---|---|
Online webbook | Yes | Computer, tablet, phone | An Internet browser (Chrome, Firefox, Edge, or Safari) | WCAG 2.0 AA compliant, option to enlarge text, and compatible with browser text-to-speech tools. | Yes |
No | Computer, print copy | Adobe Reader (for reading on a computer) or a printer | Ability to highlight and annotate the text. If reading on the computer, you can zoom in. | Unsure | |
EPUB | No | Computer, tablet, phone | eReader app (EPUB) | Option to enlarge text, change font style, size, and colour. | Unsure |
HTML | No | Computer, tablet, phone | An Internet browser (Chrome, Firefox, Edge, or Safari) | WCAG 2.0 AA compliant and compatible with browser text-to-speech tools. | Yes |
The webbook includes a video and audio clips of different sounds as well as a number of interactive H5P activities. If you are not using the webbook to access this textbook, this content will not be included. Instead, your copy of the text will provided a link to where you can access that content online.
Even if you decide to use a PDF or a print copy to access the textbook, you can access the webbook and download any other formats at any time.
Psychology of Language by Dinesh Ramoo was funded by BCcampus Open Education.
BCcampus Open Education began in 2012 as the B.C. Open Textbook Project with the goal of making post-secondary education in British Columbia more accessible by reducing students’ costs through the use of open textbooks and other OER. BCcampus supports the post-secondary institutions of British Columbia as they adapt and evolve their teaching and learning practices to enable powerful learning opportunities for the students of B.C. BCcampus Open Education is funded by the British Columbia Ministry of Advanced Education and Skills Training and the Hewlett Foundation.
Open educational resources (OER) are teaching, learning, and research resources that, through permissions granted by the copyright holder, allow others to use, distribute, keep, or make changes to them. Our open textbooks are openly licensed using a Creative Commons licence and are offered in various eBook formats free of charge, or as printed books that are available at cost.
For more information about open education in British Columbia, please visit the BCcampus Open Education website. If you are an instructor who is using this book for a course, please fill out our Adoption of an Open Textbook form.
Dinesh Ramoo is a lecturer at Thompson Rivers University in Kamloops, Canada. After receiving his PhD in Psychology from the University of Birmingham in 2014, he has worked in the United Kingdom, Sri Lanka, and Turkey before moving to Canada in 2019. He has also served as a consultant linguist for Google Inc. and Oxford University Press. His research interests include word-form encoding in English as well as Indian languages such as Hindi and Tamil. He employs experimental data from neurological patients with acquired language disorders as well as computational models to study language production.
Marie Bartlett is an Instructional Designer at Thompson Rivers University—Open Learning. Marie is dedicated to creating educational experiences that inspire and engage learners. She looks for opportunities to highlight active learning and creativity in her course design, and uses innovative pedagogical approaches to best utilize online environments for learning and teaching.
Marie grew up in the Czech Republic, where she started teaching high school level English as a Second Language, Studio Drawing, and History of Architecture. Her curiosity and sense of adventure took her to study and work in the United States and Canada. In 2006, Marie moved from face-to-face educational settings to work in distance education, which has been her passion ever since.
Marie’s educational background is in Learning and Technology (MA, Royal Roads University), and Art History and English Literature (BA honors, University of Victoria).
Nicole Singular is a Graphic Designer for Thompson Rivers University Open Learning. Nicole specializes in the intersection between educational technology and illustration with a focus on applying visual design within an online pedagogical context.
Eric Franks is student at Thompson Rivers University. He is a Supplemental Learning Leader at TRU and a Youth Support Worker for Insight Support Services. He is fluent in English and Spanish.
Learning Objectives
Language is the most human of all qualities. No human population has been found that doesn’t have language and uses it not just for communication but as an instrument of cultural identity and transmission. Psycholinguistics is a discipline with roots in psychology and linguistics. It combines the theories from both to develop a scientific understanding of language.
Given the centrality of language to human culture, its analysis and investigation goes back thousands of years. The earliest formalization of a language was conducted almost 4000 years ago in Babylonia. As Sumerian was considered a language of prestige, word lists of the language were created to help people learn it as a foreign language. A similar position was help by Sanskrit in India. Around 1200 BCE, the oral transmission of the Vedas became standardized in as people started to notice that the language was changing over time (Staal, 1986). Strict rules were developed to preserve the oral scriptures which have survived to this day. Of the six canonical areas of knowledge considered necessary for the proper study of the Vedas, four dealt with language: śikṣā (phonetics and phonology), chandas (prosody), vyākaraṇa (grammar), and nirukta (etymology). This impetus towards standardization led to Ancient Indians analyzing the language for its properties and linguistics as a science was born. The 6th century BCE grammarian Pāṇini wrote the Aṣṭādhyāyī, a grammatical treatise on Sanskrit. The sounds of Sanskrit were organized into units based on place and manner of articulation (which we will discuss further in Chapter 2). These ideas influenced an interest in what appears to be early psycholinguistics in the form of the Sphoṭa school of linguistics. This school dealt with investigating how linguistic units are organized in the mind and produced as speech (we will visit this issue later in Chapter 9). At the same time, another school of linguistics was emerging in South India on India’s other classical language: Tamil. The Tolkāppiyam was written in the turn of the first millennium by an author known only as Tolkāppiyar (he who wrote the Tolkāppiyam). This adapted the ideas from Sanskrit grammarians to an unrelated language.
At the same time, the Ancient Greeks were engaged in discussion about the origins of language. In Cratylus, Plato presents the idea that the meaning of words emerges from a natural process. His student Aristotle delved further into rhetoric and poetry as well as looking at language in terms of its possibilities for defining logic. The 4th century Roman grammarian Aelius Donatus compiled the Latin grammar Ars Grammatica which dominated linguistic thought in the Middle Ages. Indeed we still use his ideas for studying most European languages.
The Chinese were no less interested in linguistics or Xiaoxue (小學). They divided their attention between three branches of knowledge: exegesis (Xungu, 訓詁), analysis of writing (Wenzi, 文字), and phonology (Yinyun, 音韻). The first glossary of word from the 3rd century BCE was Erya. Confucius was particularly concerned with the relationship between names and reality. In the Analects (12.11,13.3) he considers moral and social collapse as a result of people not acting according their named roles. Similar efforts can be seen in the Middle East with scholars attempting to standardize the description of Classical Arabic in the 8th century. Perhaps the most important injection of life to the field of linguistics was Sir William Jones’ 1786 book The Sanscrit Language [sic]. Jones proposed that Sanskrit and Persian resembled Classcial Greek and Latin starting off the field of comparative linguistics. Analyzing the sound rules that led to the divergence of these languages from a common ancestor has been a vibrant field within linguistics ever since.
Although linguistics has a venerable history across the world, psycholinguistics itself has a relatively recent history. Francis Galton studied word association as early as 1879 and Meringer and Mayer (1895) studied slips of the tongue which Freud (1901/1975) tried to analyse with his theory of psychodynamics.
The modern field of linguistics can be traced to a 1951 conference held at Cornell University, USA. In describing the conference, Osgood and Sebeok’s (1954) were the first to use the word “psycholinguistics.” During this time, the dominant paradigm in psychology was behaviourism. Psychologists following the behaviourist tradition considered observable phenomena such as input (stimuli) and output (response) to be the only things that need be investigated within psychology as a science. How the input was cogitated in the mind was considered too esoteric for scientific analysis because these processes were not measurable. As language was a type of behaviour, its acquisition and use were explained in behaviourist terms such as reinforcement and conditioning as explained in Skinner’s famous book Verbal Behaviour. Chomsky’s (1959) scathing review of Skinner’s book led to a revolution (of the cognitive variety) by discussing how Skinner’s explanations for language acquisition and use fell short of empirical evidence and couldn’t explain natural language. He argued for a new theory call transformation grammar to account for how underlying cognitive structure can account for people’s intuitive grasp of language production and comprehension. The field of psycholinguistics was abloom with the rarified perfume of new ideas attempting to find empirical evidence for this new theory. This exploration continues to this day as we also delve into this field to explore language.
Consider the following poems:
Poem A
Hwæt. We Gardena in geardagum,
þeodcyninga, þrym gefrunon,
hu ða æþelingas ellen fremedon.
Poem B
Whan that Aprille with his shoures soote,
The droghte of March hath perced to the roote,
And bathed every veyne in swich licóur
Of which vertú engendred is the flour.
Poem C
Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
You may be familiar with poem c as a famous sonnet by Shakespeare. Some of the words, such as thee, may be a little odd, but you can understand it. Poem b may be a little more difficult to understand; but you may see some familiar words there. Try saying it out loud and you will hear even more familiar words from this beginning of Chaucer’s Canterbury Tales. Poem a (the beginning of Beowulf) would be unintelligible to a modern English speaker and sounds more like German. However, this is also English (of Old English) connected to us through a long line of speakers going back to the 7th century.
This shows us that languages change over time and may even change so much as to become different languages. Depending on how we count them, there are about 5000-6000 languages in the world. As languages change, is it possible that there were once fewer languages that then diverged into many over time? We do notice the apparent similarities between languages when we hear familiar words. As you can see in Figure 1.1, the word mother is quite similar in many European languages such as English (mother), Dutch (moeder) and Spanish (madre) but different from Turkish, Finnish and Hungarian. We can also see similarities with far flung languages such as Sanskrit (Mata) and Persian (mâdar).
Detailed analysis of words and other linguistic elements across different languages has shown that most languages in Europe, West and South Asia derive from a common ancestor called proto-Indo-European (or PIE). Analyzing the common words found in theses languages, scholars have tried to determine the common elements of this ancient culture. For examples, proto-Indo-Europeans must have had horses, sheep and chariots as they share these words while they do not have common words for palm trees and vine. There are some languages in Europe which are not descended from PIE including Basque (a language isolate not related to any other language), Finnish, Hungarian and Estonian.
Indo-European consists of a large number of languages spread across the world. As seen in Figure 1.2, these can be broadly grouped into smaller families within the larger language family of PIE. We have English and its closest cousins German, Dutch, Swedish, Norwegian, and Frisian grouped into the Germanic language family. Languages that descend from Latin such as Italian, Spanish, French and Romanian get classified as Romance languages. Russian, Polish, Czech, Slovak and Macedonian get classified as Slavic languages. A broad ribbon of related languages spread from Eastern Turkey to India and Sri Lanka known as Indo-Iranian languages. These consist of languages descended from Ancient Persian including Modern Persian, Pashto and Kurdish and those descended from Vedic Sanskrit including Hindi, Urdu, Marathi, Bengali, Punjabi (in India), Sinhalese (in Sri Lanka) and Dhivehi (in Maldives). Some languages such as Greek, Albanian and Arminian remain isolated on their own within the larger Indo-European language family.
As we have already seen, Indo-European is not the only language family found in Europe. Finnish, Hungarian and Estonian fall within the Finno-Ugric language family. Some other language families include Afro-Asiatic (including languages spoken in North Africa and the Arabian Peninsula), Dravidian (spoken in Southern India as well as parts of Sri Lanka and Pakistan), Sino-Tibetan, as well as the plethora of language families in North America.
When talking about the Indigenous languages of North America, it is often the case that we confuse culture, tribe and language. As seen in Figure 1.3, Canada consists of six Indigenous cultural regions which codify the climate, outlook, and way of life of the people in them. If you compare Figure 1.3 with Figure 1.4, you will see that cultural regions do not necessarily overlap with language families. It is possible for Indigenous peoples of the same cultural region to speak very different languages.
As seen in Figure 1.4, there are 11 North American language families with 53 separate languages in Canada. This is a fraction of the over 296 languages belonging to 29 language families spoken north of Mexico. These languages and language families are as distinct from each other as the languages of Europe and Asia. This means we need to understand these languages with the same lens of diversity instead of grouping them under the category of Indigenous languages. The main legal categorization of these communities is under First Nation, Inuit and Métis consisting of 634 communities. These terms are continuously evolving and the term ‘First Nation’ itself consists of five sub-categories: Non-status, status treaty, status non-treaty, status Bill C-3 and status Bill C-31. These legal distinctions can overlap with cultural and linguistic boundaries.
Psycholinguistics employs a number of ways understand language. These range from observational studies, speech error analysis to experiments and neuroimaging techniques. We also use computational models to simulate our theories about the language system. This section will explore some of the techniques employed by researchers. However, keep in mind that we are always developing new techniques to understand how language works.
The study of reaction time on cognitive tasks is a common psychological paradigm in trying to infer the duration, sequence and content of cognition. As seen in Figure 1.5, reaction time (or RT) is measured as the time between the onset of a stimuli and the response by the participant. The mean and the variance of reaction times are considered useful indices on processing speed. The most common form of reaction time experiments are button presses. However, eye movements and voice onset (in repetition and reading tasks) can also be employed.
One of the most popular reaction time paradigms is called priming. Priming is used in almost all areas of psychology. The basic idea is that if two things share some cognitive or psychological attribute, they will either facilitate or interfere with each other. However, it they do not share such similarities, there will be no such effect. For example, it is easier to recognize the word DOG if you have already seen the word CAT. This can be a kind of sematic priming in that both words belong to the same semantic category (ANIMAL). Such an effect is known as facilitation while the interference of slowing down of such an effect if known as interference.
As seen in Figure 1.6, the reasoning behind priming effects can be modelled as a web of interconnected ideas or concepts in the mind. Concepts that are connected semantically (dogs and frogs are both animals) or phonologically (dog and bog end with similar sounds) are more likely to facilitate priming. In Figure 1.6, sematic connections are indicated with straight lines while phonological connects are indicated with dotted lines. The idea is that encountering a stimulus (by seeing or hearing it) will not only activate that concept in the mind but also partially activate connected concepts to some degree. As such, when any one of those connected concepts is presented next, they will be retrieved quicker because they have already been partially activated (or primed) by the previous activation.
As the brain is a vulnerable organ, it can be damaged by external or internal trauma. If blood flow and oxygen supply is constricted even for a few minutes to neurons they begin to die. These sites of damage are called lesions. Such trauma can be from accidents, strokes, brain surgery, or the ingestion of certain toxins. Examining these lesions and associating them with the behavioural limitations of such patients can provide valuable information about which regions are responsible for which behaviour. Cognitive Neuropsychology has contributed to psycholinguistics from the earliest times. Perhaps the earliest record of this is from case 20 in the Edwin Smith Papyrus. It is the report of a patient with a head injury which led to the following observation: “…He is speechless. An ailment not to be cured.” A clear case of speech loss due to brain injury. Centuries later, Broca and Wernicke continue with such observation and we will discuss them in Chapter 4. Cognitive neuropsychology attempts to relate brain-damaged behavioural deficits to models of normal processing. Shallice (1988) overserved that cognitive neuropsychology has made significant advances in associating neurological disorders to cognitive model, emphasized the importance of single case studies over group studies, and contributed to the exploration of impaired brain behaviour as a way towards understanding unimpaired behaviour. While traditional lesion studies were conducted by post-mortem examination and backtracking to analyse the behaviour of the patient while alive, modern neuroimaging techniques allow us to examine lesions in patients while they are alive and conduct behavioural analysis in real time.
The advent of neuroimaging techniques has led to a flowering of new research in psycholinguistics. While traditional X-rays are not able to provide much detail on the brain, other technology such as the measurement of electrical activity in the brain have provided valuable data. Such techniques include electroencephalography or EEG which measures the brain’s electrical activity by detecting them from electrodes placed on the scalp. An amplifier can then amplify the millivoltage differences across the scalp and provide a continuous reading of brain activity.
Psychologists go even further and measure such electrical activity by tying them to specific events (such as the presentation of a stimulus). Such event-related potential or ERPs can have positive or negative polarities. These peaks in ERP readings are labelled according to their polarity (positive or negative) and the time difference from the stimuli onset (in milliseconds). Some common ERPs include N400 (detected 400ms after stimulus onset as a negative voltage) and P300 (detected 600ms after stimulus onset as a positive voltage). As EEG and ERP are measuring electrical activity, they detect changes in the brain almost instantly. We can say they have very good temporal resolution. However, as they are detecting this electrical potential from the scalp, the signals that are detected tend to be an averaged out one from multiple brain regions and neurons. Therefore, it is not always possible to pinpoint which brain region was actually involved in a particular EEG or ERP signal. In other words, these techniques have poor spatial resolution. Other techniques such as PET and MRI have been developed as a way to increase the spatial resolution of neuroimaging.
PET (positron emission tomography) uses radioactive substances as tracers to produce images of brain activity. As the brain consumes a large amount of energy, injecting glucose into the body ensures that most of it ends up in brain regions that are active in a cognitive task. If the glucose contains isotopes that are radioactive, their emissions can be detected and transformed into images.
PET is employed both as a medical and research tool. As seen in Figure 1.8, a short-lived radioactive isotope is injected into the participant. The most commonly used is F-18 labeled fluorodeoxyglucose (FDG). After a waiting period for the active molecule to become concentrated in the brain tissue (one hour for FDG), the participant is placed inside the scanner. As the tracer decays, its emissions are collected by the scanner. The scanner depends on detecting a pair of photons moving in opposite directions. Photons that do not have a temporal pair are ignored. Computational reconstruction uses statistical analysis and error correction to produce images such as Figure 1.8 which shows a scan of an unimpaired participant.
As you can imagine, the main issue with PET is the injection of radioactive material into the body. Various jurisdictions set standards on the maximum amount of radiation that a person can be exposed to in a year. This means that the same participant can only take part in a small number of PET scans which limits the amount of data collection possible in psychological studies. Another factor is the expense of PET scanners and the radioactive tracers.
An alternative to PET that doesn’t use radioactive substances is Magnetic Resonance Imaging (MRI). This employed powerful electromagnets to affect hydrogen atoms. Hydrogen atoms are abundant in humans as water and fat. The atomic nuclei of hydrogen atoms are able to absorb radio frequency energy when placed in a magnetic field. The resulting spin polarization can produce a radio frequency signal that can be detected and analyzed. Varying the parameters of the radio pulse sequence can produce different contrasts between brain tissues based on the properties of their constituent hydrogen atoms. Computational processing of the signals can produce a highly detailed 3D image of the brain. However, this is a static image of the tissues without any indication of brain activity.
Recently, fMRI (functional magnetic resonance imaging) has come to the forefront as a way to overlap MRI scans with images of brain activity. This measures the energy released by hemoglobin in the blood. It is assumed that the areas of the brain that are most active would be the most likely to take in more blood (for energy). Therefore, the measurement of blood flow with different brain regions can indirectly show us a measure of their activation during particular cognitive tasks. This type of scan provides a better temporal and spatial resolution than PET. However, as there is a 1-5 second lag between brain activation and detection, the temporal resolution of fMRI is inferior to EEG.
Neuroimaging is at the forefront of psycholinguistic research into language processing in the brain. They can tell us about the time course of various cognitive processes and the extent to which mental processes interact with each other. However, these techniques are still quite expensive and vary in terms of their temporal and spatial resolutions. As can be seen in Figure 1.10, different techniques vary in terms of how accurately they measure timing and active brain regions. EEG can detect brain activity with high temporal resolution but cannot tell us exactly where they originated. As signals are all detected on the surface of the head, we cannot be sure whether they originated in the cortex or areas deeper inside the brain. On the other hand, PET and fMRI are quite good at providing spatial information. However, as they rely on the flow of fluids (blood), there is a temporal lag between when a brain region become active and when the signal is detected by the scanner.
Methodological limitations also exist as most of these techniques require the participant to be still during the scan. This limits the ability to study overt speech or other movement. In addition, the use of powerful magnets in fMRI means that participants with any metal implants cannot take part in such studies (the metal would fly out of their body towards the scanner).
A more serious limitation of any neuroimaging technique is the difficulty in interpreting the results. How do we know what is causing a particular activity? We can see when or where something is happening, but not necessarily how. Observing neural activity is not the same as observing mental activity. Some studies often average out the results from multiple participants. How can we be sure that all of them are using the same brain regions for similar activities? However, even with such limitations, these methods have opened us to a wide range of insights into the neurological basis of language. As new methods are developed, we may even see these methods employed regularly for research and rehabilitation.
Figure 1.5 Reaction Time Experiment
A diagram showing the process of testing someone’s reaction time to seeing a number on a computer screen and pressing the number on their keyboard:
[Return to the place in text (Figure 1.5)]
Figure 1.7 Positron Emission Tomography Schema
Function of a PET machine. A scanner detects the emissions of the short-lived radioactive isotope in the brain of the subject, transmits this information to a Coincidence Processing Unit, which is subsequently used to reconstruct an image of the subject’s brain activity.
[Return to place in the text (Figure 1.7)]
Figure 1.9 fMRI Activation in an Emotional Stroop Task
fMRI scans of six brains. The first three images display the brain’s response to expressions, while the last three illustrate the brain’s response to words. Coloured marks from red to yellow are used to qualitatively assess the strength of the brain’s response, in addition to the location of brain activity.
[Return to place in the text (Figure 1.9)]
Figure 1.10 Comparing Brain Imaging Techniques
A labeled, three-dimensional graph comparing the several brain imaging techniques on the axes of Temporal Resolution, Portability, and Spatial Resolution.
Whole brain imaging techniques listed by spatial resolution from low to high:
Local brain imaging techniques listed by spatial resolution from low to high:
[Return to place in the text (Figure 1.10)]
Language is a multifaceted attribute of humanity. Indeed, while we consider language in terms of speech, most of our use of language is in the form of an internal monologue within our own minds. In addition, sign language is also as much a of a language as spoken language. Sign language is not just a set of gestures any more than spoken language is just a set of sounds. The way in which the sounds are organised, how they fit together to form words and meaning as well as the order of the words in an utterance are all amazingly complex subjects that we will touch upon in this book.
We also need to contend with language in its visual form. The written word allows us not just to communicate as usual (across space) but also across time. I can understand (or at least try to understand) what a poet or scholar was thinking thousands of years ago by reading what they wrote. This power to record language allows us to transcend the limitations of nature and transmit our ideas through the generations independent of biological constraints.
Living Language
This book will revolve around five overarching themes. First, we will try to immerse ourselves in the basics of linguist theories relevant to the exploration of psycholinguistics. Any understanding of the cognitive processes involved in language must first establish a thorough understanding of the linguistic ideas that underpin its study. This theme will be divided across the various chapters but will mostly be found in Chapters two and three.
The second theme will explore the various processes involved in language processing and how they interact. For example, does our ability to read influence how we speak? Does memory play a role in how language comprehension and production occur?
The Third theme will be the exploration of the various theories and models that are employed by psychologists to understand language. For example, how do we understand language production? What are its stages? If we speak more than one language, does that mean we need to have separate mechanisms to process those languages or do they overlap? Do we use the same set of mechanisms to process words we know and novel words that we are learning? Models are a staple of the sciences, from the heliocentric model of the solar system to the various models of the atom. Psycholinguistics also employs models to illustrate how different psychological processes interact to produce human behaviour (in our case, language).
Fourth, we will explore what evidence exists to support these theories and model. Afterall, we can have an elegant model that explains some psychological process and while it may make sense to us, without evidence we have no idea whether it exists in the real world. Sometimes the model evolves from existing data and sometimes the model precedes the evidence. For example, Copernicus proposed the heliocentric model for its elegance and simplicity but the real evidence to support the model didn’t arrive until much later with Kepler and Galileo Galilee. Similarly, psychologists may propose models that appear to make sense and then modify them as the evidence appears from numerous studies. The evidence that we will look at will range from observational studies, experimental data, computational models and speech error analysis. This last type of evidence can come from the errors made in everyday speech (meticulously gathered over the years by tedious psycholinguists) or mistakes that appear in the speech of people who have suffered some impairment to their language system through brain damage (such as a stroke). Analysing and combining these disparate sources of knowledge can yield fecund ground for new theories and models for psycholinguists.
Finally, this book will explore how all of the preceding themes can be enriched by the inclusion of diverse languages and particularly the Indigenous languages of Canada. Psycholinguistics has been dominated by the study of European languages as linguistics has its origins in the comparative studies between Western languages. Even though the recording and analysis of Indigenous languages was conducted in earnest by some linguists, this was always as a way to show the contrasts between these languages and the language of the colonizers. This is regrettable because the diversity of languages across the world can provide us with new clues to how language and cognition work. For example, do the languages we speak influence the way we think and behave? Analysing a small set of language with very similar cultural and historical backgrounds (as is the case with the languages of Europe) may not allow us to answer such questions. By opening ourselves to the rich linguistic treasures that are spread across Canada from English and French to the many languages of the First Nations, Métis and Inuit communities, we can truly appreciate the possibilities of language as a human universal.
This chapter introduced us to the subject of psycholinguistics and the major themes that pervade this book. We explored the history of psycholinguistics and the classification of languages into language families. We also saw the linguistic diversity that exists in Canada which presents us with new opportunities to explore the cognitive and linguistic possibilities of human language. We also considered some of the research techniques employed by psycholinguists to study language and their limitations.
Key Takeaways
Exercises in Critical Thinking
Chomsky, N. (1959). Review of “Verbal behavior” by B. F. Skinner. Language, 35, 26–58.
Freud, S. (1901/1975). The psychopathology of everyday life (Trans. A. Tyson). Harmondsworth, UK: Penguin.
Meringer, R., & Mayer, K. (1895). Versprechen und Verlesen: Eine Pyschologisch-Linguistische Studie. Stuttgart: Gössen.
Osgood, C. E., & Sebeok, T. A. (Eds.). (1954/1965). Psycholinguistics: A survey of theory and research problems. Bloomington: Indiana University Press.
Bod R. (2014). A new history of the humanities: The search for principles and patterns from antiquity to the present. Oxford University Press.
Shallice, T. (1988). From neuropsychology to mental structure. Cambridge: Cambridge University Press.
Staal, J. F. (1986). The fidelity of oral tradition and the origins of science. North-Holland Publishing Company.
Learning Objectives
This chapter is an introduction to various articulatory building blocks that make up human language: sounds and syllables. We will explore how we define the specific sounds of language and how we can describe them. You may encounter a lot of technical terminology and it may appear difficult at first to understand these terms. However, you will find it worthwhile as you will gain valuable insight into how you speak. You will also find it easier to understand the rest of this book as the classification of sounds forms the basis for a lot of what we will be discussing later.
The sounds we produce can be described in terms of their physical properties and in terms of how they are articulated. The acoustic details of speech sounds are studied as phonetics. The description of sounds in terms of how they are produced is known as phonology. Think about how to produce the ‘t’ at the beginning of the word ‘tin.’ If you are native speaker of English, you will produce a small burst of air as you produce the ‘t’. This is not the case when you produce the ‘t’ in the word ‘sit.’ The ‘t’ in ‘tin’ is aspirated and the ‘t’ in ‘sit’ is unaspirated. Even if you produce the ‘t’ without aspiration, it may sound odd but doesn’t change the meaning of the word in English. We will call these different sounds phones. However, in some languages (such as Hindi), aspiration does change meaning. Therefore, in Hindi there is a distinction between unaspirated [b] in [bɑːluː] ‘sand’ and aspirated [bʰ] in [bʰɑːluː] ‘bear’. As English doesn’t differentiate between aspirated and unaspirated sounds, Mowgli’s bear buddy in Rudyard Kipling’s ‘The Jungle Book’ is simply called Baloo. Similarly, the ‘gh’ in Bagheera is an aspirated [ɡʰ] sound which is not pronounced as such in English. When we write out a phone in linguistics, we place them between two square brackets (as seen above).
The smallest sound unit in a language is known as a phoneme. In English, the aspirated and unaspirated ‘t’ sounds are both considered one phoneme as they are not distinguished by speakers of that language. When such sounds occur without being differentiated by speakers of a languages, they are known as allophones. However, in Hindi the aspirated and unaspirated ‘t’ sounds are separate phonemes. When we write out phonemes in linguistics, we place them between two forward slashes. So, a phonemic description of the word ‘pin’ would look like /pɪn/, while a phonetic description would look like [pʰɪn].
In order to discover all the phonemes in a language, we often employ minimal pairs. Two words that differ from each other in just one phoneme are know as minimal pairs. Consider ‘kit’ and ‘kid’. Substituting ‘t’ for ‘d’ changes the meaning of the word, but changing [kɪt] with [kɪth] would not. Therefore, /t/ and /d/ are phonemes in English and [t] and [th] are phones.
We speak by moving parts of our vocal tract (See Figure 2.1). These include the lips, teeth, mouth, tongue and larynx. The larynx or voice box is the basis for all the sounds we produce. It modified the airflow to produce different frequencies of sound. By changing the shape of the vocal tract and airflow, we are able to produce all the phonemes of spoken language. There are two basic categories of sound that can be classified in terms of the way in which the flow of air through the vocal tract is modified. Phonemes that are produced without any obstruction to the flow of air are called vowels. Phonemes that are produced with some kind of modification to the airflow are called consonants. Of course, nature is not as clear-cut as all that and we do make some sounds that are somewhere in between these two categories. These are called semivowels and are usually classified alongside consonants as they behave similar to them.
While vowels do not require any modifications to the airflow, the production of consonants requires it. This obstruction is produced by bringing some parts of the vocal tract into contact. These places of contact are known as places of articulation. As seen in Figure 2.2, there are a number of places of articulation for the lips, teeth, and tongue. Sometimes the articulators touch each other as in the case of the two lips coming together to produce [b]. At other times, two articulators come into contact as when the lower lip folds back into the upper teeth to produce [f]. The tongue can touch different parts of the vocal tract to produce a variety of consonants by touching the teeth, the alveolar ridge, hard palate or soft palate (or velum).
While these places of articulation are sufficient for describing how English phonemes are produced, other languages also make use of the glottis and epiglottis among other parts of the vocal tract. We will explore these in more detail later.
Figure 2.1 Parts of the Human Vocal Tract
A labeled image of the anatomical components of the human vocal tract, including the nasal cavity, hard palate, soft palate or velum, alveolar ridge, lips, teeth, tongue, uvula, esophagus, trachea, and the parts of the larynx, which include the epiglottis, vocal cords, and glottis.
[Return to place in the text (Figure 2.1)]
Figure 2.2 Places of Articulation
A labeled image illustrating the anatomical components of the human vocal tract that are involved in English phonemes. These include the glottal, velar, palatal, dental, and labial structures.
[Return to place in the text (Figure 2.2)]
Consonants, as we saw earlier, are produced with some obstruction to the airflow through the vocal tracts. This obstruction is created by brings a variety of articulators together which are called places of articulation. There can also be variation in how the airflow is controlled when travelling through the vocal tract and this is known as manner of articulation. For example, when we say [p], [t] or [k], the flow of air is stopped for a moment before being released. On the other hand, the flow of air is released with some stricture when we produce [s], [f] or [ʃ] (the sound we write in English as ‘sh’). We call sounds that stop the flow of air for a moment stops or plosives. Sounds that are produced with some kind of frictions, such as [s] and [f], are called fricatives. When we also let the flow of air to travel through the nasal cavity, we produce nasal sounds such as [m] and [n]. Try holding your nostrils closed with your fingers while saying ‘ma’. You will find it difficult to do so as the flow of air needs to travel through your nasal passage to produce it. There are other sounds we produce which bring the articulators together with some degree of approximation. These sounds are called approximants and include [l] and [w]. When two consonants are produced in close association as if they were one sound, we call them affricates. English has two affricates which written in English as ‘ch’ and ‘j’. However, the International Phonetic Alphabet or IPA transcribes them as [tʃ] and [dʒ]. This shows that the ‘ch’ sound of ‘chair’ is produced by the combination of [t] and [ʃ]. Similarly, IPA uses [j] to refer to the sound written in English with ‘y’. The sound used to represent the ‘j’ sound of ‘juice’ is [dʒ]. This shows that this affricate is produced with both [d] and [ʒ]. Another difference between consonants is whether the vocal cords vibrate or not when they are produced. This called voicing and is seen in the difference between the unvoiced [p] and the voiced [b].
blank | Place of Articulation | ||||||
---|---|---|---|---|---|---|---|
Manner of Articulation | Labial | Dental | Alveolar | Post-alveolar | Palatal | Velar | Glottal |
Nasal | m [audio] | blank | n [audio] | blank | blank | ŋ [audio] | blank |
Unvoiced Stop | p [audio] | blank | t [audio] | blank | blank | k [audio] | blank |
Voiced Stop | b [audio] | blank | d [audio] | blank | blank | g [audio] | blank |
Unvoiced Affricate | blank | blank | blank | tʃ [audio] | blank | blank | blank |
Voiced Affricate | blank | blank | blank | dʒ [audio] | blank | blank | blank |
Unvoiced Fricative | f [audio] | θ [audio] | s [audio] | ʃ [audio] | blank | blank | h [audio] |
Voiced Fricative | v [audio] | ð [audio] | z [audio] | ʒ [audio] | blank | blank | blank |
Approximate | blank | blank | l [audio] | ɹ [audio] | j [audio] | w [audio] | blank |
Table 2.1 shows the full range of consonants found in English. The symbols used are from the IPA allowing us to describe and discuss these phonemes across different languages without confusion. Most of the symbols used will be familiar to those who write using the English alphabet though some will be different. Let’s explore this chart and see how we can describe these phonemes. Using the table given in Table 2.1 can classify [m] as a bilabial nasal meaning that it is produced with the two lips coming into contact and the airflow directed through both the mouth and the nasal passage. [p] is an unvoiced bilabial stop. This means that it is produced with the two lips coming into contact with the airflow stopped briefly before release. When the airflow is released, the vocal cords do not vibrate (unvoiced). [b] on the other hand is produced with similar place and manner of articulation but with the vocal cords vibrating; so, it is called a voiced bilabial stop.
Table 2.2 shows you examples of English words for each consonant. You can see the need for IPA symbols as the English alphabet doesn’t have separate graphemes for such phonemes. For example, ‘thin’ and ‘this’ both begin with a grapheme ‘th’ but are produced differently. The ‘th’ in ‘thin’ is an unvoiced dental fricative while the ‘th’ in ‘this’ is a voiced dental fricative. Similarly, we don’t have a grapheme to represent the voiced post-alveolar fricative seen in the ‘s’ of ‘pleasure’ and ‘measure.’ The velar nasal is written with two letters ‘ng’ found in ‘sing’, ‘ring’ and ‘walking.’ Some dialects may pronounce the ‘g’ while others may not.
Table 2.2 English Consonants with Examples
We can compare English phonology to Table 2.3 where we see the arrangement of French consonants. We see a lot of similarities with some variations. For example, French has a palatal nasal consonant which may be familiar to you as the ñ in Spanish señor. The velar nasal is not native to French but seen in loan words such as ‘camping.’
blank | Place of Articulation | |||||
---|---|---|---|---|---|---|
Manner of Articulation | Labial | Dental/Alveolar | Post-alveolar | Palatal | Velar | Uvular |
Nasal | m [audio] | n [audio] | blank | ɲ [audio] | (ŋ) [audio] | blank |
Unvoiced Stop | p [audio] | t [audio] | blank | blank | k [audio] | blank |
Voiced Stop | b [audio] | d [audio] | blank | blank | g [audio] | blank |
Unvoiced Fricative | f [audio] | s [audio] | ʃ [audio] | blank | blank | blank |
Voiced Fricative | v [audio] | z [audio] | ʒ [audio] | blank | blank | ʁ [audio] |
Plain Approximant | blank | l [audio] | blank | j [audio] | blank | ʁ [audio] |
Labial Approximant | blank | blank | blank | ɥ [audio] | w [audio] | blank |
Table 2.4 and Table 2.5 show the phonology of languages found in other parts of Canada. Secwepemc (also known as Shuswap) is a language spoken in the Canadian province of British Columbia. It is the northernmost of the Interior Salish languages and is spoken by over 1600 people. We can see in Table 2.4 a number of new sounds. In particular, we see glottalized phonemes (with a superscript symbol like a questions mark) as well as rounded phonemes (with a superscript w). The glottal sounds are produced with stricture of the glottis. Some dialects of British English produce this sound in the pronunciation of the ‘t’ in ‘bottle.’
blank | Place of Articulation | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Manner of Articulation | Labial | Dental | Alveolar | Palatal – Plain | Palatal – Velarized | Velar – Plain | Velar – Rounded | Uvular – Plain | Uvular – Rounded | Laryngeal – Plain | Laryngeal – Rounded |
Nasal – Plain | m [audio] | n [audio] | blank | blank | blank | blank | blank | blank | blank | blank | blank |
Nasal – Glottalized | mˀ | nˀ | blank | blank | blank | blank | blank | blank | blank | blank | blank |
Stop – Plain | p [audio] | t [audio] | blank | blank | blank | k [audio] | kʷ | q [audio] | qʷ | ʕ [audio] | ʕʷ |
Stop – Glottalized | pˀ | tɬˀ | blank | blank | blank | kˀ | kʷˀ | qˀ | qʷˀ | ʔ [audio] | ʕʷˀ |
Affricate – Plain | blank | blank | tʃ [audio] | tʃ [audio] | blank | blank | blank | blank | blank | blank | blank |
Affricate – Glottalized | blank | blank | tʃˀ | tʃˀ | blank | blank | blank | blank | blank | blank | blank |
Fricative | blank | ɬ [audio] | blank | ʃ [audio] | blank | x [audio] | xʷ | χ [audio] | χʷ | h [audio] | blank |
Approximant – Plain | blank | blank | l [audio] | j [audio] | ɰ [audio] | ɰ [audio] | blank | blank | blank | blank | w [audio] |
Approximant – Glottalized | blank | blank | lˀ | jˀ | ɰˀ | ɰˀ | blank | blank | blank | blank | wˀ |
Table 2.5 shows the consonants of the Inuit languages. These languages are spoken by Indigenous communities in the northernmost parts of north America and parts of Greenland. We will talk more about these languages in later discussions about their writing systems. As you can see these languages have quite a lot of familiar consonants. We also see the uvular rhotic that is found in French along with a uvular stop produced at the very back of the mouth. The consonants in parentheses are language specific within the language family. The retroflex /ʂ/ and /ʐ/ appear only in Inupiatun. Retroflex consonants are produced with the tongue rolled back to touch the hard palate. You will be familiar with these consonants from their appearance in Indian languages. The unvoiced palatal fricative /ɟ/ appears only in Natsilingmiutut having merged with /j/ in other languages. One other aspect of these languages is the lack of minimal pairs for voicing for most consonants. They only have unvoiced [p], [t] and [s] with no contrast with voiced [b], [d] or [z]. There is a contrast between velar stops for unvoiced [k] and voiced [ɡ].
blank | Place of Articulation | ||||||
---|---|---|---|---|---|---|---|
Manner of Articulation | Labial | Alveolar – Central | Alveolar – Lateral | Retroflex | Palatal | Velar | Uvular |
Nasal | m [audio] | n [audio] | blank | blank | blank | ŋ [audio] | blank |
Unvoiced Stop | p [audio] | t [audio] | blank | blank | blank | k [audio] | q [audio] |
Voiced Stop | blank | blank | blank | blank | blank | g [audio] | blank |
Unvoiced Fricative | blank | s [audio] | blank | (ʂ) [audio] | (ɟ) [audio] | blank | blank |
Voiced Fricative | v [audio] | blank | blank | (ʐ) [audio] | blank | blank | blank |
Unvoiced Approximant | blank | blank | ɬ [audio] | blank | blank | blank | blank |
Voiced Approximant | blank | blank | l [audio] | blank | j [audio] | blank | ʁ [audio] |
The examples from these four languages shows us the diversity of consonants that is available in human languages. Not all languages need to have all these phonemes. As languages evolve and change, there is a constant pull in two directions: economy of production (not wanting to take too much effort in producing a sound) versus distinctiveness (having enough distinct sounds to produce all the differences necessary for distinguishing words or minimal pairs). This competition exists in all languages from generation to generation and has resulted in the diversity we see in modern languages.
Vowels are produced without any obstruction to the articulatory tract (Ladefoged & Maddieson, 1996). Unlike consonants which result from the contact between articulators, vowels allow for a free flow of air. Therefore, we cannot define vowels in terms of place and manner of articulation. Rather, we define vowels in terms of the shape and position of the tongue. This means that while consonants in different dialects of a language remain relatively constant, vowels can differ widely. The defining terms for vowels are height, backness and roundness.
Height refers to the vertical position of the tongue. Try saying ‘ee’ and ‘aa’ repeatedly. You will notice your tongue moving up and down. Therefore, we say that the vowel produced in saying ‘ee’ is a high vowel and that produced in saying ‘aa’ is a low vowel. Backness is based on the tongues horizontal position and shape. This is can noticed in saying ‘ee’ and ‘oo’ where the latter makes the tongue go back. Roundness is not a property of the tongue but of the lips which you will notice in making sounds such as ‘oo.’ Table 2.6 shows you the vowels found in English.
blank | Front | Central | Back |
---|---|---|---|
Close | i [audio] | blank | u [audio] |
ɪ [audio] | blank | ʊ [audio] | |
Mid | e [audio] | (ɜ) [audio] | o [audio] |
ɛ [audio] | ə [audio] | (ɔ) [audio] | |
Open | blank | blank | ɑ [audio] |
æ [audio] | ʌ [audio] | blank |
Some languages such as French and Hindi also have nasalized vowel. Consider beau /bo/ ‘beautiful’ and bon /bɔ̃/ ‘good’ in French being minimal pairs in terms of nasalization of the vowel. When two vowels are combined within a syllable, they form diphthongs. These can be seen in words such as cow /kaʊ/, pie /paɪ/, and boy /bɔɪ/.
IPA Symbol | Name | Example |
---|---|---|
/æ/ | Near-open front unrounded vowel | trap [audio] |
/ɑ/ | Open back unrounded vowel | lot [audio] |
/ɔ/ /ɑ/ | Open-mid back rounded vowel | caught [audio] |
/ɪ/ | Near-close front unrounded vowel | bit [audio] |
/i/ | Close front unrounded vowel | geese [audio] |
/ɘ/ | Close-mid central unrounded vowel | about [audio] |
/ʌ/ | Open-mid back unrounded vowel | gut [audio] |
/ɛ/ | Open-mid front unrounded vowel | bet [audio] |
/ʊ/ | Near-close near-back rounded vowel | foot [audio] |
/u/ | Close back rounded vowel | moose [audio] |
IPA Symbol | Example |
---|---|
/eɪ/ | face [audio] |
/oʊ/ | goat [audio] |
/aɪ/ | nice [audio] |
/ɔɪ/ | choice [audio] |
/aʊ/ | south [audio] |
Watch the video Diphthongs (3 minutes).
Table 2.7 and Table 2.8 show us the vowels found in most varieties of Canadian English with examples. While consonants tend to be similar across dialects, vowels can vary greatly between dialects and countries. Therefore, you will find that the English spoken in the United Kingdom, Australia and New Zealand will have very different vowels when producing the same words.
While phonemes are the smallest units of sound, we don’t actually speak in phonemes. If I say the word ‘cat’ /kæt/ and record it, I won’t be able to break it into three units of /k/, /æ/ and /t/. Therefore, the smallest unit of articulation is not the phoneme but rather the syllable. Most native speakers of a language will know how many syllables are in a word in their language. You can try this in English by saying a word slowly. For example, the word ‘elephant’ has three syllables: e-li-phant. As seen in Figure 2.3, all syllables must have a mandatory nucleus or peak. This is usually a vowel. Some languages can also have a syllabic consonant as a nucleus of a syllable as in the English word ‘button’ [bʌtn̩] where there are two syllables [bʌ] and [tn̩]. You can see that the second syllable has no vowels but a syllabic [n̩] as the nucleus.
Consonants that come before the nucleus of a syllable are know as onsets and those that come after it are called codas. The nucleus and coda of a syllable form a group called a rime. These onsets and codas can be complicated or simple depending on what is allowed in a language. English allows up to three consonants in the onset and at least as much in the coda. Consider the word ‘twelfths’ /twɛlfθs/. It has two consonants in the onset and four consonants in the coda. Generally, the onset is more restricted in what is consonants are allowed.
In English, you can have almost all consonants other than the velar nasal /ŋ/ as an onset. If there are two consonants in the onset and the first one isn’t /s/ then the second has to be either /l/, /r/, /w/, or /j/. if there are three consonants in the onset, then the first has to be an /s/, the second has to be either /p/, /t/ or /k/, and the third has to be either /l/, /r/ or /w/.
Living Language
Consider some words in your language and try to syllabify them. Think of what phonemes occur in the onset, nucleus and coda of these syllables. Can you come up with long onsets and codas in your language? Ask a friend who speaks another language to do the same. Are there any differences between your languages’ syllable structure?
As we saw earlier, what is allowed in the onset, nucleus and coda of a language can be different across languages. While a sequence such as /pl/ is allowed in English, /ps/ would not be allowed. However, /ps/ is a legal sequence in Greek which is why we still spell ‘psychology’ with the ‘ps’ sequence even though English speakers don’t pronounce the ‘p’. Similar examples of Greek onsets include /mn/ as in ‘mnemonic’. Figure 2.4 illustrates the syllable structure of the word Tk’emlúps ‘Kamloops’ in Secwepemc. As we can see, Secwepemc allows the sequence of an unvoiced dental stop and a glottalized velar stop /tkˀ/ in the onset of its syllable. The sequence /ml/ is not a legal onset in Secwepemc, so it gets separated between two syllables, the /m/ becoming the coda of the first syllable and the /l/ becoming the onset of the second.
As syllables are the smallest units of articulation, they provide the rhythmic patterns of a language. In languages such as English, syllables carry features such as stress. This determines which syllable in a word receives emphasis. Try saying ‘I am recording a song.’ You will stress the second syllable in the word ‘recording.’ Now say ‘That was a record.’ You will find yourself placing more stress on the first syllable of the word ‘record.’ Languages that place equal time periods for stressed syllables are called stress-timed languages. English is a stress-timed language and we can see how this is employed in Shakespeare’s sonnets with iambic pentameters. Each line consists of five iambs and each iamb consists of two syllables with the second one more stressed that the first. As we see in Figure 2.5, this creates a beautiful pattern of unstressed and stressed syllables that may even go across word boundaries. Read the sonnet out loud and you will notice the stress patterns.
Unlike English, other languages may produce each syllable with equal time. These are called syllable-timed languages. French is a good example of such a syllable-timed language. Poetry in syllable-timed languages will make less use of stress and will take into account what consonants appear in the coda of the syllable to determine the structure of their poems.
We learned about how English speakers will aspirate some phonemes. Is this a random act or can we figure out a pattern in this type of production? When considered carefully, we can notice that we only do it with /p/, /t/ and /k/. In addition, this only happens when these phonemes appear at the beginning of a syllable. When linguists figure out such a pattern, they can formally write it as a phonological rule. Generally, phonological rules map between two levels of representation: phonemes and phones (Goldsmith, 1995). Such rules define how we go from the abstract representation of phonemes in our mind to the actual articulation of phones. They start with an underlying representation (the string of phonemes) and produce a surface form (what is actually said).
The rule for aspiration in English could be stated as “All unvoiced stops will be aspirated when they appear as the onset of a syllable.” Each language varies in how phonological rules are applied and in hat circumstances they appear. For example, Germans will devoice (remove the voicing) of an obstruent if it appears as the coda of a syllable. So, they may pronounce ‘hund’ as [hʊnt] devoicing the [d] to a [t].
Living Language
Consider how to pronounce the /t/ when it appears between two vowels as in ‘butter’ or ‘notable.’ You will notice that most people in North America will not produce a hard [t] sound but a flap consonant [ɾ]. So /bʌtɚ/ becomes [bʌɾɚ] (the [ɚ] is a vowel with a rhotic or ‘r’ quality).
What other examples can you think of in how you make systematic changes to phonemes when you speak?
In this chapter, we learned about the basics of describing the sound forms of language. We saw that phonemes are the smallest units of sound and syllables are the smallest units of articulation in a language. We also learned how to classify consonants and vowels based on how they are produced in the vocal tract. Finally, we explored how these can be brought together to figure out recurring patterns in spoken language and formalized as phonological rules.
Key Takeaways
Exercises in Critical Thinking
Goldsmith, J. A. (1995). The handbook of phonological theory. Oxford: Blackwell.
Ladefoged, P. & Maddieson, I. (1996). The sounds of the world’s languages. Oxford: Blackwell
Learning Objectives
This chapter is an introduction to words and their meaning. We will explore meaningful units of language and their typology across different languages. This includes inflectional, isolating, agglutinative and polysynthetic morphologies. We will also look at inflectional versus derivational morpheme translations as well as unusual nonconcatenative morphology in Semitic languages. We will also look at syntax and the parts-of-speech that make up sentences or utterances. We will end this chapter with a look at word order and how they differ across languages.
It may seem a superficial question to ask “what is a word?” However, this question has stymied some of the greatest minds on history. Ferdinand de Saussure once said that a word is like a coin. It has two sides in that it has form (the sounds that make up a word) and meaning (the concept associated with it). In this sense, we could say that a word links form with meaning.
Words also have some properties that go beyond these observations. For example, words are free as they can appear in isolation. “How was the hamburger?” “Delicious”. A perfectly sensible word that provides meaning on its own. Words are also movable. They are not bound to a particular position in a sentence. Consider these examples:
The word hamburger can appear as the first, last or middle word in a sentence. However, now consider whether the meaning of words is inseparable from the form. We know we can break up a word’s form into phonemes. Can me break up a word’s meaning in the same way?
If we consider meaningful units in a language, we come to a unit beyond which we cannot derive further meaning. This smallest unit of meaning is known as a morpheme. Consider the word ‘dogs.’ It is composed of two morphemes: ‘dog’ and ‘s’ with the latter conveying the plural number. Here we see that while ‘dog’ can be a free morpheme, ‘s’ cannot. Such a morpheme which always needs to be connected to other morphemes is known as a bound morpheme.
One important issue to keep in mind is that while some words are morphemes, not all morphemes are words. Words can be made up of numerous morphemes. In a sentence such as “Jon found the box to be unbreakable” we know there are seven words. However, we can break that sentence into nine morphemes as: “Jon found the box to be un-break-able”.
In Figure 3.1 we see examples of free and bound morphemes. The -er and -ing in writer and talking are known as suffixes. These are morphemes that attached to the ends of other morphemes. Examples include the plural suffix -s and the past tense -ed. English also has prefixes as in reheat, invisible and disagree.
Previously we came across the concept of an allophone. These were variations of the smallest sound unit in a language or phoneme. Similarly, the smallest unit of meaning in a language, the morpheme, can also have variations called allomorphs. These allomorphs often vary depending on the environment. The most common example of this is the indefinite article ‘a’. It comes from the Old English ān meaning one or alone. Gradually, the n was lost before consonants by the 15th century so you get the allomorphs a and an. So, you say ‘a book’ but ‘an apple’. Some allomorphs actually change the form of words due to over analysis. For example, a norange overtime became an orange because people thought the initial n was part of the indefinite article. Similarly, an ekename /iːkneɪm / (from Middle English eke or suppliment) was analysed as a nickname. This time the n in an became attached to the following word.
Another example of an allomorph in English is the plural suffix -s. This comes in three variations: [s], [z], and [əz]. So, after a unvoiced consonants we get [s] as in carrots and books. It is pronounced [z] after voiced segments as in friends and iguanas. It is also pronounced (and written) differently in words such as churches and bushed.
The way in which morphemes are employed to modify meaning can vary between languages. Morphological typology is a method used by linguists to classify languages according to their morphological structure. While a variety of classification types have been identified, we will look at a common method of classification: analytic, agglutinative and fusional. Figure 3.2 give some examples of morphological typology across the world’s languages.
Analytic languages have a low ratio of morphemes to words. They are often isolating languages in that each morpheme is also a word and vice versa. These languages create sentences with independent root morphemes with grammatical relations between words being expressed with separate words. Examples of analytics or isolating languages include Chinese languages and Vietnamese. While in English we inflect numbers: one day, two days, an analytic language such as Mandarin Chinese has no inflection: 一天, yì tiān “one day”, 三天, sān tiān “three day”. The Canadian linguist and translator Sonja Lang has created an analytic language, Toki Pona, as a minimalist creative endeavour.
Unlike analytic languages, synthetic languages employ inflection or agglutination to express syntactic relationships. Agglutinative languages combine one or more morphemes into one word. The distinguishing feature of these languages is that each morpheme is individually identifiable as a meaningful unit even after combining into a word. Examples of agglutinative languages include Tamil, Secwepemc, Turkish, Japanese, Finnish, Basque and Hungarian. Figure 3.3 shows you an example of agglutination in Turkish. Each coloured morpheme is also given an approximate English translation. Figure 3.2 give another example from Tamil.
Another type of synthetic languages are fusional languages. Like agglutinative languages, fusional languages also combine morphemes to modify meaning. However, these combinations often do not remain distinct and fuse together. In addition, these languages also have a tendency to use a single inflectional morpheme to denote numerous grammatical or syntactic features. For example, the suffix -í in Spanish comí (“I ate”) denotes both first-person singular agreement and preterite tense. Examples of fusional languages include Indo-European languages such as Sanskrit, Spanish, Romanian, and German. Modern English could also be considered fusional; although it has tended to evolve to be more analytic. J. R. R. Tolkien’s fictional language Sindarin is fusional (another elvish language, Quenya, is agglutinative).
Figure 3.2 shows an additional morphological type named polysynthetic. These languages tend to a high morpheme-to-word ratio as well as regular morphology. They often combine a large number of morphemes to form words that are the equivalent of entire sentences in other languages. Many languages in North America such as Mohawk tend to have this type of morphology.
Inflectional morphemes add grammatical information to a word while retaining its core meaning and its grammatical category. The tense of a verb is indicated by inflectional morphology. You add -ed to walk to make walked. You can also make a past tense inflection through the change of a vowel as in sang or wrote. Some languages have inflections for the future tense as well (which English does not have). Another example is when you indicate number in English by adding -s to a word you add the morpheme to the end of a singular noun. So, book can be made a plural by adding -s to make it books. The original stem doesn’t change in meaning and it remains a noun. While English only has singular and plural numbers, some languages have a dual number. Consider the following example from Ancient Greek (Weir, 1920):
ὁ θεός (ho theós) “the god” (singular)
τὼ θεώ (tṑ theṓ) “the two gods” (dual)
οἱ θεοί (hoi theoí) “the gods” (plural)
Inuktitut spoken in the territory of Nunavut also has a dual number (Anderson, 2018):
Inuktitut | English translations |
---|---|
matu | door |
matuuk | doors (two) |
matuit | doors (three or more) |
nuvuja | cloud |
nuvujaak | clouds (two) |
nuvujait | clouds (three or more) |
qarasaujaq | computer |
qarasaujaak | computers (two) |
qarasaujait | computers (three or more) |
Another way in which morphemes modify meaning is through derivation. Here the original word is modified by the derivation and often changes its word category. Form example, adding -er to the verb write will modify it into a noun: writer. The same is seen in teacher, walker and baker. In the same way, an adjective can be changed into a noun as in sad and –ness becoming sadness.
Derivation often leads to the creation of new words. These new words can in turn serve as a base for further derivation. This can lead to some rather complex morphological forms. For example, a machine that computes may be called a computer (compute and -er). When we use a computer to complete a task, we could say they computerize (computer and -ize) which in turn can be called computerization (computerize and -ation). One interesting observation is that inflecting a base makes further derivation impossible. So, making a plural our of computer into computers (computer and -s) means we cannot make it into *computersize.
Most of the morphological types we have seen make use of prefixes and suffixes to make changes in meaning. These involve making sequential changes to the stem. However, there are some languages that make morphological modifications to a word-root using non-sequential methods. This is known as nonconcatenative morphology, discontinuous morphology or introflection. This type of change is also seen in English foot /fʊt/ → feet /fiːt/ as well as freeze /ˈfriːz/ → froze /ˈfroʊz/, frozen /ˈfroʊzən/. While these rare cases exist in other Indo-European languages as well, this is very well developed in Semitic languages such as Arabic. Consider some derivation of the Semitic root k-t-b in Arabic (Wehr, 1994) and Hebrew. This root is transposed into other segments to create these morphological derivations.
Arabic | Transliteration | Hebrew | Transliteration | Translation |
كتب | kataba | כתב | kataḇ | ‘he wrote’ |
كَتَبْتُ | katabtu | כתבתי | kāṯaḇti | ‘I wrote’ |
كاتب | kātib | כותב | koteḇ | ‘writer’ |
أكتب | ʾaktaba | הכתיב | hiḵtiḇ | ‘he dictated’ |
مكتب | maktab | מכתב | miḵtaḇ | ‘office’ (Arabic), ‘letter’ (Hebrew) |
استكتب | istaktaba | התכתב | hitkatteḇ | ‘he made (them) write’ (Arabic), ‘he corresponded’ (Hebrew) |
As we can see, the morphemes do not attach to the ends but infuse within the triconsonantal roots as infixes. Figure 3.4 and Figure 3.5 illustrate how this can be visualized from a language production standpoint. We see the consonantal roots act as separate morphemes from the infixes which intertwine to form the final segmental sequence that is syllabified and spoken. This shows us that morphology can be more complex than simple additions to a stem.
Figure 3.2 Examples of Morphological Typology
Provides examples of the morphological typology of Mandarin, isolating language, Tamil, an agglutinative language, Spanish, a fusional language, and Mohawk, a polysynthetic language. The image illustrates the meanings of the morpheme components of the words or phrases, and how they combine to express meaning.
[Return to place in text (Figure 3.2)]
Figure 3.3 Example from Turkish, an Agglutinative Language
Two examples of agglutination from the Turkish language broken down into their morphological components.
[Return to place in text (Figure 3.3)]
Now that we are familiar with the units of sound, articulation and meaning, let us explore how these are put together in connected speech. Syntax is the set of rules and process that govern sentence structure in a language. A basic description of syntax would be the sequence in which words can occur in a sentence. One of the earliest approaches to syntactic theory comes from the works of the Sanskrit grammarian Pāṇini (c. 4th century BC) and his seminal work: Aṣṭādhyāyī. While the field has diversified into many schools, we will look at some basic issues of syntax and look at the contributions of Noam Chomsky.
Living Language
Look at these two sentences and decide which one seems normal to you:
Why is one not considered correct even though it contains all the same words? Can you articulate the rules that govern your decision or are they intuitive?
Grammar employs a finite set of rules to generate the infinite variety of output in a language. This is the basis for generative grammar. Chomsky argued for a system of sentence generation that took into account the underlying syntactic structure of sentences. He emphasised the native intuition of any native speaker of a language to identity ill-formed sentences in that language. The speaker may not be able to provide a rationale for why some sentences are acceptable and other are not. However, it cannot be denied that such intuitions exist in every person. While Chomsky’s ideas have evolved over the years, the main conclusions appear to be that language is a rule-based system and a finite set of syntactic rules can capture our knowledge of syntax.
A key aspect of language is that we can construct sentences with words using a set of finite rules. Phrase-structure rules are a way to describe how words can be combined into different structures. Sentences are constructed from smaller units. If s sentence is designated as S, we can use rewrite rules to translate other symbols such as noun phrases (NP) and verb phrases (VP) as in:
S → NP + VP
Phrase-structure grammar has word (terminal elements) and other constituent parts (non-terminal elements). This means that words usually form the lowest part of a sentences building up towards a sentence. The rules that we use to construct these sentences do not deal with individual words but classes of words. Such classes include words that name objects (nouns), words for actions (verbs), words that describe nouns (adjectives), and words that qualify actions (adverbs). We can also think of words that determine number such as ‘the’, ‘a’ and ‘some’ (determiners), words that join constituents such as ‘and’ and ‘because’ (conjunctions), words that substitute for a noun or noun phrase as in ‘I’ and ‘she’ (pronouns), and words that express spatial or temporal relations as in ‘on’ and ‘on’ (prepositions).
These types of words combine to form phrases. Such phrases that can take the part of nouns in sentences are called noun phrases. So ‘dog,’ ‘the dog’ or ‘the naughty dog’ are all noun phrases because they can fill the gap in a sentence such as ‘_____ ran through the park’. Phrases combine to form clauses. These contain a subject (what we are talking about) and a predicate (information about the subject). Every clause has to have a verb and sentences can consist of one or more clauses. As we see in Figure 3.6, the sentence ‘the dog likes John’ consists of one clause composed of a noun phrase and a verb phrase. It contains a subject ‘the dog,’ a verb ‘likes,’ and an object ‘John.’
One way to think about how sentences are organized in the mind is through a notation called a tree diagram. They are called tree diagrams because they branch from a single point into phrases which in turn branch into words. Each place where the branches come together is called a node. A node indicates a set of words that act together as a unit or constituent. Consider Figure 3.6 which illustrates how a sentence can be depicted in a tree diagram.
The order of the syntactic constituents varies between languages. When talking about word order, linguists generally look at 1) the relative order of subject, object and verb in a sentence (constituent order), 2) the order modifiers such as adjectives and numerals in a noun phrase, and 3) the order of adverbials. Here we will focus mostly on constituent word order.
English sentences generally display a word order consisting of subject-verb-object (SVO) as in ‘the dog [noun] likes [verb] John [object]’. Mandarin and Swahili are other examples of SVO. About a third of all languages have this type of word order (Tomlin, 1986). About half of all languages employ subject-object-verb (SOV). Japanese, Turkish as well as the Indo-Aryan and Dravidian languages of India are examples of SOV word order. Classical Arabic and Biblical Hebrew as well as the Salishan languages of British Columbia employ verb-subject-object (VSO). Rarer are typologies such as verb-object-subject (VOS) as is found in Algonquin. Unusual word ordering can be employed for dramatic effect as in the object-subject-verb (OSV) word order of Yoda from Star Wars: ‘Powerful (object) you (subject) have become (verb). The dark side (O) I (S) sense (V) in you.’
We know that a sentence’s syntax has an influence on how its meaning is interpreted (semantics of the sentence). Any given string of words can have different meanings if they have different syntactic structures. However, syntax doesn’t necessarily need to be in line with semantics. Chomsky (1957) famously composed a sentence that was syntactically correct but semantically meaningless: “colorless green ideas sleep furiously.” The sentence is devoid of semantic content, but it is a perfectly grammatical sentence in English. The words “*Furiously sleep ideas green colorless” are the same but their order would not be considered grammatical by a native English speaker.
We have psycholinguistic evidence from electroencephalography to support the idea that syntax and semantics are processed independent of each other. In measuring event related potentials (ERPs) for sentences there are some interesting observations. For example, the sentence “He eats a ham and cheese …” sets up a very strong expectation in your mind about what words comes next. If the word that comes next is in line with your expectations, the ERP signal will be a baseline condition. However, if the next word violates your expectations, then we often see a sudden negative spike in the EEG voltage around 400ms after the unexpected word. This ERP signal is called an N400 (where the N stands for negative and 400 indicates the approximate timing of the ERP after the stimulus). Numerous studies have found an N400 response when a semantically unexpected word is inserted into a sentence.
However, not every unexpected word elicits an N400 response. In some cases, where the unexpected word belongs to an unexpected word category (for example, a verb instead of a noun), we see a positive voltage around 600ms after the unexpected word. This is known as a P600. Therefore, we see that violations of semantic expectations elicit an N400 while violations of syntactic expectations elicit a P600. This suggests that syntax and semantics are independently processed n our brains.
Figure 3.6 Sentence Structure in English
The sentence “the dog likes John” consists of a noun phrase “the dog,” and a verb phrase “likes John.” The noun phrase is consisted of a determiner “the” and a noun “dog.” The verb phrase is consisted of a verb “likes,” and a noun phrase “John.”
[Return to place in the text (Figure 3.6)]
In this chapter we learned about the smallest units of meaning in a language: morphemes. We saw how morphemes can be employed in different ways to modify meaning in different languages. We also saw the differences between inflectional and derivational morphology. The order in which constituents of a sentences can be arranged can be flexible in some languages but very strict in others (such as English). While subject-object-verb (SOV) is the most common word order in the world’s languages, there are other variations as is found in English (SVO).
Key Takeaways
Exercises in Critical Thinking
Anderson, C. (2018). Essentials of linguistics. Canada: McMaster University
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris.
Chomsky, N. (1995). Bare phrase structure. In G. Webelhuth (Ed.), Government and binding theory and the minimalist programme (pp. 383–400). Oxford: Blackwell.
Smyth, H. W. (1920). “Part II: Inflection”. A Greek grammar for colleges. Cambridge: American Book Company.
Tomlin, R. S. (1986). Basic word order: Functional principles. London: Croom Helm.
Wehr, H. (1994) A dictionary of modern written Arabic: (Arabic-English). 4th edition, Ithaca, NY: Spoken Language Services.
Learning Objectives
Living Language
As you may have heard, two people heard the humming noise of ducks and started an argument over whether the sound came from the air passing through their beak or their wings. The chief calls a great council to settle the matter. The people from all the nearby villages attend and again they argue and cannot agree. Eventually, the argument leads to some people moving far from this land and they began to speak differently. Eventually, other languages formed and we cannot understand each other.
A tale from the Salishan (Boas, 1917)
Long ago, Taikomol, He who goes alone, made the earth from a piece of coiled basket. Taikomol the creator walked with Coyote throughout the land. The creator laid out sticks at night which turned into people upon daybreak. As Taikomol made different people, they were given different customs, modes of life and a different language. Finally, the creator ascended to the sky where he still is.
A take from the Yuki people of California (Kroeber, 1907)
These stories show us the centrality of language to human existence. All human societies have developed stories about the origins of language and how it is linked to their identity. We will explore some of the ideas about human language, its origins and current views on the biological basis of language in this chapter.
Is human language an extension of other forms of communication, or is it a unique attribute of humans? In investigating animal communication, we explore the extent to which aspects of language may be innate. Animals have a variety of communication systems that go beyond the scope of this book. We will explore just a small set of interesting systems. Communication is the transmission of a signal to covey information (Pearce, 2008). Communication has an element of intention that differentiates it from mere informative signals. A sneeze is a signal that may mean someone has a cold, but it was not communication. However, telling someone that you have a cold is communication.
Perhaps one the most intriguing communication methods found in nature is the complex system of dances in honey bees. As seen in Figure 4.1, a bee waggles in a figure-of-eight shape to communicate the direction of sources of nectar relative to the sun. The rate of the waggle represents distance (von Frisch, 1950, 1974). This behavior suggests that complex neural networks are not a necessity for communication to occur.
More complex brains such as primates use a variety of signals from visual and auditory signals to olfactory and tactile sensations. Vervet monkeys have been observed to make sounds to that differentiate between predators on the ground and those coming from above. The difference is seen in the interpretation of these calls by other monkeys in terms of whether they rush into the trees or the ground (depending on the type of predator).
Both of the above-mentioned methods of communication differ from language in that they are temporally linked to particular stimuli. Signals are linked to particular stimuli (nectar or predators) and produced only in their presence. Therefore, what features of language can be used to distinguish it from animal communication? How do we define language?
Language is difficult to define. Indeed, most books on language actually avoid giving a definition. In trying to tackle this tricky topic, Hockett (1960) set out a list of 16 design features of human language. He focused on the physical features of language rather than cognition. While this is not a perfect attempt, it is a useful frame to start our exploration.
Now consider these design features in terms of animal communication. Do animals lack all of these features? Language is obviously about meaning and we may even add other features to human language such as creativity. But some animals have also shown the ability to lie as well as control their communication methods. Recently, Chomsky and his colleagues have argued for the syntactic creativity of language as its defining feature. The way in which we can use finite symbols to create infinite variations through iteration and recursion has been stated as unique t human languages. Kako (1999) and Pinker (2002) have listed five properties of our syntactic system:
Animal communication doesn’t appear to have these features. In addition, we communicate about things that are separated in time and space (unlike animals). Monkeys, for example, make signals about immediate threat such as snakes and leopards. They don’t discuss the snake they saw last summer. Animals cannot discuss their own communication system using their communication system. Therefore, while animals possess a rich communication system that they use to convey messages between members of their group, they do not bear much similarity with human language. Our (perhaps) limitless ability to employ language to discuss anything and everything is mind boggling. In essence, all non-human communication systems are different from human language.
“I cannot doubt that language owes its origin to the imitation and modification, aided by signs and gestures, of various natural sounds, the voices of other animals, and man’s own instinctive cries”
(Darwin, 1871, p. 56).
In 1866, the Linguistic Society of Paris (Société de Linguistique de Paris) banned any debate on the origins of language. This was based on the lack of empirical evidence for pursuing the topic. However, Darwinian ideas of human evolution by natural selection fueled continuous debate on the issue. It wasn’t until the early twentieth century that the topic regained scientific scrutiny. Speculations on the topic can be broadly divided into the following camps:
These can be further divided into theories that promote language as an innate faculty with a genetic basis and theories that consider language as a cultural system. It has been noted that some words are onomatopoeic: they sound like the things to which they refer. Examples include cuckoo (the sounds of that bird) and hiss (the sound of a snake). However, the idea that language has its origins in mimicry is challenged by the vast number of words that are not onomatopoeic with variations between languages.
The two main ideas about the evolution of language are that it was either a beneficial adaptation shaped by natural selection, or that it arose as a side effect of other evolved features of cognition (Hauser, Chomsky, & Fitch, 2002). The ‘language as a side-effect’ hypothesis is claimed to by its proponents as justified because language is too complex to have evolved within the short evolutionary period since our divergence from other apes. They also point out that an intermediate grammar system isn’t really possible and the complex grammatical systems in human language doesn’t appear to confer any selective advantage. However, the idea of language as a product of natural selection has gained ground in recent years. There is now evidence for there being enough time for grammar to have evolved to communicate existing cognitive representations (Pinker, 2003; Pinker & Bloom, 1990; Pinker & Jackendoff, 2005). The rapid increase in brain capacity and complexity would also have played a large role in the evolution of language. Fossil evidence suggests that brain regions associated with language, such as Broca’s area, were present in hominids as far back as 2 millions years ago.
The evolution of language did have its costs. The larynx is the organ in the neck containing the vocal folds. In humans, this organ is descended. This is not unique to humans as some animals (such as goats and dogs) temporarily descend the larynx to produce loud noises (Fitch, 2000). However, in these cases, the hyoid remains undescended and so the tongue remains horizontal. The descent of both the larynx and the hyoid bone in humans is taken by some researchers as significant (Lieberman, 2007). This leads to simple contact between the epiglottis and velum being no longer possible. Therefore, the respiratory and digestive tracts are no longer separate during swallowing in humans resulting in a choking hazard. It appears possible that the descent of the larynx and resultant ability to speak outweighed the potential for choking in early hominids. One argument against this is that the descent of the larynx and hyoid takes longer to evolve compared to the evolution of speech, a more recent component of human evolution.
The idea that language evolved in one giant leap or the mutation of a single gene is unlikely. However, there is evidence that certain genes such as FOXP2 may play an important role in linguistic attributes such as grammar. This gene is associated with sensory and motor coordination in animals (Fisher & Marcus, 2006). However, in humans this gene has evolved to be vital for grammatical processing. If this gene is damaged in humans, it can lead to language acquisition problems. It is possible therefore, that other genes may be discovered in the future and the evolution of language may have involved the reconfiguration or appropriation of a number of brain structures.
Figure 4.2 Parts of the Human Larynx
Human larynx includes hyoid bone, lateral thyrohyoid ligament, thyrohyoid membrane, median thyrohyoid ligament, superior cornu, larygeal incisure, thyroid cartilage, oblique line, median cricothyroid ligament, conus elasticus, cricothyroid muscle, inferior cornu, cricoid cartilage, cricothyroid joint and trachea.
[Return to place in the text (Figure 4.2)]
The specialization of various parts of the brain would have been a concept you may have come across in other psychology courses. While people have often speculated on these issues in the past, it wasn’t until the 19th century that we really started to gain a scientific understanding of this concept. Now we have even more advanced neuroimaging techniques that have opened new avenues for exploring language. We know that the brain is divided into two hemispheres which are specialized for different functions. In most right-handed people, the left hemisphere is specialized for language processing and production. This is even true of people who used sign-language (Corina et al., 2003).
The earliest record of brain damage leading to language loss or aphasia is in an Ancient Egyptian medical texts now known as the Edwin Smith surgical papyrus from 1700 BCE (Minagar et al., 2003). Cases 20 and 22 of the papyrus talk about an individual with an injury on the left side of their skull. He is described as speechless, “an ailment not to be treated.” Such an impairment to language production was not investigated in detail again until Paul Broca explored such issues in the 1860s. Broca observed several patients who had speech disorders resulting from damaged to the left frontal lobe. Later post-mortem examination revealed damaged to an area now known as Broca’s region (see Figure 4.3a). In 1874, a German neurologist Carl Wernicke observed language comprehension issues that arose from damage to the left temporal lobe. The region is know known as Wernicke’s region (see Figure 4.3b). He also proposed the first neurological model of language. As seen in Figure 4.4, we now know that the interpretation of language is associated with Wernicke’s region and it is connected to Broca’s region via a group of nerve fibres known as the arcuate fasciculus. Broca’s region itself is associated with articulation and the controlling of speech. Geschwind (1972) further elaborated on this to describe the Wernicke–Geschwind model. This model brings together what was known about language at the time to describe language production (Broca’s region), comprehension (Wernicke’s region), reading (visual cortex) and spelling (angular gyrus).
This view of the neuroanatomy of language is too simplistic and las its limitations. Language is not always localized to the left hemisphere and the right hemisphere is known to carry out some language functions. Subcortical regions may also have a role to play in language. Even within the cortex, areas other than Broca’s and Wernicke’s region appear to play a role in language processing. With advances in neuroimaging techniques, our understanding of the neuroanatomy of the brain will continue to expand.
George Orwell’s novel 1984 presents a dystopian world where an authoritarian regime imposes absolute control over all aspects of human life. The rulers of the state use Newspeak as a language modification system to modify people’s thought. The idea is that if people don’t have a word for a concept such as freedom, then they can’t think it. This idea of thought being directly influenced by language is an extension of the Sapir-Whorf hypothesis.
The central idea of the Sapir-Whorf hypothesis is that the characteristics of a language influences the thought forms of that language community. Indeed, it goes even further and states that the way people view and understand the world is influenced by their language. Originally proposed by Edward Sapir, a linguist, and supported by an amateur linguist Benjamin Lee Whorf (Whorf, 1956a, 1956b). Whorf studied the Indigenous languages in America to collect evidence this hypothesis which extends into two ideas: linguistic determinism and linguistic relativism.
Linguist determinism states that the form and characteristics of a language determine the way we cogitate, remember and understand the world. Linguistic relativism is the idea that different cognitive structures emerge from the way languages map words onto things in the real world. Some have differentiated these ideas into three versions:
Whorf analysed Indigenous languages such as Hopi, Apache, and Aztec to differentiate what he termed their ‘world view.’ He found that Hopi have no words that refer to time and therefore they must have a different concept of time compared to Europeans. These observations are now considered suspect (Malotki, 1983). Whorf used rather idiosyncratic translation methods that failed to capture the complexity of Indigenous languages. A more direct evidence is in the form of vocabulary.
It was observed that some cultures have only a few words for particular concepts compared to others. A famous example is that the Inuit have four different words for snow (Boas, 1911). The idea being that as the Inuit spend more time in the snow, they would develop more nuanced ways of differentiating this element and have a different perception of it. In fact, later observations only turned up two root words for snow in Inuktut: ganik for ‘snow in the air,’ and aput for ‘snow on the ground.’ Indeed, English has at least four words for snow: snow, slush, sleet, and blizzard. Does this mean English speakers are now more advanced in snow-classification compared to the Inuit? In fact, differences in vocabulary only indicate familiarity with a concept and not differences in perception. Another example that is often brought into bear in this discussion is that of colour hierarchy.
Languages differ in terms of how many basic colour terms they possess. By colour terms, I don’t mean magenta and mauve, but words like red, green, and blue. Berlin and Kay (1969) explored the basic colour terms in various languages. Basic is defined by words made up on one morpheme (such as blue rather than marine blue) and not being contained within another colour (blue rather than azure). These are also terms that are generally known. What cross-linguistic exploration of these basic colour terms has shown is that there is an actual progression in languages acquiring basic colour terms.
As seen in Figure 4.5, all languages have two basic colour terms: black and white. Some languages only have these two terms. If a language has three colour terms, then it will have terms for black, white and red. If a language has four basic colour terms, then it will have these three and either yellow or green. If a language has five colour terms, then it will have all five and so forth along the hierarchy. English has names for eleven basic colour terms. Does this mean that people see colours differently based on what terms they have in their language? Not very likely. For one thing, languages expand their colour vocabulary all the time. For example, Telugu expanded the term for yellow/green: పచ్చ /pət͡ʃt͡ʃə/ into two terms: yellow పసుప్పచ్చ /pəs̪uppət͡ʃt͡ʃə/ and green ఆకుపచ్చ /ɑːkupət͡ʃt͡ʃə/. English has had recent additions such as orange. Before this term came into English, English speakers used the term red for things that were orange such as red deer, red robin, and red fox. However, once the fruit was introduced into Europe, the colour of the fruit became associated with the colour. As seen in Figure 4.6, the colours form not a set of discreet units but a spectrum. We can cut up this spectrum into individual units and sometimes divide them further as the need arises. So, English-speakers used red to cover a larger area of the spectrum until they came across the term orange to divide it further. Telugu speakers used one term to refer to a unit spanning what we consider yellow and green.
What is interesting about the colour hierarchy is that participants in memory tasks often have difficulty with colours for which they don’t have differentiating terms. Brown and Lenneberg (1954) as well as Lantz and Stefflre (1964) used colour chips of different hues, brightness and saturation to test people’s memory for them. They found that participants were able to remember the chips more easily if they had the basic colour terms for them in their language. This appeared to support the Sapir-Whorf hypothesis.
Heider (1972) explored this further by working with the Dani. They only have two basic colour terms for dark and light colours (see Figure 4.6). However, when Heider taught them come made-up colour terms they learned the names for basic colour terms more easily than for other colours. They also remembered basic colours more easily than non-basic ones even though they have no names for them. As we saw earlier, the division of the colour spectrum is not arbitrary. It is done by dividing the spectrum along physiological lines. These observations suggest that while there are some biological and linguistic constraints in remembering colours, this is not the strongest evidence for the Sapir-Whorf hypothesis. Colour perception is not influenced by linguistic constraints on colour terms.
In this chapter we explored the differences between animal communication and language. We also looked at speculations about the origins and language as well as some design features that help to differentiate human language from animal communication. We saw that the brain has specialized regions for language production and comprehension. We also looked at how language has been thought to determine cognitive processing and how there is limited evidence to support this view.
Key Takeaways
Exercises in Critical Thinking
Berlin, B., & Kay, P. (1969). Basic color terms: Their universality and evolution. Berkeley: University of California Press.
Boas, F. (1911). Introduction to the handbook of North American Indians (Vol. 1). Bureau of American Ethnology Bulletin, 40 (Part 1).
Boas, F. (ed.) (1917) “The origin of the different languages”. Folk-Tales of Salishan and Sahaptin Tribes (New York: American Folk-Lore Society)
Brown, R., & Lenneberg, E. H. (1954). A study in language and cognition. Journal of Abnormal and Social Psychology, 49, 454–462.
Chomsky, N, 1996. Powers and prospects. Reflections on human nature and the social order. London: Pluto Press.
Corina, D. P., Jose-Robertson, L., Guillermin, A., High, J., & Braun, A. R. (2003). Language lateralization in a bimanual language. Journal of Cognitive Neuroscience, 15, 718–730.
Darwin, C. (1871). The descent of man, and selection in relation to sex. London: Murray.
Fisher, S. E., & Marcus, G. F. (2006). The eloquent ape: Genes, brains and the evolution of language. Nature Reviews Genetics, 7, 9–20.
Fitch, WT. (2000). The phonetic potential of nonhuman vocal tracts: comparative cineradiographic observations of vocalizing animals. Phonetica. 57 (2–4): 205–18.
Geschwind, N. (1972). Language and the brain. Scientific American, 226, 76–83.
Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298, 1569–1579.
Heider, E. R. (1972). Universals in colour naming and memory. Journal of Experimental Psychology, 93, 10–20.
Hockett, C. F. (1960). The origin of speech. Scientific American, 203, 89–96.
Kako, E. (1999). Elements of syntax in the systems of three language-trained animals. Animal Learning and Behavior, 27, 1–14.
Kroeber, A. L. (1907) Indian myths of south central California, American Archaeology and Ethnology, 4(4), 183-186.
Lantz, D., & Stefflre, V. (1964). Language and cognition revisited. Journal of Abnormal Psychology, 69, 472–481.
Lieberman, P. (2007). The evolution of human speech: Its anatomical and neural bases. Current Anthropology. 48 (1): 39–66.
Malotki, E. (1983). Hopi time: A linguistic analysis of temporal concepts in the Hopi language. Berlin: Mouton.
Minagar A, Ragheb J, Kelley RE. (2003) The Edwin Smith surgical papyrus: description and analysis of the earliest case of aphasia. Journal of Medical Biography. 11(2), 114-117.
Pearce, J. M. (2008). Animal learning and cognition (3rd ed.). Hove, UK: Lawrence Erlbaum Associates.
Pinker, S. (2002). The blank state. Harmondsworth: Penguin.
Pinker, S. (2003). Language as an adaptation to the cognitive niche. In M. H. Christiansen & S. Kirby (Eds.), Language Evolution (pp. 16–37). Oxford: Oxford University Press.
Pinker, S., & Bloom, P. (1990). Natural language and natural selection. Behavioral and Brain Sciences, 13, 707–784.
Pinker, S., & Jackendoff, R. (2005). The faculty of language: What’s special about it? Cognition, 95, 201–236.
Von Frisch, K. (1950). Bees, their vision, chemical senses, and language. Ithaca, NY: Cornell University Press.
Von Frisch, K. (1974). Decoding the language of bees. Science, 185, 663–668.
Whorf, B. L. (1956a). Language, thought, and reality: Selected writings of Benjamin Lee Whorf. New York: Wiley.
Whorf, B. L. (1956b). Science and linguistics. In J. B. Carroll (Ed.), Language, thought and reality: Selected writings of Benjamin Lee Whorf (pp. 207–219). Cambridge, MA: MIT Press.
Learning Objectives
The mystery of language acquisition was tackled by philosophers and linguists from ancient times. Plato proposed that words mapped onto objects in the external world from some innate knowledge. Sanskrit grammarians debated on whether the semantics of a word came from innate knowledge or tradition passed on from one generation to the next (Matilal, 1990).
In The Twilight Zone episode “Mute” (1963), several children are raised without exposure to language in an effort to foster telepathic abilities. It may surprise you to know that such experiments have actually been carried out, albeit not to foster telepathy but to study language acquisition. This experiment is now known as “the forbidden experiment” (Shattuck, 1980/1994).
Herodotus (ca. 485 – 425 BCE) reports in his Histories (II.2 “An Account of Egypt”) the Egyptian pharaoh Psamtik I carried out an experiment where a child was brought up without exposure to language in an effort to see what they spoke. The hypothesis being that whatever words came out would be the primordial language. They apparently concluded that the original language of humanity was Phrygian because the child said bekos, the Phrygian word for bread. Salimbene di Adam in his Chronicles reports a similar experiment carried out by the Holy Roman Emperor Frederick II in the 13th century. James IV of Scotland did a similar experiment where the child apparently spoke good Hebrew.
While these previous experiments seem to be based on the assumption that some original language would emerge from the subjects, an alternative hypothesis was postulated by the Mughal emperor Akbar. He claimed that language came from hearing and a similar experiment by his found the child to be mute (Campbell & Grieve, 1981). Such experiments are obviously highly unethical and should not be attempted under any circumstances.
From the descriptions above, you may have discerned two major themes: language acquisition as innate and as a socially acquired skill. The former was supported by rationalists such as Plato and Descartes while the latter was adopted by empiricists such as Locke and Hume. Locke (1690/1975) argued that knowldeg was acquired from experience with the famous image of the mind as a tabula rasa or blank slate onto which experience writes through sensations. The debate is alive and well today with the empirical camp being supported by the works of Piaget and the rationalist views supported by Chomsky. Perhaps the most influential voice today in this debate is Chomsky’s. He argued against the views of the Behaviourists who claimed all behaviour was a product of rewards of punishments (operant conditioning). B. F. Skinners Verbal Behavior (1957) was perhaps the seminal work of the empiricist view on language acquisition. Skinner suggested that the successful use of a sign or symbol (such as a word) by a child elicits positive reinforcement from the listening adults resulting in the behaviour of linking the sign with an object more likely. This association of a word with an object gets reinforced over time to develop as language. This view was attached by Chomsky (1959) in his review of Skinner’s book. He argued that children often ignore language corrections from adults and Skinner fails to explain the fundamental role of syntactic knowledge on language competence.
Chomsky demonstrated that children acquire linguistic rules or grammar without an inexhaustive sample of the acquired language. In other words, children cannot learn the rules of grammar by mere exposure to a language (Chomsky, 1965). For one thing, children hear an imperfect input. Adult speech is full of slips-of-the-tongue, false starts and errors. Sometimes there are contractions such as gonna and wanna and words are not necessarily separated in continuous speech. There is also a lack of examples of all the grammatical structures in a language for children to derive all linguistic rules from analysing the input. All of these phenomena are often labelled the “poverty of the stimulus” (Berwick, Pietroski, Yankama, & Chomsky, 2011). Poverty of the stimulus is often used as an argument for universal grammar. This is the claim that all languages have some underlying common structure within which all surface structures of language emerge.
Language development is perhaps one of the greatest mysteries in psycholinguistics. The rapidity of first language acquisition is astounding to anyone who has tried to learn a second language as an adult. This process can be broadly divided into stages based on the characteristics of the infants’ output. However, we must note that output doesn’t always assure us a clear picture of the cognitive processes that are going on within the infants’ minds.
As seen in Figure 5.1, infants make vegetative sounds from birth. These include crying, sucking noises and burps. At around 6 weeks, we start getting cooing sounds followed by vocal play between 16 weeks and 6 months (Stark, 1986). This vocal play involves sounds that appear similar to speech but containing no meaning. Babbling is observed between 6 to 9 months. This is different from vocal play in that it contains true syllables (generally CV syllables as in ‘wa wa’ for ‘water’). Children produce single-word utterances around 10 to 11 months followed by an extraordinary expansion of vocabulary around 18 months. At the same time, we start to get two-word utterances. We also start to get telegraphic speech. These are utterances which lack grammatical elements (Brown & Bellugi, 1964). Grammatically complex utterances emerge around two and a half years.
Research methods that we can employ with adults is not always possible with infants. One technique is the sucking habituation paradigm. This paradigm measures the rate of sucking an artificial pacifier as a measure of interest by the infant in a novel stimulus. It has been observed that babies prefer novel stimuli as opposed to stimuli that are familiar. If they are presented with habituated (or familiar) stimuli and then a novel stimulus pops up, the rate of sucking increases. This can be used to see whether an infant can detect the difference between who stimuli. Another technique is the preferential looking technique. Here children look longer at scene that are consistent with what they are hearing. Using such techniques (and others), psycholinguists try to determine at what age children understand the difference between phonemes, morphemes and understand syntax.
The simplest form of language acquisition would be simple imitation of adult language. While children do imitate adult behaviour to some extent, this alone cannot account for language development. The sentences produced by children acquiring language do not show imitation of adults. Children often make errors that adults don’t make. However, imitation may play a role in the acquisition of accents, speech mannerisms and specialized vocabulary.
Skinner (1957) argued that language acquisition happens through the same mechanisms of operant conditioning that operated on other human and animal behaviour. However, adults generally do not encourage children to speak like them. On the contrary, adults often imitate the childish speech of children when speaking to them. If any correction is made, it is regarding the accuracy of the statements rather than their syntax.
Another observation that learning theories cannot predict is the pattern of acquisition of irregular verb and noun forms. Saying *gived instead of gave or *gooses instead of geese are some examples of this. Children generally show a pattern of correct imitation of the stem but then incorrect production. These incorrect productions are usually because of over-regularization of the past tense or plural forms of the stems. Finally, children produce the correct forms. This is an example of U-shaped development: performance starting off well, then deteriorating before improving. In essence, language acquisition appears to be based on learning rules rather than learning associations.
Chomsky (1965) argued for the existence of a language acquisition device (LAD). This is hypothesized to be an innate structure separate from intellectual ability or cognition. If the poverty of the stimulus is true, then children need something in additional to language exposure to arrive at language competency. The language acquisition device was later replaced by the concept of universal grammar. According to this idea, the child has innate rules of inference that enable them to learn a language. This would be a set of parameters that constrain and guide language acquisition. As languages vary in terms of their grammar, syntax and morphology, Chomsky hypothesized that language learning was essentially setting parameters using input from exposure to a language that in turn set other parameters automatically. In other words, languages cannot vary in any way possible with infinite variety. There are basic parameters that influence each other.
We can look at some examples of parameter setting across language. For example, if a language has subject-verb-object (SVO) word order, then question words (what, where, who, how) would come at the beginning of the sentence while a language that is subject-object-verb (SOV) would put them at the end.
Some universals may be an innate part of grammar. For example, there is not obvious rationale for having all SVO languages putting question words at the beginning of their sentences. It is also possible that the external environment in which we evolved may play a role in the development of universals. Languages often note a difference between animate and inanimate object of sentient and non-sentient beings. However, there is some criticism about the idea that true universals, common to all languages, might exist.
A pidgin language is a grammatically simplified communication method. It usually develops when two or more groups have to develop a system of communication when a common language doesn’t exist. It is common when communities come together for trade and they are not considered complete languages. Pidgins are not native to any speech community. It is built from the words and sounds from a number of languages with a limited core vocabulary.
Pidgins usually have the following characteristics:
As Canada has a long history of contact between various language communities, it has had a large share of pidgins developing over the centuries. To facilitate transactions between their communities, the Inuvialuit, or Mackenzie River Inuit, and the Indigenous Athabaskan speakers used an Inuit trade jargon. Inuktitut-English Pidgin was used in Quebec and Labrador. Algonquian–Basque pidgin was used by Basque whalers and Algonquin communities the Gulf of Saint Lawrence up to the 1710s. Labrador Inuit Pidgin French was a pidgin heavily influenced by French and spoken in Labrador until the 1760s.
The reason we are discussion pidgins here is to explore the extraordinary drive towards language in human beings. While pidgins exist as second languages for adult speakers, if the children of those adults are exposed to a pidgin, they do not grow up speaking it. Instead, they develop a complete language known as creole. A creole is a pidgin language that has become the native language of the children of adult pidgin speakers. Unlike the simplified pidgins, creoles are syntactically rich and complete languages. This indicates that human beings have some in-build language mechanism that can develop a language with the mere exposure to linguistic structures. A creole that developed Scottish Red River Métis in present-day Manitoba is Bungi Creole. It developed from pidgins of Scottish English, Scottish Gaelic, French, Norn, Cree, and Ojibwe.
Michif is another example of a fully developed creole language. Michif combines Cree and Métis French with words borrowed from English, and some neighbouring Indigenous languages. Michif noun phrase phonology, lexicon, morphology, and syntax as well as articles, adjectives are derived from Métis French. Its verb phrase phonology, lexicon, morphology and syntax as well as demonstratives are from Cree.
Exploring the emergence of creoles from pidgins show the inherent instinct for language in human beings. If language is merely a socially transmitted communication system, then children could grow up speaking pidgin as their language. The fact that they develop a complete syntactically rich language even when exposed to pidgins suggests an internal language acquisition mechanism that takes in the input of linguistic structures and develops it into a language using some universal grammar.
Key Takeaways
Berwick, R. C., Pietroski, P., Yankama, B., & Chomsky, N. (2011). Poverty of the stimulus revisited. Cognitive Science, 35, 1207–1242.
Brown, R., & Bellugi, U. (1964). Three processes in the acquisition of syntax. Harvard Educational Review, 34, 133–151.
Campbell, R. N. & Grieve, R. (1981). Royal investigations of the origin of language. Historiographia Linguistica 9(1-2), 43-74.
Chomsky, N. (1959). A review of B. F. Skinner’s “Verbal behavior”. Language. 35(1), 26–58.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Locke, J. (1690). Essay concerning human understanding (Ed. P. M. Nidditch, 1975). Oxford: Clarendon.
Matilal, Bimal Krishna (1990). The word and the world: India’s contribution to the study of language. Oxford: Oxford University Press.
Shattuck, Roger (1980/1994). The forbidden experiment: The story of the wild boy of Aveyron. Kodansha International.
Skinner, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts.
Stark, R. E. (1986). Prespeech segmental feature development. In P. Fletcher & M. Garman (Eds.), Language acquisition (2nd ed., pp. 149–173). Cambridge: Cambridge University Press.
Learning Objectives
Bilinguals are people who are fluent in two languages. It is not particularly necessary for bilinguals to be equally fluent in both languages. Fluency need not to be a binary classification but rather a continuum. While people often speak of first language or mother tongue and second language, psycholinguists refer to the language learned first as L1 and the language learned after that as L2. Sometimes the language learned second may become the primary language of use in everyday life and the language learned first may become the secondary language in later life. Bilingualism can also be categorised as follows:
Bilingualism is not always a matter of choice. Some societies have a history of attempting to impose a language on others. In others, one language may be held as having higher prestige or allowing for better opportunities. On the other hand, bilingualism (or multilingualism) was the norm throughout most of human history until the rise of linguistically and ethnically divided states in Europe. Most human beings lived in multilingual societies or used one language in common use while learning another as a language of higher education (as with Latin in Europe, Sanskrit in India, Classical Chinese in China and English in the modern world).
Bilingualism doesn’t appear to have any linguistic disadvantages (Snow, 1993). There have be cases of initial delay in vocabulary acquisition in one language but this soon passes. Bilinguals tend to have a slight deficit in working memory tasks in L2. However, they have greater metalinguistic awareness and verbal fluency (Ben-Zeev, 1977; Bialystock, 2001; Cook, 1997). For example, children in Canadian French immersion programs tended to score highly on creativity tests than monolinguals (Lambert, Tucker & d’Anglejan, 1973).
Being a bilingual gives you the awareness that words are arbitrary symbols for things. There are some researchers who have found interference between L1 and L2 (Harley & Wang, 1997). However, there is evidence to suggest that bilingualism provides a general cognitive advantage. There is even some data that indicates that bilingualism protects individuals from the development of Alzheimer’s disease by slowing down cognitive aging (Bialystok, Craik, & Luk, 2012).
The issue of how we switch between two languages is an interesting topic for bilingual research. Kroll and Stewart (1994) proposed a model of asymmetric translation between L1 and L2. The Word Association Model (see Figure 6.1) has the L1 word directly associated with its L2 equivalent. In order to access the concept of a word, L2 words must first activate their L1 equivalent.
In contrast to this, the Concept Mediation Model (see Figure 6.2) has words in L1 and L2 directly associated with the concepts. However, there are no direct links between the words in L1 and L2. Potter et al. (1984) tested these models by comparing the time it took bilinguals to translate between L1 and L2 and L2 picture-naming. The assumption is that picture naming requires conceptual processing. The Word Association Model predicts that picture-naming in L2 should take longer than translation as it has two steps (link to L1 and then link to the concept). The Concept Mediation Model on the other hand predicts that both tasks should take more or less the same amount of time as the concept is linked to the words in L1 and L2. The results favoured the Concept Mediation Model.
Kroll and Curley (1988) and Chen and Leung (1989) replicated the results of Potter et al. (1984) using participants of lower proficiency in L2. They found that learners at an earlier stage of acquisition were quicker in translation from L1 to L2 than L2 picture-naming. This supported the Word Association Model. While Kroll and Curley (1988) and Chen and Leung (1989) did replicate the findings of Potter et al. (1984) for highly proficient bilinguals, they did identify a transition phase which relied on translating between L1 and L2 with possible mediation of the concept. Therefore, Kroll and Stewart (1994) proposed the Revised Hierarchical Model (see Figure 6.3) which integrated the connections from the Word Association Model and the Concept Mediation Model.
The Revised Hierarchical Model make two assumptions:
As learners become more proficient in L2, they begin to develop the ability to process L2 words directly. However, the connections for L1 remain stronger than for L2.
The most influential model of bilingualism is the Bilingual Interactive Activation Plus (BIA+) model (Dijkstra & van Heuven, 2002; Dijkstra, van Heuven, & Grainger, 1998). The model tries to bring together evidence from bilingual orthographic processing as well as the recognition of words that look the same in two languages (cognates). The model is composed of a network of nodes at every level of representation from segmental (orthographic/phonemic), sub-lexical, to lexical. As seen in Figure 6.4, the model is bottom-up. Word recognition isn’t affected by whether the task is naming or reading.
The model assumes a distinction between word identification and task-based sub-systems. The lexicon is integrated with parallel access as well as temporal decay of L2. This decay is based on the assumption that word frequency affects the resting potential of the nodes for particular words. The model is supported by neuroimaging studies. The Semantic, Orthographic, Phonological Interactive Activation (SOPHIA) model is the implemented version of the BIA+ model. It adds phonology and semantics with layers for letters/phonemes, clusters/ syllables/words and semantics. In SOPHIA, the language nodes no longer inhibit the non-target language. The model still uses orthographic information as input.
Figure 6.4 The Bilingual Interactive Activation Plus (BIA+) model
The BIA+ model is composed of three levels of representation. From bottom up the levels are orthographic or phonological features, followed by sub-lexical units, and then lexical orthography or lexical phonology, eventually lead to the semantic meaning of the language.
[Return to place in the text (Figure 6.4)]
Figure 6.5 The Semantic, Orthographic, Phonological Interactive Activation (SOPHIA) model
The SOPHIA model is composed of four layers of representation. From bottom up, the first layer of representation is letters or phonological features, followed by orthographic or phonological clusters, then orthographic or phonological syllables, and moving up to orthographic or phonological words, then eventually lead to the semantic meaning of the language.
[Return to place in the text (Figure 6.5)]
Second language acquisition is the attempt to acquire a language while already competent in another. There is a distinction between a child being naturally exposed to two languages and a child or adult learning in a classroom setting. In terms of second language acquisition itself, linguistic structures such as syntax may be harder to grasp after a critical period. People also have less time and motivation to pursue language learning in earnest. In addition, the contrasts between L1 and L2 may aid or hinder acquisition. Generally, the more different L2 is from L1 in some feature, the more difficult it will be to learn.
Initial learning of L2 is good and then declines before the learner becomes more proficient (McLaughlin & Heredia, 1996). This decline is explained by the substitution of less complex internal representations with more complex ones. For example, the learner acquires the use of syntactic rules as opposed to repeating sentences by rote. The traditional method for second language teaching is based on translating from one language into another. On the other hand, direct learning involves learning conversational skills in L2. Some methods prefer speaking and listening over reading and writing. Immersive learning is a technique where all learning is conducted in L2.
In addition to various teaching methods, the characteristics of the individual learner also play a role in second language acquisition. Carroll (1981) identified four sources of variations:
This chapter explored the definition of bilingualism in its various forms. It also considered the advantages and disadvantages of bilingualism and various models for translating between L1 and L2. We also looked at computational models for bilingual reading and auditory comprehension. Finally, we explored some of the evidence-based methods for second language learning.
Key Takeaways
Exercises in Critical Thinking
Ben-Zeev, S. (1977). The influence of bilingualism on cognitive strategy and cognitive development. Child Development, 48, 1009–1018.
Bialystock, E. (2001). Metalinguistic aspects of bilingual processing. Annual Review of Applied Linguistics, 21, 169–181.
Bialystok, E., Craik, F. I. M., & Luk, G. (2012). Bilingualism: Consequences for mind and brain. Trends in Cognitive Sciences, 16, 240–250.
Bickerton, D. (1981). Roots of language. Ann Arbor, MI: Karoma.
Bickerton, D. (1984). The language bioprogram hypothesis. Behavioral and Brain Sciences, 7, 173–221.
Carroll, J. B. (1981). Twenty-five years of research on foreign language aptitude. In K. C. Diller (Ed.), Individual differences and universals in language learning aptitude (pp. 83–118). Rowley, MA: Newbury House.
Chen, H.-C., & Leung, Y.-S. (1989). Patterns of lexical processing in a nonnative language. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 316–325.
Cook, V. (1997). The consequences of bilingualism for cognitive processing. In A. M. B. de Groot & J. F. Kroll (Eds.), Tutorials in bilingualism: Psycholinguistic perspectives (pp. 279–299). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Dijkstra, A., & Van Heuven, W. J. B. (2002). The architecture of the bilingual word recognition system: From identification to decision. Bilingualism: Language and Cognition, 5, 175–197.
Dijkstra, T., van Heuven, W. J. B., & Grainger, J. (1998). Simulating cross-language competition with the bilingual interactive activation model. Psychologica Belgica, 38, 177–196.
Harley, B., & Wang, W. (1997). The critical period hypothesis: Where are we now? In A. M. B. de Groot & J. F. Kroll (Eds.), Tutorials in bilingualism: Psycholinguistic perspectives (pp. 19–51). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Kroll, J. F., & Curley, J. (1988). Lexical memory in novice bilinguals: The role of concepts in retrieving second language words. In M. Gruneberg, P. Morris, & R. Sykes (Eds.), Practical aspects of memory (Vol. 2, pp. 389–395). London: Wiley.
Kroll, J. F., & Stewart, E. (1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language, 33, 149–174.
Lambert, W., Tucker, G., & d’Anglejan, A. (1973). Cognitive and attitudinal consequences of bilingual schooling. Journal of Educational Psychology, 65, 141-159.
McLaughlin, B., & Heredia, R. (1996). Information processing approaches to research on second language acquisition and use. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 213–228). London: Academic Press.
Potter, M. C. (1979). Mundane symbolism: The relations among objects, names, and ideas. In N. R. Smith&M. B. Franklin (Eds.), Symbolic functioning in childhood (pp. 41–65). Hillsdale, NJ: Erlbaum.
Snow, C. E. (1993). Bilingualism and second language acquisition. In J. B. Gleason & N. B. Ratner (Eds.), Psycholinguistics (pp. 391–416). Fort Worth, TX: Harcourt Brace Jovanovich.
Learning Objectives
In this chapter we will explore the basics of the written word. Writing involves the representation of language with written symbols. A system of writing is not in itself a language. It is a way to render language in a manner that is retrievable across space and time. Writing is not a natural part of language and most languages have not had a writing system for most of their history. We will not explore the history of writing in much detail. We will however look at some prominent writing systems that will give you some idea about how it may have developed in human societies. Look through the sentences below and think of how they differ from one another. Each of them represents a different writing paradigm which we will be exploring.
We are already familiar with the concept of the phoneme as a basic unit of sound in a language from the second chapter. We often think of a writing system of a language as consisting of letters. However, for our purposes here, we will use the term grapheme to refer to the smallest unit of writing. A grapheme is a letter or combination of letters that represent a phoneme. Just as we use the forward slashes // to indicate phoneme as use the less than and greater than symbols < > to indicate graphemes. For example, the word ‘dog’ has three phonemes <d>, <o> and <g> that correspond to three phonemes. That seems pretty straightforward. Now think of the word ‘ship.’ The <i> and <p> represent two phonemes but the <sh> combine to form a grapheme to represent the voiceless palato-alveolar fricative /ʃ/. This phoneme is different from a grapheme in that different languages use different graphemes to indicate it. For example, Basque <x>, Turkish, Azerbaijani, Romanian <ş>, French <ch>, Shuswap <s>, and German <sch>.
The examples we saw in terms of graphemes will be familiar to English speakers as they are all from alphabets. In alphabets, the basic correspondence for a grapheme is a phoneme. This is not the case for other writing systems. In addition, not all languages have a transparent connection between graphemes and phonemes. Italian and Spanish tend to be transparent while English and French tend not to be so clear in how the graphemes represent phonemes. For example, think of the variations for the grapheme <a> in the words ‘apple,’ ‘father,’ and ‘gate.’ There are also writing systems where graphemes represent not phonemes but syllables, morphemes or even words. As you will see, these variations have implications for psycholinguistic models as most research is done in languages with alphabetic writing systems.
The earliest writing systems in the world developed from logograms. In this type of writing system, each grapheme represents a word or morpheme. Examples of logographic systems include Chinese characters, Egyptian hieroglyphs, and Sumerian cuneiforms. You can imagine how it would be impractical to have a separate symbol for every word or morpheme. Therefore, these systems take advantage of the rebus principle. For example, in English the word ‘bee’ and ‘leaf’ can be represented with separate pictures as graphemes. However, when it comes to writing ‘belief,’ instead of creating another picture, we could combine the two pictures to for BEE+LEAF which can be sounded out when reading. A modern development of a logographic system would be the use of emojis. However, emojis are not formalized for any language and so cannot act as a writing system on their own.
There are some interesting examples of logographic writing systems in Canada. Examples include the Ojibwe wiigwaasabakoon. These are bark scrolls with symbols now known as Ojibwe hieroglyphs. They have not been deciphered although it is said that some elders still know what they mean (Geniusz, 2009). They are supposed to contain songs and details of religious rituals and medicine. Another example is the Miꞌkmaw hieroglyphic writing system used in the east coast and islands of Canada. These are known as komqwejwi’kasikl, or “sucker-fish writings” as they resemble the tracks left on the mud by sucker fish. It is possible that they evolved from mnemonic symbols that were used to aid in recalling information. As seen in the example in Figure 7.2 from 1880, later missionaries used the system to write prayers while they also destroyed older scrolls that contained information from the Miꞌkmaw religion.
While a logogram represents an entire word or morpheme, a syllabary is a system where a grapheme represents an entire syllable. Typically, syllabaries use a system whereby there are symbols for individual vowels and consonant-vowel combinations. In most syllabaries, phonetically related syllables would not look similar. For example, the grapheme for /pa/ would not look similar to the grapheme for /pi/. Syllabaries are quite natural in that they represent the smallest units of articulation.
Syllabaries are generally used by languages that have relatively simple syllable structures. Examples of syllabaries include the hiragana and katakana syllabaries used for Japanese and Linear B used for Mycenaean Greek. An amazing example of a syllabary being invented is the Cherokee syllabary (Figure 7.3) invented by Sequoyah in the 1810s to 1820s. While the graphemes were borrowed from the symbols seen by Sequoyah in European alphabets, they represent different phonemes and unlike European writing systems, it is a true syllabary. The stories about Sequoyah’s extraordinary achievement is that he attempted to create graphemes for each word. Finding this too difficult, he went on to develop graphemes for each syllable. It was so successful that it led to the Cherokee Nation’s literacy rate in the 1830s surpassed that of European settlers.
Syllabaries seem a very natural way to represent spoken language in writing. They are easily discernable as they represent the smallest unit of articulation. However, some of the earliest systems of writing invented by mankind may appear quite unusual. These are abjads which only have symbols for consonants and not vowels. Why might this be? Try the following exercise.
Living Language
Consider the language that you speak and think of the various dialects that might exist for it. In English, for example, you can think of how people speak in Western parts of North America versus the East Coast. Now compare these against the dialects of English in the United Kingdom, Australia, India, and New Zealand.
Now in imagining how these dialects sound (you can look at some movie clips from various regions), what are the phonemes that are common across them. Do they change a lot in terms of consonants or vowels?
If you went through this exercise, you will realize that vowels tend to be the most varied across dialects of a language. While it may seem natural to us to write vowels down, think of how a person coming up with a system to represent sounds would approach it. As long as you have only a few vowels in your language, you can get the gist of the word by simply writing the consonants. Such systems were used by the people of Ancient Egypt which may have influenced the Phoenicians. The Phoenician abjads lead to Hebrew and Arabic abjads as well as the eventual invention of the alphabet by the Ancient Greeks. As in most writing systems, abjads don’t always remain true to their definition and often employ diacritic marks and some consonant graphemes to represent vowels in some contexts.
Figure 7.4 gives some examples of the Persian abjad system. Persian has adapted the Arabic abjad to represent its own unique sound inventory. As vowels are quite prominent, Persian employs the aleph symbol from Arabic (originally used to represent a glottal consonant) for vowels. The difference between short and long vowels is indicated by adding a particular consonant for long vowels. Finally, we see a complete word (written right-to-left) with just consonants. Sometimes, diacritics are employed to indicate geminates (overly long consonants). The word itself is a Arabic word (maḥabba) adapted into Persian as /mohæbbæt/ as well as Urdu and Hindi /muhabbat /.
We saw in Chapter 2 that syllables are the smallest units of articulation. Therefore, it is quite natural to represent syllables and words in writing. The alphabet, which represents individual vowels and consonants with separate graphemes is a unique invention of the Ancient Greeks. Taking the abjad system of the Phoenicians, the Greeks adapted it to represent their own language at around the 8th century. As vowels played a more prominent role in Greek, they needed to indicate them with new symbols. The Greek alphabet is the ancestor of all modern European scripts. This is either through its adoption by the Romans for Latin (in the West) and Cyrillic (in the East). Canadian examples of alphabets include Secwepemc (Secwepemctsín), Squamish (Sḵwx̱wú7mesh sníchim), Thompson / Nlaka’pamux (Nłeʔkepmxcín), Okanagan (n̓səl̓xcin̓) and Algonquin (Anicinâbemowin).
Unlike an alphabet where the consonant and vowel share equal prominence, an abugida uses segments of consonant-vowel sequences where the consonant is prominent when preceding a vowel. The vowel is usually indicated by secondary notations. This may sound like a syllabary. However, unlike syllabaries, abugida segments can be split into consonants and vowels. In addition, similar segments share visual features.
Most of the writing systems of South Asia, Southeast Asia and Tibet are abugidas. This system is also prevalent in Ethiopian and the Canadian Aboriginal syllabics. Figure 7.5 gives an example of an abugida in the form of the Devanagari script used to write Sanskrit, Hindi, Marathi, and Nepali. Generally, you find separate graphemes for individual vowels and consonants. All consonant grapheme are pronounced with an inherent schwa vowel. If a consonant is followed by a vowel, then it is not written with the primary symbol (as in an alphabet), but with a secondary notation that can appear before, after, under or above the consonant’s grapheme. This is all well and good for simple CV syllables, but what about more complicated syllable structure and syllables that end with a consonant? Various strategies are employed for these scenarios. One is to fuse two consonant graphemes together to form the complex structure (as in the examples for /tr/ and /kj/). We can see that sometimes this fusion results in a new symbol (as in /tr/) or the graphemes for each consonant remain more or less the same (as in /kj/). Another method employed In Hindi is that the final vowel in a word is intuitively deleted based on context. In other languages (for example Sanskrit), there is a special vowel nullifier that is used to indicate that the inherent vowel should not be pronounced.
Figure 7.6 gives us a nice comparison between alphabets, syllabaries and abugidas. The alphabet is the Latin script used in English or Secwepemc. Each consonant and vowel have a separate grapheme. However, while in Secwepemc each grapheme represents a phoneme, in English they can represent different phonemes (making the English script less transparent). The syllabary is the Japanese Hiragana equivalents for the Latin graphemes. As you can see, while they represent variations of /k/ with different vowels, they graphemes show no indication of this and are separate symbols. The examples of an abugida are from the Tamil script. Here, we can see the same symbol for /ka/ (in black). However, every other vowel (long /aː/, /i/, /u/, /e/ and /o/) are indicated with secondary vowel notations (in red). This means you need fewer symbols in this writing system as you don’t need a separate symbol for every consonant-vowel combination (as in a syllabary).
An abugida that is prevalent in Canada is a family of writing systems known as Canadian Aboriginal syllabics (see Figure 7.7). Created by James Evans based on his knowledge of Devanagari and shorthand, it is used to write a number of Indigenous Canadian languages. They are used by Cree, Inuktitut and Ojibwe. As seen in Figure 7.7, unlike most abugidas which use secondary graphemes for different vowels, Canadian syllabics are unique in employing the orientation of the grapheme to indicate vowels. If a consonant appears on its own without an inherent vowel, then it is written as a superscript.
We have seen how phonemes can be classified according to their place and manner of articulation as well as voicing. A featural script notates these aspects of phonemes in a consistent manner in graphemes (Sampson, 1990). For example, all labials (phonemes produced with the lips) may have common visual elements in all graphemes that represent them. A writing system developed with just such featural elements is Korean hangul (see Figure 7.8).
As seen in Figure 7.8, the Korean graphemes are written in blocks arranged in two dimensions. Therefore, words are not written in a linear fashion. Hangul’s featural system is not always evident to its users as the graphemes are used like an alphabetic writing system. Another featural writing system would be Pitman’s shorthand. Here the thickness of the lines indicates featural differences. The fictional script invented by Tolkien for Tengwar also employs featural notations. In it stops look similar to each other with minor variations as do sibilants, fricatives and nasals.
Unlike speech, reading is not a biologically evolved mechanism. Writing emerged in different cultures starting around 5000-6000 years ago unlike speech which may go back as far as 2 million years. Therefore, the mechanisms that we employ for reading and writing would be those which evolved for other cognitive tasks but adapted for this new purpose. As we see in Figure 7.9, various brain regions which evolved to form associations between different modes of sensation and perception have been adapted to engage in reading.
The scientific study of reading is therefore a multidimensional area exploring the everything from the linguistic aspects of reading (as we did in this chapter) to the psychological mechanisms and neural circuits involved. To begin our exploration of various psychological models of reading and the evidence for supporting them, let us start with the Simple View of Reading. This view is based on the idea that reading requires word recognition and linguistic comprehension which need to interact in order for reading to develop (Catts et al., 2016; Savage, 2001). This view also classifies readers into four broad categories: typical readers; poor readers (general reading disability); dyslexics; and hyperlexics (see Figure 7.10).
As we are dealing with a variety of writing systems from around the world, we need to familiarize ourselves with some basic concepts associated with reading. For example, we know that reading requires some way of associating graphemes with phonemes to build meaningful units (words and morphemes). In some cases, this is pretty straightforward in that graphemes map onto phonemes in a regular manner (as in the word “feet”). We don’t need special knowledge to pronounce the word. Even if we were to be presented with a non-word such as ‘pont,’ we don’t need any special knowledge to know how to pronounce them. However, not all words are regular. There are irregular or exception words. Consider place names such as ‘Leicester’ (pronounced /lɛstər/) or surnames such as Featherstonhaugh (pronounced /fænʃɔː/). These are irregular in that the graphemes don’t map clearly onto the phonemes they are supposed to represent. More familiar exceptions would be ‘have’ as opposed to ‘gave.’ English has a plethora of these irregular words which make it difficult for learners of English as a second language.
The fact that we can understand how to pronounce regular words and regularly spelled non-words suggests we must have a mechanism for storing and decoding regular rules for spelling. Our ability recognize irregularly spelled words suggests that there must be another route for retrieving their pronunciation. In other words, there is the suggestion for a dual-route model of reading. The classic dual-route model is based on the assumption of two routes for pronouncing words. There is direct access or lexical route where the word-form needs to be retrieved from the lexicon along with its pronunciation. There must also be a non-lexical route which has a grapheme-to-phoneme converter (GPC) that maps each grapheme to its corresponding phoneme using regular rules (Gough, 1972; Rubenstein, Lewis, & Rubenstein, 1971). The non-lexical route is also evident when children are learning to read letter by letter. In most versions of the dual-route model, whenever a word is encountered, there is a race between the two routes to access the pronunciation. Whichever route gets the final word-form out produces the output. Whether these two routes are necessary for understanding reading is a primary question in most psycholinguistic research. In addition, not all writing systems are the same and may need to employ different strategies for decoding their graphemes. We will explore the evidence for these various models in the next chapter.
Figure 7.10 The Simple View of Reading
The simple view of reading classifies readers in four broad categories based on the language comprehension and the word recognition. These categories are:
[Return to the place in text (Figure 7.10)]
This chapter explored the basic concepts of graphemes in different writing systems. We saw how logograms, syllabaries, abjads, alphabets, abugidas and featural writing systems can all be useful in representing different languages across the world. This chapter also explored how the two basic types of spelling (regular and irregular) have led to the assumption of dual-route models for reading.
Key Takeaways
Exercises in Critical Thinking
Catts, Hugh W.; Hogan, Tiffany P.; Fey, Marc E. (2016). “Subgrouping poor readers on the basis of individual differences in reading-related abilities”. Journal of Learning Disabilities. 36(2), 151–164.
Geniusz, W. D. (2009). Our knowledge is not primitive: Decolonizing botanical Anishinaabe teachings. Syracuse, N.Y: Syracuse University Press.
Gough, P. B. (1972). One second of reading. In J. F. Kavanaugh & I. G. Mattingly (Eds.), Language by ear and by eye (pp. 331–358). Cambridge, MA: MIT Press.
Rubenstein, H., Lewis, S. S., & Rubenstein, M. A. (1971). Evidence for phonemic recoding in visual word recognition. Journal of Verbal Learning and Verbal Behavior, 10, 645–658.
Sampson, G. (1990). Writing systems. Stanford University Press
Savage, R. (2001). The simple view of reading: Some evidence and possible implications. Educational Psychology, 17, 17-33.
Learning Objectives
In this chapter, we will explore the evidence that informs reading research within psycholinguistics. We will look at the neurological disorders associated with reading. These include various types of dyslexia and their specific deficit types have the potential to provide new insights onto the process of normal reading. We will then look at some models of reading and the evidence they use to inform their stages and information processing.
Proposed by Morton (1969, 1970), the Logogen model assumes units called logogens which are used to understand words that are heard and read. Logogens are specialized recognition units that are used for word recognition. Logogen is from the Greek λόγος (logos, word) and γένος (genos, origin). So, every word we know has its own logogen which contains phonemic and graphemic information about that word. As we encounter a word, the logogen for that word accumulates activation until a given threshold is reached upon which the word is recognized. An important issue to remember is that the logogen itself doesn’t contain the word. Rather, it contains information that can be used to retrieve the word. Accessing words is direct and parallel for all words.
Each logogen has a resting activation level. As it receives more evidence that corresponds to its word, this activation level increases up to a threshold. For example, if the input contains the grapheme <p>, then all logogens containing that visual input get an increase in activation. Once enough graphemes are there to fully activate the logogen it fires and word recognition occurs. The model is particularly good as including contextual information in recognizing words. One of the problems with this model was that it equated visual and auditory input for a word as using the same logogen process. A prediction of this model would be that a spoken prime should facilitate a written word just as much as a visual prime. However, experimental evidence contradicted this prediction (Winnick & Daniel, 1970). Following his own observation that confirms these findings, Morton divided the model into different sets of logogens (see Figure 8.1).
While the logogen model is quite successful in explaining word recognition in terms of semantics, there are some limitations to this model. There have been challenges to the existence of the logogen as a unit of recognition. Given that different pathways process information on the way to word recognition, is it really necessary to have such centralized units? The model’s limited scope for word recognition that ignores innate syntactic rules and grammatical construction is also a limitation.
McClelland and Rumelhart (1981) and Rumelhart and McClelland (1982) developed the Interactive activation and competition (IAC) model. The model accounted for word context effects. This means that letters are easier to recognize if they are in words rather than as isolated letters (also known as the word superiority effect). As seen in Figure 8.3, the model consists of three levels: visual feature units, indivudla letter units, and word units. Each unit is connected to units in the level immediately before and after it with connections that are excitatory (if appropriate) or inhibitory (if inappropriate).
Let’s look at the example in Figure 8.3 for the word leap. First, the individual features are recognized. For example, the vertical line feature excites <E>, <P>, and <L> but combined with the horizontal line feature, it only excites <L>. All the letter units in turn excite words that contain them, but the words that do not contain the letters act as inhibitory signals in the opposite direction. Once enough letter units have accumulated activation of the word <LEAP>, then that word is recognized. The inhibition of units lower down the model if a positive recognition is not made accounts for the word superiority effect. Obviously, if no words are activated (if the letter is on its own), they will act as inhibitors in letter recognition. However, if the letter is within a word, then the words facilitate recognition.
Seidenberg and McClelland (1989) proposed a model (also known as SM) that accounts for letter recognition and pronunciation. Reading and speaking involve three features: orthographic, semantic and phonological coding. In the SM model, these features are connected with feedback connections. As seen in Figure 8.2, the model is captured in a triangular shape. There is a route from orthography to phonology via semantics. However, there are no routes for grapheme-to-phonemes correspondence.
The model has three levels containing a number of simple units. These are the input, hidden and output layers. Each unit has an activation level and is connected to other units by weighted connections which can excite or inhibit activation. The main feature of these connections is that they are not set by anyone, but learned through back-propagation. This is an algorithmic method whereby the discrepancy between the actual output and the desired output is reduced by changing the weights between the connections. This model also does not have lexical entries for individual words. They are connections between phoneme or grapheme units.
Coltheart et al. (1993) criticized the SM model for not accounting for how people read exception words, and non-words. They also stated that the model doesn’t account for how people perform visual lexical decision tasks as well as failing to account for data from reading disorders
such as dyslexia. Forster (1994) pointed out that just because a model can successfully replicate reading data using connectionist modelling doesn’t mean that it reflects how reading occurs in human beings. Norris (1994) argued that the model doesn’t reflect how readers can shift strategically between lexical and non-lexical information when reading.
The dual-route model is perhaps the most widely studied model for reading aloud. It assumes two separate mechanisms for reading: the lexical route and the non-lexical route. As seen in Figure 8.4, the lexical route is most effective with skilled readers who can recognize words that they already know. This is like looking for words in a dictionary. When a reader sees a word, they access the word in their mental lexicon and retrieve information about its meaning and pronunciation. However, this route cannot provide any help if you come across a new word for which there is not entry in the mental lexicon. For this you would need to use the non-lexical route.
The non-lexical or sub-lexical route is a mechanism for decoding novel words using existing grapheme-to-phoneme rules in a language. This mechanism operates through the identification of a word’s constituent parts (such as graphemes) and applying linguistic rules to decoding. For example, the grapheme <ch> would be pronounced as /tʃ/ in English. This route can be used to read non-words or regular words (that have regular spelling).
Figure 8.3 Seidenberg and McClelland’s model of reading
The SM model is in a triangular shape. The three corners are:
Phonological processor and orthographic processor are connected by phonics, all three processors are connected by the meaning processor.
[Return to place in the text (Figure 8.3)]
A flow chart of the dual-route model that has a lexical route and a sub-lexical route for reading aloud.
[Return to place in the text (Figure 8.4)]
Reading models can often be informed by data from people with reading disorders. In studying such disorders, we must differentiate between acquired disorders (those that arise from brain trauma, stroke or injury), and developmental disorders (those that may arise from disruption to the developmental of reading faculties). These dyslexias generally result from injury to the left hemisphere. If the dual-route model is accurately capturing the reading process, then we should be able to find patients who have damaged one route without impairing the other. The evidence for such double disassociations in reading aloud task shows strong support for dual-route models.
Patients with surface dyslexia have an impairment in reading irregular words. For example, they would have difficulty reading “quay” but can read regularly spelled words such as “dog.” They often over-regularize when reading aloud but can read regular words and regularly spelled non-words easily. In other words, the dual-route model would predict that their lexical route is impaired while the non-lexical route is intact.
Patients with phonological dyslexia are unable to read regularly spelled nonwords. However, they are able to read equivalent words. This suggests an impairment with the non-lexical (grapheme-to-phoneme) route.
Deep dyslexics often resemble phonological dyslexics in that these patients have difficulty with non-words. However, they also make semantic errors where they produce words that are related in meaning with the word they were supposed to read. Coltheart (1980) lists 12 characteristics of this disorder:
Languages with transparent scripts, such as Italian and Spanish, exhibit phonological and deep dyslexia but not surface dyslexia (Patterson, Marshall, & Coltheart, 1985a, 1985b). However, interesting observations can be made in language that have more than one script. Take Japanese, for instance, which has a syllabary (kana) and a logographic script (kanji). In the latter, no information is available about pronunciation as the symbols stand for the word. As seen in Figure 8.5, while kana can allow for non-lexical grapheme-to-phoneme processing, kanji would only access the lexical route. Therefore, a type of surface dyslexia is found in Japanese where patients cannot read kanji but can process kana. Phonological dyslexia in Japanese results in patients being able to read both kana and kanji but bot able to process non-words written in kana. This suggests that while the neuropsychological mechanisms for reading are common to all human beings, there may be contextual differences brought out by the features inherent in a particular language’s writing system.
Living Language
Canada has its own syllabic writing system used to write languages such as Algonquian and Inuit. In this writing system, the orientation and size of the letters is what modifies the pronunciation rather than other diacritics.
Key Takeaways
Exercises in Critical Thinking
Think of how you read words in your language.
Coltheart, M. (1980). Deep dyslexia: A right hemisphere hypothesis. In M. Coltheart, K. E. Patterson, & J. C. Marshall (Eds.), Deep dyslexia (pp. 326–380). London: Routledge & Kegan Paul.
Coltheart, M., Curtis, B., Atkins, P., & Haller, M. (1993). Models of reading aloud: Dual-route and parallel-distributed-processing approaches. Psychological Review, 100, 589–608.
Forster, K. I. (1994). Computational modeling and elementary process analysis in visual word recognition. Journal of Experimental Psychology: Human Perception and Performance, 20, 1292–1310.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of the basic findings. Psychological Review, 88, 375–407.
Morton, J. (1969). Interaction of information in word recognition. Psychological Review, 76, 165–178.
Morton, J. (1970). A functional model for human memory. In D. A. Norman (Ed.), Models of human memory (pp. 203–260). New York: Academic Press.
Norris, D. (1994b). Shortlist: A connectionist model of continuous speech recognition. Cognition, 52, 189–234.
Patterson, K. E., Marshall, J. C., & Coltheart, M. (1985a). Surface dyslexia in various orthographies: Introduction. In K. E. Patterson, J. C. Marshall, & M. Coltheart (Eds.), Surface dyslexia: Neuropsychological and cognitive studies of phonological reading (pp. 209–214). Hove, UK: Lawrence Erlbaum Associates.
Patterson, K. E., Marshall, J. C., & Coltheart, M. (Eds.). (1985b). Surface dyslexia: Neuropsychological and cognitive studies of phonological reading. Hove, UK: Lawrence Erlbaum Associates.
Rumelhart, D. E., & McClelland, J. L. (1982). An interactive activation model of context effects in letter perception: Part 2. The contextual enhancement effect and some tests and extensions of the model. Psychological Review, 89, 60–94.
Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96(4), 523-568.
Winnick, W. A., & Daniel, S. A. (1970). Two kinds of response priming in tachistoscopic recognition. Journal of Experimental Psychology, 84, 74–81.
Learning Objectives
Living Language
As everyone knows, a long time ago, the world was full of water. As the floods receded, Raven flew over the waters. He saw a gigantic clam shell and feeling bored, flew over and landed on it. He heard strange sounds emerging from the clamshell: “yakity yak-yak”. Having never heard such sounds Raven was curious and knowing that he had to soothe the fears of whatever was inside, he started to sing. After singing in a soothing voice, Raven called out “come out! I am Raven the creator and will not hurt you.” Finally, after more singing, the clamshell opened and out came little beings with two legs, two arms, two hands and no feathers. These being were our ancestors. While they were frightened at first, they spoke to Raven and ate what he brought to feed them. In this way, they became the ancestors of all of us (Kenny, 1994).
This Haida legend is an illustration of the centrality of speech in creating the universe. In every culture we see this repeated from ‘let there be light’ in the bible to ‘let it be and it was so’ (كن فيكون) in the Qur’an. These examples illustrate the centrality of speech to making sense of the universe, indeed creating it from our mind.
In this chapter, we will explore the manner in which we produce speech. We will consider the various stages of speech production and the evidence used to assume their existence. We will organize these stages into a standard model to serve as a basis for our discussion. Then we will explore three prominent speech production models from current psycholinguistic research and see where they agree and disagree in terms of how speech production occurs. The evidence that they use to justify their understanding of speech production will serve as a basis for understanding how researchers employ evidence in modelling psycholinguistic models.
The evidence used by psycholinguistics in understanding speech production can be varied and interesting. These include speech errors, reaction time experiments, neuroimaging, computational modelling, and analysis of patients with language disorders. Until recently, the most prominent set of evidence for understanding how we speak came from speech errors. These are spontaneous mistakes we sometimes make in casual speech. Ordinary speech is far from perfect and we often notice how we slip up. These slips of the tongue can be transcribed and analyzed for broad patterns. The most common method is to collect a large corpus of speech errors by recording all the errors one comes across in daily life.
Perhaps the most famous example of this type of analysis are what are termed ‘Freudian slips.’ Freud (1901-1975) proposed that slips of the tongue were a way to understand repressed thoughts. According to his theories about the subconscious, certain thoughts may be too uncomfortable to be processed by the conscious mind and can be repressed. However, sometimes these unconscious thoughts may surface in dreams and slips of the tongue. Even before Freud, Meringer and Mayer (1895) analysed slips of the tongue (although not in terms of psychoanalysis).
Speech errors can be categorized into a number of subsets in terms of the linguistic units or mechanisms involved. Linguistic units involved in speech errors could be phonemes, syllables, morphemes, words or phrases. The mechanisms of the errors can involve the deletion, substitution, insertion, or blending of these units in some way. Fromkin (1971; 1973) argued that the fact that these errors involve some definable linguistic unit established their mental existence at some level in speech production. We will consider these in more detail in discussing the various stages of speech production.
Error Type | Error | Target |
---|---|---|
Anticipation | leading list | Reading list |
Perseveration | black bloxes | black boxes |
Exchange | rat pack | pack rat |
Substitution | spencil | stencil |
Deletion | sippery | slippery |
Insertion | sakool | school |
Speech production falls into three broad areas: conceptualization, formulation and articulation (Levelt, 1989). In conceptualization, we determine what to say. This is sometimes known as message-level processing. Then we need to formulate the concepts into linguistic forms. Formulation takes conceptual entities as input and connects them with the relevant words associated with them to build a syntactic, morphological, and phonological structure. This structure is phonetically encoded and articulated, resulting in speech.
During conceptualization, we develop an intention and select relevant information from the internal (memory) or external (stimuli) environment to create an utterance. Very little is known about this level as it is pre-verbal. Levelt (1989) divided this stage into microplanning and macroplanning. Macroplanning is thought to be the elaboration of a communication goal into subgoals and connecting them with the relevant information. Microplanning assigns the correct shape to these pieces of information and deciding on the focus of the utterance.
Formulation is divided into lexicalization and syntactic planning. In lexicalization, we select the relevant word-forms and in syntactic planning we put these together into a sentence. In talking about word-forms, we need to consider the idea of lemmas. This is the basic abstract conceptual form which is the basis for other derivations. For example, break can be considered a lemma which is the basis for other forms such as break, breaks, broke, broken and breaking. Lemma retrieval used a conceptual structure to retrieve a lemma that makes syntactic properties available for encoding (Kempen & Hoenkamp, 1987). This can specify the parameters such as number, tense, and gender. During word-form encoding, the information connected to lemmas is used to access the morphemes and phonemes linked to the word. The reason these two processing levels, lemma retrieval and word-form encoding, are assumed to exist comes from speech errors where words exchange within the same syntactic categories. For example, nouns exchange with nouns and verbs with verbs from different phrases. Bierwisch (1970), Garrett (1975, 1980) and Nooteboom (1967) provide some examples:
We see here that not only are the exchange of words within syntactic categories, the function words associated with the exchanges appear to be added after the exchange (as in ‘its’ before ‘tongue’ and ‘the’ before ‘cat’). In contrast to entire words (which exchange across different phrases), segment exchanges usually occur within the same phrase and do not make any reference to syntactic categories. Garrett (1988) provides an example in “she is a real rack pat” instead of “she is a real pack rat.” In such errors, the segments involved in the error often share phonetic similarities or share the same syllable position (Dell, 1984). This suggests that these segments must be operating within some frame such as syllable structure. To state this in broader terms, word exchanges are assumed to occur during lemma retrieval, and segment exchanges occur during word-form encoding.
Putting these basic elements together, Meyer (2000) introduced the ‘Standard Model of Word-form Encoding’ (see Figure 9.2) as a summation of previously proposed speech production models (Dell, 1986; Levelt et al., 1999; Shattuck-Huffnagel, 1979, 1983; Fromkin, 1971, 1973; Garrett, 1975, 1980). The model is not complete in itself but a way for understanding the various levels assumed by most psycholinguistic models. The model represents levels for morphemes, segments, and phonetic representations.
We have already seen (in Chapter 3) that morphemes are the smallest units of meaning. A word can be made up on one or more morphemes. Speech errors involving morphemes effect the lemma level or the wordform level (Dell, 1986) as in:
In the first, we see that the morpheme that indicates the plural number has remained in place while the morpheme for ‘apple’ and ‘pie’ exchanged. This is also seen in the last example. This suggests that the exchange occurred after the parameters for number were set indicating that lemmas can switch independent of their morphological and phonological representations (which occur further down in speech production).
While speech production models differ in their organisation and storage of segments, we will assume thay segments have to be retrieved at some level of speech production. Between 60-90% of all speech errors tend to involve segments (Boomer & Laver, 1968; Fromkin, 1971; Nooteboom, 1969; Shattuck-Hufnagel, 1983). However, 10-30% of all speech errors also involve segment sequences (Stemberger, 1983; Shattuck-Hufnagel, 1983). Reaction time experiments have also been employed to justify this level. Roeloffs (1999) asked participants to learn a set of word pairs followed by the first word in the pair being presented as a prompt to produce the second word. These test blocks were presented as either homogeneous or heterogenous phonological forms. In the homogenous blocks there were shared onsets or the segments differed only in voicing. In the heterogenous blocks the initial segments contrasted in voicing and place of articulation. He found that there were priming effects in homogenous blocks when the targets shared an initial segment but not when all but one feature was shared suggesting that whole phonological segments are represented at some level rather than distinctive features.
The segmental level we just discussed is based on phonemes. The standard understanding of speech is that there must be a phonetic level that represents the actual articulated speech as opposed to the stored representations of sound. We have already discussed this in Chapter 2 and will expand here. For example, in English, there are two realizations of unvoiced stops. One form is unaspirated /p/, /k/, and /t/ and the other is aspirated [ph], [kh], and [th]. This can be seen in the words pit [phɪt] and lip [lɪp] where syllable-initial stops are aspirated as a rule. The pronunciation of pit as *[pɪt] doesn’t change the meaning but will sound odd to a native speaker. This shows that /p/ has one phonemic value but two phonetic values: [p] and [ph]. This can be understood as going from an abstract level to a concrete level developing as speech production occurs. Having familiarized ourselves with the basic levels of speech production, we can now go on to see how they are realized in actual speech production models.
Figure 9.2 The Standard Model of Speech Production
The Standard Model of Word-form Encoding as described by Meyer (2000), illustrating five level of summation of conceptualization, lemma, morphemes, phonemes, and phonetic levels, using the example word “tiger”. From top to bottom, the levels are:
[Return to place in text (Figure 9.2)]
Speech error analysis has been used as the basis for the model developed by Dell (1986, 1988). Dell’s spreading activation model (as seen in Figure 9.3) has features that are informed by the nature of speech errors that respect syllable position constraints. This is based on the observation that when segmental speech errors occur, they usually involve exchanges between onsets, peaks or codas but rarely between different syllable positions. Dell (1986) states that word-forms are represented in a lexical network composed on nodes that represent morphemes, segments and features. These nodes are connected by weighted bidirectional vertices.
As seen in Figure 9.3, when the morpheme node is activated, it spreads through the lexical network with each node transmitting a proportion of its activation to its direct neighbour(s). The morpheme is mapped onto its associated segments with the highest level of activation. The selected segments are encoded for particular syllable positions which can then be slotted into a syllable frame. This means that the /p/ phoneme that is encoded for syllable onset is stored separately from the /p/ phonemes encoded for syllable coda position. This also accounts for the phonetic level in that instead of having two separate levels for segments (phonological and phonetic levels), there is only one segmental level. In this level, the onset /p/ is stored with its characteristic aspiration as [ph] and the coda /p/ is stored in its unaspirated form [p]. Although this means that segments need to be stored twice for onset and coda positions, it simplified the syllabification process as the segments automatically slot into their respective position. Dell’s model ensures the preservation of syllable constraints in that onset phonemes can only fit into onset syllable slots in the syllable template (the same being true for peaks and codas). The model also has an implicit competition between phonemes that belong to the same syllable position and this explains tongue-twisters such as the following:
In these examples, speakers are assumed to make errors because of competition between segments that share the same syllable position. As seen in Figure 9.3, Dell (1988) proposes a word-shape header node that contains the CV specifications for the word-form. This node activates the segment nodes one after the other. This is supported by the serial effects seen in implicit priming studies (Meyer, 1990; 1991) as well as some findings on the influence of phonological similarity on semantic substitution errors (Dell & Reich, 1981). For example, the model assumes that semantic errors (errors based on shared meaning) arise in lemma nodes. The word cat shares more segments with a target such as mat ((/æ/nu and /t/cd) than with sap (only /æ/nu). Therefore, the lemma node of mat will have a higher activation level than the one for sap creating the opportunity for a substitution error. In addition, the feedback from morpheme nodes leads to a bias towards producing words rather then nonword error. The model also takes into account the effect of speech rate on error probability (Dell, 1986) and the frequency distribution of anticipation-, perseveration- and transposition- errors (Nooteboom, 1969). The model accounts for differences between various error types by having an in-built bias for anticipation. Activation spreads through time. Therefore, upcoming words receive activation (at a lower level than the current target). Speech rate also has an influence on errors because higher speech rates may lead to nodes not having enough time to reach a specified level of activation (leading to more errors).
While the Dell model has a lot of support for it’s architecture, there have been criticisms. The main evidence used for the model, speech errors, have themselves been questioned as a useful piece of evidence for informing speech production models (Cutler, 1981). For instance, the listener might misinterpret the units involved in the error and may have a bias towards locating errors at the beginning of words (accounting for the large number of word-onset errors). Evidence for the CV header node is limited as segment insertions usually create clusters when the target word also had a cluster and CV similarities are not found for peaks.
The model also has an issue with storage and retrieval as segments need to be stored for each syllable position. For example, the /l/ in English needs to be stored as [l] for syllable onset, [ɫ] for coda and [ḷ] when it appears as a syllabic consonant in the peak (as in bottle). However, while this may seem redundant and inefficient, recent calculations of storage costs based on information theory by Ramoo and Olson (2021) suggest that the Dell model may actually be more storage efficient than previously thought. They suggest that one of the main inefficiencies of the model are during syllabification across word and morpheme boundaries. During the production of connected speech or polymorphic words, segments from one morpheme or word will move to another (Chomsky & Halle, 1968; Selkirk, 1984; Levelt, 1989). For example, when we say “walk away” /wɔk.ə.weɪ/, we produce [wɔ.kə.weɪ] where the /k/ moves from coda to onset in the next syllable. As the Dell model codes segments for syllable position, it may not be possible for such segments to move from coda to onset position during resyllabification. These and other limitations have led researchers such as Levelt (1989) and his colleagues (Meyer, 1992; Roelofs, 2000) to propose a new model based on reaction time experiments.
The Levelt, Roelofs, and Meyer or LRM model is one of the most popular models for speech production in psycholinguistics. It is also one of the most comprehensive in that it takes into account all stages from conceptualization to articulation (Levelt et al., 1999). The model is based on reaction time data from naming experiments and is a top-down model where information flows from more abstract levels to more concrete stages. The Word-form Encoding by Activation and VERification (WEAVER) is the computational implementation of the LRM model developed by Roelof (1992, 1996, 1997a, 1997b, 1998, 1999). It is a spreading activation model inspired by Dell’s (1986) ideas about word-form encoding. It accounts for the syllable frequency effect and ambiguous syllable priming data (although the computational implementation has been more successful in illustrating syllable frequency effects rather than priming effects).
As we can see in Figure 9.4, the lemma node is connected to segment nodes. These vertices are specified for serial position and the segments are not coded for syllable position. Indeed, the only syllabic information that is stored in this model are syllable templates that indicate the stress patterns of each word (which syllable in the word is stressed and which is not). These syllabic templates are used during speech production to syllabify the segments using the principle of onset-maximization (all segments that can legally go into a syllable onset in a language are put into the onset and the leftover segments go into the coda). This kind of syllabification during production accounts for resyllabification (which is a problem for the Dell model). The model also has a mental syllabary which is hypothesized to contain the articulatory programs that are used to plan articulation.
The model is interesting in that syllabification is only relevant at the time of production. Phonemes are defined within the lexicon with regard to their serial position in the word or lemma. This allows for resyllabification across morpheme and word boundaries without any difficulties. Roelofs and Meyer (1998) investigated whether syllable structures are stored in the mental frame. They employed an implicit priming paradigm where participants produced one word out of a set of words in rapid succession. The words were either homogenous (all words had the same word onsets) or heterogeneous. They found that priming depended on the targets having the same number of syllable and stress patterns but not the same syllable structure. This led them to conclude that syllable structure was not a stored component of speech production but computed during speech (Choline et al., 2004). Costa and Sebastian-Galles (1998) employed a picture-word interference paradigm to investigate this further. They asked participants to name a picture while a word was presented after 150 ms. They found that participants were faster to name a picture when they shared the same syllable structure with the word. These results challenge the view that syllable structure is absent as an abstract encoding within the lexicon. A new model has challenged the LRM model’s assumptions on this with a Lexicon with Syllable Structure (LEWISS) model.
Proposed by Romani et al. (2011), the Lexicon with Syllable Structure (LEWISS) model explores the possibility of stored syllable structure in phonological encoding. As seen in Figure 9.5 the organisation of segments in this model is based on a syllable structure framework (similar to proposals by Selkirk, 1982; Cairns & Feinstein, 1982). However, unlike the Dell model, the segments are not coded for syllable position. The syllable structural hierarchy is composed of syllable constituent nodes (onset, peak and coda) with the vertices having different weights based on their relative positions. This means that the peak (the most important part of a syllable) has a very strongly weighted vertex compared to onsets and codas. Within onsets and codas, the core positions are more strongly weighted compared to satellite position. This is based on the fact that there are positional variations in speech errors. For example, onsets and codas are more vulnerable to errors compared to vowels or peaks. Within onsets and codas, the satellite positions are more vulnerable compared to core positions. For example, in a word like print, the /r/ and /n/ in onset and coda satellite positions are more likely to be the subjects of errors than the /p/ and /t/ which are core positions. The main evidence for the LEWISS model comes from the speech errors of aphasic patients (Romani et al., 2011). It was observed that not only did they produce errors that weighted syllable positions differently, they also preserved the syllable structure of their targets even when making speech errors.
In terms of syllabification, the LEWISS model syllabifies at morpheme and word edges instead of having to syllabify the entire utterance each time it is produced. The evidence from speech errors supports the idea of having syllable position constraints. While Romani et al. (2011) have presented data from Italian, speech error analysis in Spanish also supports this view (Garcia-Albea et al., 1989). The evidence from Spanish is also interesting in that the errors are mostly word-medial rather than word-initial as is the case for English (Shattuck-Hufnagel, 1987, 1992). Stemberger (1990) hypothesised that structural frames for CV structure encoding may be compatible with phonological systems proposed by Clements and Keyser (1983) as well as Goldsmith (1990). This was supported by speech errors from German and Swedish (Stemberger, 1984). However, such patterns were not observed in English. Costa and Sebastian-Gallés (1998) found primed picture-naming was facilitated by primes that shared CV structure with the targets. Sevald, Dell and Cole (1995) found similar effects in repeated pronunciation tasks in English. Romani et al. (2011) brought these ideas to the fore with their analysis of speech errors made by Italian aphasic and apraxic patients. The patients did repetition, reading, and picture-naming tasks. Both groups of patients produced errors that targeted vulnerable syllable positions such as onset- and coda- satellites consistent with previous findings (Den Ouden, 2002). They also found that a large proportion of the errors preserved syllable structure even in the errors. This is noted by previous findings as well (Wilshire, 2002). Previous findings by Romani and Calabrese (1996) found that Italian patients replaced geminates with heterosyllabic clusters rather than homosyllabic clusters. For example, /ʤi.raf.fa/ became /ʤi.rar.fa/ rather than /ʤi.ra.fra/ preserving the original syllable structure of the target. While the Dell model’s segments coded for syllable position can also explain such errors, it cannot account for errors that moved from one syllable position to another. More recent computational calculations by Ramoo and Olson (2021) found that the resyllabification rates in English and Hindi as well as storage costs predicted by information theory do not discount LEWISS based on storage and computational costs.
An interactive H5P element has been excluded from this version of the text. You can view it online here:
https://opentextbc.ca/psyclanguage/?p=995#h5p-13
In this chapter, we discussed speech production models and some of the evidence used to support them. The main source of evidence in understanding speech production comes from speech errors. However, this has since been supplemented with evidence from reaction time experiments, computational modelling, and errors made by patients with acquired language disorders. We discussed three major speech production models by Dell (1986; 1988), Levelt, Roelofs, and Meyer (1999) as well as Romani, Galluzzi, Bureca, and Olson (2011).
Key Takeaways
Exercises in Critical Thinking
Bierwisch, M. (1970) Semantics. In J. Lyons (Ed.), New horizons in linguistics (pp. 166-184). Harmondsworth: Pelican.
Boomer, D.S, & Laver, J.D.M. (1968). Slips of the tongue. British Journal of Disorders of Communication, 3, 2-12.
Cholin, J., Shiller, N. O., & Levelt, W. J. M. (2004). The preparation of syllables in speech production. Journal of Memory and Language, 50, 47–61.
Chomsky, N. & Halle, M. (1968). The sound pattern of English. New York: Harper & Row.
Clements, G. N. & Keyser, S. J. (1983) CV phonology (Linguistic Inquiries Monograph Series, No.9). Cambridge, MA: MIT Press.
Costa, A., & Sebastian-Gallés, N. (1998). Abstract phonological structure in language production: Evidence from Spanish. Journal of Experimental Psychology: Learning, Memory and Cognition, 24, 886-903.
Cutler, A. (1981). Making up materials is a confounded nuisance, or: Will we be able to run any psycholinguistic experiments at all in 1990? Cognition, 10, 65–70.
Dell, G. (1986). A spreading-activation theory of retrieval in speech production. Psychological Review, 93, 283-321.
Dell, G. S. & Reich, P. A. (1981). Stages in sentence production: An analysis of speech error data. Journal of Verbal Learning and Verbal Behavior, 20(6). 611-629. doi:10.1016/S0022-5371(81)90202-4.
Dell, G. S. (1984). The representation of serial order in speech: Evidence from the repeated phoneme effect in speech errors. Journal of Experimental Psychology: Learning, Memory and Cognition, 10, 222-233.
Den Ouden, D.B. (2002). Phonology in aphasia: syllables and segments in level-specific deficits. Unpublished doctoral dissertation. Groningen University.
Freud, S. (1975). The psychopathology of everyday life (Trans. A. Tyson). Harmondsworth, UK: Penguin. [Originally published 1901.]
Fromkin, V.A. (1971). The non-anomalous nature of anomalous utterances. Language, 47, 27-52.
Fromkin, V.A. (1973). Introduction. In V.A. Fromkin (Ed.), Speech errors in linguistic evidence (pp. 11-45). The Hague, The Neatherlands: Mouton.
García-Albea, J.E., del Viso, S., & Igoa, J.M. (1989). Movement errors and levels of processing in sentence production. Journal of Psycholinguistic Research, 18, 145-161.
Garrett, M.F. (1975). The analysis of sentence production. In G.H. Bower (Ed.), The psychology of language and motivation (Vol. 9, pp. 133-175). New York: Academy Press.
Garrett, M.F. (1980). Levels of processing in sentence production. In B. Butterworth (Ed.), Language production: Vol. 1. Speech and talk (pp. 177-210). New York: Academic Press.
Garrett, M.F. (1988). Processes in language production. In F. J. Newmeyer (Ed.), Linguistics: The Cambridge survey (pp. 6996). Cambridge, MA: Harvard University Press.
Garrett, M. F. (2001). The Psychology of Speech Errors. In N. J. Smelser & P. B. Baltes (Eds.), International encyclopedia of the social and behavioral sciences (pp. 14864-14870). New York: Elsevier.
Goldsmith, J. (1990). Autosegmental and metrical phonology. Cambridge, MA: Basil Blackwell.
Kempen, G., & Hoenkamp, E. (1987). An incremental procedural language for sentence formulation. Cognitive Science, 11, 201-258.
Kenny, C. (1994). Our legacy: Work and play. Keynote presentation. Proceedings of the Annual conference of the American Association for Music Therapy, “Connections: Integrating our Work and play.”
Levelt, W.J.M., Roelofs, A., & Meyer, A.S. (1999). A theory of lexical access in speech production. Behavioural and Brain Sciences, 22, 1-75.
Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Meringer, R., & Mayer, K. (1895). Versprechen und verlesen: Eine psychologisch-linguistische studie. Stuttgart: Gössen.
Meyer, A. S. (1990). The time course of phonological encoding in language production: The encoding of successive syllables of a word. Journal of Memory and Language, 29, 524-545.
Meyer, A. S. (1991). The time course of phonological encoding in language production: Phonological encoding inside a syllable. Journal of Memory and Language, 30, 69-69.
Meyer, A. S. (1992). Investigation of phonological encoding through speech error analyses: Achievements, limitations, and alternatives. Cognition, 42, 181-211.
Meyer, A.S. (2000). Form representations in word production. In L.R. Wheeldon (Ed.), Aspects of language production (pp. 49-70). East Sussex: Psychology Press.
Nooteboom, S. G. (1967). Some regularities in phonemic speech errors. IPO Annual Progress Report, 2, 65-70.
Nooteboom, S.G. (1969). The tongue slips into patterns. In A.G. Sciarone, A.J. van Essen, & A.A. Van Raad (Eds.) Nomen: Leyden studies in linguistics and phonetics (pp. 114-132). The Hague, The Neatherlands: Mouton.
Ramoo, D. & Olson, A. (accepted). Lexeme and speech syllables in English and Hindi: A case for syllable structure. In Lowe, J. & Ghanshyam, S. (Eds.) Trends in South Asian Linguistics, Berlin/New York: Mouton De Gruyter.
Roelofs, A. (1992). A spreading-activation theory of lemma retrieval in speaking. Cognition, 42, 107-142.
Roelofs, A. (1996). Computational models of lemma retrieval. In T. Dijkstra, & K. De Smedt (Eds.), Computational psycholinguistics: AI and connectionist models of human language processing (pp. 308-327). London: Taylor & Francis.
Roelofs, A. (1997a). Syllabification in speech production: Evaluation of WEAVER. Language and Cognitive Processes, 12, 657-693.
Roelofs, A. (1997b). The WEAVER model of word-form encoding in speech production. Cognition, 64, 249-284.
Roelofs, A. (1998). Rightward incrementality in encoding simple phrasal forms in speech production: Verb-particle combinations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 904-921.
Roelofs, A. (1999). Phonological segments and features as planning units in speech production. Language and Cognitive Processes, 14, 173-200.
Roelofs, A. (2000). WEAVER++ and other computational models of lemma retrieval and word-form encoding. In L.R. Wheeldon (Ed.), Aspects of language production (pp. 71-114). East Sussex: Psychology Press.
Romani, C., & Calabrese, A. (1996). On the representation of geminate consonants: Evidence from aphasia. Journal of Neurolinguistics, 9, 219–235.
Romani, C., Galluzzi, C., Bureca, I., & Olson, A. (2011). Effects of syllable structure in aphasic errors: Implications for a new model of speech production. Cognitive Psychology, 62. 151-192.
Selkirk, E. O. (1984). On the major class features and syllable theory. In M. Aronoff & R. Oehrle (Eds.) Language sound structure (pp. 107-136). Cambridge, Mass.: MIT Press.
Shattuck-Hufnagel, S. (1979). Speech errors as evidence for a serial ordering mechanism in sentence production. In W.E. Cooper & E.C.T. Walker (Eds.), Sentence processing (pp. 295-342). Hillsdale, N.J.: Lawrence Erlbaum.
Shattuck-Hufnagel, S. (1983). Sublexical units and suprasegmental structure in speech production planning. In P.F. MacNeilage (Ed.), The production of speech (pp. 109-136). New York: Springer
Shattuck-Hufnagel, S. (1987). The role of word onset conso-nants in speech production planning: New evidence fromspeech error patterns. In E. Keller & M. Gopnik (Eds.), Motor and sensory processing in language (pp. 17–51).Hillsdale, NJ: Erlbaum.
Shattuck-Hufnagel, S. (1992). The role of word structure insegmental serial ordering. Cognition, 42, 213–259.
Stemberger, J. P. (1983). Inflectional malapropisms: Form-based errors in English morphology. Linguistics, 21, 573–602.
Stemberger, J. P. (1984). Structural errors in normal and agrammatic speech. Cognitive Neuropsychology, 1, 281–313.
Stemberger, J. P. (1985). An interactive activation model of language production. In A. W. Ellis (Ed.). Progress in the psychology of language (Vol. 1). London: Erlbaum.
Stemberger, J. P. (1990). Word shape errors in language production. Cognition, 35, 123-157.
Wilshire, C. E. (2002). Where do aphasic phonological errors come from? Evidence from phoneme movement errors in picture naming. Aphasiology, 16, 169–197.
A writing system where each grapheme stands for a consonant with no or minimal representation of vowels.
A writing system where each grapheme stands for a consonant-vowel syllable. More complicated syllables are represented by combining these graphemes.
A medical condition that develops after conception.
A word that modifies a noun or noun phrase.
A word that modifies a verb, adjective, determiner, clause, preposition or sentence.
A consonant that begins with a stop and releases a fricative.
A language which primarily employs agglutination (sticking morphemes together) in its morphology.
A variant form of a morpheme.
A phoneme that has different variants depending on its environment without changing the meaning of the word.
A writing system where graphemes exist for consonants and vowels.
A consonant produced with the tip of the tongue touching or close to the superior alveolar ridge.
A language that primarily employs helper words and word order to show the relationship between words.
An acquire disorder which affect language processing after brain trauma.
A consonant produced with the articulators approaching each other but not touching.
A speech disorder which mainly manifests itself in difficulties with articulation.
The formation of speech.
A consonant produced with a strong burst of breath.
A stage in language development where the infant experiments with articulation of sounds without any recognizable words.
In artificial intelligence neural networks, an error correction method whereby the output is fed back as an input into the system.
The quality of the vowel on the tongue’s horizontal position.
A person who speaks two languages.
The phenomenon of speaking two languages.
A morpheme that can only appear as part of a larger expression.
A region in the frontal lobe of the brain and usually found in the left hemisphere linked to speech production.
A group of words that contain a subject and a predicate within a complex or compound sentence.
The part of the syllable that includes all the consonants that follow the nucleus.
The imparting or sharing of information.
The process of forming a concept or idea.
The process of acquiring an association between a stimulus and a response; or the change in the frequency of a behaviour in response to an input.
A word used to connect clauses or sentences.
A speech sound that is produced with complete or partial closure of the vocal tract.
A word or group of words that can function as a single unit.
A stage in language development where the infant produces cooing sounds.
A natural language that develops from the simplification and mixing or two or more languages. Creoles often emerge in children brought up in an environment where adults speak pidgin.
A reading disorder where patients substitute semantically similar words.
A consonant produced with the tongue touching the upper teeth.
Forming a new word from an existing word through the addition of a prefix or suffix.
A word that determines the type of reference a noun or noun group.
A group of disorders originating in childhood with serious impairment in various areas of functioning.
Two adjacent vowels that are within the same syllable.
A reading model that proposes two separate paths for reading aloud; a lexical route and a sub-lexical route.
A disorder which manifests itself in reading difficulties.
Learning one language first and then acquiring another language in childhood.
Electroencephalography (EEG) is an electrophysiological measurement technique used to record electrical activity on the scalp. This activity represents the electoral activity on the surface of the brain underneath the scalp.
A writing system where graphemes represent common elements to represent phonological similarity.
The collective name used to refer to Indigenous peoples in what is now called Canada, distinct from the Métis and Inuit populations.
Functional magnetic resonance imaging or functional MRI is a measurement technique used to detect changes in blood flow within the brain.
The sounds that make up a word.
The creation of the word form during speech production.
A morpheme that can stand on its own without being dependent on other words or morphemes.
An unintentional speech error hypothesized by Sigmund Freud as indicating subconscious feelings.
A consonant produced with the forcing of air through a narrow gap between two articulators.
Another name for a language that employs inflectional morphology.
A linguistic theory that looks at linguistics as the discovery of innate grammatical structures.
A region in the parietal lobe of the brain linked to language and mathematical operations.
A consonant that is phonetically similar to a vowel but functions as a consonant. Also known as a semivowel.
A consonant produced using the glottis.
The smallest unit of representation in a writing system.
The hypothesized system that converts graphemes into phonemes during reading aloud.
The quality of the vowel on the tongue’s vertical position.
Reproducing observed behaviour.
A language that is native to a region.
A family of languages spoken in Europe, the Iranian Plateau, and the Indian subcontinent. This language family includes English, French, Spanish, Hindi, and Iranian.
The process of changing word meaning through affixation and vowel change.
A model for representing memory in artificial intelligence networks. Usually comprised of three levels of interacting components of increasing complexity.
The group of culturally similar Indigenous peoples who traditionally lived in the Arctic regions of Canada, Greenland, and Alaska.
Words that appear in utterances alone and demarcated by moments of silence on either end.
A language that mostly has isolated morphemes as words with no inflectional morphology.
Syllabaries used to write Japanese phonological units.
Chinese characters adopted to write Japanese.
The language acquired before the end of the critical language acquisition period.
The language acquired after the end of the critical language acquisition period. Usually results in an inability to acquire native fluency without great effort.
A consonant produced with one or both lips.
The process of acquiring a language.
A hypothesized mechanism that is an instinctive mental capacity to acquire language.
The phenomenon whereby some permanent change is made in the features and uses of language across time.
A group of languages that are related by common descent from an ancestral language.
The way humans use language to communicate and how it is processed and comprehended.
Learning one language first and then acquiring another language in adolescence or adulthood.
The form of a word or set of words.
A damaged or abnormally changed tissue caused by disease or trauma.
In the dual-route model, the pathway that processed whole words.
The process of developing a word for production.
A theory about language that explores the nature of language and attempts to resolve some of the fundamental questions about it.
A consonant that is either a lateral approximant or rhotic.
A speech recognition model that used units called logogens to explain word comprehension.
A written character that represents a word or morpheme.
The configuration and interaction of the articulators when producing phonemes.
The concept associated with a word.
The scientific study of processing speech on cognitive tasks.
A distinct Indigenous group in Canada. Their ancestors were French and Scottish men who migrated to Canada in the 17th and 18th centuries to work in the fur trade and who had children with First Nations women and then formed new communities (Definition source: Pulling Together: Foundations Guide, "Glossary of Terms." Licensed under a CC BY-NC 4.0 licence.).
A pair of words in a language that differ in only one phonological element (such as a phoneme).
A pure vowel sound.
The smallest unit of meaning in a language.
A method of classifying languages based on their common methods for modifying morphemes.
The concept that words are movable and not bound to a particular position in a sentence.
A person who speaks a number of languages.
The phenomenon of speaking many languages.
A component of event related potentials that is a negative peak around 400 milliseconds after the onset of a stimulus.
A consonant produced with the nasal passage open along with the oral tract.
A location in a diagram.
In the dual-route model. The pathway that processed words using grapheme-to-phoneme conversion.
Symbols or elements that can be replaced.
A type of morpheme modification that involves modifying the root without sequentially stringing units one after the other.
A word that refers to a thing, a person, a place, an animal, a quality, or an action.
A syntactic unit that has a noun as its head.
The part of the syllable that is mandatory and usually includes vowels and sometimes syllabic consonants. Also called the peak.
A grammatical category that expresses count distinctions (e.g., one versus many).
In subject-prominent languages, a noun that is distinguished by a transitive verb from the subject.
The part of the syllable that includes all the consonants that occur before the nucleus.
A component of event related potentials that is a positive peak around 600 milliseconds after the onset of a stimulus.
A consonant produced with the body of the tongue touching the hard palate.
The part of the syllable that is mandatory and usually includes vowels and sometimes syllabic consonants. Also called the nucleus.
Positron emission tomography is a technique that uses radioactive substances to measure metabolic changes and other physiological changes such as blood flow.
Any distinct speech sound.
The smallest unit of sound in a language. While a phone is not specific to any language, phoneme is language-specific and changing a phoneme would change the meaning of a word within a language.
A branch of linguistics that explores sound production and perception.
A reading disability which mainly affects the reading of novel non-words while preserving the ability to read familiar words.
A formal way of expressing phonological and morphological processes of sound change.
A branch of linguistics that explores the organization of sounds in languages.
A group of words that act as a grammatical unit.
A type of rewrite rule used to define a language’s syntax.
A grammatically simplified communication system that develops between two or more groups with no shared language.
The point of contact between the articulators.
A consonant produced with the airflow blocked before release. Also known as a stop consonant.
A language which has words composed of many morphemes.
Everything in a declarative sentence other than the subject.
An experimental technique in developmental psychology often used to study non-verbal participants (e.g., human infants and animals).
An affix that is placed before the word stem.
A category of words that can express spatial or temporal relations or mark semantic roles.
A phenomenon where exposure to a stimulus influences the response time to a subsequent stimulus.
A word that can stand in for a noun or noun phrase.
The theorized ancestral language of the modern Indo-European language family.
The temporal measure of the time taken between detecting a stimulus and the response to that stimulus.
The use of pre-existing pictograms purely for their sound value. For example, bee-leaf to represent the word “belief”.
The process by which segments that belong to one syllable move to another syllable during morphological changes and connected speech.
A consonant that is classified into a group of consonants similar to the phoneme /r/.
The part of the syllable consisting of the nucleus and coda.
The quality of the vowel on whether the lips are rounded when producing it.
The acquisition of a language in addition to one’s L1 (or native) language.
A consonant that is phonetically similar to a vowel but functions as a consonant. Also known as a glide.
Acquiring two languages at the same time.
A stage in language development where the infant produces individual words.
A fourteen-line poem using a number of formalized rhyming schemes.
An error in the production of speech.
A consonant produced with the airflow blocked before release. Also known as a plosive consonant.
In the dual-route model. The pathway that processed words using grapheme-to-phoneme conversion.
The person or thing about which a statement is made.
An experimental technique where infants are habituated to a stimulus and then suck on an artificial nipple faster when exposed to novel stimuli.
An affix that is placed after the word stem.
A reading disability which mainly affects the ability to recognize whole words but can employ pronunciation rules to read words.
A writing system where graphemes represent entire syllables.
The process of putting individual segments into syllable based on language-specific rules.
The smallest unit of speech.
The structure of the syllable in terms of onset, peak (or nucleus) and coda.
The planning of word order in a sentence.
The study of how words and morphemes combine to create larger phrases and sentences.
A stage in language development where the infant produced sentences without many function words and lacking proper grammar.
A linguistic category that expresses time reference.
Symbols that may appear as the output of grammatical rules.
A representation of a hierarchical structure in graphic form.
A stage in language development where the infant produced two words at a time.
A consonant produced without a strong burst of breath.
A linguistic theory that postulates that a certain number of structural rules are innate to human beings.
A consonants that is produced without a vibration of the vocal cords.
A sound that is not meaningful; such as a cough or through clearing.
A consonant produced with the back of the tongue touching the soft palate.
A word that conveys action or a state of being.
A syntactic unit composed of at least one verb and its dependents.
The set of words of a language that are acquired by an individual.
A stage in language development where the infant plays with vocalizations.
The articulatory process whereby the vocal cords either vibrate (voiced) or don’t (unvoiced).
A consonants that is produced with a vibration of the vocal cords.
A speech sound that is produced without complete or partial closure of the vocal tract.
A region in the temporal lobe of the brain and usually found in the left hemisphere linked to speech comprehension.
The smallest unit of language that conveys a particular meaning. A word can be made up of one or more morphemes.
The order of words in a sentence.
This page provides a record of edits and changes made to this book since its initial publication. Whenever edits or updates are made in the text, we provide a record and description of those changes here. If the change is minor, the version number increases by 0.01. If the edits involve substantial updates, the version number increases to the next full number.
The files posted by this book always reflect the most recent version. If you find an error in this book, please fill out the Report an Error form.
Version | Date | Change | Details |
---|---|---|---|
1.00 | October 29, 2021 | Book published. | |
1.01 | July 28, 2022 | 13 H5P activities inserted throughout the book. | The Accessibility Statement and For Students: How to Access and Use this Textbook sections were updated to account for these activities. |