Language and Genes: A tutorial

The development of spoken language involves the acquisition of a complex system of knowledge that is generally believed to depend on biological systems that are genetically influenced.

Introduction
The universal capacity of humans to acquire and use a complex language system for social interaction is often cited as that which makes them unique among animals. Despite the superficial differences across human languages, linguists believe that all human languages conform to a common set of abstract features involving the use of a finite set of words that are systematically combined according abstract grammatical principles to allow for the expression of a nearly limitless variety of messages. Despite that abstractness of languages, human infants and toddlers show remarkable abilities to rapidly acquire this complex system with little intentional tutoring by their parents.

These features of species specificity, species universality and rapid unaided development have provided the foundations for the claims that language, and most specifically grammar, is innate (Chomsky, 1988; Pinker, 1994). Within such a nativist account, it is claimed that the genome codes for neural systems that provide human infants with knowledge of critical universal features of all human languages. This genetically endowed knowledge allows children to rapidly exploit the linguistic experience in order to acquire knowledge of the particular language of their community. An alternative to linguistic nativism, comes from those who believe that species specificity, universality and rapid development could arise from an assemblage of biologically and potentially genetically influenced cognitive systems that are important, but not dedicated, to language (Bates, 1994; Elman et al., 12996; Tomasello, 1998). Thus, most contemporary views of language predict that the capacity for language in humans is, at least in part, dependent upon neural systems that are genetically influenced. There are considerable differences, however, with regard to the kinds of hypothesized systems these genes may affect.

It has been within this theoretical context that the research on genetics and language has been conducted. An additional motivation for this research comes from clinical interests in the etiology of impairments of language development. Thus, much of the research on genetics and language pertains to explanations of individual differences in children's development of language.

Heritability of Spoken Language Development and Disorders
Much of the empirical work concerning genes and language has addressed the question of the heritability of individual differences in spoken language using the twin method. The twins in this research have either been sampled from the general population and thus represent the full range of normal language development or they were sampled because at least one twin had poor or impaired language development.

Typically developing children
There has been a modest amount of research concerning the heritability of various aspects of language among normally developing twins. Vocabulary represents that aspect of language concerned with words and their meanings and in these studies is reflected in estimates of vocabulary size or degree of development. Studies examining the heritability of vocabulary development have typically shown that monozygotic twins (MZ) were more similar to each other with respect to vocabulary development than dyzgotic twins (DZ). Heritability estimates for vocabulary vary considerably, with higher levels of heritability found in older twins that younger twins (Stromswold, 2001). Among those studies of children over 3 years of age, levels of heritability in the range of 0.40-0.60 were typical, whereas values well below 0.40 have been common under the age of three. Grammar refers to a system that governs the arrangement of words in sentences in order that they convey a role in the sentence, such as the subject or object, and it is this aspects of language that linguistic nativism argues is most likely to be innate. Children systematically acquire grammatical skills beginning with rudimentary grammatical patterns that progressively approximate the mature adult grammar of their language. This level of grammatical development has been compared among twins in a small number of studies. In most cases, the MZ twins were more similar with respect to grammatical development that the DZ twins. Estimates of heritability varied considerably but were often over 0.30. Thus, the rate of vocabulary and grammar development in normally developing twins is likely to be moderately to strongly influenced by genetic sources. There does not appear to be evidence showing that grammatical ability is under greater genetic influence that vocabulary; however, there are suggestions that measures of language expression result in higher heritabilities than measures of language reception (Young et al., 2001). Furthermore, an analysis of the cross-twin vocabulary-grammar covariance has indicated that the genetic heritability of these two aspects of language is likely to be due to the same genes (Dale et al., 2000).

Adoption studies provide an alternate method for estimating the extent of a genetic effect on a trait such as language. One large-scale longitudinal adoption study, the Colorado Adoption Study, examined language in adopted children and their biological and adoptive parents and siblings. Heritability estimates for verbal development in these studies were usually found to be above 0.40 and increased to 0.64 when the adopted children reached adolescence (Alarcon et al., 1998; Cardon et al., 1993; Thompson and Plomin, 1988).

Children who are particularly slow in the development of spoken language despite normal hearing, linguistic experience and intellect have been described as having childhood dysphasia or specific language impairment (SLI). Owing to considerable evidence that SLI aggregates in families, several studies have used twinships where at least one twin is affected to examine for heritability of language impairment. It has been estimated from these studies that the concordance rate for MZ twins is around 0.84 and 0.43 for DZ twins (Stromswold, 2001). Some of these studies used quantitative measures of language development and thus heritability estimates for poor language development could be computed using the extremes method of DeFries and Fulker to compute group differences heritability (h2g). The heritability estimates in these twinships are typically above 0.60, particularly when the children with SLI were restricted to children whose language achievement was in the lowest 5% of their age group (Dale et al., 1999). Thus, poor language development appears to have a substantial genetic etiology. The heritability levels in twinships with SLI were typically higher than that found for children with normal language status, suggesting that there may be unique sources of genetic etiology for poor language achievement beyond that which results in variability of language development among normal learners.

Specificity of Genetic Effects on Language and Cognition
Before it is possible to claim that there are genes specifically dedicated to language, it is necessary to show that the genetic influence found for language in these twin and adoption studies is not shared with non-language cognitive skills. It is common to find that language and nonverbal skills are moderately correlated, which could justify the claim that the genetic basis of language is actually reflective of the genetic influence on more generalized intellectual abilities. Several of the same studies concerned with language heritability have used multivariate methods to determine the extent to which language and non-language traits have shared genetic roots - that are genetically correlated. Using both twin and adoptive family sets, these studies have shown very high levels of genetic correlation between verbal and nonverbal skills (Alarcon et al., 1999). One recent study of twins focused on the comorbidity between poor vocabulary and poor nonverbal cognitive development and found evidence in support of a common set of genes influencing both poor nonverbal cognition and poor vocabulary development (Purcell et al., 2001). Thus, there is considerable evidence that argues against genes having language-specific effects. Several studies of young twins, however, indicate that this conclusion may need to be qualified. In these studies, data from twins under 3 years of age provided evidence of unique genetic effects on language (Dale et al., 2000; Price et al., 2000). Furthermore, a longitudinal study has demonstrated a shift from a unique genetic influence on language at around age two to a shared genetic influence on language and non-language development at age three (Young et al., 2001). Thus, there may be a specific genetic contribution for early stages of language development that diminishes at older ages.

Molecular Genetic Studies of Language Disorder
Evidence that several genes or loci can have an impact on language development comes from the large number of genetic syndromes with associated mental retardation. Most, if not all, genetic syndromes of mental retardation involve deficits of language development. For instance, considerable research on the language development of children with Down syndrome has shown that these individuals have greater difficulties with language development than nonverbal cognitive development (Chapman, 1995). The only mental retardation syndrome that has been described as having preserved language in the context of general intellectual deficits has been Williams syndromes (WS), where reports of normal or near-normal development of grammar and speech sound skills are not uncommon (Bellugi et al., 1994). Despite these reports, other have noted that depressed language skills are typical among WS individuals (Bates et al., 2001). The large number of different genetic loci associated with mental retardation attest to the likelihood that there are several means by which genetic factors can perturb the development of brain in ways that will affect language. If there are genes that have a particular effect on language, it has been hoped that it may be possible to identify them through the study of SLI.

The twin studies cited earlier, showing that SLI is genetically influenced, have provided an impetus to search for genes that have particular or predominant effects on language. This interest was intensified by reports concerning one large multigenerational family (KE family) with a high rate of speech and language impairment. The speech and language deficits of the affected members of this kindred have, in some cases, emphasized specific grammatical difficulties in these individuals (Gopnik, 1990). However, most accounts have also indicated that these family members were impaired with respect to a wide range of speech, language and even nonverbal development (Vargha-Khadem et al., 1995). In particular, these affected family members have been described as having dyspraxia of speech, which is considered to be a motor speech impairment. A genome-wide search of the KE family by researchers at Oxford showed linkage to two markers on the long arm of chromosome 7 in the region of 7q31 (Fisher et al., 1998). The gene mutation accounting for the speech and language impairment in this family was identified as FOXP2 (Lai et al., 2001). Sequencing of this gene showed that it contains a forkhead box binding domain and the mutation results in the protein product having an arginine-to-histidine substitution in this domain. One other case of an unrelated individual with a translocation involving a breakpoint in this gene was reported by the same laboratory, however, they did not find evidence of association or instances of mutations of the FOXP2 gene among a larger set of SLI probands (Lai et al., 2000; Newbury et al., 2002). Thus, although FOXP2 appears to affect neural systems important for normal speech and language development, the extent to which it can lead to deficits that only involve language is not clear and it also does not appear to account for most cases of SLI.

The same laboratory that located the FOXP2 gene has also performed the only genome-wide scan for SLI in sib-pairs (The SLI Consortium, 2002). This scan resulted in loci on chromosomes 16q and 19q showing linkage to quantitative verbal measures, but it is noteworthy that they did not find linkage to the region of the FOXP2 mutation in 7q31. The chromosome 16q locus was linked to a measure of phonological working memory requiring remembering and repeating nonsense words. Previous word has shown that the phonological memory is a sensitive indicator of SLI and may be am important cognitive skill for language development (Ellis Weismer et al., 1999). The language trait linked to the chromosome 19q was one that represented the children's ability to express words and sentences.

It is not unreasonable to expect to find linkage of language to additional regions of the genome. Spoken language development has been shown to be very important for successful reading development. A large effort toward identifying genetic linkage to reading and reading disorder has been made during the past two decades with several replicated linkages found particularly in 6p21 region of chromosome 6. Given the close relationship between reading and spoken language, these findings may have important implications to the genetics of language.

Back to Role of Genetics page