The target for this study was a stratified cluster sample of 6,000 kindergarten children who were monolingual speakers. The sample was stratified by residential setting and clustered according to school building. Rather than choosing a single urban, suburban and rural area, the sample was drawn from various regions of the states of Iowa and Illinois. These regions were centered on large metropolitan areas that will hereafter be referred to as "population centers." Each population center was selected for the ability to contribute an urban sample, with the surrounding areas contributing the suburban and rural samples. Several population centers reduced potential bias in participant characteristics associated with a single geographic area. The four selected population centers were Des Moines, Cedar Rapids, Waterloo/Cedar Falls, and the "Quad Cities" that straddle the Mississippi River. The Quad Cities are Davenport, IA, Bettendorf, IA, Moline, IL and Rock Island, IL.

Although Iowa is considered overall to be a rural, farming state, the use of population centers provided the desired urban, suburban and rural residential strata. The selected population centers were the largest in the state. Des Moines, the capital and largest city, had a 1990 city population of 191,000 and a metropolitan area population of 338,000. The second largest city in Iowa is Cedar Rapids, with a 1990 city population of 110,000 and a metropolitan area population of 170,000. Davenport had a 1990 city population of 103,000 and a metropolitan area population of 384,000. Waterloo/Cedar Falls together had a population of 100,388.

In summary, the population centers selected provided a suitable sample for the study of SLI in monolingual English speaking children. The general population in the areas sampled provided the linguistic homogeneity desired to reduce the chances that the identified language deficits were confused with cultural and regional differences.

Strata General Definition
The targeted 6,000 kindergarten children were equally distributed into three residential strata: urban, suburban and rural settings. This stratified sampling was specified by the NIDCD contract, and allowed the sampling of children across a spectrum of living and demographic conditions. To achieve this stratified sampling, the attendance zones of the school buildings from the four population centers were drawn and designated as being predominately urban, suburban or rural. Subsequent to the study, each individual child was assigned to a stratum according to that child's home address, thus allowing for a more accurate assignment of residential strata.

The U.S. Census Bureau specifically defines urban and rural areas, however suburban areas are defined by default, relative to the definitions of urban and rural areas (Census of Population and Housing, 1990). Based on the U.S. Census Bureau 1990 definitions, "urban" is defined in terms of territory, population, and housing units, and are considered to be places of 2,500 or more persons living in incorporated or unincorporated areas included in urbanized areas. An urbanized area comprises one or more places ('central places') and the adjacent densely settled surrounding territory ('urban fringe') that together have a minimum of 50,000 persons. The urban fringe generally consists of contiguous territory having a density of at least 1,000 persons per square mile.

The urban fringe also includes outlying territory of such density, connected to the urban area or fringe, and either within 1.5 road miles of the urban core, or within 5 road miles of the core but separated by water or other undevelopable territory (Census of Population and Housing, 1990).

"Rural" is defined by the U.S. Census Bureau (1990) as territory, population, and housing units not classified as urban. Rural areas may be divided into "places of less than 2,500" and "not in places," a category that is comprised of rural areas outside incorporated and census designated places and the rural portions of extended cities.

A general rule to determine strata for this study was developed by the investigators based on the two variables of population density and distance from the urban center. Areas designated as being "urban" were within 2 miles of the center business district. "Urban" also included areas that were between 2-3 miles of the center business district if the population density was 3,000 or more people per square mile. "Suburban" designation was assigned to areas having a population density greater than 2,000 persons per square mile and that did not qualify as being urban. "Rural" was considered to be areas with a population density less than 2,000 persons per square mile.

Because of the influence of the Mississippi River on the geographic layout of Rock Island, IL, the following definitions of residential strata for that population center were based solely on population density: Urban was considered to be greater than 3,000 people per square mile, suburban was between 2,000 and 3,000 people per square mile, and rural was designated as areas having less than 2,000 people per square mile.

Sampling of Elementary Schools
The method of sampling was a stratified cluster sample of school buildings located in the selected population centers. This sampling was accomplished by first contacting in writing the superintendent of each school district in the selected population centers. Along with a written explanation of the study was an invitation to participate during the course of this 2-year study. Receipt of a district superintendent's consent to participate served as permission to contact all of the principals of the school buildings in that district. A total of 41 districts were contacted; 21 (51.22%) superintendents consented to participate, 15 (36.59%) superintendents refused participation, and no response was elicited from 5 (12.19%) districts. It should be noted that only public school districts were sampled; there was no sampling of private schools or children being home schooled.

As was described above, each participating school building was assigned a residential stratum (urban, suburban, rural) based on their attendance zones. Once the individual school buildings were sorted by population center and residential strata, buildings within each stratum were assigned a number. Using a random number table, buildings were selected to obtain a minimum total sample of 1,000 students in each of the three strata across all population centers. For example, for the testing conducted during Field Year 1, a minimum of 333 children were selected from each rural, suburban and urban strata in each of the population centers of Des Moines, Waterloo/Cedar Falls, and the Quad Cities. (Cedar Rapids provided us with primarily an urban sample to supplement the urban sample from Waterloo/Cedar Falls population center). This procedure was repeated for Field Year 2. Therefore, because of the random sampling, some school buildings did not participate in this study; some were selected to participate in only one year of the study; and some school buildings were selected to participate in both years of the field testing. Because the population of Iowa does not contain a substantial number of African Americans, this sampling strategy was modified to over sample the urban strata, since this stratum contained the largest proportion of African Americans.

Table 1 presents the number of children who were sampled over the course of the study as age-eligible participants according to the study site and strata.

Table 1. Distribution of Participants by Strata and Study Site

Center Rural Suburban Urban Total
Des Moines 655 789 754 2,198
Waterloo/C.R. 888 665 957 2,510
Quad Cities 814 695 1001 2,510


Screening Phase Screening Instrument
The screening procedure only involved language performance. Children were not screened for hearing, nonverbal intelligence, or pervasive developmental disorder, the exclusionary criteria for SLI.

A language screening test was developed that had a very high predictive relationship with the diagnostic outcome. The screening tool consisted of 40 items from the "Test of Language Development-2:Primary" (Newcomer & Hammill, 1988). This screening instrument was administered to each child individually and took approximately 10 minutes to complete.

Screening Data Coding
All data were entered directly onto computerized scan forms by the examiner during the screening. The University of Iowa Testing Services department then scanned these forms and transferred the data onto computer disk for analysis. See Chapter IV for the procedures used to assign the screening outcomes of pass or fail.

Diagnostic Phase: Diagnostic Battery
The goal of the diagnostic testing phase was to identify those children who would serve as SLI cases or control subjects. The diagnostic battery included hearing, language, speech, cognitive, and pre-reading tasks, and gross motor observations. Because the examiners who had conducted the screening also administered the diagnostic tests, the children had become familiar with the examiners. Thus, introductory warm-up sessions were unnecessary. Testing was administered in a standardized manner. All children participated individually, and diagnostic testing took approximately 2 hours to complete. The diagnostic battery was completely administered during one testing session; when this was not possible due to scheduling reasons, the testing was completed within a week of its initiation. However, individual tests, such as the TOLD-2:P and WPPSI, were always administered in their entirety during a single session. The order of administration of the diagnostic tests was held constant when possible. Within individual tests, the same presentation order of subtests was maintained across examiners. The examiners provided written comments regarding the testing situation and impressions of the child's performance to supplement the objective measures.

The measures included in the diagnostic battery follow, along with a general discussion of the testing procedures.

Audiometric Testing
Because a hearing loss was an exclusionary criterion for the diagnosis of SLI, audiometric testing was performed. The purpose of the audiometric testing was to determine if the child had a persistent hearing loss that was suggestive of sensori-neural or conductive origins. Therefore, both pure tone audiometric screenings and acoustic omittance/impedance audiometry were conducted.

Pure tone screening was conducted for 500, 1, 2, and 4 kHz at 20 dB (American Speech-Language-Hearing Association, 1985). If the child failed the pure tone screening in an ear, pure tone thresholds were obtained and a visual inspection of the ear canal was done. Tympanometry was then done with four measures taken: Static Admittance (y A passing range was .22 to .81); Ear Canal Volume (Vea passing range was .42 to .97); Tympanometric Width (Gradient passing range was 59 to 151); and Tympanometric Peak Pressure (TPP passing range was -139 to + 11). If anyone of the four measures was failed in an ear, the child was considered to have failed the tympanometry testing for that ear.

If a child failed the pure tone screening bilaterally, no further procedures were done at that time, and the child was retested, usually after a period of two weeks. If the child failed the pure tone screening unilaterally, the diagnostic testing was continued at that time.

For the children who failed the pure tone testing unilaterally or bilaterally at the first screening, a letter was sent to the parents/guardians to notify them of the potential hearing problem and to suggest the appropriate audiologic or medical follow-up as based on the results of tympanometry (see Follow-up Procedures). If the child failed the second screening bilaterally, that child was disqualified from further testing. Regardless of whether the child failed the second screening unilaterally or bilaterally, a second follow-up letter was sent to the parents to notify them of the audiometric testing results.

Cognitive Testing
Because nonverbal cognitive ability was an exclusionary criterion for the diagnosis of SLl, nonverbal cognitive testing was performed. The Block Design and Picture Completion subtests of the "Wechsler Preschool and Primary Scale of Intelligence-Revised" (WPPSI; Wechsler, 1989) were administered. These subtests were chosen for two reasons: 1) of all the WPPSl performance subtests, these two were reported to be most highly correlated with full performance scale score (Block Design I = .59, Picture Completion I = .60), and 2) they afford an objective scoring method. These two performance subtests have been reported in the literature as a short form of the WPPSI Performance scale (LoBello, 1991). Further, these are easily administered and scored, and provide important criteria to assure inter-examiner reliability across the multiple field examiners.

The scaled score for each subtest was reported. The two scaled scores were summed, and a score of 16 or greater was selected for passing decisions. This summed score reflects a performance intelligence score greater than 85.

Diagnostic Language Testing
Language testing was conducted to identify the SLI cases and control subjects. Subtests of the "Test of Language Development-2:P" (TOLD-2:P; Newcomer & Hammill, 1988) was supplemented with a narrative story task (Culatta, Page, & Ellis, 1983). These tests were selected because they assessed multiple aspects of comprehension and production, and provided normative data that allowed the calculation of standard scores.

Further, these language measures were easily administered, and enabled development of scoring guidelines for reliable scoring across the multiple field examiners. The 5 TOLD-2:P subtests that were administered were Picture Vocabulary (PV), Oral Vocabulary (OV), Grammatic Understanding (GU), Sentence Imitation (SI), and Grammatic Completion (GC). The Word Articulation (W A) subtest was administered, however these results did not contribute to the language diagnosis. The subtests of the TOLD-2:P were administered according to the manual. Detailed scoring guidelines were developed to assure consistency within and across examiners during the field testing. Raw scores for each subtest were converted to standard scores based upon local norms.

Narrative Story Task
The Narrative Story task involved the retelling and comprehension of a short story about a birthday party. The examiner would read the story, and then ask the child to retell it. A maximum of three general recall were provided if the child needed prompting. The reported score was number of story events mentioned out of a possible 21.

The number-of-events-retold score was supplemented by the examiner's rating of the child's completeness and organization of recall. Completeness of recall was rated on the basis of whether or not all major components of the story were included in the retelling. To be considered complete, the child needed to include (1) Setting and problem (it was the boy's birthday, he wanted a puppy, his mom said "no"); (2) Complicating problem (the boy had a party and received presents, but did not get a puppy); and (3) Resolution (the boy was surprised and he got a puppy).

The organization of recall was rated on the basis of whether or not the main components of the story were retold in the correct sequential order, regardless of whether or not it was complete. Examiners rated organization of recall with a "yes" or "no." If the child retold no events or only one event, the organization of recall rating was scored "not applicable."

Following the child's retelling of the story, the examiner asked the child 10 questions to measure comprehension and memory of the story. The score reported was the total number of questions correctly answered out of a possible 10. The raw scores obtained for events mentioned during the retelling and for the comprehension questions were standardized.

Reading/Reading Readiness Tasks
Reading Readiness was a variable of interest for the risk factor study. However, rather than probe the parents about Reading Readiness (such as asking about reading in the home and the child's exposure to reading), measure was obtained. The Letter Identification subtest of the "Woodcock Reading Mastery Tests-Revised" (Woodcock, 1987), the "Word-Sound Deletion Task" (Catts, 1991), and the "Random Animals-Colors Task" (Catts, 1991) were used to measure pre-reading skills.

Letter Identification
The Letter Identification subtest is a measure of Reading Readiness. This test measures the child's ability to identify verbally letters written in several forms and scripts (such as upper or lowercase; roman, italic, and bold types; cursive and printed characters). Administration and scoring of this subtest was in accordance with the manual.

Normative data for each month of the academic year are provided in the manual so that scores obtained from children who are tested later in the year are adjusted to control for learning that has occurred during the year. As specified by the manual, raw scores were first converted into a "W" score. A difference score was then calculated by subtracting from the W score a reference value based on the month that the child was tested. The difference score was then converted into a standard score, which is the reported score.

Reliability information for the Letter Identification subtest shows that the split-half reliability coefficient for Grade 1 is of r= .94. The standard error of measure for the W scores as reported in the manual for 1st grade scores is 4 W scale units.

Word-Sound Deletion
The Word-Sound Deletion task (Catts, 1991) is a sound segmentation, phonological awareness task. In this task, the child was required to delete the initial phoneme or syllable from a word and repeat only the remaining phoneme or syllable. Three example items, all compound words, were demonstrated using pictures. For example, the directions were "Say 'baseball"' as pictures of "base" and "ball" were shown. Then, with the first picture covered, the child was asked "Now say 'baseball' without the 'base'." If the child did not respond correctly, the correct answer was provided for the demonstration items.

When the child showed an understanding of the task, the testing began. The directions were the same for each of the 21 test items, however, the pictured stimuli were discontinued. The stimuli consisted of compound words, two syllable words, and monosyllabic words. The sound sequence remaining as the correct response was always a high frequency word. One repetition of an item was provided if needed, and testing was discontinued when 6 consecutive items were incorrectly answered. The reported score was the raw score, and the maximum raw score was 21.

Random Animal-Colors
The Random Animal-Colors task measured rapid naming ability, a skill which has been reported to be a measure of phonetic coding ability (Catts, 1991). The Random Animal-Colors task involved showing the child an 11.5" x 17.5" page that contained images of 24 animals. These 24 animals were 1 of 3 randomly selected animals (a pig, a horse, and a cow) that were colored in 1 of 3 randomly selected colors (blue, red, or black). These colored animals were arranged in random order in 4 rows of 6 items each. The child was first given as much practice as needed to identify the animals and colors, and several demonstration items were provided to allow the child to practice responding with "adj+noun" responses. If the child did not know his/her colors or animals, this task was not administered. The examiner instructed the child to "name these as fast as you can" in sequential order.

The reported score for this task was the total time required for the child to name all of the colored animals, measured with a stopwatch. Thus, a lower total time score reflects better performance. A tally of incorrect responses was kept for this task.

The Word-Sound Deletion and Random Animals-Colors tasks were reported by Catts (1991) to be the best combination of predictors of reading group membership for a group of kindergartners that was comprised of 41 SLI and 30 control subjects. Catts reported that these tasks together enabled them correctly to classify 82.9% of the children in their study.

Iowa Severity Rating Scale
The Iowa Severity Rating Scale (ISRS; Jeffrey & Freilinger, 1986) was developed for use by speech-language pathologists working in the Iowa Public School system. The purpose of the ISRS measure in this study was to supplement the standardized measures of speech and language with a clinically significant measure, and to obtain an informal measure of voice and fluency.

The ISRS is a 5-point severity rating scale of speech, language, voice and fluency skills. The ratings are on a continuum, where 0 indicates adequate skills and 4 indicates a disorder. There are specified published criteria to guide the rating made by the speech-language pathologist, and these guidelines were used during this study (Jeffrey & Freilinger, 1986). Minimal additional guidelines were established for use during the Field Study because the examiners' subjective impressions were desired. Four guidelines were used:

(1) Articulation: The Word Articulation subtest of the TOLD-2:P was supplemented by the ISRS Articulation rating made by the examiner. A rating of "1" indicated developmental s, r, l problems. Because certain phonemes (initial k, m, v, n) are not sampled by the Word Articulation subtest, there was in some cases a discrepancy between the WA results and articulation severity rating.

(2) Language: The language severity rating was determined by the speech-language pathologist based on observation of the child in informal interactions as well as during the standardized language testing. In some cases, the TOLD-2:P raw scores were converted to standard scores by the examiners, and this information was considered when making the severity rating for language, as specified by the ISRS manual. If the examiner did not have sufficient information to make a language severity rating, a language sample was elicited. For children with low performance scores on the WPPSI, the language severity rating was made commensurate with cognitive ability.

(3) Voice: The only guidelines provided to supplement the manual were that mild hypernasality was rated as a "2," and moderate hypernasality was rated as a "3."

(4) Fluency: Ratings were made in accordance with the ISRS manual for fluency.

Motor skills
Two gross measures of motor skills were obtained: gait, and handedness. The examiner observed the child walking to the examining room, and noted if there were any obvious gross motor problems when walking. The child was also asked to write her/his name, and the examiner made note of which hand the child used.

Diagnostic Data Entry and Diagnosis
The diagnostic data were coded using a data entry program. To minimize coding errors, the data were entered using a two-pass method. Data were first entered and then verified during a second entry of these same data. Thus, data were entered at two different times and usually by two different people. The data entry program also had range checking for data having a minimum low and maximum high value.

Diagnostic outcomes were determined based on language performance as well as hearing and nonverbal cognitive performance. Based on the diagnostic results, the children were assigned to one of five diagnostic categories: (1) specific language impaired (SLI; failed language testing but passed hearing and nonverbal cognitive testing); (2) control (C; passed all language, hearing, and nonverbal cognitive testing); (3) language impaired (LI; failed the language testing and nonverbal cognitive testing, but passed hearing testing); (4) cognitive failure (CF; pass language and hearing testing, but failed nonverbal cognitive testing); and (5) hearing failure (HF; failed both hearing screenings and did not continue for further testing).

Development of the Risk Factor Survey
A major research question in this study was concerned with identifying risk factors for SLI. In order to address this question, information was obtained from the parents of SLI children and non-SLI controls regarding a wide range of exposures to potential risk factors. This information was obtained through a telephone survey that lasted approximately 50 minutes.

The candidate risk factors were selected after a literature review of risk factors and language impairment. A pilot survey was developed to obtain information regarding these risk factors, and was administered to 6 parents. Several modifications were made based on the pilot. A data coding manual for the final Risk Factor Survey is contained in Appendix I. The general categories of risk factors selected for this study are:

  1. Parent exposure, prior to the study child's birth, to:
  2. Mother's pregnancy and delivery of the study child:
  3. The study child's health & development
  4. 4. Factors associated with the study child's rearing environment

Risk Factor Survey Administration
The names of the parents who were to receive the Risk Factor Survey were sent from the lab at the University of Iowa to the Statistical Laboratory at Iowa State University (ISU), a subcontractor on this contract. The lab at the University of Iowa contacted these parents by letter to notify them that they had been selected to receive the Risk Factor Survey. The lab sent a list of chemicals for the parents' reference during the telephone call. Trained professional interviewers at the ISU Statistical Laboratory then called the parents to arrange a mutually convenient time to administer the survey. The interviewer usually administered the survey during one call. The parents almost always received the telephone survey prior to receiving any information regarding the outcome of their child.

A supplementary information booklet was developed to aid the telephone interviewers and to assure standardized administration of the survey. This booklet explained terms used in the survey, as well as general instructions for the interviewer, on a question-by-question basis. All interviewers were blind regarding the diagnostic outcome of the study child. 

close window