Retroactive Moral Judgment and the Evolution of Ethics
in Human Subjects Research: A Case Study in Context
June 17, 2001
[Last revised 20010709, 20010731, 20010816, 20010823, 20021010]
NOW WITH EPILOGUE
(What a difference a month makes!)
July 30, 2001
|Impediments to Change and the Need for Education Today's Moral and Ethical Attacks on the 1939 Stuttering Study The Price of Super-Sensitivity and the Arrogance of Misplaced Morality|
During the summer of 2001 journalists, university and association spokespersons struck a stance of moral rectitude and outrage as part of, and in response to, a national media blitz attacking a 62-year-old master’s thesis about the onset of stuttering.
Given the fact that the best current and impartial re-evaluation of the study's data (Ambrose and Yairi) suggests that its theory and evaluation of data was flawed, and that the researcher not only could not and did not "cause stuttering," but in fact did no harm:
It is neither a defense of, nor another attack upon, the study itself.
The paper puts forward the following analysis:
There are serious and sophisticated issues interwoven in the use of human subjects in research going on right now, such as that involving genetic engineering, use of stem cells, clinical trials of new drugs, human cloning, and use of human subjects in developing countries. There is an active trade in human organs "harvested" from the third world poor. The public, its representatives, and the research community can benefit from the understanding of these issues that can be promoted as a result of rational discussion. Awareness of the evolution of the ethical standards regulating human subjects research can contribute to that understanding.
Even after the creation of ethical standards following World War II there were hundreds, if not thousands, of studies that now seem to have violated those standards. Some involved knowingly doing harm to large numbers of uninformed and non-consenting subjects over long periods of time. Many were done by highly regarded professionals in prestigious institutions, sometimes with government funding. There is a substantial body of literature dealing with this history.
To the extent that history is thought relevant to today's understanding, those are the studies that should be one’s focus.
By contrast, the 1939 stuttering study involved no pre-study known harm, a handful of subjects, with consent, over a very short time. It occurred before the post-WWII standards were in place – and yet compares very favorably with post-standards studies. It was not a major, federally-funded study. It was a little master’s thesis experiment.
It may be that the sole purpose of the media attention was to trash the reputation of a decent and defenseless man now 36 years dead. Because otherwise, to focus on the details and ethics of this 1939 study today is bizarre, pointless and counterproductive as well as unethical, possibly illegal, and grossly unfair.
Mystery still surrounds the question of why the original journalist, and his newspaper, would (a) seek out such a study as “news” in the first place, (b) give it the supermarket tabloid-style, emotion-laden, dramatic treatment they did, (c) fail to present it in the context of human subject research ethics generally, and (d) endeavor to trash the stellar reputation of the highly regarded supervisor.
Selecting this study for such detailed analysis and publicity (a) contributes nothing to an understanding of the complexities of the issues that confront human subjects research ethics today, their resolution, or historical evolution, (b) provides no additional information to the millions of today’s stutterers about the resources available for their treatment, and (c) fails to increase public understanding of the body of stuttering research generally.
Notwithstanding the pointlessness of doing so, it is, of course, nonetheless possible to examine this 1939 study if one wishes. Individuals can differ in their evaluations of its various aspects, including its ethics, judged either by the standards of its time or the standards of today. Hopefully such evaluations will be factually informed, placed in context, balanced and rational.
But to speak of it in terms of moral outrage is fundamentally unfair, truly unjustified and outrageous. Such attacks are indicative either of gross ignorance of the history of human subjects research or an unbecoming willingness to cast undeserved moral aspersions upon others regardless of the facts.
1. As a former university vice president for research has said, the study “was well within the norms of the time.” He’s right. It was fully approved, even encouraged, by everyone remotely related to it at the time. That does not mean the facts of the study are beyond inquiry, or even the criticism derived from hindsight. If even one subject of the study sustained long-term harm, actually caused by the study, that is deeply regrettable. But it does mean that, when all the circumstances are examined, a civilized sense of decency and fairness precludes hurling moral judgments at the individual researcher and supervisor involved.
2. In fact, to position the study in the evolution of human subjects research ethics results in its looking very good indeed. This 1939 study was done before any ethical standards were in place. And yet the researchers imposed on themselves many of those of today.
3. Where it shines most dramatically, is when it is compared with the studies that followed it from WWII to the present, after standards were in place – but widely misunderstood or ignored. Many of those studies had dramatic physical and ethical consequences going far beyond anything remotely close to this little master’s thesis.
4. “Everybody’s doing it” does not make "it" right. But if moral castigation is to be the order of the day, if Pew's “civic journalism” is also to be civil journalism, if meaningful public understanding is the goal, then at least some context is necessary to any story. And if the story is that the applicable ethical standards of today were not applied in the past, to focus on this 62-year-old master’s thesis misses over 99 percent of the studies that this story ought to have been about.
5. Finally, some of those casting the unwarranted aspersions, insisting that such a study would never be done today, are on pretty shaky ground. There are still a great many unresolved ethical issues, and continuing violations, in human research land.
These are some of the themes explored in the material that follows and its accompanying notes.
One large issue raised by the attack on the study involves the super-sensitivity those stories and comments reflect. They are indicative of something that is a serious issue today: the prevention of beneficial human research that needs to be done.
Here are two examples.
Schizophrenia is a serious disease that is the subject of great research interest. Indeed, Nobel prizes recently have been awarded for research contributions. The problem? Useful studies require the use of humans with schizophrenia.
So? So one of the most basic ethical requirements of human subject research is that the subjects give “informed consent” to their participation. What does “consent” mean to someone with paranoia or schizophrenia? Is it possible for them to consent?
If not, is the only option that the research not be done? Because without such research there are enormous limitations to the improvements that can be provided in their treatment and care. If the concern for their best interests is genuine is that served by prohibiting any additional research?
The same issue arises for persons in prisons. Because of past abuses in human subject research with prisoners there are now severe restrictions in place on such research. But those who care about improving the mental and other health conditions of prisoners need the knowledge that only such research can provide.
Thus, it turns out that the same super-sensitivity regarding the ethics of human subject research that irrationally and unfairly castigates the morality of those conducting studies a half-century ago, when entirely different standards prevailed, is also inhibiting studies that need to be done today.
The defender of another study says its critics have displayed a "loss of common sense" in, among other things, comparing it to "Nazi human experimentation," which he considered "illogical and inflamatory, and [has] the potential of doing great harm." The observation is almost directly applicable to the criticisms of the stuttering study, which was characterized by journalists as "the monster study," and also involved a comparison of the researcher and supervisor to Nazis and Timothy McVeigh.
What is “human subject research”?
As the phrase suggests, it is research that has gone beyond the laboratory, beyond the testing that can be done with animals, and needs to involve humans if it is to be continued at all.
A common example would be the testing of a new pharmaceutical product. Under current law, before a new drug can be sold to the public the manufacturer must demonstrate that it won’t do serious harm, that it will actually alleviate whatever condition it’s designed to cure, and that its side effects are known and communicated. This usually requires that the drug be tested on humans.
Those who take the drug as a part of that test are called human subjects.
Over time, as with many practices such as the death penalty, majority thinking has shifted regarding human subjects research.
For example, issues involved in the participation of populations about which there was less concern in the early part of the last century, such as adult prisoners and institutionalized children, now receive a heightened level of sensitivity. (The analog today is the disparity in the ethical standards applied by U.S. researchers to human subjects who reside in North America and those living in developing countries.)
The earlier willingness to leave ethical considerations to the experts doing the research has given way to a demand for oversight and regulations.
There has been much progress over the last 70 years in thinking about, and regulating, human subjects research. Many more regulations and opportunities for review are in place.
Although it is still possible for harm to occur – or for future ethicists to look back and condemn practices widely accepted today – this is much less likely than a century, half-century, or even decade ago.
The research community is rightfully proud of that progress.
At the same time, two things must be said. (1) As with any increased consciousness and sophistication, such as “political correctness,” pendulums have a tendency to swing beyond the mid-point. This may be happening with concerns regarding human subject research. (2) Ethical violations are still taking place, whether measured by the articulated standards of today or the standards that may evolve in the future.
The thinking about such research has gone through an evolution which may be thought of as four phases.
The first includes roughly the first half of the Twentieth Century. During this phase there were few if any human subject research standards. The ethical issues, like the research design itself, were left almost entirely to individual researchers.
The second, and the primary focus of this paper, is the period from World War II through the 1970s. This is the time when abuses began coming to public attention, consciousness was raised, and standards were evolving.
The third might be thought to be the 1980s, when the new standards were finally in place and applied.
And the fourth might be the last five years or so, when concerns and procedures have been at their most intricate and intense – and some would say self-defeating – stage so far.
Phases III and IV are dealt with only in passing.
How did we progress from the Wild West of human subject research in the early 20th Century to where we are today?
It would be comforting to report that the improvements have come about from the heightened ethical sophistication of the research community.
Unfortunately, throughout these phases a reading of history suggests that research universities and other institutions have not been the driving force for increased regulation. Indeed, they have often joined with industry to fight it.
Some practices that would be universally thought unethical today triggered no one’s conscience then. If ethical problems were even perceived at all, sometimes there might have been a good old boy wink at practices that, in retrospect, might be considered somewhere between questionable and criminal.
The pattern of ethical evolution seems to be pretty consistent in the cases that have made their way into the media and the history books.
1. Professionals’ possible abuses occur and are ignored by colleagues and administrators alike.Let’s test these conclusions in the context of the ethical standards for human subject research that have evolved during “Phase II.”
2. Practices of the time, thought to be abuses (as distinguished from those 62 years earlier), come to the attention of the news media and are publicized.
3. Once the media exposure occurs administrators respond. They jump on any available moral high horse and gallop off in all directions, disavowing knowledge, disowning their colleagues whether alive or dead, denying that such things would ever occur today, and shouting apologies over their shoulder.
4. Following the media attention, the professionals who kept the research to themselves thinking that the public “just wouldn’t understand” discover the public thinks it does understand.
5. The public makes clear it doesn’t like what’s been going on, and demands its elected representatives “do something.”
6. Legislative and regulatory bodies hold hearings in response to the public outcry.
7. Before the proposed remedies can be enacted they are watered down as a result of lobbying pressure from industry and research institutions.
Following World War II a review of experiments conducted by German researchers resulted in what came to be called the Nuremberg Code of 1948. It provided that human subject research should only involve subjects who give informed consent and volunteer to participate.
This code is significant because it marks the beginning of Phase II. The first time such standards were ever set forth in an international agreement was 1948. (This is, of course, after the 1939 study.)
This was followed by the World Medical Association's Declaration of Helsinki in 1964, most recently revised in 1996. It spelled out some additional requirements, such as the suggestion that laboratory and animal research should precede human subject research.
It is worth noting, however, that even today these most rigorous standards do not forbid the taking of risks in human subject research. The necessary finding is simply that “risks to subjects are reasonable relative to anticipated benefits . . . and the importance of the knowledge that may reasonably be expected to result.”
Bear in mind that the examples that follow are not studies done by some crazed Frankenstein-like individuals, working alone in secret laboratories outside the realm and control of respectable research institutions or any other controls.
These are studies done by well-educated, accomplished and respected professionals. Most were reviewed in some fashion, and funded, by other professionals and institutions. They were often published in academic journals.
Few or no questions of their propriety were raised by anyone along the way.
Note also that, with the exception of the 1939 study, all were done after the ethical standards and regulations were in place that would seem to have been violated by one or more aspects of the study.
The classic study illustrating the ethical problems in human subject research is what has come to be known as the Tuskegee syphilis study.
It was designed and conducted by highly educated, professional physicians of the federal Public Health Service.
Over 400 African-Americans with syphilis were recruited. Not only was there no informed consent by them, they were affirmatively misinformed that they would receive “special free treatment.” They were not informed of the nature of their disease or that the research would offer them no therapeutic benefit.
Their complications got worse. Their death rate became twice that of the control subjects. Yet the study continued.
Even after penicillin became available, and was known to be effective in the treatment of syphilis, the men were neither informed of this nor treated.
When outside doctors diagnosed a subject as having the disease researchers intervened to prevent treatment.
The existence of this study is well known by public and research professionals alike. What is not so widely known is that the Tuskegee study continued until 1973!
Thus, the Tuskegee experiment continued long after the Nuremberg Code and Helsinki Declaration were known and widely discussed.
How could this be?
Perhaps it was because the study was unknown to the research community.
No, that can’t be the explanation. There was nothing secret about it. It was widely reported in medical journals over a period of 40 years.
One of today’s administrative protections of subjects’ rights is the institutional review board, or IRB. A researcher’s colleagues must review and approve a human subject study and find that it complies with current administrative regulations, institutional procedures and ethical standards.
Perhaps that was the problem. There just was no IRB at that time.
No, that can’t be the answer either. Earlier versions of an IRB were in place. The Tuskegee study was periodically reviewed and approved by Public Health Service officials and medical societies. And since the PHS was a federal agency, presumably there was Congressional oversight and acquiescence as well.
Today such agencies and institutions would have detailed regulations in place. Maybe the PHS didn’t have any at that time.
No, that can’t be it either. The Public Health Service Policy for the Protection of Human Subjects became effective six or more years before the study was stopped.
What finally stopped the Tuskegee study? Obviously it was not the ethical concerns of a research community clearly willing to continue operations in violation of known standards.
No, this study was stopped only after it was brought to public attention by a journalist.
It was the media attention to the Tuskegee study that prodded Congress into holding hearings on human subject research standards.
Even then, after all the revelations, Senator Ted Kennedy’s bill to create a National Human Experimentation Board, as recommended by the Tuskegee Syphilis Study Ad Hoc Panel, was defeated. The hope for oversight of all federally funded research was shattered.
The compromise, the National Research Act of 1974, required regulations governing only the Department of Health, Education and Welfare. And even those regulations were watered down, leaving the grantee institutions free to regulate themselves through their self-appointed institutional review boards.
The subsequent Belmont Report, spelling out more ethical standards (“respect, beneficence and justice”), did not appear until 1979. The Department of Health and Human Services (DHHS) regulations based upon that report became available in 1981.
Other governmental agencies did not sign on until 10 years later. The DHHS regulations were formally adopted by over a dozen agencies in 1991 and are now referred to as the “Common Rule.” And even this set of rules provides for six categories of exemptions.
Is the Tuskegee story merely one unfortunate aberration in an otherwise stellar record of substantial and ethical accomplishment by research institutions? Unfortunately not.
Here are some additional examples from Phase II.
From 1946 to 1956 19 boys who thought they were part of a science club were, without their consent or knowledge, drinking radioactive milk provided them by researchers from Harvard and MIT.
The Calculated Risk of Atomic Bomb Radiation Exposure
Radiated milk is one thing. But in 1949 the Atomic Energy Commission studied the question of whether the fallout from its atomic bomb tests could threaten the viability of life on earth. Apparently it concluded the risk was worth it, because the tests continued – including those that involved what it conceded was a “calculated risk” of radiation exposure to populations living downwind from the tests.
Doctors’ Patients as “Subjects”: The Thalidomide Babies
Until the 1960s pharmaceutical companies paid doctors who were willing to use their uninformed patients for human subject research. Participating doctors were provided free samples by the drug companies, required to keep records of the consequences for their patients, and provide those results to the companies. The practice was so common and widespread that, even today, few speak of it.
In fact, the practice continues today in the guise of so-called “clinical trials” in academic medical centers. Most subjects have provided some form of informed consent (thereby relieving the institutions of legal liability), but the medicines are still free, doctors are still compensated by the pharmaceutical houses in a variety of ways, and the risks still fall on the patients.
In the 1960s there was no law that required drugs be tested before marketing. Companies didn’t have to show anyone that the products they were selling were even safe, let alone useful for the conditions for which prescribed.
In a sense, we were then an entire nation of uninformed, non-consenting human subjects – all for the profit of drug companies and doctors.
Injury, disease and death still result -- even after the required testing. In August 2001 it was reported that some 81 persons using cholesterol-lowering drugs had died from muscle cell degeneration. Philip J. Hilts, "Drug's Problems Raise Questions on Warnings," New York Times, August 21, 2001, reports that doctors often ignore warnings regarding usage and side effects. As few as 5 percent were found to be conducting the essential monthly liver tests of their patients.
And there is one sense in which every doctor's patient is a kind of "human subject" -- especially the patients of young doctors -- even if there is no clinical trial or other experiment as such. There are changes in procedures as well as pharmaceuticals. Good doctors want to keep up with both. But that means, necessarily, that there are increased risks involved while the doctor is perfecting the new techniques. This is expecially true for specialists practicing outside of major research hospitals, for whom there may be relatively few cases on which to hone one's skills.
Why did the lucrative practice of paying doctors to test drugs on patients stop?
One of the 1950s tests involved a sedative from Germany called thalidomide. It was given to pregnant women to control sleep and nausea.
Unfortunately, however useful as a sedative, one of thalidomide’s nasty side effects is that it causes missing or deformed limbs and other severe deformities in fetuses. As a result, the human subjects in this research project, almost all of whom were in Europe, gave birth to some 12,000 so-called thalidomide babies who provided dramatic “visuals” for national television.
Following this media attention, as is often the case, Congressional hearings were held.
Notwithstanding the dramatic attention, however, the industry and research community were successful in weakening the legislation. Informed consent would be required, but “the best judgment of the doctors involved” would control whether consent was “feasible” or “in the best interests of the patient.”
With little or no thanks to the pharmaceutical industry or medical profession the law now authorizes the FDA to insist on the safety and efficacy of new drugs.
Even today, this law is still being attacked by industry on the grounds it is delaying the time it takes to get drugs to patients – and experiments with thalidomide continue, though hopefully not on pregnant women.
In 1962, the U.S. Army addressed human subject research issues with regard to its atomic, chemical and biological warfare experiments on soldiers and others. But with the urging of non-military consultants it expressly excluded “clinical research” involving military personnel from its standards.
By 2002 the Defense Department released more than two dozen reports of previously classified exercises from 1962 through 1973 involving the deliberate exposure of U.S. troops to chemical and biological weapons -- without the consent, or even knowledge, of the subjects. The agents used, "some of the most poisonous in the arsenal," included VX, sarin, soman, tabun, and Bacillus globigii (related to anthrax). As of 2002 the Department was trying to track down some 5500 known subjects. ["Defense Dept. Offers Details of Toxic Tests Done in Secret," New York Times, Oct. 10, 2002.]
As late as 1963 doctors in a New York hospital were deliberately injecting live cancer cells into subjects. The chief investigator was a physician from the Sloan-Kettering Cancer Research Institute. The study was reviewed and approved by the hospital’s medical director. There was no documentation of the subjects’ consent, nor were they informed what was being done to them.
After the study was revealed there were no immediate repercussions for the hospital, Sloan-Kettering, the university involved, or the PHS.
Such professional concern as did exist focused not so much on the ethics of the project as on the possible adverse impact of public knowledge on the continued funding of such research and the possibilities of legal liability.
The Chimpanzee’s Kidney Experiment and NIH Review
The same year a Tulane University doctor performed an unsuccessful transplant of a kidney from a chimpanzee into a human being. The procedure promised no benefit to the recipient or new scientific knowledge. It was funded by the National Institutes of Health after layers of bureaucratic evaluation and approval.
After thorough NIH review of “research protocols and procedures” the recommendation was for no changes whatsoever. The agency was concerned that, if it had standards they might “inhibit, delay or distort the carrying out of clinical research.” It was simply, “not in a position to shape the educational foundations of medical ethics.”
From 1956 to 1972 a New York University doctor led a study team at the Willowbrook State School for the Retarded in New York. The team wished to study hepatitis.
The children who were subjects were fed extracts of stools from infected individuals.
Did their parents consent? In theory yes, in reality no.
For starters, the consent form seemed to suggest that the children were going to receive a vaccine to protect against the virus rather than be deliberately infected with the disease.
Moreover, Willowbrook told parents it was overcrowded and unable to take more inmates – unless parents would consent to their children becoming a part of the study, in which case there was plenty of room.
Finally, the study could have as easily been done with children who already had the disease rather than infecting those who did not.
Was the doctor on a frolic of his own? No. The study was reviewed, approved and funded by the Armed Forces Epidemiological Board. It was further reviewed and approved by the executive faculty of the NYU School of Medicine.
NASA came up with an informed consent policy in 1968. However, it provides that the requirement can be waived in a number of circumstances, including when obtaining consent would seriously interfere with the research.
LSD From the CIA
It was not until 1975 that Congressional hearings brought to public attention the human subject research projects of the CIA and Defense Department.
The agencies wanted to know the extent to which it was possible to control human behavior through the use of psychoactive drugs, such as LSD, mescaline, and other chemical, biological and psychological means including radiation. The subjects used in these experiments had not given informed consent, and some died.
The project’s code name was MKULTRA, and involved at least 150 individually reviewed, approved and funded projects conducted by presumably reputable research scientists.
The CIA director ordered all records of the studies deliberately destroyed in 1973.
One could go on with these examples.
Someone who tried to bring attention to such questionable studies was a researcher named Henry Beecher. He spoke at a convention of science journalists in 1965. He cited 22 examples of research with potentially serious ethical violations that he had found in published reports in medical journals.
In other words, apparently the authors were either oblivious to the ethical standards they had violated or they simply didn’t care.
Rather than distance himself from such abuses, however, Beecher was candid enough to acknowledge that “in years gone by work in my laboratory could have been criticized.”
His paper was rejected for publication by the American Medical Association Journal (JAMA).
One can draw a great many conclusions from these examples. But it is not necessary to pass judgment upon the individuals involved to conclude that the mere existence of institutions, regulations and ethical standards has not always proven adequate to protect the rights of human subjects.
Indeed, the evolution of human morality with regard to any category of issues is usually a very slow process.
A 1994 Department of Energy advisory committee report contains an historical account of Public Health Service employees’ site visits to research institutions some years ago. Those visits “revealed a wide range of compliance . . . confusion about how to assess risks and benefits, refusal by some researchers to cooperate with the [PHS] policy, and in many cases, indifference by those charged with administering research and its rules at local institutions.”
As late as the post-1998 period the NIH has shut down research programs at eight prestigious institutions for a variety of ethical violations -- including the September 1999 death of a human subject in a gene therapy study who, it is alleged, was not adequately informed of the risks.
As recently as October 2000 the NIH was still sufficiently concerned about researchers’ lack of knowledge, understanding and compliance with human research standards that it began to require proof of the education of researchers about the standards before studies are funded and undertaken.
Defenders of the 1939 master’s thesis experiment argue, and understandably so, that it is both unfair and analytically unsound to judge it by the standards that evolved over the period of Phase II.
Standards change over time with regard to many aspects of human behavior.
Even if we do, even if we’re willing to take the judgmental personal risk of applying today’s standards to the human subject research of others in the 1920s and 1930s, we might first want to consider the consequences.
Because to apply today’s standards to yesterday’s research will mean that, for the next few years, professional societies, research universities and other institutions will be doing little other than issuing apologies to the experimental subjects of that time – if not writing checks for billions of dollars.
Indeed, one journalist has already seriously suggested they should be doing just that.
In the case of the 1939 stuttering study, even by the standards of Phase II – the actual practices as distinguished from the ethical ideals, but even by the standards of those later ideals – it is not clear that this small, brief student research project from 1939 was that out of line.
This is not to say that the stuttering study would be done today, nor that it did not violate, then, the standards we apply now – or perhaps should have applied then.
It is only to say that what this college student did 62 years ago pales by comparison with what was proposed, approved, funded and carried out by some of the nation’s top research professionals, institutions and government agencies both at the time and long after ethical standards were in place. However much it fairly may be subjected to rational analysis and criticism, it is both unproductive and unfair to single it out -- as, by implication, one of the worst -- of all the studies that might be mentioned in a survey of human subjects research ethics.
During the 1920s and 1930s there was little or no prior scientific research or useful books in the library regarding the condition we now call “stuttering.” Some of the techniques then used for treating the condition dated from the Middle Ages, such as cutting tongues.
Scientific research seemed warranted.
The master’s thesis in question was perhaps the first experiment to study the hypothesis that stuttering could be caused by well-meaning parents’ efforts to help a child speak.
Repetition is normal for young children trying to master this incredibly difficult skill we call speech. By “helping” the child with, say, suggestions that they stop and start over, or speak more slowly, the child quickly learns from loving parents that they do not approve of his or her repetitions. It is this consciousness of repetition, so the hypothesis went, that creates what becomes the abnormal repetition pattern called stuttering.
To test the hypothesis the master’s student went to an orphanage often used for human studies at that time with the approval of everyone involved. The study was scientifically well designed. There were a total of 22 subjects in roughly four groups. Both the control and test groups were made up of both previously stuttering subjects and those with normal speech. Thus, there were only about five or six subjects with normal speech who could potentially have been influenced by the study.
What the student did was to speak to those with normal speech in the language still used today by well-meaning parents to see whether that would increase the subjects’ disfluency. It did. By the rules of science the hypothesis thereby became a theory.
Newspaper reports of the study during 2001 were dramatic and emotionally charged tabloid journalism. They characterized it as a “monster study,” emphasized the participation of “orphans,” and made reference to research by “Nazis.” Those reports gave little or no attention to what it was, specifically, about the study that warranted such an attack.
It is no more analytically sound to defend “the study” than to attack “the study.”
The ethical questions that must be addressed, if a 62-year-old study is to be evaluated at all, are: What are the specific ethical objections that could be raised about this study?
For these purposes it may be unfair to hold this study to the standards of Phase II, since it was done long before they were in place.
But that’s what the study’s critics are doing. So, for the sake of argument, let’s use their unfair and inapplicable standards and pose some questions about this study.
Was there anything unusual about the involvement of children as subjects in this study?
Although today’s standards for child research subjects are quite strict, their participation is still possible at the present time. Indeed, as many as 95 percent of children with cancer are today involved in clinical trials!
Clearly children continued to be a part of many studies during Phase II. Consider, for example, the boys served radioactive milk, or the children infected with hepatitis.
Indeed, since the study involved a test of an hypothesis about the onset of stuttering in children it was necessary that they be involved if the study was to be done at all.
This would not appear to have been the case with the later Phase II experiments, approved as appropriate at the time, such as those involving children’s reaction to radioactive foods or hepatitis. Those studies presumably could have used adults.
So the mere fact that children were involved in the study is not, alone, basis for adverse moral judgment.
Was there anything inappropriate about involving as subjects the residents of an institution?
Participation by institutionalized individuals, including children, was approved even after standards were in place during Phase II.
For example, the Willowbrook hepatitis study was proposed by a qualified research scientist and approved by the faculty of the NYU School of Medicine, among others. It involved institutionalized children who were mentally retarded. At least the children used in the stuttering study were of normal intelligence.
The study involving the injection of cancer cells used institutionalized adults.
Moreover, in 1939 it was common to involve subjects in the specific institution used by the student who did the stuttering experiment: the Iowa Soldiers’ Orphans’ Home. Many other University of Iowa professors and graduate students used the facility in this way. In fact, one of the stuttering study participants is quoted as saying, referring to other studies, “Every week somebody else from the university would come and start testing us.”
The Iowa State Board of Control, which oversaw the orphanage, encouraged this research, as did, presumably, the university. Permission from the orphanage was required, and was obtained.
No, it’s hard to fault the study even under today's standards, let alone the standards of its time, because it involved institutionalized subjects.
Was informed consent not provided?
After standards were in place the Atomic Energy Commission didn’t get informed consent before risking radiation for large populations.
Doctors didn’t get the consent of their unwitting patients testing new drugs.
The Sloan-Kettering doctor didn’t get the consent of those he injected with cancer.
The 1968 NASA standards even permit the waiver of subjects’ informed consent when obtaining it would interfere with the research.
Much of the research in social psychology has required some measure of deception of the one human subject being studied in the group of those otherwise informed. Indeed, one can question the extent to which college students in psychology classes, even today, have provided anything fairly considered "informed consent" when participation in experiments is a requirement of the course.
Of course, the 1939 experiment was not a NASA study. But it may very well have met NASA’s 1968 standard. That is, it would have been a very different if not impossible experiment if the subjects were first told of its nature.
So it’s not clear that, judged by the standards of Phase II, had there been no consent in the stuttering study that it would necessarily have been unethical.
But, in fact, it can be fairly argued that there was informed consent in this case.
Obviously, by definition no researcher could obtain the consent of the parents of orphans. Thus, the only person who could legally give consent on their behalf was the administrator of the orphanage. And all indications are that he did so.
So, for this variety of reasons, it seems inappropriate to criticize the study on grounds that consent was not obtained.
Did the researcher knowingly and deliberately do permanent harm to the subjects?
Injecting cancer or hepatitis into subjects is deliberately doing known harm.
Using LSD on unsuspecting subjects to test its possible utility as a military or intelligence weapon is deliberately doing harm.
Even today, testing the efficacy of new drugs on diseased human subjects by deliberately withholding the remedy from the half of them getting placebos risks a measure of harm -- in the case of young children in Thailand the harm we call AIDS.
If we could know that the researcher and supervisor of the 1939 study knew to a certainty that the result of the experiment would be to turn normal speakers into individuals who would stutter for the rest of their lives an ethical, even moral, judgment would be warranted. But we don't know that. And every available scrap of evidence suggests exactly the opposite.
The 1939 study involved speaking to children in a manner and with words still used today by millions of well-meaning parents. That was all the study involved.
The hypothesis was that this would increase the child's disfluency. But what might be expected to result from four months of intermittent contact could reasonably, then, be presumed to be only temporary, or at worst something that would promptly respond to therapy. Given the pain the supervisor suffered from his own stuttering, and the reputation he had among those who knew him for extreme kindness and sensitivity -- especially with children -- it is simply inconceivable that there was, before the study, even a known risk, let alone a probability, of certain harm.
Looking back 62 years, with what we now know about stuttering and was then unknown, we can see the study did involve some risk.
But compare this experiment with those the supervisor of this research, who described himself as “a professional white rat,” was subjected to by his professors. As one journalist describes it, he “was hypnotized, psychoanalyzed, prodded with electrodes, and told to sit in cold water to have his tremors recorded. Like Demosthenes, the ancient Greek stutterer, he placed pebbles in his mouth [and] had his dominant arm, the right, placed in a cast to help prove his professor’s controversial ‘cerebral dominance’ theory . . ..”
The passage provides some perspective as to the acceptable range of human subject experimentation at that time, the passion the supervisor brought to a lifetime of stuttering research, the commitment to science, and the rather dramatic contrast between what he was quite willing to endure himself and what was being tested with the master’s thesis.
Most important, this 1939 study involved none of the approved physical contact, nuclear radiation, drug-induced behavior modification, exposure to disease, untested pharmaceuticals or other invasive techniques sometimes used in human subject research after standards were in place.
That, alone, doesn’t make it right. It does make one wonder, however, why it was selected from among all the studies during the past 70 years that might have been chosen to bring media attention to these ethical issues.
How much permanent harm came from this brief experiment?
All that is now available are a sensationalist journalist’s repetition of quotes from the subjects – at least one of whom is trying to build a lawsuit, even though she is quoted as acknowledging that she did not stutter during the 45 years of her marriage.
And, of course, “a correlation is not a cause.”
To the extent any subject suffered permanent harm following the experiment it is not clear how much, if any, of that harm can be traced in a causal way to the study. The orphans involved had many adverse conditions to deal with before, during and after their stay in the orphanage. Some had become stutterers before the study began.
Moreover, and perhaps one of the more serious indictments of the journalist's professional ethics and abilities, the very human subject he selected to highlight was one whose fluency actually improved during the course of the study! Whatever this may indicate regarding the validity of the theory drawn from the data by the researcher, it certainly seriously undercuts the journalist's efforts to trash the reputation of the supervisior because of the harm he did to the subjects.
But let’s assume for the sake of argument that a causal relation could be shown.
If we are to pass moral judgment on the researcher there then remains the additional and considerable question, which the passage of time prevents answering, as to how deliberate or predictable any of this harm was.
This was original research. No one had ever done it before. As discussed above, there was a substantial probability there would be no effect whatsoever on the subjects. The hypothesis, however interesting, might have proven to be totally invalid – like so many research scientists’ hypotheses before and since.
There is reason to know that the injection of cancer or hepatitis is going to cause temporary or permanent harm.
There was no reason to believe that even troublesome temporary, let alone permanent, harm would result from speaking to children in the ways parents do.
As it turned out, the hypothesis was far more insightful, the effect of adult comment far more powerful, than anyone could have dreamed at the time.
It would have been quite reasonable for the researchers to believe, knowing what was then known, that any disfluencies created in the six subjects’ speech during this brief, four-month experiment would quickly disappear.
That they did not disappear in all subjects is certainly regrettable. But it does not automatically follow that it represents a reprehensible moral and ethical lapse. This is true regardless of whether one evaluates it by the norms of Phase I, when it occurred, or by comparing it with numerous studies done after those standards were in place during Phase II.
Recall as well today’s standard with regard to risk. It is not that no risks may be taken. It is that “risks to subjects are reasonable relative to . . . the importance of the knowledge that may reasonably be expected to result.”
Given the millions of stutterers who have benefited from the body of stuttering research, and the millions of children who have not become stutterers because it has been communicated to parents, one could even argue that today’s standard of permissible risk was met.
No, whether one relies upon the standards of Phase I or even Phase IV to argue about the ethics of this study -- and rational debate on that issue is certainly possible even if pointless -- there are enough additional considerations involved to suggest that moral outrage is unjustified, unfair and unproductive.
Was there a way of testing this study's hypothesis without involving children?
The Willowbrook study could have been done without injecting subjects with hepatitis. There were many Willowbrook children who had contracted the disease naturally who could have been used.
There were undoubtedly some times during Phase II when an animal study might have been a preferable alternative to the use of human subjects.
This was not the case with the stuttering study.
Obviously, animal studies are of no use when studying human communication.
And if the focus of the study is on the onset of stuttering in young children it pretty much requires the participation of young children.
The conclusion may be that the dictates of human subjects research ethics prohibit humankind ever finding out what this study revealed. If so, that is a very heavy price to pay.
But even this conclusion makes the point that the study cannot be faulted because it failed to use an obvious alternative methodology that would have been less risky.
Were a large number of subjects affected?
It is regrettable if even one human subject is harmed by a research project.
But the fact is that very few were involved in the 1939 study – especially when compared with the numbers in the Phase II studies.
Tens of thousands were potentially involved in the atomic bomb tests.
Twelve thousand babies were affected by thalidomide.
Only two or three of the stuttering research subjects are even now alleged to have been permanently adversely affected.
Did the experiment continue after the results were known?
The stuttering experiment was a short-lived four-month experiment. Once the hypothesis was tested and proven the study ceased.
Compare this ethical response to what was done in the Tuskegee study over the course of, not four months but forty years.
Clearly the study cannot be faulted on grounds there was a purposeful, continuing, callous abuse of anyone.
Was there any after-study concern for the subjects?
An effort was made to provide recuperative therapy to those subjects whose speech suffered as a result of the study. Judging by the number of subjects who have told the media they suffered no long-term consequences the therapy may well have been helpful.
But this was the dawn of human understanding of stuttering. Therapies that would be routine today were simply unknown at the time. It is not clear that there was, then, anything more that could have been done than what was done.
Nonetheless, one can argue in hindsight that additional recuperative therapy should have been provided anyway – if for no other reason than to remove any possible question regarding the researcher’s desire to be helpful.
There was far more after-study concern and care of the subjects than has been provided in many human subject experiments since.
Were the results not published and the data destroyed?
Some media reported that the results were never published. Standing alone this is so misleading as to be false.
The thesis was bound, given to the library, cataloged, and made available to the public. There was no effort to suppress it.
Few masters’ theses are commercially published or reprinted in academic journals. They are simply put in academic libraries. That is what was done with this one. It is apparently true that the study was not referred to very much if at all in subsequent academic articles. But that is also the fate of much scholarship.
It is not customary to save all the research data from a master’s thesis. But it is certainly inaccurate to suggest that the data in this study was “destroyed.” Indeed, some media reports indicate that much if not all of it was saved in this instance by the student who did the study.
This behavior – making the study available and saving the research data – especially after the concerns the researcher and supervisor must have had over the results, compares very well to the actions of the CIA director. As mentioned above, he maintained secrecy regarding the agency’s human subject research and then deliberately ordered destroyed the records of the agency’s 150 LSD studies.
The point of this discussion, as indicated at the outset, is not to say that there was nothing wrong with this study.
It is not to say that no aspect of it should ever be criticized merely because other studies, even later studies, can be found that are much worse.
It is only to say that it is both unfair and unproductive to take a dusty 62-year-old master’s thesis off the shelf, put it under an ethical microscope, and then subject the researcher, who is still alive, and the supervisor, who has been dead for 36 years, to a national torrent of moral outrage.
Reflect upon the history of Phase II and the hundreds or thousands of studies that must have been conceived, reviewed and approved, funded and carried out by academic research institutions during the last half of the 20th Century -- up until today. Obviously, many prominent and reputable academic researchers, institutions, and granting agencies believed that those studies were defensible after ethical standards were in place.
It is a real stretch to single out for moral judgment this 1939 master’s thesis, conceived and carried out long before any such standards existed.
As a former university vice president for research has said, the study "was fully within the norms of the time."
That being the case, why would an officer of a professional association that includes speech pathologists want to say -- not incidentally of the person most responsible for the creation of that association -- that the research "cannot be justified on theoretical, moral or ethical grounds and represented a serious error of judgment"?
Why would a current university administrator want to be quoted as saying, “This is not a study that should ever be considered defensible in any era. In no way would I ever think of defending this study. In no way. It’s more than unfortunate.”
It may be “more than unfortunate” that the media has brought it to national attention. But if it was so indefensible, if apologies are so ethically necessary, why was none of this said and done when the study was spread across a local newspaper in that university's town years ago? Or described in an academic journal years before that? Or in a novel two years ago?
Is this but one more example of a professional association and research institution responding more to the public relations demands of a negative national media blitz than to genuine concerns about the ethical issues of human subject research – including issues about what’s going on today, but unknown to the public?
Such comments, and the tabloid-style journalism that preceded and provoked them, are deliberate efforts that have the effect of removing the study from meaningful perspective.
But these comments and articles will have zero impact on today’s ethical standards and the administration of human subject research. Indeed, it truly would be shocking if this 62-year-old, Phase I master’s thesis were to raise ethical issues that have still not been addressed in the detailed Phase IV regulations of today.
It also might be worth harming a reputation if the revelations would help millions, thousands, hundreds, or even dozens of people. There are numerous examples in which that is the case involving everything from information about tobacco, asbestos, pharmaceuticals’ side effects, and lead to silicone breast implants, Ford vehicles and Firestone tires.
Today, it is not clear whether there were ever more than one or two individuals permanently adversely affected by the stuttering study.
The question is not whether a mere two people matter. Of course they do.
The question is whether, even as to them, the publication of these articles and administrators’ comments, at this time, do not do even them more harm than good.
It’s pointless to speculate as to a journalist’s motives. It does appear that there has been a deliberate effort to dramatize, and emotionalize, a 62-year-old master’s thesis into a national story discrediting the reputation of a man 36 years dead.
This is the stuff for which the law provides a remedy and damages – sometimes defamation, sometimes "false light." It involves the use of a fact here and there to present a damaging and false overall impression of someone.
It is never wise to proclaim one’s own moral superiority. As the Biblical admonition puts it, “Let he who is without sin cast the first stone.” But if it is to be done at all it should at least wait until the lab results are back from one’s last annual moral checkup.
Journalist Heal Thyself
The author of the stories in question was quick to formulate and cast moral opprobrium on the researcher and supervisor. He was considerably slower in coming to an examination of his own ethical lapses. In fact, it appears he never bothered to consider them at all.
Journalistic ethics is not the oxymoron many believe it to be. There is a Society of Professional Journalists which has a “Code of Ethics,” the most recent version of which was adopted in September 1996. There are a number of provisions in this Code that raise issues with regard to the ethics of the reporter’s and newspaper’s handling and promotion of the story about the 1939 master’s thesis.
The Code speaks of goals such as “public enlightenment” from journalism. Journalists have a “duty” to “further those ends by . . . providing a fair and comprehensive account of . . . issues. . . . [and] to serve the public with thoroughness and honesty.”
“Journalists should . . . examine their own cultural values and avoid imposing those values on others.”
They should “show compassion to those who may be affected adversely by news coverage. Recognize that gathering and reporting information may cause harm or discomfort . . . [and] that private people have a greater right to control information about themselves than do public officials . . .. Only an overriding public need can justify intrusion into anyone’s privacy.”
Finally, a standard perhaps more applicable to the editors responsible for a newspaper's promotion, or hype, "Journalists should . . . Make certain that headlines, news teases and promotional material, photos, video, audio, graphics, sound bites and quotations do not misrepresent. They should not oversimplify or highlight incidents out of context."
Consider these admonitions in turn.
The stories detracted from, rather than added to, “public enlightenment” about ethics in human subjects research. Their account of the issues was not fair, nor comprehensive, nor thorough, nor honest.
As noted above, "perhaps one of the more serious indictments of the journalist's professional ethics and abilities, the very human subject he selected to highlight was one whose fluency actually improved during the course of the study! Whatever this may indicate regarding the validity of the theory drawn from the data by the researcher, it certainly seriously undercuts the journalist's efforts to trash the reputation of the supervisior because of the harm he did to the subjects." One would think that a "fair" account, presented with "honesty" would have to have, at a minimum, a measure of factual accuracy.
The stories involved the imposition of the cultural values of the journalist on others – moreover, others who lived and acted in a different time and place, 62 years before the story was written.
There was no demonstration of compassion, and total indifference to the harm or discomfort the stories would cause to the named subjects, the researcher, and the family survivors of the supervisor of the study.
There was no overriding public need that justified this intrusion into the privacy of those individuals.
At least the researcher and supervisor had enough sensitivity to refer to the subjects by number rather than by name in order to avoid “intrusion into anyone’s privacy.” Unfortunately, the journalist chose to ignore both their ethical sense of decency and his own ethical standards in this regard.
As for the ethics of the promotional material, consider this promo in the journalist's paper a couple days before the series was scheduled to run. The headline blared:
"[The name of the newspaper] Uncovers Secret Experiment to Make Orphans Stutter; Traces Living Legacy of Tormented Children and Haunted Researcher"
The promo began,
"In a chilling investigative series beginning Sunday, the [name of paper] reveals for the first time the complete story of a secret experiment conducted 60 years ago to induce a group of orphans to stutter. The study [was] designed and concealed by [the supervisor] . . .."
Consider the errors and exaggerations. There was nothing to "uncover." The study was not "revealed for the first time." It had never been secret. It was not concealed. The researcher did not "torment" the subjects. "Chilling"? "Haunted"?
As explained above, the study was published and available in the university's library just like any other master's thesis. It has been described in the professional literature and by prior newspapers.
Compare this promotion with the ethical standard. Are these "headlines, news teases and promotional material [that] do not misrepresent; [that do] not oversimplify or highlight incidents out of context"? Or do they (and the series itself) have more in common with sensationalist, tabloid, supermarket scandle sheets? Having read this paper so far, you be the judge.
It is not the point of this paper to pursue whether the journalist may also have violated legal rights referred to as defamation or false light.
But it does seem that he is in a weak position to be questioning the ethics of others, with regard to actions taken by them before the relevant ethical standards existed, when he has violated the ethical standards applicable to his actions today.
We are not looking back on his actions with the benefit of hindsight. We are not judging him with the standards of journalistic ethics at the end of their evolutionary process 62 years from now, in the year 2063. We are using the standards in place at the time he wrote, standards that presumably were well known to him.
Misplaced Administrative Rectitude
The university administrator and association officer quoted above are equally unwavering in their rectitude. “The university today has in place a strict policy and procedures [so that] experiments of this nature [the 1939 stuttering study] cannot happen again,” says one. “Such research is strictly prohibited under [the association’s] Code of Ethics,” says the other.
They have built themselves a very high pulpit from which to cast moral judgments on their predecessors below. Unfortunately, it sits atop a shaky scaffolding from which a fall from grace could be painful.
At a minimum, these executives are inviting exceedingly close examination of their own future actions – not unlike what happened to former Senator Gary Hart after he challenged the media to catch him in a moral slip and they promptly caught him with Donna Rice.
Are they aware how soon it may be coming?
After all the pious proclamations from the study’s critics that we have now evolved into creatures with a heightened ethical sense there is substantial evidence that things still are far from perfect in research land.
It may turn out that we have not yet earned the right to cast moral aspersions on those who have gone 60 years before us, even in this Phase IV period of heightened awareness.
Consider the April 2000 report of the DHHS Office of Inspector General, “Protecting Human Research Subjects.” The report notes the office’s concerns two years earlier: a “call for widespread reform,” “a sense of urgency,” “disturbing inadequacies in IRB oversight of clinical trials,” up to and including “the death of a teenager participating in a gene transfer clinical trial funded by NIH.”
Presumably death could be considered a kind of “permanent harm” at least the equivalent of stuttering.
Notwithstanding these concerns, “few of [the office’s] recommended reforms have been enacted.”
But what may happen to those journalists and other individuals
who were so willing to strike a pose of moral superiority, looking down
on a 1939 researcher and supervisor, are not the most serious consequences
of their attacks.
There are plenty of serious issues involving human subjects research that cry out for additional analysis. The 1939 stuttering study is not one of them. But it does illustrate the harm that can come from such super-sensitivity to ethical issues.
Should we be concerned about the rights of the human subjects in research? Of course.
But we may have carried our concerns to such extremes that today we are both failing to do research that desperately needs to be done, and unfairly casting moral judgments about that which was done a half-century ago.
Most of us hope that history will judge our lives by the standards of our time. We owe Phase I researchers no less.