William Sealy Gosset and William A. Silverman: Two "Students" of Science
http://www.100md.com
《小儿科》
Pregnancy and Perinatology Branch Center for Developmental Biology and Perinatal Medicine, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
ABSTRACT
In 1908, William Sealy Gosset, a chemist in an Irish brewery, published his second article on statistics in Biometrika under the pseudonym "Student." He chose this pseudonym because his company did not allow its scientists to publish confidential data. In the article, Gosset described a procedure to assess population means by using small samples. This was the origin of the "Student's t test." Dr William Silverman (1917–2004), a pioneer neonatologist, also used the pseudonym "Student." He sent thousands of notes, clippings, anecdotes, and quotations to Pediatrics with the signature line "Submitted by Student" that appeared as blurbs at the ends of articles since 1977. Both Gosset and Silverman were rigorous students of science. Silverman chose pseudonyms to seek readers' responses to the message rather than the messenger. He also wished that one would remain a perpetual student, ready to say "I don't know," and strive to understand the human side of medicine. This brief article provides a perspective on these 2 "students" of science.
Key Words: ethics history of medicine history of statistics neonatal intensive care society and medicine
WILLIAM SEALY GOSSET: THE "STUDENT" WHO DEVELOPED THE "t" TEST
In 1899, William Sealy Gosset, a 23-year-old chemistry graduate, took up a job as brewer at Arthur Guinness, Son & Co, Ltd, in Dublin, Ireland. His task was to apply scientific methods to beer processing. Born on June 13, 1876, in Canterbury, England, Gosset was a brilliant student. Because of poor eyesight he decided not to study engineering like his father but to study chemistry and mathematics at New College in Oxford. He graduated in 1899 with a First Class in chemistry.1
To brew a perfect beer, one had to mix exact amounts of yeast to the continuously fermenting barley; too little led to incomplete fermentation, and too much led to a bitter taste. Ambient temperature was also an unpredictable variable. Gosset's first task was to count the yeast colonies, for which he learned to use the newly developed hemacytometer. However, he had to overcome the challenge of estimating the quantity of colonies in entire jars based on small samples taken from them. Gosset used his mathematical and statistical skills, not chemistry, to solve this practical problem.
The conceptual basis for Gosset's solution had evolved over 150 years.1–4 Mathematicians knew that observations were prone to errors. As the measurement errors became smaller with improved technology, the hitherto-unrecognized "random error" became apparent, especially in biological measurements. In 1820, Laplace proposed that random errors (deviations of the observed from the predicted) can be plotted; such plots became known as the "normal" or "Gaussian-distribution" curves.3,4
The British mathematician and philosopher Karl Pearson (1857–1936) took the concept of distributions a step further.4 He proposed that all experiments provided only pieces of information regarding a larger, immeasurable, original scatter. He showed that measurements themselves, not just the random errors, had probability distribution properties, which could be described by using 4 parameters, namely, the mean, standard deviation, symmetry, and kurtosis. Pearson proposed the term "parameter," which had a Greek root meaning of "almost measurements."4
Pearson held that if one knew the 4 parameters, one could locate the probability that an observed number will be at a certain location in the population scatter. He proposed a family of skewed distribution curves to describe such scatters.1–4 Pearson was a towering personality. He founded Biometrika, a major statistical journal, the first issue of which appeared in October 1901.
Gosset noticed that his yeast colony counts did not fit any of Pearson's skewed curves but did fit the Poisson model, named for the 18th-century French mathematician Siméon Denise Poisson. In November 1904, Gosset presented a report to the Guinness Board titled "The Application of the ‘Law of Error’ to the Work of the Brewery."1
Student's First Article
Gosset first met Pearson in 1905, and the two became friends. Because Guinness did not allow its scientists to publish articles, Gosset had to negotiate with the company hierarchy for permission. A permission was granted, provided Gosset used a pseudonym and did not divulge any confidential data. Pearson, too, agreed to these conditions; he published Gosset's first article, "On the Error of Counting With a Hemacytometer," under the pseudonym "Student," in Biometrika in February 1907.5 The article explained how the scatters of colony counts were similar to the exponential limits of binominal distribution.
Developing the t Test
Pearson had strongly argued that only with large samples could one estimate population parameters. Because most researchers cannot obtain large samples, Gosset thought that formal methods ought to be developed by using small samples for estimating population means. He conducted a number of empirical experiments to develop such methods.
In 1 experiment, Gosset prepared 3000 pieces of cardboard, on each of which he wrote 2 sets of data on 3000 "criminals."6 One set of values were heights, and the other values were the lengths of the left middle fingers. Gosset shuffled the cards, drew at random 750 samples of 4 cards each, and computed means and standard deviations of each. Then he obtained the difference between each sample mean and the population mean (n = 3000) and divided the difference by the sample standard deviation to obtain 750 z scores. He plotted the scores as probability functions and discovered that even without any of 4 parameters of Pearson, one could estimate the population mean and the associated error with a degree of certainty.4
Gosset published these results in his second article using the pseudonym "Student" under the title "The Probable Error of a Mean" in Biometrika in March 1908.7 Despite its long, algebraic discourses and mathematical arguments, the article is a classic. It is simple, lucid, and free from jargon, as a few introductory paragraphs from it reveal7:
The usual method of determining the probability that the mean of the population lies within a given distance of the mean of the sample is to assume a normal distribution about the mean of the sample with a standard deviation equal to s/ where s is the standard deviation of the sample, and to use the tables of probability.
But as we decrease the number of experiments [sample sizes], the value of the standard deviation found from the sample of experiments becomes itself subject to an increasing error, until judgments used in this way become altogether misleading ...
The aim of the present paper is to determine the point at which we may use the tables of the probability integral in judging of the significance of the mean of a series of experiments, and to furnish alternative tables for use when the numbers of experiments [sample sizes] is too low.
The test that Gosset described in this article became the famous t test. However, for a number of years the article had received little attention until Ronald A. Fisher provided a mathematical proof and showed the practical utility of the t test.1,2,4 Gosset had not called it a t test but used "z" instead to denote the key ratio. Because the convention was to use z for population parameters and t for samples, at Fisher's suggestion Gosset published another set of tables in 1925 for testing the significance of observations from small samples. He used the t ratio: t = z where, n1 was the number of degrees of freedom. This table became the "table of Student's t distributions" and the test became the "Student's t test."4,8
For 30 years, Gosset wrote a number of articles on statistics, all attempting to solve practical problems encountered in the brewery. Yet the Student's identity remained a secret until his death on October 16, 1937. There were tributes and obituaries in Biometrika,9,10 and some of his friends solicited and obtained a gift from the Guinness company to publish Gosset's collected articles in 1942.8
Gosset was also an avid writer of letters, maintaining regular correspondence with a large circle of friends and scientists. The famous statistician Egon Pearson (Karl Pearson's son) wrote the history of probability statistics and a statistical biography of Gosset based on the latter's correspondence.1,11
WILLIAM A. SILVERMAN (1917–2004): A TEACHER AND A STUDENT
When William Silverman died on December 16, 2004, tributes poured in from around the world, remembering him as a founding father of neonatology, a pioneer, and a highly influential figure in contemporary pediatrics.12,13 Silverman was a voracious reader and, like Gosset, a prolific writer. PubMed lists 67 articles he wrote since 1990 alone, 63 of which were single-author reports. In his writings, Silverman covered a vast array of topics, from statistics and evidence-based medicine to bedside medicine, from social and ethical aspects of medical care to intensive care.14–17 During the last decade of his life, he exhorted clinicians against what he called "therapeutic nihilism" in neonatal intensive care and implored them to consider parental wishes and societal values while treating critically ill neonates. Even in his e-mail he used the pen name "fumer."
Like Gosset, Silverman was also an avid writer of letters. He wrote to the editors of newspapers, magazines, and medical journals and to a wide circle of students, friends, and scientific colleagues, regardless of whether he knew them personally. Also similar to Gosset, Silverman often wrote anonymously. Since 1977, he began sending clippings and quotations from newspapers, reflections and notes from scientific articles, annotations from court documents, pieces from personal letters, and materials from obscure sources to Pediatrics. These were printed (and continue to be printed) as blurbs at the end of journal articles with the distinctive signature line "Submitted by Student."
I think that these materials may be worthy of study by students of medical history. Analyses of even a select thousand of these may provide a useful perspective on contemporary medicine, society, and ethics as seen through the eyes of a visionary.
Why did Silverman choose anonymity in an era when many people seek publicity He explained this in the preface to his book, Where's the Evidence Controversies in Modern Medicine,18 a monograph based on his columns, Fumes From the Spleen (also written under a pseudonym, "Malcontent").19,20 He felt, as did the famous Anglo-American poet W. H. Auden, that an unsigned work forced the reader to respond to the "reasoning, not to the reasoner."18
Perhaps such caution was not necessary. Despite disagreements with some of his views,21 Silverman was universally respected for his integrity and honesty and admired for his intellectual rigor. Like Gosset, Silverman was a quintessential guru and a perpetual student, striving to learn, as much as to teach, the human side of medicine. A picture in his book Retrolental Fibroplasia: A Modern Parable22 depicts Moses Maimonides holding a sign that reads, "Teach thy tongue to say I do not know and though shalt progress." A more fitting epitaph for William Silverman cannot be found.
ACKNOWLEDGMENTS
I am indebted to John Sinclair, MD, Gilbert Ira Martin, MD, and Ashish Sen, PhD, for help during the preparation of this manuscript. I thank the respective publishers for permission to reproduce Gosset’s photograph (Fig 1) and portions from the page of his 1908 paper (Fig 2). I offer my sincere appreciation and thanks to Gilbert Ira Martin, MD, and Mrs Ruth Silverman for providing Dr William Silverman’s photograph.
FOOTNOTES
Accepted May 11, 2005.
No conflict of interest declared.
PEDIATRICS (ISSN 0031 4005). Published in the public domain by the American Academy of Pediatrics.
REFERENCES
Plackett RL. 'Student': A Statistical Biography of William Sealy Gosset. Oxford, United Kingdom: Vlarendon Press; 1990
Newman JR, ed. The World of Mathematics: A Small Library of the Literature of Mathematics From A'h-mosé the Scribe to Albert Einstein. Vols 2 and 3. New York, NY: Simon and Schuster; 1956
Hald A. A History of Probability and Statistics and Their Applications Before 1750. New York, NY: Wiley Publications; 1990
Salsburg D. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. New York, NY: WH Freeman and Co; 2001
Student. On the error of counting with a haemacytometer. Biometrika. 1907;5 :351 –360
Macdonell WR. On criminal anthropometry and the identification of criminals. Biometrika. 1901;1 :117 –127
Student. The probable error of a mean. Biometrika. 1908;6 :1 –25
Gosset WS. "Student"'s Collected Papers. Pearson ES, Wishart J, eds. Cambridge, United Kingdom: Cambridge University Press; 1942
McMullen L. "Student" as a man. Biometrka. 1939;30 :205 –210
Pearson ES. "Student" as statistician. Biometrika. 1939;30 :210 –250
Pearson ES. Studies in the history of probability and statistics. Some early correspondence between W. S. Gosset, R. A. Fisher and Karl Pearson, with notes and comments. Biometrika. 1968;55 :445 –457
Oransky I. William Silverman. Lancet. 2005;365 :116
Chalmers I. Bill Silverman: a personal appreciation. Paediatr Perinat Epidemiol. 2005;19 :82 –85
Silverman WA. Rule-based folly at the moment of birth. Paediatr Perinat Epidemiol. 1998;12 :366 –369
Silverman WA. Suspended judgment. Is the scientific paper a fraud Control Clin Trials. 1991;12 :273 –276
Silverman WA. Compassion or opportunism Pediatrics. 2004;113 :402 –403
Silverman WA. Russian roulette in the delivery room. Pediatrics. 2005;115 :192 –193
Silverman WA. Where's the Evidence Controversies in Modern Medicine. Oxford, United Kingdom: Oxford University Press; 1998
Malcontent. Fumes from the spleen. Paediatr Perinat Epidemiol. 1987;1 :10 –14
Malcontent. Fumes from the spleen. Paediatr Perinat Epidemiol. 1987;1 :137 –138
Lorenz JM. Compassion and perplexity. Pediatrics. 2004;113 :403 –404
Silverman WA. Retrolental Fibroplasia: A Modern Parable. New York, NY: Grune & Stratton; 1990(Tonse N. K. Raju, MD, DCH)
ABSTRACT
In 1908, William Sealy Gosset, a chemist in an Irish brewery, published his second article on statistics in Biometrika under the pseudonym "Student." He chose this pseudonym because his company did not allow its scientists to publish confidential data. In the article, Gosset described a procedure to assess population means by using small samples. This was the origin of the "Student's t test." Dr William Silverman (1917–2004), a pioneer neonatologist, also used the pseudonym "Student." He sent thousands of notes, clippings, anecdotes, and quotations to Pediatrics with the signature line "Submitted by Student" that appeared as blurbs at the ends of articles since 1977. Both Gosset and Silverman were rigorous students of science. Silverman chose pseudonyms to seek readers' responses to the message rather than the messenger. He also wished that one would remain a perpetual student, ready to say "I don't know," and strive to understand the human side of medicine. This brief article provides a perspective on these 2 "students" of science.
Key Words: ethics history of medicine history of statistics neonatal intensive care society and medicine
WILLIAM SEALY GOSSET: THE "STUDENT" WHO DEVELOPED THE "t" TEST
In 1899, William Sealy Gosset, a 23-year-old chemistry graduate, took up a job as brewer at Arthur Guinness, Son & Co, Ltd, in Dublin, Ireland. His task was to apply scientific methods to beer processing. Born on June 13, 1876, in Canterbury, England, Gosset was a brilliant student. Because of poor eyesight he decided not to study engineering like his father but to study chemistry and mathematics at New College in Oxford. He graduated in 1899 with a First Class in chemistry.1
To brew a perfect beer, one had to mix exact amounts of yeast to the continuously fermenting barley; too little led to incomplete fermentation, and too much led to a bitter taste. Ambient temperature was also an unpredictable variable. Gosset's first task was to count the yeast colonies, for which he learned to use the newly developed hemacytometer. However, he had to overcome the challenge of estimating the quantity of colonies in entire jars based on small samples taken from them. Gosset used his mathematical and statistical skills, not chemistry, to solve this practical problem.
The conceptual basis for Gosset's solution had evolved over 150 years.1–4 Mathematicians knew that observations were prone to errors. As the measurement errors became smaller with improved technology, the hitherto-unrecognized "random error" became apparent, especially in biological measurements. In 1820, Laplace proposed that random errors (deviations of the observed from the predicted) can be plotted; such plots became known as the "normal" or "Gaussian-distribution" curves.3,4
The British mathematician and philosopher Karl Pearson (1857–1936) took the concept of distributions a step further.4 He proposed that all experiments provided only pieces of information regarding a larger, immeasurable, original scatter. He showed that measurements themselves, not just the random errors, had probability distribution properties, which could be described by using 4 parameters, namely, the mean, standard deviation, symmetry, and kurtosis. Pearson proposed the term "parameter," which had a Greek root meaning of "almost measurements."4
Pearson held that if one knew the 4 parameters, one could locate the probability that an observed number will be at a certain location in the population scatter. He proposed a family of skewed distribution curves to describe such scatters.1–4 Pearson was a towering personality. He founded Biometrika, a major statistical journal, the first issue of which appeared in October 1901.
Gosset noticed that his yeast colony counts did not fit any of Pearson's skewed curves but did fit the Poisson model, named for the 18th-century French mathematician Siméon Denise Poisson. In November 1904, Gosset presented a report to the Guinness Board titled "The Application of the ‘Law of Error’ to the Work of the Brewery."1
Student's First Article
Gosset first met Pearson in 1905, and the two became friends. Because Guinness did not allow its scientists to publish articles, Gosset had to negotiate with the company hierarchy for permission. A permission was granted, provided Gosset used a pseudonym and did not divulge any confidential data. Pearson, too, agreed to these conditions; he published Gosset's first article, "On the Error of Counting With a Hemacytometer," under the pseudonym "Student," in Biometrika in February 1907.5 The article explained how the scatters of colony counts were similar to the exponential limits of binominal distribution.
Developing the t Test
Pearson had strongly argued that only with large samples could one estimate population parameters. Because most researchers cannot obtain large samples, Gosset thought that formal methods ought to be developed by using small samples for estimating population means. He conducted a number of empirical experiments to develop such methods.
In 1 experiment, Gosset prepared 3000 pieces of cardboard, on each of which he wrote 2 sets of data on 3000 "criminals."6 One set of values were heights, and the other values were the lengths of the left middle fingers. Gosset shuffled the cards, drew at random 750 samples of 4 cards each, and computed means and standard deviations of each. Then he obtained the difference between each sample mean and the population mean (n = 3000) and divided the difference by the sample standard deviation to obtain 750 z scores. He plotted the scores as probability functions and discovered that even without any of 4 parameters of Pearson, one could estimate the population mean and the associated error with a degree of certainty.4
Gosset published these results in his second article using the pseudonym "Student" under the title "The Probable Error of a Mean" in Biometrika in March 1908.7 Despite its long, algebraic discourses and mathematical arguments, the article is a classic. It is simple, lucid, and free from jargon, as a few introductory paragraphs from it reveal7:
The usual method of determining the probability that the mean of the population lies within a given distance of the mean of the sample is to assume a normal distribution about the mean of the sample with a standard deviation equal to s/ where s is the standard deviation of the sample, and to use the tables of probability.
But as we decrease the number of experiments [sample sizes], the value of the standard deviation found from the sample of experiments becomes itself subject to an increasing error, until judgments used in this way become altogether misleading ...
The aim of the present paper is to determine the point at which we may use the tables of the probability integral in judging of the significance of the mean of a series of experiments, and to furnish alternative tables for use when the numbers of experiments [sample sizes] is too low.
The test that Gosset described in this article became the famous t test. However, for a number of years the article had received little attention until Ronald A. Fisher provided a mathematical proof and showed the practical utility of the t test.1,2,4 Gosset had not called it a t test but used "z" instead to denote the key ratio. Because the convention was to use z for population parameters and t for samples, at Fisher's suggestion Gosset published another set of tables in 1925 for testing the significance of observations from small samples. He used the t ratio: t = z where, n1 was the number of degrees of freedom. This table became the "table of Student's t distributions" and the test became the "Student's t test."4,8
For 30 years, Gosset wrote a number of articles on statistics, all attempting to solve practical problems encountered in the brewery. Yet the Student's identity remained a secret until his death on October 16, 1937. There were tributes and obituaries in Biometrika,9,10 and some of his friends solicited and obtained a gift from the Guinness company to publish Gosset's collected articles in 1942.8
Gosset was also an avid writer of letters, maintaining regular correspondence with a large circle of friends and scientists. The famous statistician Egon Pearson (Karl Pearson's son) wrote the history of probability statistics and a statistical biography of Gosset based on the latter's correspondence.1,11
WILLIAM A. SILVERMAN (1917–2004): A TEACHER AND A STUDENT
When William Silverman died on December 16, 2004, tributes poured in from around the world, remembering him as a founding father of neonatology, a pioneer, and a highly influential figure in contemporary pediatrics.12,13 Silverman was a voracious reader and, like Gosset, a prolific writer. PubMed lists 67 articles he wrote since 1990 alone, 63 of which were single-author reports. In his writings, Silverman covered a vast array of topics, from statistics and evidence-based medicine to bedside medicine, from social and ethical aspects of medical care to intensive care.14–17 During the last decade of his life, he exhorted clinicians against what he called "therapeutic nihilism" in neonatal intensive care and implored them to consider parental wishes and societal values while treating critically ill neonates. Even in his e-mail he used the pen name "fumer."
Like Gosset, Silverman was also an avid writer of letters. He wrote to the editors of newspapers, magazines, and medical journals and to a wide circle of students, friends, and scientific colleagues, regardless of whether he knew them personally. Also similar to Gosset, Silverman often wrote anonymously. Since 1977, he began sending clippings and quotations from newspapers, reflections and notes from scientific articles, annotations from court documents, pieces from personal letters, and materials from obscure sources to Pediatrics. These were printed (and continue to be printed) as blurbs at the end of journal articles with the distinctive signature line "Submitted by Student."
I think that these materials may be worthy of study by students of medical history. Analyses of even a select thousand of these may provide a useful perspective on contemporary medicine, society, and ethics as seen through the eyes of a visionary.
Why did Silverman choose anonymity in an era when many people seek publicity He explained this in the preface to his book, Where's the Evidence Controversies in Modern Medicine,18 a monograph based on his columns, Fumes From the Spleen (also written under a pseudonym, "Malcontent").19,20 He felt, as did the famous Anglo-American poet W. H. Auden, that an unsigned work forced the reader to respond to the "reasoning, not to the reasoner."18
Perhaps such caution was not necessary. Despite disagreements with some of his views,21 Silverman was universally respected for his integrity and honesty and admired for his intellectual rigor. Like Gosset, Silverman was a quintessential guru and a perpetual student, striving to learn, as much as to teach, the human side of medicine. A picture in his book Retrolental Fibroplasia: A Modern Parable22 depicts Moses Maimonides holding a sign that reads, "Teach thy tongue to say I do not know and though shalt progress." A more fitting epitaph for William Silverman cannot be found.
ACKNOWLEDGMENTS
I am indebted to John Sinclair, MD, Gilbert Ira Martin, MD, and Ashish Sen, PhD, for help during the preparation of this manuscript. I thank the respective publishers for permission to reproduce Gosset’s photograph (Fig 1) and portions from the page of his 1908 paper (Fig 2). I offer my sincere appreciation and thanks to Gilbert Ira Martin, MD, and Mrs Ruth Silverman for providing Dr William Silverman’s photograph.
FOOTNOTES
Accepted May 11, 2005.
No conflict of interest declared.
PEDIATRICS (ISSN 0031 4005). Published in the public domain by the American Academy of Pediatrics.
REFERENCES
Plackett RL. 'Student': A Statistical Biography of William Sealy Gosset. Oxford, United Kingdom: Vlarendon Press; 1990
Newman JR, ed. The World of Mathematics: A Small Library of the Literature of Mathematics From A'h-mosé the Scribe to Albert Einstein. Vols 2 and 3. New York, NY: Simon and Schuster; 1956
Hald A. A History of Probability and Statistics and Their Applications Before 1750. New York, NY: Wiley Publications; 1990
Salsburg D. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. New York, NY: WH Freeman and Co; 2001
Student. On the error of counting with a haemacytometer. Biometrika. 1907;5 :351 –360
Macdonell WR. On criminal anthropometry and the identification of criminals. Biometrika. 1901;1 :117 –127
Student. The probable error of a mean. Biometrika. 1908;6 :1 –25
Gosset WS. "Student"'s Collected Papers. Pearson ES, Wishart J, eds. Cambridge, United Kingdom: Cambridge University Press; 1942
McMullen L. "Student" as a man. Biometrka. 1939;30 :205 –210
Pearson ES. "Student" as statistician. Biometrika. 1939;30 :210 –250
Pearson ES. Studies in the history of probability and statistics. Some early correspondence between W. S. Gosset, R. A. Fisher and Karl Pearson, with notes and comments. Biometrika. 1968;55 :445 –457
Oransky I. William Silverman. Lancet. 2005;365 :116
Chalmers I. Bill Silverman: a personal appreciation. Paediatr Perinat Epidemiol. 2005;19 :82 –85
Silverman WA. Rule-based folly at the moment of birth. Paediatr Perinat Epidemiol. 1998;12 :366 –369
Silverman WA. Suspended judgment. Is the scientific paper a fraud Control Clin Trials. 1991;12 :273 –276
Silverman WA. Compassion or opportunism Pediatrics. 2004;113 :402 –403
Silverman WA. Russian roulette in the delivery room. Pediatrics. 2005;115 :192 –193
Silverman WA. Where's the Evidence Controversies in Modern Medicine. Oxford, United Kingdom: Oxford University Press; 1998
Malcontent. Fumes from the spleen. Paediatr Perinat Epidemiol. 1987;1 :10 –14
Malcontent. Fumes from the spleen. Paediatr Perinat Epidemiol. 1987;1 :137 –138
Lorenz JM. Compassion and perplexity. Pediatrics. 2004;113 :403 –404
Silverman WA. Retrolental Fibroplasia: A Modern Parable. New York, NY: Grune & Stratton; 1990(Tonse N. K. Raju, MD, DCH)