IN MEMORIAM

Leo Breiman

Professor of Statistics, Emeritus

Berkeley

1928 — 2005

Leo Breiman, professor of statistics, a one-time leading probabilist, then and to the end of his life, applied statistician, and in the last 15 years, one of the major leaders in machine learning, died on July 5, 2005, at his home in Berkeley after a long battle with cancer.

Leo Breiman was born in New York City on January 27, 1928, the only child of Eastern European immigrants Max and Lena Breiman. His father was a tailor and sewing machine operator, and his mother was a housewife. When Leo was five, his parents brought him to California—first to San Francisco and then to Los Angeles, where he graduated from Roosevelt High School in 1945. He earned his Ph.D. in mathematics from the University of California, Berkeley in 1954 and then was hired to teach probability theory at the University of California, Los Angeles (UCLA).

After this relatively conventional beginning, Leo’s strikingly distinct style began to emerge. While Breiman was in southern California, he helped poor Mexican youth learn English so that they could find employment in their country’s tourism industry. His interest in education led him to run for the Santa Monica School District board, where he ultimately served as president. Breiman took his first sabbatical working for UNESCO and was hired as an educational statistician in Liberia to find out exactly how many students were in the country’s schools. At the time, there were only 50 miles of paved road in the entire country, and many schools were in virtually inaccessible areas of the rainforest. Breiman formed 20 teams to walk throughout Liberia, going into villages, calling children out of schools and then counting them.

After obtaining tenure at UCLA based on a record that included production of what is now known as the Shannon-MacMillan-Breiman theorem, a fundamental result in information theory showing that a Shannon lower bound on code length holds when the messages transmitted come from an arbitrary stationary ergodic process, he decided to become an applied statistician and resigned from UCLA. He proceeded to get academia out of his system, at least temporarily, by writing a book,

Probability Theory,that has just been reprinted as a Classic of Applied Mathematics. He spent the next 18 years as a highly successful consultant working with government and industry on various problems: traffic, pollution levels, and other questions of prediction. In the process he began to develop the tree-based methods of classification that became his trademark and arguably his most important contributions to science. Classification is the major focus of machine learning, an area of computer science overlapping heavily with statistics. It deals with the construction of algorithms, by which a computer is trained by examples to construct rules for classifying items in one of a number of classes. For instance, automated medical diagnosis can be viewed in this way. In collaboration with colleagues Jerome Friedman and Richard Olshen at Stanford University and Charles Stone at Berkeley, he wroteCART: Classification and Regression Trees. The book and later methods of Breiman and Friedman have become standards in the data mining industry, an enterprise attempting to extract patterns from enormous amounts of complex data.

Upon joining Berkeley's Department of Statistics in 1980, Breiman plunged into moving our rather theoretical department toward modern applications and to begin training its students and faculty to deal with the new questions posed by the exponential rate of increase in computing power. He became director of the Statistical Laboratory, a then essentially inactive departmental unit founded by J. Neyman in 1938, and turned it, with the help of financing from campus and government sources, into the most sophisticated statistical computing facility in the United States. He participated as an active member of administrative committees on policy of computing and communications. He revitalized our classical multivariate analysis course, bringing in the ideas, questions and techniques of machine learning. Through his stature in the machine learning community, he helped the department make a highly successful joint appointment with the Department of Electrical Engineering and Computer Sciences. His impact on the Department of Statistics has been profound.

Breiman retired from Berkeley in 1993. The years as an emeritus and Professor in the Graduate School were among his most active. He made major contributions to another important machine learning method, Boosting, and developed Random Forests, a method that he viewed as a culmination of his work and that is being developed by his collaborator Adele Cutler. Leo’s achievements were recognized by election to the National Academy of Sciences and the American Academy of Arts and Sciences. He is survived by his wife Mary Lou and his two daughters, Jessica and Rebekah, from a former marriage. His work and the memories of his vivid personality live on.

Peter Bickel

Michael Jordan

John Rice