Find Open Textbooks

Introductory Statistics: Saylor

textbook cover image
singing ringing tree (http://www.flickr.com/photos/mjtmail/5067352864/in/photostream/) by Mark Tighe (http://www.flickr.com/photos/mjtmail/) used under a CC-BY license (http://creativecommons.org/licenses/by/2.0/deed.en_CA)

Description: This book is meant to be a textbook for a standard one-semester introductory statistics course for general education students.Over time the core content of this course has developed into a well-defined body of material that is substantial for a one-semester course. The authors believe that the students in this course are best served by a focus on the core material and not by an exposure to a plethora of peripheral topics. Therefore in writing this book we have sought to present material that comprises fully a central body of knowledge that is defined according to convention, realistic expectation with respect to course duration and students’ maturity level, and our professional judgment and experience.

Author: Douglas S. Shafer, Zhiyi Zhang

Original source: flatworldknowledge.lardbucket.org www.saylor.org/books

Adoption (faculty): Contact us if you are using this textbook in your course

Adaptations: Support for adapting an open textbook

Open Textbook(s):

  1. DOWNLOAD PDF file. This icon is licensed under a Creative Commons
Attribution 3.0 License. Copyright Yusuke Kamiyamane. Introductory Statistics.pdf (38 MB)
  2. PRINT Buy a print copy
  3. DOWNLOAD WORD file. This icon is licensed under a Creative Commons
Attribution 3.0 License. Copyright Yusuke Kamiyamane. Introductory Statistics.docx (36 MB)
  4. DOWNLOAD ZIP file. This icon is licensed under a Creative Commons
Attribution 3.0 License. Copyright Yusuke Kamiyamane. introductory-statistics_html.zip (29 MB)

Creative Commons License
Introductory Statistics: Saylor by Douglas S. Shafer, Zhiyi Zhang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.


Reviews for 'Introductory Statistics'

Number of reviews: 4
Average Rating: 4.18 out of 5

1. Reviewed by: Leslie Burkholder
  • Institution: University of British Columbia
  • Title/Position: Senior lecturer
  • Overall Rating: 4.4 out of 5
  • Date:
  • License: Creative Commons License

Q: The text covers all areas and ideas of the subject appropriately and provides an effective index and/or glossary

The consensus introductory statistics curriculum is typically presented in three major units: (1) Descriptive statistics and study design (first third of course), (2) Probability and sampling distributions (second third of course), and (3) Statistical inference (final third of course).

This textbook covers all of these topics. Topic 1 is chapters 1 and 2. Topic 2 is chapters 3 through 7. Topic 3 is done in chapters 8 to 14. There are more chapters on the third topic. Inevitably instructors might not use them all. Each chapter comes with plenty of exercises and exercise answers. There is a good index and glossary.

The coverage in each topic is very competent and clear. There is, however, nothing exciting or novel in the the manner in which the topics are covered or the pedagogical approach. Recent trends in teaching introductory statistics have emphasized statistics as a part of scientific investigations. So they have integrated the learning of statistics into the understanding of science. This text does little of that. An emerging trend is to make heavy use of computer simulation and even physical simulation techniques to aid learning. This text does none of that.

Comprehensiveness Rating: 4 out of 5

Q: Content is accurate, error-free and unbiased

Content is very competent, accurate, error-free, and unbiased.

Instructors will find the many exercises are US-centric. They may find they want exercises that are not that.

Content Accuracy Rating: 5 out of 5

Q: Content is up-to-date, but not in a way that will quickly make the text obsolete within a short period of time. The text is written and/or arranged in such a way that necessary updates will be relatively easy and straightforward to implement

The content is fairly timeless in its coverage. It is certainly arranged in ways that would make altering it -- for example, to update it or make less US-centric it -- pretty straightforward.

Relevance Rating: 4 out of 5

Q: The text is written in lucid, accessible prose, and provides adequate context for any jargon/technical terminology used

The textbook is very clear. The writing style is quite accessible. Many of our students do not have English as a first language. It doesn't look like the text would present issues for their understanding.

On the other hand BCCampus might consider having the textbook translated into other languages as its contribution.

Clarity Rating: 5 out of 5

Q: The text is internally consistent in terms of terminology and framework

The use of statistics terminology is consistent through the text. The organization of material is similar in each chapter

Consistency Rating: 4 out of 5

Q: The text is easily and readily divisible into smaller reading sections that can be assigned at different points within the course (i.e., enormous blocks of text without subheadings should be avoided). The text should not be overly self-referential, and should be easily reorganized and realigned with various subunits of a course without presenting much disruption to the reader.

The textbook is broken into smaller chunks. It looks like an instructor could skip or reorder sections without there being a problem.

Modularity Rating: 5 out of 5

Q: The topics in the text are presented in a logical, clear fashion

The writing in this text is clear and the organization of material is logical

Organization Rating: 4 out of 5

Q: The text is free of significant interface issues, including navigation problems, distortion of images/charts, and any other display features that may distract or confuse the reader

The text looks like a professionally published textbook. There isn't color and there aren't images. But in other respects it looks good. There aren't any navigation or user interface issues.

Interface Rating: 5 out of 5

Q: The text contains no grammatical errors

I could find no grammatical or spelling mistakes in this text.

Grammar Rating: 5 out of 5

Q: The text is not culturally insensitive or offensive in any way. It should make use of examples that are inclusive of a variety of races, ethnicities, and backgrounds

The text is not culturally offensive in any way. The examples and exercises are often US-centric. rtsInstructors might want to adapt or modify these parts for a BC or Canadian audience

Cultural Relevance Rating: 3 out of 5

Q: Are there any other comments you would like to make about this book, for example, its appropriateness in a Canadian context or specific updates you think need to be made?

As previously noted many examples and exercises are US-centric.
There is no investigation of causal studies. This something some although not all introductory statistics cover.

2. Reviewed by: Shivanand Balram
  • Institution: Simon Fraser University
  • Title/Position: Senior Lecturer, Spatial Information Science
  • Overall Rating: 3.9 out of 5
  • Date:
  • License: Creative Commons License

Q: The text covers all areas and ideas of the subject appropriately and provides an effective index and/or glossary

Most introductory statistics texts use the logical structure of descriptive statistics, probability, and inferential statistics to deliver the materials to new students. This Introductory Statistics textbook by Shafer and Zhang is no exception. There is an introduction chapter (chapter 1) that sets out the main definitions and conceptual foundation for the rest of the book. Descriptive statistics is covered in one chapter (chapter 2). Probability and related concepts are covered across four chapters (chapters 3-6). Inferential statistics (chapters 7-9) and their applications to statistical model building and testing (chapter 10-11) form the remaining parts of the content. Collectively these topics form a useful (and standard) foundation for learning statistics. The online version of the text contains a detailed and functioning hyperlinked Table of Contents for the Chapters and Section headings. I was unable to find a glossary or index, but maybe the same functional benefits can be obtained by clicking on the appropriate topic hyperlink and scrolling through the text.

One aspect of the content that might be useful to include is the bigger picture notion of: How is statistics used in the real world? The examples and exercises sections provide some hints to students, but contemporary issues such as population growth, climate change and sea level rise are hardly ever mentioned. Including these issues and a connection to the statistical tools that can provide solutions to these problems would help make statistics fun for multidisciplinary students who often perceive statistics as boring and irrelevant.

Another aspect of the content is the heavy reliance on the use of a calculator to perform many of the statistical calculations. Whilst this may have some value in terms of flexibility for the instructor as stated by the authors in the Preface, the reality is that once students pursue further statistics and other related courses they will be confronted with the needed to use computer software tools. Including this explicitly would have made the book more comprehensive and relevant to the modern statistics student. It should be noted that in the Large Data Set Exercises sections of the book there are some links to digital spreadsheet data that can be articulated as computer-based data analysis practice for students.

Comprehensiveness Rating: 4 out of 5

Q: Content is accurate, error-free and unbiased

The contents are free of errors. In the Acknowledgments section the authors listed at least 16 individuals linked to higher education that have provided feedback and suggestions for improving the materials. This adds confidence in the quality of the materials.

Many of the exercises and examples use concepts (SAT scores for example) and data that are best understood within the context of the United States. Using the textbook outside of that geographic context may prove to be a limitation in terms of asking students to grasp an understanding of the problem domain before attempting a statistical solution. However, there are a few examples that attempt to break the mold - Section 2.5, Application 20 outlines a problem related to hockey pucks.

Content Accuracy Rating: 4 out of 5

Q: Content is up-to-date, but not in a way that will quickly make the text obsolete within a short period of time. The text is written and/or arranged in such a way that necessary updates will be relatively easy and straightforward to implement

The statistical core that the textbook focuses on is relatively stable and so changes would be few and far between. This statistical core is up-to-date. The examples and exercises that wrap around the statistical core could use some modifications. For example, issues (climate change, population growth, etc.) that appeal to a wider background of multidisciplinary students can make the entire book more relevant. Making these changes to the existing online HTML files would be relatively easy and straightforward to implement.

Relevance Rating: 4 out of 5

Q: The text is written in lucid, accessible prose, and provides adequate context for any jargon/technical terminology used

The text is written in simple and clear prose. There are hardly any sentences more than 20 words long making the statistical messages easily digestible to students whose first language may not be English. Highlighted definition boxes and key takeaway boxes provide adequate explanations of terminology and key points.

Clarity Rating: 4 out of 5

Q: The text is internally consistent in terms of terminology and framework

The quality, layout, terminology, sections and overall value of each chapter are all internally consistent. The online Table of Contents also provide a consistent means to access these materials in an easily accessible way.

Consistency Rating: 4 out of 5

Q: The text is easily and readily divisible into smaller reading sections that can be assigned at different points within the course (i.e., enormous blocks of text without subheadings should be avoided). The text should not be overly self-referential, and should be easily reorganized and realigned with various subunits of a course without presenting much disruption to the reader.

The text is highly modular. Each Chapter is broken down into smaller sections, and on the whole the materials are covered in a very efficient way making the chapters and sections relatively short. The Chapters are self-contained and can be re-ordered down to the Sections level to suit the needs of the instructor/curriculum.

Modularity Rating: 4 out of 5

Q: The topics in the text are presented in a logical, clear fashion

The topics are arranged in the standard statistical workflow process of Descriptive/Probability/Inferential/Modeling stages. There are a few instances where there is overflow of the topics from one chapter into another where it might not be a good fit. For example, an introduction chapter (Chapter 1) begins immediately to define core statistical concepts and to start familiarizing students with data presentation. The authors chose to continue data presentation (mainly histograms) in chapter 2 that has been titled Descriptive Statistics. In order to avoid any confusion in the minds of students, it would have been useful to focus the Descriptive Statistics chapter on mean, median, mode concepts. The histogram material could have been merged with the data presentation materials of Chapter 1, and maybe added newer presentation forms such as maps and sparklines, to have a more comprehensive data presentation chapter. Experience has shown that chunking materials using clearly defined boundaries help students to learn better.

A particularly useful feature is the learning objective that has been given for each Section.

Organization Rating: 3 out of 5

Q: The text is free of significant interface issues, including navigation problems, distortion of images/charts, and any other display features that may distract or confuse the reader

The interface is well designed and organized to enable easy access and pleasing display of the materials. There is some color used throughout the text and this adds to improve the readability and contrast of the images and texts. It is fair to say that figures (especially graphs) are used extensively to illustrate the concepts being discussed.

Interface Rating: 4 out of 5

Q: The text contains no grammatical errors

There is no evidence of grammatical errors. However, it should be noted that the online version of the material seems to be of the highest quality - the printed version of the book (of which I had access) had some symbols missing (Section 10.3) which might be due to the printing/conversion of certain of the Greek symbols used to represent statistical parameters.

Grammar Rating: 4 out of 5

Q: The text is not culturally insensitive or offensive in any way. It should make use of examples that are inclusive of a variety of races, ethnicities, and backgrounds

There is no evidence that the text is culturally insensitive in any way. I suspect that the book was designed to be used in the United States and so many of the examples are within that context. If the book is to be used for a student population outside of that context, then some changes (either by the authors or instructors) in the diversity of examples will be needed.

Cultural Relevance Rating: 4 out of 5

Q: Are there any other comments you would like to make about this book, for example, its appropriateness in a Canadian context or specific updates you think need to be made?

Overall, this is a useful book. It does a good job at covering the breadth and depth of the topics one would expect for an introductory course. The content is well presented and easily accessible. The drawback is that statistical computing is not adequately emphasized and that students in Canada will find it a challenge to relate to some of the US-context questions and examples. Some immediate updates that are needed would be: (1) modify chapter 1 to show the links between statistics and real world solutions, (2) directly introduce computer software into the exercises, (3) adapt the questions and examples to be more relevant to an international audience.

3. Reviewed by: Dr. Erik C. Korolenko
  • Institution: University Canada West
  • Title/Position: Professor
  • Overall Rating: 3.7 out of 5
  • Date:
  • License: Creative Commons License

Q: The text covers all areas and ideas of the subject appropriately and provides an effective index and/or glossary

The text covers some of the areas of the subject, albeit not in-depth.
Whether this approach is appropriate for an introductory course, depends on
the plan for the further study.

Similarly to many other introductory textbooks, the text
leaves open the question "why" do the particular formulas apply.

Glossary is not provided other than chapter-by-chapter.

Comprehensiveness Rating: 3 out of 5

Q: Content is accurate, error-free and unbiased

The authors have gone great lengths towards ensuring error-free and unbiased
content. As always in a text of this size, some errors would still creep in
despite the best efforts. In particular:

1. Position of the mean on the illustration of a bimodal distribution
(page 92) is incorrect. FWHM [full width at half maximum] and the variances
for both Gaussian components of the distribution are identical, but the
components have different amplitudes. As FWHM is identical, the mean should
lie closer to the peak of the component with higher amplitude. Note, that
if the FWHM of the left component was twice the FWHM of the right component,
the position of the mean would nearly halfway between the modes.

2. Pages 224, 614: pictures and text inside are presented as mirror images of
proper orientation.

3. Multiple pages: Authors use Gaussian distribution plots to illustrate
Student distribution. While technically correct for large N, this gives a
wrong impression about the shape of the Student distribution.

Content Accuracy Rating: 4 out of 5

Q: Content is up-to-date, but not in a way that will quickly make the text obsolete within a short period of time. The text is written and/or arranged in such a way that necessary updates will be relatively easy and straightforward to implement

Content is marginally up-to-date. No attention is given to non-parametric
methods, Bayesian estimation, multivariate distributions, to name a few areas.

The amount of included exercises is unnecessarily overwhelming, making the
text appear much longer than it actually is, and difficult to locate the
actual text material. Examples are easy to update, but would benefit from
reduction of their count.

The text will not become obsolete any faster than similar introductory
statistics books.

Relevance Rating: 3 out of 5

Q: The text is written in lucid, accessible prose, and provides adequate context for any jargon/technical terminology used

The authors have done a very good effort towards producing an easily readable
and accessible text. However, the reader in most cases has to trust
the word of the text as not a single proof is presented - even when this
would be easy to achieve (Chebyshev theorem). This is a problem with similar
introductory statistics textbooks that assume no prior knowledge of
algebra.

Clarity Rating: 3 out of 5

Q: The text is internally consistent in terms of terminology and framework

The text is quite consistent in its terminology and structure. However, the
level of detail in presentation of the starting chapters much exceeds that
for the last chapter (ANOVA).

Consistency Rating: 4 out of 5

Q: The text is easily and readily divisible into smaller reading sections that can be assigned at different points within the course (i.e., enormous blocks of text without subheadings should be avoided). The text should not be overly self-referential, and should be easily reorganized and realigned with various subunits of a course without presenting much disruption to the reader.

The text is clearly intended to be used in a sequential manner as it builds
upon the prior knowledge chapter by chapter. Thus, it is better suited
to truncation at the end rather than re-organization or dropping of the
intermediate subunits.

Modularity Rating: 3 out of 5

Q: The topics in the text are presented in a logical, clear fashion

Topics in the text are presented clearly but require a leap of faith on
the part of the reader in every instance a new formula is presented.

Organization Rating: 3 out of 5

Q: The text is free of significant interface issues, including navigation problems, distortion of images/charts, and any other display features that may distract or confuse the reader

The text has some inconsistencies in the layout of its components:
1. Overscripted variables are not typeset well in Word.
2. Formulas and text typeset in LaTeX on occasion import into Word with a loss
of resolution (see the example 21, page 93).
3. As noted before, illustrations are heavy on the Gaussian distribution
images, even where the Student distribution images are needed.
4. Navigation through the parts imported as images is visibly different
from navigation through the parts entered as text.

Interface Rating: 4 out of 5

Q: The text contains no grammatical errors

Grammar has been very well proof-read.

Grammar Rating: 5 out of 5

Q: The text is not culturally insensitive or offensive in any way. It should make use of examples that are inclusive of a variety of races, ethnicities, and backgrounds

The text is not culturally offensive or insensitive, and makes use of
inclusive examples.

Cultural Relevance Rating: 5 out of 5

Q: Are there any other comments you would like to make about this book, for example, its appropriateness in a Canadian context or specific updates you think need to be made?

The text presents a good attempt at presenting the introductory statistics topics
for students with little previous experience with statistics and probability.

4. Reviewed by: Xiaowen Lei
  • Institution: Simon Fraser University
  • Title/Position: Teaching Assistant
  • Overall Rating: 4.7 out of 5
  • Date:
  • License: Creative Commons License

Q: The text covers all areas and ideas of the subject appropriately and provides an effective index and/or glossary

The book’s coverage is pretty good at an introductory level, including some suggestions of presenting the data, the standard statistics concept and the way to compute them, and even some linear model, which does not always appear in an introductory statistics book, but would be useful for students to learn advanced statistics or econometrics.

Comprehensiveness Rating: 4 out of 5

Q: Content is accurate, error-free and unbiased

Yes, no major mistakes are found. However, attempting to use words to describe many statistics concept is always a bit harder than using rigorous math, so it’s natural that in explaining some of the concepts, it’s not so rigorous in those words.

Content Accuracy Rating: 4 out of 5

Q: Content is up-to-date, but not in a way that will quickly make the text obsolete within a short period of time. The text is written and/or arranged in such a way that necessary updates will be relatively easy and straightforward to implement

The presentation of the book is pretty classic, but no super new fancy stuff is introduced there. On the other hand, it’s also hard to do so in an introductory level statistics book, since most of the new fancy concepts would suit better for a graduate student text, but not this text. Finally, the book seems to avoid examples that quotes new date and new events, which is a good way to maintain the longevity of its own.

Relevance Rating: 5 out of 5

Q: The text is written in lucid, accessible prose, and provides adequate context for any jargon/technical terminology used

Yes, the book is very easy to read, and only requires a little fluency of English, should be quite easy for the foreign students. At the same time, the way the authors present the material are quite intuitive and clear.

Clarity Rating: 5 out of 5

Q: The text is internally consistent in terms of terminology and framework

The book is quite consistent. Although concepts are very much linked with each other across Chapters, there’s no major conflicts amongst them.

Consistency Rating: 5 out of 5

Q: The text is easily and readily divisible into smaller reading sections that can be assigned at different points within the course (i.e., enormous blocks of text without subheadings should be avoided). The text should not be overly self-referential, and should be easily reorganized and realigned with various subunits of a course without presenting much disruption to the reader.

This is not always the case for this book. In these math settings, every concepts are linked to the concepts linked previously, so it would be a bit hard to separately view them.

Modularity Rating: 4 out of 5

Q: The topics in the text are presented in a logical, clear fashion

The structure of the book is extremely good. It starts with very easy introduction of what it means by these statistics, then gave a separate hapter for descriptive statistics. The separation of sections of discrete vs. continuous random variables makes it less confusing for students who haven’t had exposure to random variables. Then the book gets a bit harder on sampling on testing, all with separate sections. Overall, it’s a very nice way to structure the book. Graphs, tables and appendix are very supportive as well.

Organization Rating: 5 out of 5

Q: The text is free of significant interface issues, including navigation problems, distortion of images/charts, and any other display features that may distract or confuse the reader

I found the presentations of graphs a bit coarse, and in some sections, the size doesn’t match very well from graphs to graphs ( with the similar importance or ideas). But the graphs and charts themselves have no problem.

Interface Rating: 5 out of 5

Q: The text contains no grammatical errors

I found almost no gramma errors, the textbook is written in plain English, which is easily understood as well.

Grammar Rating: 5 out of 5

Q: The text is not culturally insensitive or offensive in any way. It should make use of examples that are inclusive of a variety of races, ethnicities, and backgrounds

Yes. As the basic statistics concepts everywhere, so is this book applies to its readers. Though the book use some examples from the United States, it’s not a problem for readers from other cultures to understand the basic mechanism of math underlying these examples.

Cultural Relevance Rating: 5 out of 5

Q: Are there any other comments you would like to make about this book, for example, its appropriateness in a Canadian context or specific updates you think need to be made?

The author could organize the book a bit better. In some part of the equations, it looks like they are copied pictures from typing somewhere else, which look not so elegant. Also, when presenting a graph, they should better be centered.