For the teaching of Statistics, there are some things in common in thetwo stories. Because the data and graphs that appear in Internet dataanalysis behave sometimes differently from what we teach students inthe descriptive data analysis module, we can use these stories toreinforce what students already know, by contrast. In addition tothat, the data sets are huge, much bigger than the ones we usually usein our teaching, presenting the student with the mystery of dealingwith such monsters. Finally, but not the least, there are no definiteanswers yet, so the students are really being exposed to the ongoingsearch for new paradigms in the engineering, computer science andstatistics community.

Another critical end point is computing measures of dispersion, such as the variance, which describe the dispersion of individual measures around the mean value. In this phase, extreme values (‘outliers’) deserve special attention since they could represent an error in measurement, in data recording or in data entry, but they can also represent a legitimate value, which may provide critical insight for the association studied. In addition, visual tools of descriptive analysis may be used. For instance, a histogram shows spread and skewness of the data; presence of outliers and presence of multiple modes in the data. These features provide strong indications of the proper distributional model for the data. For qualitative (‘categorical’) variables, one-way tables or bar charts could be used. A bar graph displays the frequency (or relative frequency) for all levels of a categorical variable. Instructions for the descriptive data analysis are reported in several books (–).

