Avtor sekcije: Danielle J. Navarro and David R. Foxcroft

Histogrami

Let’s begin with the humble histogram. Histograms are one of the simplest and most useful ways of visualising data. They make most sense when you have an interval or ratio scale variable (e.g., the afl.margins variable from the aflsmall_finalists data set that we used in Descriptive statistics) and what you want to do is get an overall impression of the variable. Most of you probably know how histograms work, since they’re so widely used, but for the sake of completeness I’ll describe them. All you do is divide up the possible values into bins and then count the number of observations that fall within each bin. This count is referred to as the frequency or density of the bin and is displayed as a vertical bar. The afl.margins variable contains 33 games in which the winning margin was less than 10 points and it is this fact that is represented by the height of the leftmost bar that we showed earlier in Descriptive statistics, and Fig. 20. With these earlier graphs we used an advanced plotting package in R which, for now, is beyond the capability of jamovi. But jamovi gets us close, and drawing this histogram in jamovi is pretty straightforward. Open up the Plots options under ExplorationDescriptives and click the Histogram check box, as shown in Fig. 21. jamovi defaults to labelling the y-axis as density and the x-axis with the variable name. The bins are selected automatically, and there is no scale, or count, information on the y-axis unlike the previous Fig. 20. But this does not matter too much because after all what we are really interested in is our impression of the shape of the distribution: is it normally distributed or is there a skew or kurtosis? Our first impressions of these characteristics come from drawing a histogram.

Histogram check box in jamovi

Fig. 21 jamovi screen showing the histogram check box

Ena od dodatnih funkcij, ki jih ponuja jamovi, je možnost izrisa krivulje gostote. To lahko storite tako, da v možnosti Plots kliknete potrditveno polje Density (in odstranite potrditev Histogram), s čimer dobite graf, prikazan v Fig. 22. Graf gostote prikazuje porazdelitev podatkov v neprekinjenem intervalu ali časovnem obdobju. Ta diagram je različica histograma, ki za izris vrednosti uporablja jedrno glajenje, ki z glajenjem šuma omogoča bolj gladko porazdelitev. Vrhovi na grafu gostote pomagajo prikazati, kje so vrednosti v intervalu zgoščene. Prednost ploskev gostote pred histogrami je, da bolje določajo obliko porazdelitve, saj nanje ne vpliva število uporabljenih binsov (vsaka vrstica, uporabljena v tipičnem histogramu). Histogram, ki ga sestavljajo le 4 košarice, ne bi dal dovolj razločne oblike porazdelitve, kot bi jo dal histogram z 20 košaricami. Vendar pri diagramih gostote to ni težava.

Density plot for the ``afl.margins`` variable

Fig. 22 Density plot for the afl.margins variable plotted in jamovi

Čeprav bi bilo treba to sliko za dobro predstavitveno grafiko (tj. takšno, ki bi jo vključili v poročilo) precej popraviti, pa kljub temu dobro opisuje podatke. Pravzaprav je velika prednost histograma ali diagrama gostote v tem, da (pravilno uporabljen) prikaže celotno razpršitev podatkov, tako da si lahko ustvarite precej dober občutek o tem, kako so ti videti. Slaba stran histogramov je, da niso zelo kompaktni. Za razliko od nekaterih drugih grafov, o katerih bom govoril, je na eno sliko težko spraviti 20-30 histogramov, ne da bi preobremenili gledalca. In seveda, če so vaši podatki v nominalnem merilu nominal, so histogrami neuporabni.