Main exercise
The Hogwarts School of Witchcraft and Wizardry conducted a study investigating learning culture and magical abilities within the four houses Gryffindor, Hufflepuff, Ravenclaw and Slytherin. For this, they asked all students from one age cohort how many hours per week they spent in the library on average. Additionally, all students had to perform the Wingardium Leviosa spell to levitate a feather. The distance that the feather levitate above the ground was used as an objective measure of magical ability. Finally, they noted which of the four houses the student belonged to, thus including three variables in total for this study.
library(tidyverse)
hogwarts_data <- read_csv("https://pzezula.pages.gwdg.de/data/HarryPotter.dat")
##
## ── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────
## cols(
## Haus = col_character(),
## Lernzeit = col_double(),
## Schwebehoehe = col_double()
## )
hogwarts_data$Haus <- factor(hogwarts_data$Haus)
#Haus = house
head(hogwarts_data)
## # A tibble: 6 x 3
## Haus Lernzeit Schwebehoehe
## <fct> <dbl> <dbl>
## 1 Gryffindor 9.4 40.5
## 2 Gryffindor 10.2 46.1
## 3 Gryffindor 9.6 49.1
## 4 Gryffindor 10.2 0
## 5 Gryffindor 9.8 32.9
## 6 Gryffindor 10.6 40.4
library(psych)
describeBy(hogwarts_data, hogwarts_data$Haus)
##
## Descriptive statistics by group
## group: Gryffindor
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Haus* 1 30 1.00 0.00 1.00 1.00 0.00 1 1.0 0.0 NaN NaN 0.00
## Lernzeit 2 30 9.24 1.16 9.55 9.27 1.19 7 11.7 4.7 -0.21 -0.70 0.21
## Schwebehoehe 3 30 29.75 16.57 33.90 31.00 11.93 0 51.8 51.8 -0.79 -0.73 3.02
## -----------------------------------------------------------------------------------------------
## group: Hufflepuff
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Haus* 1 26 2.00 0.00 2.0 2.00 0.00 2 2.0 0.0 NaN NaN 0.00
## Lernzeit 2 26 13.40 1.25 13.6 13.41 1.56 11 15.5 4.5 -0.09 -1.21 0.25
## Schwebehoehe 3 26 17.13 12.12 19.8 16.94 12.75 0 38.3 38.3 -0.10 -1.25 2.38
## -----------------------------------------------------------------------------------------------
## group: Ravenclaw
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Haus* 1 27 3.00 0.00 3.0 3.00 0.00 3.0 3.0 0.0 NaN NaN 0.00
## Lernzeit 2 27 12.03 1.12 12.2 12.15 1.04 8.8 13.6 4.8 -1.01 0.79 0.21
## Schwebehoehe 3 27 60.78 17.45 64.5 63.40 8.45 0.0 80.3 80.3 -1.78 3.57 3.36
## -----------------------------------------------------------------------------------------------
## group: Slytherin
## vars n mean sd median trimmed mad min max range skew kurtosis se
## Haus* 1 29 4.00 0.00 4.0 4.00 0.00 4.0 4.0 0.0 NaN NaN 0.00
## Lernzeit 2 29 7.67 1.14 7.6 7.61 1.19 5.8 10.4 4.6 0.36 -0.33 0.21
## Schwebehoehe 3 29 23.03 15.80 24.8 23.08 20.61 0.0 47.4 47.4 -0.26 -1.44 2.93
As a school for witchcraft and wizardry, the leviation height of the feather was especially interesting as a criterion, therefor:
hogwarts_m1 <- lm(Schwebehoehe ~ Haus + Lernzeit, data = hogwarts_data)
#Schwebehoehe = leviation height
#Lernzeit = study time
summary(hogwarts_m1)
##
## Call:
## lm(formula = Schwebehoehe ~ Haus + Lernzeit, data = hogwarts_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -61.411 -9.118 2.057 10.871 26.685
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.5630 11.8327 -0.386 0.700541
## HausHufflepuff -28.0998 6.5866 -4.266 4.31e-05 ***
## HausRavenclaw 20.6523 5.3118 3.888 0.000176 ***
## HausSlytherin -0.9114 4.3959 -0.207 0.836141
## Lernzeit 3.7149 1.2457 2.982 0.003544 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.13 on 107 degrees of freedom
## Multiple R-squared: 0.5709, Adjusted R-squared: 0.5549
## F-statistic: 35.59 on 4 and 107 DF, p-value: < 2.2e-16
The most important result is the significance of the learning time predictor: Hogwart’s teaching style seems to be working. However, some of the dummy variables of the different houses reached significance, too, indicating substantial differences in magical abilities between the houses. Professor Snape especially emphasizes Hufflepuff’s negative results. To him, their highly significant dummy predictor is clear evidence for the Hufflepuffs being by far the worst witches and wizards in Hogwarts. They might even be completely hopeless cases, whom no further teaching capacities should be wasted on.
Professor Flitwick expressed a wish for an additional analysis. He’s convinced that the culture of striving found in his house Ravenclaw leads to his students spending more time learning than the other houses. To show this, he conducted a logstic regression, using learning time to predict whether the student in question is a Ravenclaw or not:
hogwarts_data <- mutate(hogwarts_data,
Ravenclaw_ja_nein = ifelse( #Ravenclaw_ja_nein = Ravenclaw_yes_no
Haus == "Ravenclaw",
1,
0)) #If student's a Ravenclaw 1, otherwise 0
hogwarts_m2 <- glm(Ravenclaw_ja_nein ~ Lernzeit,
data = hogwarts_data,
family = "binomial")
summary(hogwarts_m2)
##
## Call:
## glm(formula = Ravenclaw_ja_nein ~ Lernzeit, family = "binomial",
## data = hogwarts_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.3858 -0.7013 -0.4795 -0.3123 2.0452
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.1608 1.2351 -4.178 2.94e-05 ***
## Lernzeit 0.3638 0.1051 3.462 0.000537 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 123.72 on 111 degrees of freedom
## Residual deviance: 109.34 on 110 degrees of freedom
## AIC: 113.34
##
## Number of Fisher Scoring iterations: 4
Learning time proves to be a significant predictor of ‘Ravenclaw-ness’, leading professor Flitwick to feel validated in his opinion of his house’s striving culture leading to more learning.
In the original analysis, there was also a visual representation of the collected data. However, the owl carrying these dcuments must’ve eaten that page. Are you able to use ggplot, the muggle alternative to visual wizardry, to create an adequate visualization?