title: “Lab 1 Online experiments Hertie”
output: html_document
author: Katharina Lawall

Step 1: Change the root directory

Step 2: Load packages

Step 3: Reading in data

Step 4: Explore the data

What are the outcome variables? What are anonymity and log.followers? How many observations are there?

data %>% head()
##   X.2 X.1 X treat.f In_group high_followers anonymity log.followers
## 1   1   1 1       4        0              1         1      4.094345
## 2   2   2 2       4        0              1         2      7.007601
## 3   3   3 3       4        0              1         2      6.948897
## 4   4   4 4       2        0              0         2      8.270781
## 5   5   5 5       2        0              0         1      5.411646
## 6   6   6 6       3        1              1         2      3.044523
##   racism.scores.post.1wk racism.scores.pre.2mon racism.scores.post.2mon
## 1              1.4285714             0.00000000              0.22580645
## 2              0.1428571             0.04838710              0.17741935
## 3              0.0000000             0.01612903              0.00000000
## 4              0.1428571             0.03225806              0.22580645
## 5              0.5714286             0.01612903              0.06451613
## 6              3.2857143             0.19354839              0.75806452
##   racism.scores.post.1mon racism.scores.post.2wk
## 1               0.4516129             1.00000000
## 2               0.1935484             0.07142857
## 3               0.0000000             0.00000000
## 4               0.1290323             0.14285714
## 5               0.1290323             0.28571429
## 6               1.5161290             1.64285714
data %>% nrow()
## [1] 243

What is treat.f?

data %>% group_by(treat.f, In_group, high_followers) %>% count()
## # A tibble: 5 x 4
## # Groups:   treat.f, In_group, high_followers [5]
##   treat.f In_group high_followers     n
##     <int>    <int>          <int> <int>
## 1       0        0              0    52
## 2       1        1              0    49
## 3       2        0              0    44
## 4       3        1              1    50
## 5       4        0              1    48

Step 5: Random assignment

# Simple random assignment to two groups
set.seed(123)

data <- data %>% mutate (assignment_simple = simple_ra(N = nrow(data), num_arms = 2))

data %>% group_by(assignment_simple) %>% count()
## # A tibble: 2 x 2
## # Groups:   assignment_simple [2]
##   assignment_simple     n
##   <fct>             <int>
## 1 T1                  126
## 2 T2                  117
# Complete random assignment to two groups
set.seed(123)

data <- data %>% mutate (assignment_complete = complete_ra(N = nrow(data), num_arms = 2))

data %>% group_by(assignment_complete) %>% count()
## # A tibble: 2 x 2
## # Groups:   assignment_complete [2]
##   assignment_complete     n
##   <fct>               <int>
## 1 T1                    122
## 2 T2                    121
#Q. What's the difference between simple and complete RA? Why is it important to set a seed? 

Exercise 5A.

Recreate the random assignment from the Munger experiment. How many treatment and control groups do you need? Are you going to use simple or complete RA? Bonus: can you name the treatment and control groups?

set.seed(123)

data <- data %>% mutate (assignment_exercise = complete_ra(N = nrow(data), num_arms = 5, conditions=c("Control", "T1", "T2", "T3", "T4")))

data %>% group_by(assignment_exercise) %>% count()
## # A tibble: 5 x 2
## # Groups:   assignment_exercise [5]
##   assignment_exercise     n
##   <fct>               <int>
## 1 Control                49
## 2 T1                     48
## 3 T2                     49
## 4 T3                     49
## 5 T4                     48
data %>% group_by(treat.f) %>% count()
## # A tibble: 5 x 2
## # Groups:   treat.f [5]
##   treat.f     n
##     <int> <int>
## 1       0    52
## 2       1    49
## 3       2    44
## 4       3    50
## 5       4    48

Step 6: Analysis

## 
## Call:
## lm_robust(formula = racism.scores.post.1wk ~ treat.f + log.followers + 
##     racism.scores.pre.2mon, data = data)
## 
## Standard error type:  HC2 
## 
## Coefficients:
##                         Estimate Std. Error  t value Pr(>|t|)  CI Lower
## (Intercept)             0.275806    0.14450  1.90875  0.05751 -0.008866
## treat.f1               -0.079670    0.12475 -0.63863  0.52369 -0.325443
## treat.f2               -0.012201    0.14614 -0.08349  0.93353 -0.300113
## treat.f3               -0.259299    0.12186 -2.12776  0.03440 -0.499386
## treat.f4               -0.073142    0.12308 -0.59426  0.55291 -0.315623
## log.followers           0.008617    0.02066  0.41698  0.67707 -0.032095
## racism.scores.pre.2mon  1.324999    0.57818  2.29168  0.02281  0.185926
##                        CI Upper  DF
## (Intercept)             0.56048 235
## treat.f1                0.16610 235
## treat.f2                0.27571 235
## treat.f3               -0.01921 235
## treat.f4                0.16934 235
## log.followers           0.04933 235
## racism.scores.pre.2mon  2.46407 235
## 
## Multiple R-squared:  0.2758 ,    Adjusted R-squared:  0.2573 
## F-statistic: 2.213 on 6 and 235 DF,  p-value: 0.04263
## 
## Call:
## lm_robust(formula = racism.scores.post.1wk ~ treat.f + log.followers + 
##     racism.scores.pre.2mon, data = filter(data, anonymity > 1))
## 
## Standard error type:  HC2 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)   CI Lower
## (Intercept)             0.36898    0.18696  1.9736  0.05024 -0.0003987
## treat.f1               -0.13303    0.17057 -0.7799  0.43665 -0.4700303
## treat.f2               -0.13730    0.20813 -0.6597  0.51046 -0.5484904
## treat.f3               -0.33887    0.16802 -2.0168  0.04548 -0.6708355
## treat.f4               -0.13345    0.16242 -0.8216  0.41259 -0.4543504
## log.followers           0.01092    0.02761  0.3956  0.69296 -0.0436315
## racism.scores.pre.2mon  1.28111    0.62276  2.0572  0.04138  0.0507318
##                         CI Upper  DF
## (Intercept)             0.738358 152
## treat.f1                0.203966 152
## treat.f2                0.273896 152
## treat.f3               -0.006906 152
## treat.f4                0.187453 152
## log.followers           0.065479 152
## racism.scores.pre.2mon  2.511483 152
## 
## Multiple R-squared:  0.2906 ,    Adjusted R-squared:  0.2626 
## F-statistic: 1.561 on 6 and 152 DF,  p-value: 0.1624

Exercise 6A.

Let’s estimate this for non-anonymous users:

How do you interpret this?

## 
## Call:
## lm_robust(formula = racism.scores.post.1wk ~ treat.f + log.followers + 
##     racism.scores.pre.2mon, data = filter(data, anonymity < 2))
## 
## Standard error type:  HC2 
## 
## Coefficients:
##                         Estimate Std. Error  t value Pr(>|t|) CI Lower CI Upper
## (Intercept)             0.041638    0.17624  0.23626  0.81387 -0.30937  0.39265
## treat.f1                0.094023    0.11589  0.81130  0.41973 -0.13680  0.32484
## treat.f2                0.277580    0.15511  1.78956  0.07751 -0.03135  0.58651
## treat.f3               -0.044160    0.08542 -0.51694  0.60670 -0.21430  0.12598
## treat.f4                0.102838    0.13780  0.74629  0.45779 -0.17161  0.37729
## log.followers           0.002072    0.02440  0.08491  0.93255 -0.04652  0.05067
## racism.scores.pre.2mon  1.443077    0.59364  2.43090  0.01742  0.26074  2.62541
##                        DF
## (Intercept)            76
## treat.f1               76
## treat.f2               76
## treat.f3               76
## treat.f4               76
## log.followers          76
## racism.scores.pre.2mon 76
## 
## Multiple R-squared:  0.1963 ,    Adjusted R-squared:  0.1329 
## F-statistic: 2.662 on 6 and 76 DF,  p-value: 0.02132

Step 7: Graphs

Additional material: Exercise 6B.

Let’s estimate all three models for the week 2 outcomes:

How do you interpret the results? How does this compare to week 1 outcomes?

## 
## Call:
## lm_robust(formula = racism.scores.post.2wk ~ treat.f + log.followers + 
##     racism.scores.pre.2mon, data = data)
## 
## Standard error type:  HC2 
## 
## Coefficients:
##                         Estimate Std. Error  t value Pr(>|t|) CI Lower CI Upper
## (Intercept)             0.227471    0.08551  2.66023 0.008347  0.05901  0.39593
## treat.f1               -0.025022    0.07503 -0.33350 0.739058 -0.17284  0.12280
## treat.f2                0.005348    0.08402  0.06365 0.949300 -0.16018  0.17088
## treat.f3               -0.159445    0.06665 -2.39212 0.017538 -0.29076 -0.02813
## treat.f4               -0.001729    0.07854 -0.02202 0.982453 -0.15646  0.15300
## log.followers          -0.005768    0.01282 -0.44992 0.653185 -0.03103  0.01949
## racism.scores.pre.2mon  0.942644    0.28312  3.32948 0.001010  0.38487  1.50042
##                         DF
## (Intercept)            235
## treat.f1               235
## treat.f2               235
## treat.f3               235
## treat.f4               235
## log.followers          235
## racism.scores.pre.2mon 235
## 
## Multiple R-squared:  0.3446 ,    Adjusted R-squared:  0.3278 
## F-statistic: 3.829 on 6 and 235 DF,  p-value: 0.001159
## 
## Call:
## lm_robust(formula = racism.scores.post.2wk ~ treat.f + log.followers + 
##     racism.scores.pre.2mon, data = filter(data, anonymity > 1))
## 
## Standard error type:  HC2 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|) CI Lower CI Upper
## (Intercept)             0.249512    0.10624  2.3485 0.020138  0.03961  0.45942
## treat.f1               -0.049439    0.10010 -0.4939 0.622081 -0.24720  0.14832
## treat.f2               -0.077952    0.11992 -0.6501 0.516639 -0.31487  0.15897
## treat.f3               -0.197423    0.08931 -2.2106 0.028557 -0.37387 -0.02098
## treat.f4               -0.079633    0.09230 -0.8628 0.389609 -0.26198  0.10272
## log.followers           0.001388    0.01556  0.0892 0.929039 -0.02936  0.03214
## racism.scores.pre.2mon  0.872096    0.26294  3.3168 0.001139  0.35262  1.39158
##                         DF
## (Intercept)            152
## treat.f1               152
## treat.f2               152
## treat.f3               152
## treat.f4               152
## log.followers          152
## racism.scores.pre.2mon 152
## 
## Multiple R-squared:  0.3648 ,    Adjusted R-squared:  0.3398 
## F-statistic: 3.041 on 6 and 152 DF,  p-value: 0.007745
## 
## Call:
## lm_robust(formula = racism.scores.post.2wk ~ treat.f + log.followers + 
##     racism.scores.pre.2mon, data = filter(data, anonymity < 2))
## 
## Standard error type:  HC2 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)  CI Lower CI Upper
## (Intercept)             0.05584    0.15113  0.3695 0.712800 -0.245168  0.35685
## treat.f1                0.09173    0.10161  0.9028 0.369492 -0.110640  0.29410
## treat.f2                0.20248    0.10618  1.9069 0.060315 -0.009003  0.41395
## treat.f3               -0.02377    0.08329 -0.2853 0.776170 -0.189663  0.14213
## treat.f4                0.14886    0.12428  1.1978 0.234722 -0.098664  0.39638
## log.followers          -0.01506    0.02066 -0.7291 0.468154 -0.056207  0.02608
## racism.scores.pre.2mon  1.78471    0.61518  2.9011 0.004861  0.559481  3.00995
##                        DF
## (Intercept)            76
## treat.f1               76
## treat.f2               76
## treat.f3               76
## treat.f4               76
## log.followers          76
## racism.scores.pre.2mon 76
## 
## Multiple R-squared:  0.3686 ,    Adjusted R-squared:  0.3188 
## F-statistic: 3.313 on 6 and 76 DF,  p-value: 0.005965