This tutorial uses data and reproduces a subset of analyses reported in the following manuscript:

Cosme et al. (Preprint) Message self and social relevance increases intentions to share content: Correlational and causal evidence from six studies

In these analyses, we test whether self and social relevance are causally related to sharing intentions in a preregistered experiment. Self and social relevance were experimentally manipulated by having participants explicitly reflect on the self or social relevance of messages.

Participants were randomly assigned to either the self (n = 200) or social (n = 197) condition. We used a mixed design in which all participants saw a set of 5 messages in the control condition and a set of 5 messages either in the self condition or the social condition. Therefore, relationships between the experimental condition (self or social) and the control condition were assessed within-person, whereas the difference between experimental conditions was assessed between-person. We manipulated self relevance by asking participants to write about why the article matters to them personally (self condition), and social relevance by asking them to write about why the article matters to people they know (social condition). In the control condition, participants did not reflect on relevance and instead were asked to write what the article is about. Messages consisted of a news headline and brief abstract from the New York Times about general health or climate change—two important societal issues that could benefit from individual and collective action.

Prep data

Load packages

if (!require(pacman)) {
  install.packages('pacman')
}
pacman::p_load(devtools, report, tidyverse, lmerTest, usmap, knitr, kableExtra, reactable, performance, wesanderson, install = TRUE)
devtools::install_github("hadley/emo")

Define aesthetics

palette_map = c("#3B9AB2", "#EBCC2A", "#F21A00")
palette_condition = c("#ee9b00", "#bb3e03", "#005f73")

plot_aes = theme_minimal() +
  theme(legend.position = "top",
        legend.text = element_text(size = 12),
        text = element_text(size = 16, family = "Futura Medium"),
        axis.text = element_text(color = "black"),
        axis.line = element_line(colour = "black"),
        axis.ticks.y = element_blank())

Define functions

# MLM results table function
table_model = function(model_data, reference = "control") {
  model_data %>%
  broom.mixed::tidy(conf.int = TRUE) %>%
  filter(effect == "fixed") %>%
  rename("SE" = std.error,
         "t" = statistic,
         "p" = p.value) %>%
  select(-group, -effect) %>%
  mutate_at(vars(-contains("term"), -contains("p.value")), round, 2) %>%
  mutate(term = gsub("article_cond", "", term),
         term = gsub("\\(Intercept\\)", !!reference, term),
         term = gsub("sharing_typemsg_sharing_narrow", "sharing type", term),
         term = gsub("msg_rel_self_between", "self-relevance", term),
         term = gsub("msg_rel_social_between", "social relevance", term),
         term = gsub(":", " x ", term),
         p = ifelse(p < .001, "< .001",
                    ifelse(p == 1, "1.000", gsub("0.(.*)", ".\\1", sprintf("%.3f", p)))),
         `b [95% CI]` = sprintf("%.2f [%0.2f, %.2f]", estimate, conf.low, conf.high)) %>%
  select(term, `b [95% CI]`, df, t, p) %>%
  kable() %>%
  kableExtra::kable_styling()
}

# source raincloud plot function
source("https://gist.githubusercontent.com/benmarwick/2a1bb0133ff568cbe28d/raw/fb53bd97121f7f9ce947837ef1a4c65a73bffb3f/geom_flat_violin.R")

Load data

data_raw = read.csv("data/study6_clean_long.csv", stringsAsFactors = FALSE)
demo = read.csv("data/study6_demo.csv", stringsAsFactors = FALSE)

Tidy data

Data transformations

None

Exclusions

No participants were excluded

# get condition information
cond_order = data_raw %>%
  filter(grepl("cond_order", survey_name)) %>%
  select(SID, value) %>%
  mutate(article_cond = strsplit(value, "\\|")) %>%
  unnest(article_cond) %>%
  select(-value) %>%
  bind_cols(., data.frame(item = rep(1:10, length(unique(data_raw$SID))))) %>%
  mutate(item = as.character(item),
         article_cond = gsub("other", "social", article_cond))

# isolate message ratings and merge with condition information
messages = data_raw %>%
  filter(grepl("sharing|relevance_self|relevance_social", survey_name)) %>%
  extract(item, "item", "([0-9]+)_.*") %>%
  mutate(value = as.numeric(value),
         survey_name = gsub("relevance", "rel", survey_name),
         survey_name = sprintf("msg_%s", survey_name),
         cond = gsub("other", "social", cond)) %>%
  left_join(., cond_order)

# prepare dataframe for modeling; keep sharing type and article condition long, but pivot the relevance variables wide
messages_mod = messages %>%
  group_by(survey_name, SID) %>%
  spread(survey_name, value) %>%
  gather(sharing_type, msg_share, contains("sharing")) %>%
  mutate(group = cond) %>%
  select(group, SID, item, article_cond, sharing_type, msg_share, contains("rel_"))

# summarize the demographic information
demo_summary = demo %>%
  mutate(value = ifelse(value == "Would rather not say", "Prefer not to say", value),
         value = ifelse(is.na(value), "Not reported", value)) %>%
  group_by(item, value) %>%
  summarize(n = n()) %>%
  ungroup() %>%
  filter(!item %in% c("gender_4_TEXT", "race_self")) %>%
  rename(" " = item)

Sample demographics

For this tutorial, these are made up demographics to illustrate how to summarize and present different characteristics

Sample size

messages %>%
  select(cond, SID) %>%
  unique() %>%
  group_by(cond) %>%
  summarize(n = n()) %>%
  reactable::reactable(striped = TRUE)

Geographic distribution

states = demo %>%
  filter(grepl("state", item)) %>%
  spread(item, value) %>%
  group_by(state) %>%
  summarize(n = n())

states %>%
  usmap::plot_usmap(data = ., values = "n", color = "grey50") +
  scale_fill_gradient2(low = palette_map[1], mid = palette_map[2], midpoint = max(states$n) / 2, high = palette_map[3],
                       name = "", na.value = NA, limits = c(0, max(states$n)), breaks = seq(0, max(states$n), 10)) +
  theme(text = element_text(size = 12, family = "Futura Medium"),
        legend.position = "right")

Age

Table

demo %>%
  filter(item == "age") %>%
  mutate(value = as.numeric(value)) %>%
  summarize(`age range` = sprintf("%s - %s", min(value, na.rm = TRUE), max(value, na.rm = TRUE)),
            `mean age` = round(mean(value, na.rm = TRUE), 0),
            `sd age` = round(sd(value, na.rm = TRUE), 0)) %>%
  reactable::reactable(striped = TRUE)

Plot

demo %>%
  filter(item == "age") %>%
  mutate(value = as.numeric(value)) %>%
  ggplot(aes(value)) +
  geom_density(fill = "#3B9AB2", color = NA) + 
  labs(x = "") +
  plot_aes +
  theme(legend.position = "right")

Gender

Table

demo_summary %>%
  filter(` ` == "gender") %>%
  mutate(total = sum(n),
         percent = sprintf("%.0f%s", (n / total) * 100, "%")) %>%
  select(-total, -` `) %>%
  bind_rows(data.frame(value = c("Non-binary / third gender", "Prefer to self-describe"), n = rep(0, 2), percent = rep("0%", 2))) %>%
  arrange(value) %>%
  rename("gender" = value) %>%
  reactable::reactable(striped = TRUE)

Plot

demo_summary %>%
  filter(` ` == "gender") %>%
  mutate(total = sum(n),
         percent = (n / total) * 100) %>%
  select(-total, -` `) %>%
  bind_rows(data.frame(value = c("Non-binary / third gender", "Prefer to self-describe"), n = rep(0, 2), percent = rep(0, 2))) %>%
  ggplot(aes("", percent, fill = reorder(value, percent))) +
  geom_bar(stat = "identity") + 
  coord_flip() +
  scale_fill_manual(name = "", values = wesanderson::wes_palette("Zissou1", n = 5, type = "continuous")) +
  labs(x = "") +
  plot_aes +
  theme(legend.position = "right")

Race and ethnicity

Table

hispanic_latinx = demo_summary %>%
  filter(` ` == "Hispanic or Latinx") %>%
  mutate(total = sum(n),
         percent = sprintf("%.1f%s", (n / total) * 100, "%")) %>%
  select(-total) %>%
  spread(value, percent) %>%
  select(-No, - ` `) %>%
  rename("percent" = Yes) %>%
  mutate(`race / ethnicity` = "Hispanic or Latinx")

demo_summary %>%
  filter(` ` == "race") %>%
  filter(!value %in% c("Hispanic", "Latino")) %>% #counted already in the hispanic_latinx item
  mutate(total = sum(n),
         percent = sprintf("%.1f%s", (n / total) * 100, "%"),
         value = ifelse(is.na(value), "Not reported", value)) %>%
  select(value, n, percent) %>%
  bind_rows(data.frame(value = c("American Indian or Alaskan Native", "Native Hawaiian or Other Pacific Islander"), n = rep(0, 2), percent = rep("0%", 2))) %>%
  rename("race & ethnicity" = value) %>%
  bind_rows(., hispanic_latinx) %>%
  arrange(`race / ethnicity`) %>%
  reactable::reactable(striped = TRUE)

Plot

demo_summary %>%
  filter(` ` == "race") %>%
  filter(!value %in% c("Hispanic", "Latino")) %>% #counted already in the hispanic_latinx item
  mutate(total = sum(n),
         percent = (n / total) * 100,
         value = ifelse(is.na(value), "Not reported", value)) %>%
  select(value, n, percent) %>%
  bind_rows(data.frame(value = c("American Indian or Alaskan Native", "Native Hawaiian or Other Pacific Islander"), n = rep(0, 2), percent = rep(0, 2))) %>%
  ggplot(aes("", percent, fill = reorder(value, percent))) +
  geom_bar(stat = "identity") + 
  coord_flip() +
  scale_fill_manual(name = "", values = wesanderson::wes_palette("Zissou1", n = 7, type = "continuous")) +
  labs(x = "") +
  plot_aes +
  theme(legend.position = "right")

Household income

Table

demo_summary %>%
  filter(` ` == "household income") %>%
  group_by(value) %>%
  summarize(n = sum(n)) %>%
  mutate(total = sum(n),
         percent = sprintf("%.1f%s", (n / total) * 100, "%"),
         value = ifelse(is.na(value), "Not reported", value),
         value = factor(value, levels = c("Less than $5,000", "$5,000 through $11,999", "$12,000 through $15,999", "$16,000 through $24,999", 
                                          "$25,000 through $34,999", "$35,000 through $49,999", "$50,000 through $74,999", "$75,000 through $99,999",
                                          "$100,000 and greater", "Not reported"))) %>%
  arrange(value) %>%
  select(value, n, percent) %>%
  rename("household income" = value) %>%
  mutate_if(is.character, funs(ifelse(is.na(.), "—", .))) %>%
  reactable::reactable(striped = TRUE)

Plot

demo_summary %>%
  filter(` ` == "household income") %>%
  group_by(value) %>%
  summarize(n = sum(n)) %>%
  mutate(total = sum(n),
         percent = (n / total) * 100,
         value = ifelse(is.na(value), "Not reported", value),
         value = factor(value, levels = c("Less than $5,000", "$5,000 through $11,999", "$12,000 through $15,999", "$16,000 through $24,999", 
                                          "$25,000 through $34,999", "$35,000 through $49,999", "$50,000 through $74,999", "$75,000 through $99,999",
                                          "$100,000 and greater", "Not reported"))) %>%
  arrange(value) %>%
  select(value, n, percent)  %>%
  ggplot(aes(value, percent, fill = value)) +
  geom_histogram(stat = "identity") + 
  geom_text(aes(label = sprintf("%.1f%s", percent, "%")), nudge_y = 2) + 
  coord_flip() + 
  scale_fill_manual(name = "", values = wesanderson::wes_palette("Zissou1", n = 9, type = "continuous")) +
  labs(x = "") +
  plot_aes +
  theme(legend.position = "none")

Education

Table

demo_summary %>%
  filter(` ` == "highest degree completed") %>%
  group_by(value) %>%
  summarize(n = sum(n)) %>%
  mutate(total = sum(n),
         percent = sprintf("%.1f%s", (n / total) * 100, "%")) %>%
  bind_rows(data.frame(value = "Less than high school", n = 0, percent = "0%")) %>%
  mutate(value = factor(value, levels = c("Less than high school", "High school graduate (diploma)", "High school graduate (GED)", 
                                          "Some college (1-4 years, no degree)", "Associate's degree (including occupational or academic degrees)",
                                          "Bachelor's degree (BA, BS, etc)", "Master's degree (MA, MS, MENG, MSW, etc)",
                                          "Professional school degree (MD, DDC, JD, etc)", "Doctorate degree (PhD, EdD, etc)"))) %>%
  arrange(value) %>%
  select(value, n, percent) %>%
  rename("highest degree completed" = value) %>%
  mutate_if(is.character, funs(ifelse(is.na(.), "—", .))) %>%
  reactable::reactable(striped = TRUE)

Plot

demo_summary %>%
  filter(` ` == "highest degree completed") %>%
  group_by(value) %>%
  summarize(n = sum(n)) %>%
  mutate(total = sum(n),
         percent = (n / total) * 100) %>%
  bind_rows(data.frame(value = "Less than high school", n = 0, percent = 0)) %>%
  mutate(value = factor(value, levels = c("Less than high school", "High school graduate (diploma)", "High school graduate (GED)", 
                                          "Some college (1-4 years, no degree)", "Associate's degree (including occupational or academic degrees)",
                                          "Bachelor's degree (BA, BS, etc)", "Master's degree (MA, MS, MENG, MSW, etc)",
                                          "Professional school degree (MD, DDC, JD, etc)", "Doctorate degree (PhD, EdD, etc)"))) %>%
  arrange(value) %>%
  select(value, n, percent)  %>%
  ggplot(aes(value, percent, fill = value)) +
  geom_histogram(stat = "identity") + 
  geom_text(aes(label = sprintf("%.1f%s", percent, "%")), nudge_y = 3) + 
  coord_flip() + 
  scale_fill_manual(name = "", values = wesanderson::wes_palette("Zissou1", n = 8, type = "continuous")) +
  labs(x = "") +
  plot_aes +
  theme(legend.position = "none")

Data quality checks

Missing data

Only a single rating (or 0.01%) is missing

messages %>%
  filter(is.na(value)) %>%
  group_by(SID, survey_name) %>%
  summarize(n = n()) %>%
  arrange(-n)

Outliers

There do not appear to be obvious outliers

messages %>%
  ggplot(aes(survey_name, value, fill = survey_name)) +
  geom_flat_violin(position = position_nudge(x = .1, y = 0), color = FALSE) +
  geom_point(aes(color = survey_name), position = position_jitter(width = .05), size = .1, alpha = .1) + 
  geom_boxplot(width = .1, outlier.shape = NA, fill = NA) +
  scale_fill_manual(values = wesanderson::wes_palette("Zissou1", n = 4, type = "continuous")) +
  scale_color_manual(values = wesanderson::wes_palette("Zissou1", n = 4, type = "continuous")) +
  labs(x = "") +
  coord_flip() +
  plot_aes +
  theme(legend.position = "none")

Descriptive statistics

Self-relevance (1 = strongly disagree, 100 = strongly agree)

This message is relevant to me.

Social relevance (1 = strongly disagree, 100 = strongly agree)

This message is relevant to people I know.

Broadcast sharing intention (1 = strongly disagree, 100 = strongly agree)

I would share this article by posting on social media (on Facebook, Twitter, etc).

Narrowcast sharing intention (1 = strongly disagree, 100 = strongly agree)

I would share this article directly with someone I know (via email, direct message, etc).

# means
means = messages %>%
  group_by(survey_name) %>%
  filter(!is.na(value)) %>%
  summarize(N = n(),
            `M (SD)` = sprintf("%s (%s)", round(mean(value, na.rm = TRUE), 2), round(sd(value, na.rm = TRUE), 2))) %>%
  mutate(`scale range` = "0-100") %>%
  select(`scale range`, everything()) %>%
  rename("variable" = survey_name) %>%
  mutate(variable = recode(variable, "msg_rel_social" = "social relevance",
                           "msg_rel_self" = "self-relevance",
                           "msg_sharing_broad" = "broadcast sharing intention",
                           "msg_sharing_narrow" = "narrowcast sharing intention"))

# repeated measures correlations
corrs = messages %>%
  spread(survey_name, value) %>%
  nest() %>%
  mutate(self_social = map(data, ~ rmcorr::rmcorr(as.factor(SID), msg_rel_self, msg_rel_social, .)),
         self_broad = map(data, ~ rmcorr::rmcorr(as.factor(SID), msg_rel_self, msg_sharing_broad, .)),
         social_broad = map(data, ~ rmcorr::rmcorr(as.factor(SID), msg_rel_social, msg_sharing_broad, .)),
         broad_narrow = map(data, ~ rmcorr::rmcorr(as.factor(SID), msg_sharing_broad, msg_sharing_narrow, .)),
         self_narrow = map(data, ~ rmcorr::rmcorr(as.factor(SID), msg_rel_self, msg_sharing_narrow, .)),
         social_narrow = map(data, ~ rmcorr::rmcorr(as.factor(SID), msg_rel_social, msg_sharing_narrow, .))) %>%
  select(-data) %>%
  gather(test, model) %>%
  group_by(test) %>%
  filter(!is.null(model)) %>%
  mutate(r = sprintf("%.2f [%.2f, %.2f]", model[[1]][[1]], model[[1]][[4]][1], model[[1]][[4]][2]),
         df = model[[1]][[2]],
         p = round(model[[1]][[3]], 3),
         p = ifelse(p == 0, "< .001", as.character(p))) %>%
  ungroup() %>%
  select(test, r) %>%
  extract(test, c("var1", "var2"), "(.*)_(.*)") %>%
  spread(var2, r) %>%
  select(var1, everything()) %>%
  group_by(var1) %>%
  mutate_if(is.character, ~ ifelse(is.na(.), "--", .)) %>%
  mutate(var1 = ifelse(var1 == "social", "social relevance",
                ifelse(var1 == "self", "self-relevance",
                ifelse(var1 == "broad", "broadcast sharing intention", "narrowcast sharing intention")))) %>%
  rename("variable" = var1,
         "broadcast sharing intention" = broad,
         "narrowcast sharing intention" = narrow,
         "social relevance" = social) %>%
  select(variable, `narrowcast sharing intention`, `broadcast sharing intention`, `social relevance`)

# merge descriptors and correlations to generate manuscript table
means %>%
  left_join(., corrs)%>%
  mutate_if(is.character, ~ ifelse(is.na(.), "--", .)) %>%
  reactable::reactable(striped = TRUE)

Visualize raw data

The following plots visualize the raw data and do not take into account repeated measures within-participant

Condition effects

messages %>%
  ggplot(aes(survey_name, value, color = article_cond)) +
  stat_summary(fun.data = "mean_cl_boot", geom = "pointrange", size = 1) +
  scale_color_manual(values = palette_condition) +
  plot_aes

Condition effects with data

messages %>%
  ggplot(aes(survey_name, value, color = article_cond)) +
  geom_jitter(alpha = .05) +
  stat_summary(fun.data = "mean_cl_boot", geom = "pointrange") +
  scale_color_manual(values = palette_condition) +
  plot_aes

Correlations between relevance and sharing

messages %>%
  spread(survey_name, value) %>%
  gather(sharing, share_value, msg_sharing_broad, msg_sharing_narrow) %>%
  gather(relevance, rel_value, msg_rel_self, msg_rel_social) %>%
  ggplot(aes(rel_value, share_value, color = article_cond, fill = article_cond)) +
  geom_point(alpha = .05) +
  geom_smooth(method = "lm") +
  facet_grid(sharing ~ relevance) +
  scale_color_manual(values = palette_condition) +
  scale_fill_manual(values = palette_condition) +
  plot_aes

Correlations between self and social relevance

messages %>%
  spread(survey_name, value) %>%
  ggplot(aes(msg_rel_self, msg_rel_social, color = article_cond, fill = article_cond)) +
  geom_point(alpha = .1) +
  geom_smooth(method = "lm") +
  scale_color_manual(values = palette_condition) +
  scale_fill_manual(values = palette_condition) +
  plot_aes

Preregistered analyses

The following hypotheses are preregistered. The preregistration is available on OSF.

Hypothesis 1

H1: Messages in the experimental conditions will evoke higher broad- and narrowcast sharing intentions than messages in the control condition.

Results

✅ These data are consistent with the hypothesis that thinking about the self-relevance of a message increases its perceived self-relevance

mod_h1 = lmer(msg_rel_self ~ 1 + article_cond + (1 + article_cond | SID),
              data = messages_mod,
              control = lmerControl(optimizer = "bobyqa"))

Plot

ggeffects::ggpredict(mod_h1, c("article_cond")) %>%
  data.frame() %>%
  mutate(x = factor(x, levels = c("self", "control", "social"))) %>%
  ggplot(aes(x = x, y = predicted, fill = x)) +
    geom_bar(stat = "identity", position = position_dodge(.9)) +
    geom_errorbar(aes(ymin = conf.low, ymax = conf.high), position = position_dodge(.9), width = 0, size = 1) +
    scale_fill_manual(name = "", values = palette_condition) +
    labs(x = "", y = "mean predicted rating\n") +
    plot_aes +
    theme(legend.position = "none")

Model table

table_model(mod_h1)

term	b [95% CI]	df	t	p
control	52.85 [50.55, 55.15]	396.00	45.13	< .001
self	12.41 [10.01, 14.81]	225.44	10.19	< .001
social	5.12 [2.96, 7.28]	212.74	4.67	< .001

Summary

summary(mod_h1)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: msg_rel_self ~ 1 + article_cond + (1 + article_cond | SID)
##    Data: messages_mod
## Control: lmerControl(optimizer = "bobyqa")
## 
## REML criterion at convergence: 74144.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3486 -0.5966  0.0850  0.5823  3.3875 
## 
## Random effects:
##  Groups   Name               Variance Std.Dev. Corr       
##  SID      (Intercept)        489.1    22.12               
##           article_condself   222.4    14.91    -0.44      
##           article_condsocial 145.9    12.08    -0.34  0.06
##  Residual                    552.3    23.50               
## Number of obs: 7940, groups:  SID, 397
## 
## Fixed effects:
##                    Estimate Std. Error      df t value             Pr(>|t|)    
## (Intercept)          52.849      1.171 396.000  45.132 < 0.0000000000000002 ***
## article_condself     12.408      1.218 225.445  10.189 < 0.0000000000000002 ***
## article_condsocial    5.118      1.096 212.745   4.671           0.00000532 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) artcl_cndsl
## artcl_cndsl -0.352            
## artcl_cndsc -0.287  0.101

Check model

Check model assumptions using the {performance} package

performance::check_model(mod_h1)

Hypothesis 2

H1: Messages in the social condition will be rated as more socially relevant than messages in the control condition.

Results

✅ These data are consistent with the hypothesis that thinking about the social relevance of a message increases its perceived social relevance

mod_h2 = lmer(msg_rel_social ~ 1 + article_cond + (1 + article_cond | SID),
              data = messages_mod,
              control = lmerControl(optimizer = "bobyqa"))

Plot

ggeffects::ggpredict(mod_h2, c("article_cond")) %>%
  data.frame() %>%
  mutate(x = factor(x, levels = c("self", "control", "social"))) %>%
  ggplot(aes(x = x, y = predicted, fill = x)) +
    geom_bar(stat = "identity", position = position_dodge(.9)) +
    geom_errorbar(aes(ymin = conf.low, ymax = conf.high), position = position_dodge(.9), width = 0, size = 1) +
    scale_fill_manual(name = "", values = palette_condition) +
    labs(x = "", y = "mean predicted rating\n") +
    plot_aes +
    theme(legend.position = "none")

Model table

table_model(mod_h2)

term	b [95% CI]	df	t	p
control	58.44 [56.18, 60.69]	396.00	50.88	< .001
self	8.66 [6.61, 10.71]	228.12	8.32	< .001
social	8.90 [6.81, 11.00]	220.99	8.38	< .001

Summary

summary(mod_h2)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: msg_rel_social ~ 1 + article_cond + (1 + article_cond | SID)
##    Data: messages_mod
## Control: lmerControl(optimizer = "bobyqa")
## 
## REML criterion at convergence: 72201.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8896 -0.4633  0.0913  0.5490  3.4932 
## 
## Random effects:
##  Groups   Name               Variance Std.Dev. Corr       
##  SID      (Intercept)        481.0    21.93               
##           article_condself   158.1    12.57    -0.46      
##           article_condsocial 167.1    12.93    -0.47  0.27
##  Residual                    426.8    20.66               
## Number of obs: 7940, groups:  SID, 397
## 
## Fixed effects:
##                    Estimate Std. Error      df t value             Pr(>|t|)    
## (Intercept)          58.436      1.149 396.000   50.88 < 0.0000000000000002 ***
## article_condself      8.656      1.040 228.124    8.32  0.00000000000000804 ***
## article_condsocial    8.904      1.062 220.989    8.38  0.00000000000000616 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) artcl_cndsl
## artcl_cndsl -0.356            
## artcl_cndsc -0.365  0.130

Check model

Check model assumptions using the {performance} package

performance::check_model(mod_h2)

Hypothesis 3 & 4

H3: Messages in the experimental conditions will evoke higher broad- and narrowcast sharing intentions than messages in the control condition.

H4: We will test whether the experimental conditions differ in their effects on broadcast and narrowcast sharing intentions. We expect that the social relevance manipulation will have a stronger effect on narrowcast than broadcast sharing intentions, whereas the self-relevance manipulation will have a stronger effect on broadcast than narrowcast sharing intentions.

Results

✅ These data are consistent with the hypothesis that the self and social conditions increase sharing intentions

✅ These data are consistent with the hypothesis that the social condition more strongly increases narrowcast than broadcast sharing intentions

❌ These data are inconsistent with the hypothesis that the self condition more strongly increases broadcast than narrowcast sharing intentions

mod_h3_h4 = lmer(msg_share ~ 1 + article_cond*sharing_type + (1 + sharing_type | SID),
              data = messages_mod,
              control = lmerControl(optimizer = "bobyqa"))

Plot

ggeffects::ggpredict(mod_h3_h4, c("article_cond", "sharing_type")) %>%
  data.frame() %>%
  mutate(x = factor(x, levels = c("self", "control", "social")),
         group = ifelse(group == "msg_sharing_broad", "broadcast sharing", "narrowcast sharing")) %>%
  ggplot(aes(x = x, y = predicted, color = x)) +
  geom_pointrange(aes(ymin = conf.low, ymax = conf.high, alpha = group), position = position_dodge(.5), size = 2) +
  coord_flip() +
  scale_color_manual(name = "", values = palette_condition, guide = "none") +
  scale_alpha_manual(name = "", values = c(1, .5)) +
  labs(x = "", y = "\nmean predicted sharing intention rating") +
  scale_y_continuous(limits = c(40, 60)) +
  plot_aes +
  theme(legend.position = "top")

Model table

table_model(mod_h3_h4)

term	b [95% CI]	df	t	p
control	45.04 [42.04, 48.04]	431.34	29.54	< .001
self	5.23 [3.57, 6.89]	7536.07	6.16	< .001
social	3.37 [1.70, 5.05]	7535.21	3.95	< .001
sharing type	0.16 [-1.50, 1.82]	743.49	0.19	.850
self x sharing type	2.08 [-0.18, 4.35]	6961.88	1.81	.070
social x sharing type	3.53 [1.25, 5.80]	6926.66	3.04	< .001

Summary

summary(mod_h3_h4)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: msg_share ~ 1 + article_cond * sharing_type + (1 + sharing_type |  
##     SID)
##    Data: messages_mod
## Control: lmerControl(optimizer = "bobyqa")
## 
## REML criterion at convergence: 71638.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0176 -0.4887 -0.0143  0.4899  4.3824 
## 
## Random effects:
##  Groups   Name                           Variance Std.Dev. Corr 
##  SID      (Intercept)                    845.7    29.08         
##           sharing_typemsg_sharing_narrow 128.1    11.32    -0.44
##  Residual                                386.8    19.67         
## Number of obs: 7939, groups:  SID, 397
## 
## Fixed effects:
##                                                    Estimate Std. Error
## (Intercept)                                         45.0383     1.5249
## article_condself                                     5.2305     0.8489
## article_condsocial                                   3.3742     0.8546
## sharing_typemsg_sharing_narrow                       0.1607     0.8440
## article_condself:sharing_typemsg_sharing_narrow      2.0847     1.1538
## article_condsocial:sharing_typemsg_sharing_narrow    3.5251     1.1609
##                                                          df t value
## (Intercept)                                        431.3357  29.536
## article_condself                                  7536.0749   6.162
## article_condsocial                                7535.2126   3.948
## sharing_typemsg_sharing_narrow                     743.4852   0.190
## article_condself:sharing_typemsg_sharing_narrow   6961.8849   1.807
## article_condsocial:sharing_typemsg_sharing_narrow 6926.6611   3.036
##                                                               Pr(>|t|)    
## (Intercept)                                       < 0.0000000000000002 ***
## article_condself                                        0.000000000756 ***
## article_condsocial                                      0.000079372297 ***
## sharing_typemsg_sharing_narrow                                  0.8490    
## article_condself:sharing_typemsg_sharing_narrow                 0.0708 .  
## article_condsocial:sharing_typemsg_sharing_narrow               0.0024 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##                 (Intr) artcl_cndsl artcl_cndsc shr___ artcl_cndsl:___
## artcl_cndsl     -0.151                                               
## artcl_cndsc     -0.150  0.075                                        
## shrng_typ__     -0.433  0.272       0.270                            
## artcl_cndsl:___  0.111 -0.686      -0.106      -0.400                
## artcl_cndsc:___  0.110 -0.106      -0.685      -0.398  0.164

Check model

Check model assumptions using the {performance} package

performance::check_model(mod_h3_h4)

Exploratory analyses

Research question 1

We will also explore the degree to which the effects of the experimental manipulations are specific to their relevance target. Does the self condition increase self-relevance more strongly than the social condition?

Results

✅ These data are consistent with the hypothesis that the self condition increases self-relevance more strongly than the social condition

mod_e1 = lmer(msg_rel_self ~ 1 + article_cond + (1 + article_cond | SID),
              data = filter(messages_mod, !article_cond == "control"), 
              control = lmerControl(optimizer = "bobyqa"))

Plot

ggeffects::ggpredict(mod_e1, c("article_cond")) %>%
  data.frame() %>%
  mutate(x = factor(x, levels = c("self", "social"))) %>%
  ggplot(aes(x = x, y = predicted, fill = x)) +
    geom_bar(stat = "identity", position = position_dodge(.9)) +
    geom_errorbar(aes(ymin = conf.low, ymax = conf.high), position = position_dodge(.9), width = 0, size = 1) +
    scale_fill_manual(name = "", values = c(palette_condition[1], palette_condition[3])) +
    labs(x = "", y = "mean predicted rating\n") +
    plot_aes +
    theme(legend.position = "none")

Model table

table_model(mod_e1, reference = "self")

term	b [95% CI]	df	t	p
self	64.93 [61.87, 67.99]	199.00	41.83	< .001
social	-6.57 [-10.96, -2.19]	394.43	-2.95	< .001

Summary

summary(mod_e1)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: msg_rel_self ~ 1 + article_cond + (1 + article_cond | SID)
##    Data: filter(messages_mod, !article_cond == "control")
## Control: lmerControl(optimizer = "bobyqa")
## 
## REML criterion at convergence: 36758.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5494 -0.5313  0.1243  0.5669  3.2953 
## 
## Random effects:
##  Groups   Name               Variance Std.Dev. Corr 
##  SID      (Intercept)        433.0    20.81         
##           article_condsocial 287.2    16.95    -0.38
##  Residual                    488.8    22.11         
## Number of obs: 3970, groups:  SID, 397
## 
## Fixed effects:
##                    Estimate Std. Error      df t value             Pr(>|t|)    
## (Intercept)          64.927      1.552 199.000  41.827 < 0.0000000000000002 ***
## article_condsocial   -6.573      2.229 394.428  -2.948              0.00339 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## artcl_cndsc -0.696
## optimizer (bobyqa) convergence code: 0 (OK)
## unable to evaluate scaled gradient
## Model failed to converge: degenerate  Hessian with 1 negative eigenvalues

Check model

Check model assumptions using the {performance} package

performance::check_model(mod_e1)

Research question 2

We will also explore the degree to which the effects of the experimental manipulations are specific to their relevance target. Does the social condition increase social relevance more strongly than the self condition?

Results

❌ These data are inconsistent with the hypothesis that the social condition increases social relevance more strongly than the self condition

mod_e2 = lmer(msg_rel_social ~ 1 + article_cond + (1 + article_cond | SID),
              data = filter(messages_mod, !article_cond == "control"), 
              control = lmerControl(optimizer = "bobyqa"))

Plot

ggeffects::ggpredict(mod_e2, c("article_cond")) %>%
  data.frame() %>%
  mutate(x = factor(x, levels = c("self", "social"))) %>%
  ggplot(aes(x = x, y = predicted, fill = x)) +
    geom_bar(stat = "identity", position = position_dodge(.9)) +
    geom_errorbar(aes(ymin = conf.low, ymax = conf.high), position = position_dodge(.9), width = 0, size = 1) +
    scale_fill_manual(name = "", values = c(palette_condition[1], palette_condition[3])) +
    labs(x = "", y = "mean predicted rating\n") +
    plot_aes +
    theme(legend.position = "none")

Model table

table_model(mod_e2, reference = "self")

term	b [95% CI]	df	t	p
self	66.47 [63.51, 69.44]	199.00	44.18	< .001
social	1.48 [-2.60, 5.55]	394.16	0.71	.480

Summary

summary(mod_e2)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: msg_rel_social ~ 1 + article_cond + (1 + article_cond | SID)
##    Data: filter(messages_mod, !article_cond == "control")
## Control: lmerControl(optimizer = "bobyqa")
## 
## REML criterion at convergence: 35666.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.1246 -0.4177  0.1240  0.5330  3.8295 
## 
## Random effects:
##  Groups   Name               Variance Std.Dev. Corr 
##  SID      (Intercept)        416.1    20.40         
##           article_condsocial 246.5    15.70    -0.47
##  Residual                    365.9    19.13         
## Number of obs: 3970, groups:  SID, 397
## 
## Fixed effects:
##                    Estimate Std. Error      df t value            Pr(>|t|)    
## (Intercept)          66.474      1.504 199.000  44.184 <0.0000000000000002 ***
## article_condsocial    1.479      2.073 394.161   0.714               0.476    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## artcl_cndsc -0.726

Check model

Check model assumptions using the {performance} package

performance::check_model(mod_e2)

Package citations

report::cite_packages()

##   - Douglas Bates and Martin Maechler (2021). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.3-4. https://CRAN.R-project.org/package=Matrix
##   - Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
##   - Greg Lin (2020). reactable: Interactive Data Tables Based on 'React Table'. R package version 0.2.3. https://CRAN.R-project.org/package=reactable
##   - H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
##   - Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0. https://CRAN.R-project.org/package=stringr
##   - Hadley Wickham (2021). forcats: Tools for Working with Categorical Variables (Factors). R package version 0.5.1. https://CRAN.R-project.org/package=forcats
##   - Hadley Wickham and Maximilian Girlich (2022). tidyr: Tidy Messy Data. R package version 1.2.0. https://CRAN.R-project.org/package=tidyr
##   - Hadley Wickham, Jennifer Bryan and Malcolm Barrett (2021). usethis: Automate Package and Project Setup. R package version 2.1.5. https://CRAN.R-project.org/package=usethis
##   - Hadley Wickham, Jim Hester and Jennifer Bryan (2022). readr: Read Rectangular Text Data. R package version 2.1.2. https://CRAN.R-project.org/package=readr
##   - Hadley Wickham, Jim Hester, Winston Chang and Jennifer Bryan (2021). devtools: Tools to Make Developing R Packages Easier. R package version 2.4.3. https://CRAN.R-project.org/package=devtools
##   - Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2022). dplyr: A Grammar of Data Manipulation. R package version 1.0.8. https://CRAN.R-project.org/package=dplyr
##   - Hao Zhu (2021). kableExtra: Construct Complex Table with 'kable' and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra
##   - Karthik Ram and Hadley Wickham (2018). wesanderson: A Wes Anderson Palette Generator. R package version 0.3.6. https://CRAN.R-project.org/package=wesanderson
##   - Kirill Müller and Hadley Wickham (2021). tibble: Simple Data Frames. R package version 3.1.6. https://CRAN.R-project.org/package=tibble
##   - Kuznetsova A, Brockhoff PB, Christensen RHB (2017). "lmerTest Package:Tests in Linear Mixed Effects Models." _Journal of StatisticalSoftware_, *82*(13), 1-26. doi: 10.18637/jss.v082.i13 (URL:https://doi.org/10.18637/jss.v082.i13).
##   - Lionel Henry and Hadley Wickham (2020). purrr: Functional Programming Tools. R package version 0.3.4. https://CRAN.R-project.org/package=purrr
##   - Lüdecke et al., (2021). performance: An R Package for Assessment, Comparison and Testing of Statistical Models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
##   - Makowski, D., Ben-Shachar, M.S., Patil, I. & Lüdecke, D. (2020). Automated Results Reporting as a Practical Tool to Improve Reproducibility and Methodological Best Practices Adoption. CRAN. Available from https://github.com/easystats/report. doi: .
##   - Paolo Di Lorenzo (2021). usmap: US Maps Including Alaska and Hawaii. R package version 0.5.2. https://CRAN.R-project.org/package=usmap
##   - R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
##   - Rinker, T. W. & Kurkiewicz, D. (2017). pacman: Package Management for R. version 0.5.0. Buffalo, New York. http://github.com/trinker/pacman
##   - Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686
##   - Yihui Xie (2021). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.37.

Reproducible report example

Dani cosme

2022-03-17

Prep data

Load packages

Define aesthetics

Define functions

Load data

Tidy data

Sample demographics

Sample size

Geographic distribution

Age

Table

Plot

Gender

Table

Plot

Race and ethnicity

Table

Plot

Household income

Table

Plot

Education

Table

Plot

Data quality checks

Missing data

Outliers

Descriptive statistics

Visualize raw data

Condition effects

Condition effects with data

Correlations between relevance and sharing

Correlations between self and social relevance

Preregistered analyses

Hypothesis 1

Plot

Model table

Summary

Check model

Hypothesis 2

Plot

Model table

Summary

Check model

Hypothesis 3 & 4

Plot

Model table

Summary

Check model

Exploratory analyses

Research question 1

Plot

Model table

Summary

Check model

Research question 2

Plot

Model table

Summary

Check model

Package citations