Session 1: Getting Started
Te Kāhui Roro Reo | New Zealand Institute of Language, Brain and Behaviour
Te Whare Wānanga o Waitaha | University of Canterbury
2025-07-31





Ctrl + Shift + N



Tools > Global Options).

For reasons, see: https://r4ds.hadley.nz/workflow-scripts.html
Why projects?: https://www.tidyverse.org/blog/2017/12/workflow-vs-script/
>2 + 2’ and press enter.
🥳🥳🥳🥳🥳
File > New File > R Script)getting-started.R.There is a script with all of these commands at scripts/basics.R
+’!
? before function name produces help (in output pane)TRUE or FALSE&’) or ‘or’ (|)c stands for “combine”.<-’View(toddlers): a spreadsheet-style view of data.$’ to access columns in a data frame.install.packages().
install.packages() directly in a script.library(), or individual functions can be accessed using ::
_______________________________________
<Ko te Kāhui Roro Reo te rōpū pai rawe>
---------------------------------------
\
\
|\___/|
==) ^Y^ (==
\ ^ /
)=*=(
/ \
| |
/| | | |\
\| | |_|/\
jgs //_// ___/
\_)
tidyversetidyverse is a popular collection of packages.dplyr: functions for data filtering and transformation.ggplot2: a popular package for data visualisation.readr: functions for reading and writing data.stringr: functions for manipulating strings.tidyverse packages has a different style. (…better)data in our project.read_tsv() loads ‘tab separated values’ (simlar to a .csv file)summary() tells us about some of the variables.hist() is a built-in function for making histograms.# Chaining functions together with pipes.
wellform_filtered <- wellform |>
filter(reactionTime < 5) |>
select(
workerId, stimulus, word,
enteredResponse, reactionTime, score.shortv
) |>
rename(
participant = workerId,
response = enteredResponse,
reaction_time = reactionTime,
phonotactic_score = score.shortv
)tidyverse style.
lm() function.
Call:
lm(formula = response ~ phonotactic_score + reaction_time, data = wellform_filtered)
Residuals:
Min 1Q Median 3Q Max
-3.0752 -0.8669 0.1387 1.0105 2.8295
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.16334 0.11274 54.67 <2e-16 ***
phonotactic_score 3.53412 0.12287 28.76 <2e-16 ***
reaction_time 0.04016 0.02039 1.97 0.0489 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.217 on 4327 degrees of freedom
Multiple R-squared: 0.1605, Adjusted R-squared: 0.1602
F-statistic: 413.8 on 2 and 4327 DF, p-value: < 2.2e-16
# Get predictions from model.
# Step 1. Decide what we want predictions for. In this case,
# the full range of phonotactic scores at the mean value for
# reaction time.
to_predict <- data.frame(
phonotactic_score = seq(-1.2, -0.6, by = 0.01),
reaction_time = mean(wellform_filtered$reaction_time)
)
# Step 2: Get predictions using the `predict()` function.
model_predictions <- predict(
wellform_fit,
newdata = to_predict,
se.fit = TRUE
)
# Step 3: add predictions and 95% confidence intervals to the `to_predict` data
# frame.
to_predict$prediction <- model_predictions$fit
to_predict$upper <- model_predictions$fit + 1.96 * model_predictions$se
to_predict$lower <- model_predictions$fit - 1.96 * model_predictions$se
# Step 4: visualise again
to_predict |>
ggplot(
aes(
x = phonotactic_score,
y = prediction,
ymin = lower,
ymax = upper
)
) +
geom_ribbon(alpha = 0.4) +
geom_line()
We’ve covered a lot!
Next time: More on data processing.
Comments
#will be ignored by R.