Skip to contents

Reads comma separated CSV files generated by Qualtrics software. The second line containing the variable labels is imported. Repetitive introductions to matrix questions are automatically removed. Variable labels are stored as attributes.

Usage

read_survey(
  file_name,
  strip_html = TRUE,
  import_id = FALSE,
  time_zone = NULL,
  legacy = FALSE,
  add_column_map = TRUE,
  add_var_labels = TRUE,
  col_types = NULL
)

Arguments

file_name

String. A CSV data file.

strip_html

Logical. If TRUE, then remove HTML tags from variable descriptions. Defaults to TRUE.

import_id

Logical. If TRUE, use Qualtrics import IDs instead of question IDs as column names. Defaults to FALSE.

time_zone

String. A local timezone to determine response date values. Defaults to NULL which corresponds to UTC time. See "Dates and Times" from Qualtrics for more information on format.

legacy

Logical. If TRUE, then import "legacy" format CSV files (as of 2017). Defaults to FALSE.

add_column_map

Logical. If TRUE, then a column map data frame will be added as an attribute to the main response data frame. This column map captures Qualtrics-provided metadata associated with the response download, such as an item description and internal ID's. Defaults to TRUE.

add_var_labels

Logical. If TRUE, then the item description from each variable (equivalent to the one in the column map) will be added as a "label" attribute using sjlabelled::set_label(). Useful for reference as well as cross-compatibility with other stats packages (e.g., Stata, see documentation in sjlabelled). Defaults to TRUE.

col_types

Optional. This argument provides a way to manually overwrite column types that may be incorrectly guessed. Takes a readr::cols() specification. See example below and readr::cols() for formatting details. Defaults to NULL.

Value

A data frame. Variable labels are stored as attributes. They are not printed on the console but are visibile in the RStudio viewer.

Examples

if (FALSE) {
# Generic use of read_survey()
df <- read_survey("<YOUR-PATH-TO-CSV-FILE>")
}
# Example using current data format
file <- system.file("extdata", "sample.csv", package = "qualtRics")
df <- read_survey(file)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   .default = col_double(),
#>   StartDate = col_datetime(format = ""),
#>   EndDate = col_datetime(format = ""),
#>   IPAddress = col_character(),
#>   RecordedDate = col_datetime(format = ""),
#>   ResponseId = col_character(),
#>   RecipientLastName = col_logical(),
#>   RecipientFirstName = col_logical(),
#>   RecipientEmail = col_logical(),
#>   ExternalReference = col_logical(),
#>   DistributionChannel = col_character(),
#>   UserLanguage = col_character(),
#>   Q2 = col_character(),
#>   `Q2 - Topics` = col_logical()
#> )
#>  Use `spec()` for the full column specifications.

# Example using legacy data format
file <- system.file("extdata", "sample_legacy.csv", package = "qualtRics")
df <- read_survey(file, legacy = TRUE)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   V1 = col_character(),
#>   V2 = col_character(),
#>   V3 = col_character(),
#>   V4 = col_logical(),
#>   V5 = col_logical(),
#>   V6 = col_character(),
#>   V7 = col_double(),
#>   V8 = col_datetime(format = ""),
#>   V9 = col_datetime(format = ""),
#>   V10 = col_double(),
#>   Q1 = col_double(),
#>   Q2 = col_character(),
#>   LocationLatitude = col_double(),
#>   LocationLongitude = col_double(),
#>   LocationAccuracy = col_double()
#> )

# Example changing column type
file <- system.file("extdata", "sample.csv", package = "qualtRics")
# Force EndDate to be a string
df <- read_survey(file, col_types = readr::cols(EndDate = readr::col_character()))