Skip to contents

This service returns information of samples used in analyses from all datasets. Results may be filtered by dataset ID, sample ID, subject ID, sample metadata, or other provided parameters. By default, this service queries the latest GTEx release.

GTEx Portal API documentation

Usage

get_sample_datasets(
  datasetId = "gtex_v8",
  sampleIds = NULL,
  tissueSampleIds = NULL,
  subjectIds = NULL,
  ageBrackets = NULL,
  sex = NULL,
  pathCategory = NULL,
  tissueSiteDetailIds = NULL,
  aliquotIds = NULL,
  autolysisScores = NULL,
  hardyScales = NULL,
  ischemicTimes = NULL,
  ischemicTimeGroups = NULL,
  rins = NULL,
  uberonIds = NULL,
  dataTypes = NULL,
  sortBy = "sampleId",
  sortDirection = "asc",
  page = 0,
  itemsPerPage = getOption("gtexr.itemsPerPage"),
  .verbose = getOption("gtexr.verbose"),
  .return_raw = FALSE
)

Arguments

datasetId

String. Unique identifier of a dataset. Usually includes a data source and data release. Options: "gtex_v8", "gtex_snrnaseq_pilot".

sampleIds

Character vector. GTEx sample ID.

tissueSampleIds

Array of strings. A list of Tissue Sample ID(s).

subjectIds

Character vector. GTEx subject ID.

ageBrackets

The age bracket(s) of the donors of interest. Options: "20-29", "30-39", "40-49", "50-59", "60-69", "70-79".

sex

String. Options: "male", "female".

pathCategory

Character vector. Options: "adenoma", "amylacea", "atelectasis", "atherosclerosis", "atherosis", "atrophy", "calcification", "cirrhosis", "clean_specimens", "congestion", "corpora_albicantia", "cyst", "desquamation", "diabetic", "dysplasia", "edema", "emphysema", "esophagitis", "fibrosis", "gastritis", "glomerulosclerosis", "goiter", "gynecomastoid", "hashimoto", "heart_failure_cells", "hemorrhage", "hepatitis", "hyalinization", "hypereosinophilia", "hyperplasia", "hypertrophy", "hypoxic", "infarction", "inflammation", "ischemic_changes", "macrophages", "mastopathy", "metaplasia", "monckeberg", "necrosis", "nephritis", "nephrosclerosis", "no_abnormalities", "nodularity", "pancreatitis", "pigment", "pneumonia", "post_menopausal", "prostatitis", "saponification", "scarring", "sclerotic", "solar_elastosis", "spermatogenesis", "steatosis", "sweat_glands", "tma".

tissueSiteDetailIds

Character vector of IDs for tissues of interest. Can be GTEx specific IDs (e.g. "Whole_Blood"; use get_tissue_site_detail() to see valid values) or Ontology IDs.

aliquotIds

Character vector.

autolysisScores

Character vector. Options: "None", "Mild", "Moderate", "Severe".

hardyScales

Character vector. A list of Hardy Scale(s) of interest. Options: "Ventilator case", "Fast death - violent", "Fast death - natural causes", "Intermediate death", "Slow death".

ischemicTimes

Integer.

ischemicTimeGroups

Character vector. Options: "<= 0", "1 - 300", "301 - 600", "601 - 900", "901 - 1200", "1201 - 1500", "> 1500".

rins

Integer, vector.

uberonIds

Character vector of Uberon IDs (e.g. "UBERON:EFO_0000572"; use get_tissue_site_detail() to see valid values).

dataTypes

Character vector. Options: "RNASEQ", "WGS", "WES", "OMNI", "EXCLUDE".

sortBy

String. Options: "sampleId", "ischemicTime", "aliquotId", "tissueSampleId", "hardyScale", "pathologyNotes", "ageBracket", "tissueSiteDetailId", "sex".

sortDirection

String. Options: "asc", "desc". Default = "asc".

page

Integer (default = 0).

itemsPerPage

Integer (default = 250). Set globally to maximum value 100000 with options(list(gtexr.itemsPerPage = 100000)).

.verbose

Logical. If TRUE (default), print paging information. Set to FALSE globally with options(list(gtexr.verbose = FALSE)).

.return_raw

Logical. If TRUE, return the raw API JSON response. Default = FALSE

Value

A tibble. Or a list if .return_raw = TRUE.

Examples

get_sample_datasets()
#> Warning: ! Total number of items (22734) exceeds the selected maximum page size (250).
#>  22484 items were not retrieved.
#>  To retrieve all available items, increase `itemsPerPage`, ensuring you reuse
#>   your original query parameters e.g.
#>   `get_sample_datasets(<your_existing_parameters>, itemsPerPage = 100000)`
#>  Alternatively, adjust global "gtexr.itemsPerPage" setting e.g.
#>   `options(list(gtexr.itemsPerPage = 100000))`
#> 
#> ── Paging info ─────────────────────────────────────────────────────────────────
#>  numberOfPages = 91
#>  page = 0
#>  maxItemsPerPage = 250
#>  totalNumberOfItems = 22734
#> # A tibble: 250 × 20
#>    ischemicTime aliquotId tissueSampleId       tissueSiteDetail         dataType
#>           <int> <chr>     <chr>                <chr>                    <chr>   
#>  1         1188 SM-58Q7G  GTEX-1117F-0003      Whole Blood              WES     
#>  2         1188 SM-5DWSB  GTEX-1117F-0003      Whole Blood              OMNI    
#>  3         1188 SM-6WBT7  GTEX-1117F-0003      Whole Blood              WGS     
#>  4         1193 SM-AHZ7F  GTEX-1117F-0011-R10a Brain - Frontal Cortex … NA      
#>  5         1193 SM-CYKQ8  GTEX-1117F-0011-R10b Brain - Frontal Cortex … NA      
#>  6         1214 SM-5GZZ7  GTEX-1117F-0226      Adipose - Subcutaneous   RNASEQ  
#>  7         1220 SM-5EGHI  GTEX-1117F-0426      Muscle - Skeletal        RNASEQ  
#>  8         1221 SM-5EGHJ  GTEX-1117F-0526      Artery - Tibial          RNASEQ  
#>  9         1243 SM-5N9CS  GTEX-1117F-0626      Artery - Coronary        RNASEQ  
#> 10         1244 SM-5GIEN  GTEX-1117F-0726      Heart - Atrial Appendage RNASEQ  
#> # ℹ 240 more rows
#> # ℹ 15 more variables: ischemicTimeGroup <chr>, freezeType <chr>,
#> #   sampleId <chr>, sampleIdUpper <chr>, ageBracket <chr>, hardyScale <chr>,
#> #   tissueSiteDetailId <chr>, subjectId <chr>, uberonId <chr>, sex <chr>,
#> #   datasetId <chr>, rin <dbl>, pathologyNotesCategories <tibble[,57]>,
#> #   pathologyNotes <chr>, autolysisScore <chr>