classcodes are classification schemes based on regular expression stored in
data frames. These are essential to the package and constitute the third
part of the triad of case data, code data and a classification scheme.
Usage
as.classcodes(x, ...)
# S3 method for class 'classcodes'
as.classcodes(
x,
...,
regex = attr(x, "regexpr"),
indices = attr(x, "indices"),
hierarchy = attr(x, "hierarchy")
)
# S3 method for class 'data.frame'
as.classcodes(
x,
...,
regex = NULL,
indices = NULL,
hierarchy = attr(x, "hierarchy"),
.name = NULL
)
is.classcodes(x)Arguments
- x
data frame with columns described in the details section. Alternatively a
classcodesobject to be modified.- ...
arguments passed between methods#'
- regex, indices
character vector with names of columns in
xcontaining regular expressions/indices.- hierarchy
named list of pairwise group names to appear as superior and subordinate for indices. To be used for indexing when the subordinate class is redundant (see the details section of
elixhauserfor an example).- .name
used internally for name dispatch
Value
Object of class classcodes (inheriting from data frame)
with additional attributes:
code:the coding used (for example "icd10", or "ATC").NULLfor unknown/arbitrary coding.regexprs:name of columns with regular expressions (as specified by theregexargument)indices:name of columns with (optional) index weights (as specified by theindicesargument)hierarchy:list as specified by thehierarchyargument.name:name as specified by the.nameargument.
Details
A classcodes object is a data frame with mandatory columns:
group: unique and non missing class namesAt least one column with regular expressions (regex without Perl-like versions) defining class membership. Those columns can have arbitrary names (as specified by the
regexargument). Occurrences of non unique regular expressions will lead to the same class having multiple names. This is accepted but will raise a warning. Classes do not have to be disjunct.
The object can have additional optional columns:
description: description of each categorycondition: a class might have conditions additional to what is expressed by the regular expressions. If so, these should be specified as quoted expressions that can be evaluated within the data frame used byclassify()weights for each class used by
index(). Could be more than one and could have arbitrary names (as specified by theindicesargument).
See also
vignette("classcodes")
vignette("Interpret_regular_expressions")
The package have several default classcodes included, see all_classcodes().
Other classcodes:
all_classcodes(),
as.data.frame.classified(),
codebook(),
print.classcodes(),
print.classified(),
set_classcodes(),
summary.classcodes(),
visualize.classcodes()
Other classcodes:
all_classcodes(),
as.data.frame.classified(),
codebook(),
print.classcodes(),
print.classified(),
set_classcodes(),
summary.classcodes(),
visualize.classcodes()
Examples
# The Elixhauser comorbidity classification is already a classcodes object
is.classcodes(coder::elixhauser)
#> [1] TRUE
# Strip its class attributes to use in examples
df <- as.data.frame(coder::elixhauser)
# Specify which columns store regular expressions and indices
# (assume no hierarchy)
elix <-
as.classcodes(
df,
regex = c("icd10", "icd10_short", "icd9cm", "icd9cm_ahrqweb", "icd9cm_enhanced"),
indices = c("sum_all", "sum_all_ahrq", "walraven",
"sid29", "sid30", "ahrq_mort", "ahrq_readm"),
hierarchy = NULL
)
elix
#>
#> Classcodes object
#>
#> Regular expressions:
#> icd10, icd10_short, icd9cm, icd9cm_ahrqweb, icd9cm_enhanced
#> Indices:
#> sum_all, sum_all_ahrq, walraven, sid29, sid30, ahrq_mort, ahrq_readm
#>
#> # A tibble: 31 × 13
#> group icd10 icd10_short icd9cm icd9cm_ahrqweb icd9cm_enhanced sum_all
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 congestive h… I(09… I(09|1[13]… 39891… 39891|4(0(2[0… 39891|4(0(2[01… 1
#> 2 cardiac arrh… I(44… I(4[457-9]… 42(6(… NA 42(6([079|1[02… 1
#> 3 valvular dis… A520… A52|I(0[5-… 0932|… 0932|39([4-6]… 0932|39[4-7]|4… 1
#> 4 pulmonary ci… I(2(… I2[678] 41(6|… 41(6|79) 41(5[01]|6|7[0… 1
#> 5 peripheral v… I7([… I7[01389]|… 44(0|… 44([0-2]|3[1-… 0930|4(373|4([… 1
#> 6 hypertension… I10 I10 401[1… 401[19]|6420 401 1
#> 7 hypertension… I1[1… I1[1-35] 40([2… 40(10|[2-5])|… 40[2-5] 1
#> 8 paralysis G(04… G(04|11|8[… 34(2[… 34[2-4]|438[2… 3(341|4([23]|4… 1
#> 9 other neurol… G(1[… G(1[0-3]|2… 3(3(1… 3(3([0145]|20… 3(3(19|2[01]|3… 1
#> 10 chronic pulm… (I27… I27|(J([46… 49(([… 49|50([0-5]|6… 4(16[89]|90)|5… 1
#> # ℹ 21 more rows
#> # ℹ 6 more variables: sum_all_ahrq <dbl>, walraven <dbl>, sid29 <dbl>,
#> # sid30 <dbl>, ahrq_mort <dbl>, ahrq_readm <dbl>
# Specify hierarchy for patients with different types of cancer and diabetes
# See `?elixhauser` for details
as.classcodes(
elix,
hierarchy = list(
cancer = c("metastatic cancer", "solid tumor"),
diabetes = c("diabetes complicated", "diabetes uncomplicated")
)
)
#>
#> Classcodes object
#>
#> Regular expressions:
#> icd10, icd10_short, icd9cm, icd9cm_ahrqweb, icd9cm_enhanced
#> Indices:
#> sum_all, sum_all_ahrq, walraven, sid29, sid30, ahrq_mort, ahrq_readm
#> Hierarchy:
#> c("metastatic cancer", "solid tumor"),
#> c("diabetes complicated", "diabetes uncomplicated")
#>
#> # A tibble: 31 × 13
#> group icd10 icd10_short icd9cm icd9cm_ahrqweb icd9cm_enhanced sum_all
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 congestive h… I(09… I(09|1[13]… 39891… 39891|4(0(2[0… 39891|4(0(2[01… 1
#> 2 cardiac arrh… I(44… I(4[457-9]… 42(6(… NA 42(6([079|1[02… 1
#> 3 valvular dis… A520… A52|I(0[5-… 0932|… 0932|39([4-6]… 0932|39[4-7]|4… 1
#> 4 pulmonary ci… I(2(… I2[678] 41(6|… 41(6|79) 41(5[01]|6|7[0… 1
#> 5 peripheral v… I7([… I7[01389]|… 44(0|… 44([0-2]|3[1-… 0930|4(373|4([… 1
#> 6 hypertension… I10 I10 401[1… 401[19]|6420 401 1
#> 7 hypertension… I1[1… I1[1-35] 40([2… 40(10|[2-5])|… 40[2-5] 1
#> 8 paralysis G(04… G(04|11|8[… 34(2[… 34[2-4]|438[2… 3(341|4([23]|4… 1
#> 9 other neurol… G(1[… G(1[0-3]|2… 3(3(1… 3(3([0145]|20… 3(3(19|2[01]|3… 1
#> 10 chronic pulm… (I27… I27|(J([46… 49(([… 49|50([0-5]|6… 4(16[89]|90)|5… 1
#> # ℹ 21 more rows
#> # ℹ 6 more variables: sum_all_ahrq <dbl>, walraven <dbl>, sid29 <dbl>,
#> # sid30 <dbl>, ahrq_mort <dbl>, ahrq_readm <dbl>
# Several checks are performed to not allow any erroneous classcodes object
if (FALSE) { # \dontrun{
as.classcodes(iris)
as.classcodes(iris, regex = "Species")
} # }
