Skip to contents

Removes or flags duplicated records based on species name and coordinates, as well as user-defined additional columns. True (specimen) duplicates or duplicates from the same species can make up the bulk of records in a biological collection database, but are undesirable for many analyses. Both can be flagged with this function, the former given enough additional information.

Usage

cc_dupl(
  x,
  lon = "decimalLongitude",
  lat = "decimalLatitude",
  species = "species",
  additions = NULL,
  value = "clean",
  verbose = TRUE
)

Arguments

x

data.frame. Containing geographical coordinates and species names.

lon

character string. The column with the longitude coordinates. Default = “decimalLongitude”.

lat

character string. The column with the latitude coordinates. Default = “decimalLatitude”.

species

a character string. The column with the species name. Default = “species”.

additions

a vector of character strings. Additional columns to be included in the test for duplication. For example as below, collector name and collector number.

value

character string. Defining the output value. See value.

verbose

logical. If TRUE reports the name of the test and the number of records flagged.

Value

Depending on the ‘value’ argument, either a data.frame

containing the records considered correct by the test (“clean”) or a logical vector (“flagged”), with TRUE = test passed and FALSE = test failed/potentially problematic . Default = “clean”.

See also

Other Coordinates: cc_aohi(), cc_cap(), cc_cen(), cc_coun(), cc_equ(), cc_gbif(), cc_inst(), cc_iucn(), cc_outl(), cc_sea(), cc_urb(), cc_val(), cc_zero()

Examples


x <- data.frame(species = letters[1:10], 
                decimalLongitude = sample(x = 0:10, size = 100, replace = TRUE), 
                decimalLatitude = sample(x = 0:10, size = 100, replace = TRUE),
                collector = "Bonpl",
                collector.number = c(1001, 354),
                collection = rep(c("K", "WAG","FR", "P", "S"), 20))

cc_dupl(x, value = "flagged")
#> Testing duplicates
#> Flagged 3 records.
#>   [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [37]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [61]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [73]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [85] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
#>  [97]  TRUE FALSE  TRUE  TRUE
cc_dupl(x, additions = c("collector", "collector.number"))
#> Testing duplicates
#> Removed 3 records.
#>     species decimalLongitude decimalLatitude collector collector.number
#> 1         a               10               4     Bonpl             1001
#> 2         b                7               4     Bonpl              354
#> 3         c                1               0     Bonpl             1001
#> 4         d                5               4     Bonpl              354
#> 5         e                5               8     Bonpl             1001
#> 6         f                0               9     Bonpl              354
#> 7         g                4               3     Bonpl             1001
#> 8         h                4              10     Bonpl              354
#> 9         i                5               5     Bonpl             1001
#> 10        j                8               9     Bonpl              354
#> 11        a                0               2     Bonpl             1001
#> 12        b                0               2     Bonpl              354
#> 13        c                8               0     Bonpl             1001
#> 14        d                3               7     Bonpl              354
#> 15        e                1               4     Bonpl             1001
#> 16        f                7               8     Bonpl              354
#> 17        g                6               3     Bonpl             1001
#> 18        h                6               2     Bonpl              354
#> 19        i               10               8     Bonpl             1001
#> 20        j                9               9     Bonpl              354
#> 21        a                6               8     Bonpl             1001
#> 22        b                5               0     Bonpl              354
#> 23        c                4               7     Bonpl             1001
#> 24        d                9               5     Bonpl              354
#> 25        e                8               9     Bonpl             1001
#> 26        f                5               5     Bonpl              354
#> 27        g                7               7     Bonpl             1001
#> 28        h                2               2     Bonpl              354
#> 29        i                9               4     Bonpl             1001
#> 30        j                9               8     Bonpl              354
#> 31        a               10               5     Bonpl             1001
#> 32        b                1               9     Bonpl              354
#> 33        c                9               0     Bonpl             1001
#> 34        d                6               8     Bonpl              354
#> 35        e                6               3     Bonpl             1001
#> 36        f                3               5     Bonpl              354
#> 37        g                6               5     Bonpl             1001
#> 38        h                9               0     Bonpl              354
#> 39        i                1              10     Bonpl             1001
#> 40        j                8               2     Bonpl              354
#> 41        a                9              10     Bonpl             1001
#> 43        c                9               8     Bonpl             1001
#> 44        d                1              10     Bonpl              354
#> 45        e                1               0     Bonpl             1001
#> 46        f                8               9     Bonpl              354
#> 47        g                0               1     Bonpl             1001
#> 48        h                0               7     Bonpl              354
#> 49        i                0               9     Bonpl             1001
#> 50        j                0               9     Bonpl              354
#> 51        a                8               0     Bonpl             1001
#> 52        b                1               3     Bonpl              354
#> 53        c                2               8     Bonpl             1001
#> 54        d                8              10     Bonpl              354
#> 55        e                4               0     Bonpl             1001
#> 56        f                7               1     Bonpl              354
#> 57        g                5               2     Bonpl             1001
#> 58        h               10               1     Bonpl              354
#> 59        i                0               0     Bonpl             1001
#> 60        j                3               7     Bonpl              354
#> 61        a                9               8     Bonpl             1001
#> 62        b                2               6     Bonpl              354
#> 63        c                1               4     Bonpl             1001
#> 64        d                6               1     Bonpl              354
#> 65        e                9               6     Bonpl             1001
#> 66        f                9               9     Bonpl              354
#> 67        g                1               6     Bonpl             1001
#> 68        h                4               9     Bonpl              354
#> 69        i                5               9     Bonpl             1001
#> 70        j                9               1     Bonpl              354
#> 71        a                6              10     Bonpl             1001
#> 72        b                5               3     Bonpl              354
#> 73        c                2               9     Bonpl             1001
#> 74        d                3               1     Bonpl              354
#> 75        e                2               7     Bonpl             1001
#> 76        f                5              10     Bonpl              354
#> 77        g                3               7     Bonpl             1001
#> 78        h                9               3     Bonpl              354
#> 79        i                6               7     Bonpl             1001
#> 80        j               10               8     Bonpl              354
#> 81        a                8               1     Bonpl             1001
#> 82        b                8               3     Bonpl              354
#> 83        c                0               4     Bonpl             1001
#> 84        d               10               7     Bonpl              354
#> 86        f                2               7     Bonpl              354
#> 87        g                9               3     Bonpl             1001
#> 88        h                0               5     Bonpl              354
#> 89        i               10               3     Bonpl             1001
#> 90        j                9               7     Bonpl              354
#> 91        a                4               5     Bonpl             1001
#> 92        b                4               5     Bonpl              354
#> 93        c               10              10     Bonpl             1001
#> 94        d                6              10     Bonpl              354
#> 95        e                6               8     Bonpl             1001
#> 96        f                7               6     Bonpl              354
#> 97        g                4               6     Bonpl             1001
#> 99        i                5               3     Bonpl             1001
#> 100       j                1               3     Bonpl              354
#>     collection
#> 1            K
#> 2          WAG
#> 3           FR
#> 4            P
#> 5            S
#> 6            K
#> 7          WAG
#> 8           FR
#> 9            P
#> 10           S
#> 11           K
#> 12         WAG
#> 13          FR
#> 14           P
#> 15           S
#> 16           K
#> 17         WAG
#> 18          FR
#> 19           P
#> 20           S
#> 21           K
#> 22         WAG
#> 23          FR
#> 24           P
#> 25           S
#> 26           K
#> 27         WAG
#> 28          FR
#> 29           P
#> 30           S
#> 31           K
#> 32         WAG
#> 33          FR
#> 34           P
#> 35           S
#> 36           K
#> 37         WAG
#> 38          FR
#> 39           P
#> 40           S
#> 41           K
#> 43          FR
#> 44           P
#> 45           S
#> 46           K
#> 47         WAG
#> 48          FR
#> 49           P
#> 50           S
#> 51           K
#> 52         WAG
#> 53          FR
#> 54           P
#> 55           S
#> 56           K
#> 57         WAG
#> 58          FR
#> 59           P
#> 60           S
#> 61           K
#> 62         WAG
#> 63          FR
#> 64           P
#> 65           S
#> 66           K
#> 67         WAG
#> 68          FR
#> 69           P
#> 70           S
#> 71           K
#> 72         WAG
#> 73          FR
#> 74           P
#> 75           S
#> 76           K
#> 77         WAG
#> 78          FR
#> 79           P
#> 80           S
#> 81           K
#> 82         WAG
#> 83          FR
#> 84           P
#> 86           K
#> 87         WAG
#> 88          FR
#> 89           P
#> 90           S
#> 91           K
#> 92         WAG
#> 93          FR
#> 94           P
#> 95           S
#> 96           K
#> 97         WAG
#> 99           P
#> 100          S