Skip to contents

centroid_group calculates the centroid of all individuals in each spatiotemporal group identified by group_pts. The function expects a data.table with relocation data appended with a group column from group_pts. Relocation data should be in two columns representing the X and Y coordinates, or in a geometry column prepared by the helper function get_geometry().

Usage

centroid_group(
  DT = NULL,
  coords = NULL,
  crs = NULL,
  group = "group",
  geometry = "geometry"
)

Arguments

DT

input data.table with group column generated with group_pts

coords

character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names

crs

numeric or character defining the coordinate reference system to be passed to sf::st_crs. For example, either crs = "EPSG:32736" or crs = 32736. Used only if coords are provided, see details under Interface

group

Character string of group column

geometry

simple feature geometry list column name, generated by get_geometry(). Default 'geometry', see details under Interface

Value

centroid_group returns the input DT appended with centroid column(s) for each group.

If the crs for coords or st_crs(geometry) for geometry is long lat (see sf::st_is_longlat()), centroids will be calculated using s2::s2_centroid() through sf::st_centroid(). If the crs for coords or st_crs(geometry) for geometry is projected or NA, the centroids will be calculated using a mean on the coordinates.

If coords are provided, the centroid columns will be named by prefixing the coordinate column names with "centroid_" (eg. "X" = "centroid_X"). If geometry is used, the centroid column will be named "centroid".

A message is returned when the centroid column(s) already exist in the input because they will be overwritten.

See details for appending outputs using modify-by-reference in the FAQ.

Details

The DT must be a data.table. If your data is a data.frame, you can convert it by reference using data.table::setDT() or by reassigning using data.table::data.table().

The group argument expects the name of a column in DT which correspond to the group column.

See below under "Interface" for details on providing coordinates and under "Centroid function" for details on the underlying centroid function used.

Interface

Two interfaces are available for providing coordinates:

  1. Provide coords and optionally crs. The coords argument expects the names of the X and Y coordinate columns. The crs argument expects a character string or numeric defining the coordinate reference system to be passed to sf::st_crs. For example, for UTM zone 36S (EPSG 32736), the crs argument is crs = "EPSG:32736" or crs = 32736. See https://spatialreference.org for a list of EPSG codes. For centroid calculations, if crs is NULL, it will be internally set to NA_crs_.

  2. (New!) Provide geometry. The geometry argument allows the user to supply a geometry column that represents the coordinates as a simple feature geometry list column. This interface expects the user to prepare their input DT with get_geometry(). To use this interface, leave the coords and crs arguments NULL, and the default argument for geometry ('geometry') will be used directly.

Centroid function

The underlying centroid function used depends on the crs of the coordinates or geometry provided.

Note: if the input is length 1, the input is returned.

See also

Examples

# Load data.table
library(data.table)
# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))
# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#>          ID        X       Y            datetime population
#>      <char>    <num>   <num>              <POSc>      <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1
#>  ---                                                       
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1

# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12

# Spatial grouping with timegroup
group_pts(DT, threshold = 5, id = 'ID',
          coords = c('X', 'Y'), timegroup = 'timegroup')
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group
#>      <int>
#>   1:     1
#>   2:     2
#>   3:     3
#>   4:     4
#>   5:     5
#>  ---      
#> 116:   109
#> 117:   110
#> 118:   111
#> 119:   112
#> 120:    59

# Calculate group centroid
centroid_group(DT, coords = c('X', 'Y'), group = 'group')
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group centroid_X centroid_Y
#>      <int>      <num>      <num>
#>   1:     1   696191.5    5508362
#>   2:     2   696205.2    5508363
#>   3:     3   696745.8    5508225
#>   4:     4   696952.0    5508373
#>   5:     5   696074.7    5508214
#>  ---                            
#> 116:   109   696996.5    5508024
#> 117:   110   697046.4    5507922
#> 118:   111   697037.5    5507924
#> 119:   112   697303.0    5508347
#> 120:    59   696617.8    5508734

# Or, using the new geometry interface
get_geometry(DT, coords = c('X', 'Y'), crs = 32736)
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group centroid_X centroid_Y                 geometry
#>      <int>      <num>      <num>              <sfc_POINT>
#>   1:     1   696191.5    5508362 POINT (696191.5 5508362)
#>   2:     2   696205.2    5508363 POINT (696205.2 5508363)
#>   3:     3   696745.8    5508225 POINT (696745.8 5508225)
#>   4:     4   696952.0    5508373   POINT (696952 5508373)
#>   5:     5   696074.7    5508214   POINT (696079 5508218)
#>  ---                                                     
#> 116:   109   696996.5    5508024 POINT (696996.5 5508024)
#> 117:   110   697046.4    5507922 POINT (697046.4 5507922)
#> 118:   111   697037.5    5507924 POINT (697037.5 5507924)
#> 119:   112   697303.0    5508347   POINT (697303 5508347)
#> 120:    59   696617.8    5508734 POINT (696616.7 5508736)
group_pts(DT, threshold = 5, id = 'ID', timegroup = 'timegroup')
#> group column will be overwritten by this function
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      centroid_X centroid_Y                 geometry group
#>           <num>      <num>              <sfc_POINT> <int>
#>   1:   696191.5    5508362 POINT (696191.5 5508362)     1
#>   2:   696205.2    5508363 POINT (696205.2 5508363)     2
#>   3:   696745.8    5508225 POINT (696745.8 5508225)     3
#>   4:   696952.0    5508373   POINT (696952 5508373)     4
#>   5:   696074.7    5508214   POINT (696079 5508218)     5
#>  ---                                                     
#> 116:   696996.5    5508024 POINT (696996.5 5508024)   109
#> 117:   697046.4    5507922 POINT (697046.4 5507922)   110
#> 118:   697037.5    5507924 POINT (697037.5 5507924)   111
#> 119:   697303.0    5508347   POINT (697303 5508347)   112
#> 120:   696617.8    5508734 POINT (696616.7 5508736)    59
centroid_group(DT)
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      centroid_X centroid_Y                 geometry group
#>           <num>      <num>              <sfc_POINT> <int>
#>   1:   696191.5    5508362 POINT (696191.5 5508362)     1
#>   2:   696205.2    5508363 POINT (696205.2 5508363)     2
#>   3:   696745.8    5508225 POINT (696745.8 5508225)     3
#>   4:   696952.0    5508373   POINT (696952 5508373)     4
#>   5:   696074.7    5508214   POINT (696079 5508218)     5
#>  ---                                                     
#> 116:   696996.5    5508024 POINT (696996.5 5508024)   109
#> 117:   697046.4    5507922 POINT (697046.4 5507922)   110
#> 118:   697037.5    5507924 POINT (697037.5 5507924)   111
#> 119:   697303.0    5508347   POINT (697303 5508347)   112
#> 120:   696617.8    5508734 POINT (696616.7 5508736)    59
#>                      centroid
#>                   <sfc_POINT>
#>   1: POINT (696191.5 5508362)
#>   2: POINT (696205.2 5508363)
#>   3: POINT (696745.8 5508225)
#>   4:   POINT (696952 5508373)
#>   5: POINT (696074.7 5508214)
#>  ---                         
#> 116: POINT (696996.5 5508024)
#> 117: POINT (697046.4 5507922)
#> 118: POINT (697037.5 5507924)
#> 119:   POINT (697303 5508347)
#> 120: POINT (696617.8 5508734)