centroid_group calculates the centroid of all individuals in each
spatiotemporal group identified by group_pts. The function expects a
data.table with relocation data appended with a group column from
group_pts. Relocation data should be in two columns representing the X and
Y coordinates, or in a geometry column prepared by the helper function
get_geometry().
Usage
centroid_group(
DT = NULL,
coords = NULL,
crs = NULL,
group = "group",
geometry = "geometry"
)Arguments
- DT
input data.table with group column generated with
group_pts- coords
character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names
- crs
numeric or character defining the coordinate reference system to be passed to sf::st_crs. For example, either
crs = "EPSG:32736"orcrs = 32736. Used only if coords are provided, see details under Interface- group
Character string of group column
- geometry
simple feature geometry list column name, generated by
get_geometry(). Default 'geometry', see details under Interface
Value
centroid_group returns the input DT appended with
centroid column(s) for each group.
If the crs for coords or st_crs(geometry) for geometry is long lat
(see sf::st_is_longlat()), centroids will be calculated using
s2::s2_centroid() through sf::st_centroid(). If the crs for coords
or st_crs(geometry) for geometry is projected or NA, the centroids will
be calculated using a mean on the coordinates.
If coords are provided, the centroid columns will be named by prefixing
the coordinate column names with "centroid_" (eg. "X" = "centroid_X"). If
geometry is used, the centroid column will be named "centroid".
A message is returned when the centroid column(s) already exist in the input because they will be overwritten.
See details for appending outputs using modify-by-reference in the FAQ.
Details
The DT must be a data.table. If your data is a
data.frame, you can convert it by reference using
data.table::setDT() or by reassigning using
data.table::data.table().
The group argument expects the name of a column in
DT which correspond to the group column.
See below under "Interface" for details on providing coordinates and under "Centroid function" for details on the underlying centroid function used.
Interface
Two interfaces are available for providing coordinates:
Provide
coordsand optionallycrs. Thecoordsargument expects the names of the X and Y coordinate columns. Thecrsargument expects a character string or numeric defining the coordinate reference system to be passed to sf::st_crs. For example, for UTM zone 36S (EPSG 32736), the crs argument iscrs = "EPSG:32736"orcrs = 32736. See https://spatialreference.org for a list of EPSG codes. For centroid calculations, ifcrsis NULL, it will be internally set toNA_crs_.(New!) Provide
geometry. Thegeometryargument allows the user to supply ageometrycolumn that represents the coordinates as a simple feature geometry list column. This interface expects the user to prepare their input DT withget_geometry(). To use this interface, leave thecoordsandcrsargumentsNULL, and the default argument forgeometry('geometry') will be used directly.
Centroid function
The underlying centroid function used depends on the crs of the coordinates or geometry provided.
If the crs is longlat degrees (as determined by
sf::st_is_longlat()) andsf::sf_use_s2()is TRUE, the distance function issf::st_centroid()which passes tos2::s2_centroid().If the crs is longlat degrees but
sf::sf_use_s2()is FALSE, the centroid calculated will be incorrect. Seesf::st_centroid().If the crs is not longlat degrees (eg. NULL, NA_crs_, or projected), the centroid function used is mean.
Note: if the input is length 1, the input is returned.
See also
group_pts
Other Centroid functions:
centroid_dyad(),
centroid_fusion(),
direction_to_centroid(),
distance_to_centroid()
Examples
# Load data.table
library(data.table)
# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))
# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#> ID X Y datetime population
#> <char> <num> <num> <POSc> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1
# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
# Spatial grouping with timegroup
group_pts(DT, threshold = 5, id = 'ID',
coords = c('X', 'Y'), timegroup = 'timegroup')
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
#> group
#> <int>
#> 1: 1
#> 2: 2
#> 3: 3
#> 4: 4
#> 5: 5
#> ---
#> 14293: 13882
#> 14294: 13883
#> 14295: 13884
#> 14296: 13885
#> 14297: 13886
# Calculate group centroid
centroid_group(DT, coords = c('X', 'Y'), group = 'group')
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
#> group centroid_X centroid_Y
#> <int> <num> <num>
#> 1: 1 715851.4 5505340
#> 2: 2 715822.8 5505289
#> 3: 3 715872.9 5505252
#> 4: 4 715820.5 5505231
#> 5: 5 715830.6 5505227
#> ---
#> 14293: 13882 700616.5 5509069
#> 14294: 13883 700622.6 5509065
#> 14295: 13884 700657.5 5509277
#> 14296: 13885 700610.3 5509269
#> 14297: 13886 700744.0 5508782
# Or, using the new geometry interface
get_geometry(DT, coords = c('X', 'Y'), crs = 32736)
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
#> group centroid_X centroid_Y geometry
#> <int> <num> <num> <sfc_POINT>
#> 1: 1 715851.4 5505340 POINT (715851.4 5505340)
#> 2: 2 715822.8 5505289 POINT (715822.8 5505289)
#> 3: 3 715872.9 5505252 POINT (715872.9 5505252)
#> 4: 4 715820.5 5505231 POINT (715820.5 5505231)
#> 5: 5 715830.6 5505227 POINT (715830.6 5505227)
#> ---
#> 14293: 13882 700616.5 5509069 POINT (700616.5 5509069)
#> 14294: 13883 700622.6 5509065 POINT (700622.6 5509065)
#> 14295: 13884 700657.5 5509277 POINT (700657.5 5509277)
#> 14296: 13885 700610.3 5509269 POINT (700610.3 5509269)
#> 14297: 13886 700744.0 5508782 POINT (700744 5508782)
group_pts(DT, threshold = 5, id = 'ID', timegroup = 'timegroup')
#> group column will be overwritten by this function
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
#> centroid_X centroid_Y geometry group
#> <num> <num> <sfc_POINT> <int>
#> 1: 715851.4 5505340 POINT (715851.4 5505340) 1
#> 2: 715822.8 5505289 POINT (715822.8 5505289) 2
#> 3: 715872.9 5505252 POINT (715872.9 5505252) 3
#> 4: 715820.5 5505231 POINT (715820.5 5505231) 4
#> 5: 715830.6 5505227 POINT (715830.6 5505227) 5
#> ---
#> 14293: 700616.5 5509069 POINT (700616.5 5509069) 13882
#> 14294: 700622.6 5509065 POINT (700622.6 5509065) 13883
#> 14295: 700657.5 5509277 POINT (700657.5 5509277) 13884
#> 14296: 700610.3 5509269 POINT (700610.3 5509269) 13885
#> 14297: 700744.0 5508782 POINT (700744 5508782) 13886
centroid_group(DT)
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
#> centroid_X centroid_Y geometry group
#> <num> <num> <sfc_POINT> <int>
#> 1: 715851.4 5505340 POINT (715851.4 5505340) 1
#> 2: 715822.8 5505289 POINT (715822.8 5505289) 2
#> 3: 715872.9 5505252 POINT (715872.9 5505252) 3
#> 4: 715820.5 5505231 POINT (715820.5 5505231) 4
#> 5: 715830.6 5505227 POINT (715830.6 5505227) 5
#> ---
#> 14293: 700616.5 5509069 POINT (700616.5 5509069) 13882
#> 14294: 700622.6 5509065 POINT (700622.6 5509065) 13883
#> 14295: 700657.5 5509277 POINT (700657.5 5509277) 13884
#> 14296: 700610.3 5509269 POINT (700610.3 5509269) 13885
#> 14297: 700744.0 5508782 POINT (700744 5508782) 13886
#> centroid
#> <sfc_POINT>
#> 1: POINT (715851.4 5505340)
#> 2: POINT (715822.8 5505289)
#> 3: POINT (715872.9 5505252)
#> 4: POINT (715820.5 5505231)
#> 5: POINT (715830.6 5505227)
#> ---
#> 14293: POINT (700616.5 5509069)
#> 14294: POINT (700622.6 5509065)
#> 14295: POINT (700657.5 5509277)
#> 14296: POINT (700610.3 5509269)
#> 14297: POINT (700744 5508782)
