centroid_fusion calculates the centroid of each timestep in fusion events.
The function expects an edge-list of fusion events identified by
fusion_id() from edge-lists generated with edge_dist() and a data.table
with relocation data appended with a timegroup column from group_times().
Relocation data should be in two columns representing the X and Y
coordinates, or in a geometry column prepared by the helper function
get_geometry().
Usage
centroid_fusion(
edges = NULL,
DT = NULL,
id = NULL,
coords = NULL,
crs = NULL,
timegroup = "timegroup",
geometry = "geometry"
)Arguments
- edges
edge-list generated generated by
edge_dist()oredge_nn(), with fusionID column generated byfusion_id()- DT
input data.table with timegroup column generated with
group_times()matching the input data.table used to generate the edge list withedge_nn()oredge_dist()- id
character string of ID column name
- coords
character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names
- crs
numeric or character defining the coordinate reference system to be passed to sf::st_crs. For example, either
crs = "EPSG:32736"orcrs = 32736. Used only if coords are provided, see details under Interface- timegroup
timegroup field in the DT within which the output will be calculated
- geometry
simple feature geometry list column name, generated by
get_geometry(). Default 'geometry', see details under Interface
Value
centroid_fusion returns the input edges appended with
centroid column(s) for each timestep and fusion id.
If coords are provided, the centroid columns will be named by prefixing
the coordinate column names with "centroid_" (eg. "X" = "centroid_X"). If
geometry is used, the centroid column will be named "centroid".
Note: due to the merge required within this function, the output needs to be
reassigned unlike some other spatsoc functions like dyad_id and
group_pts. See details in
FAQ.
A message is returned when the centroid column(s) already exist in the input because they will be overwritten.
Details
The edges and DT must be data.tables. If your data is a data.frame,
you can convert it by reference using data.table::setDT() or by reassigning
using data.table::data.table().
The edges and DT are internally merged in this function using the columns
timegroup (from group_times()) and ID1 and ID2 (in edges, from
dyad_id) and id (in DT). This function expects a fusionID present,
generated with the fusion_id() function. The timegroup argument expects
the names of a column in edges which correspond to the timegroup column.
The id and timegroup arguments expect the names of columns in DT which
correspond to the id, and timegroup columns.
See below under "Interface" for details on providing coordinates and under "Centroid function" for details on the underlying centroid function used.
Interface
Two interfaces are available for providing coordinates:
Provide
coordsand optionallycrs. Thecoordsargument expects the names of the X and Y coordinate columns. Thecrsargument expects a character string or numeric defining the coordinate reference system to be passed to sf::st_crs. For example, for UTM zone 36S (EPSG 32736), the crs argument iscrs = "EPSG:32736"orcrs = 32736. See https://spatialreference.org for a list of EPSG codes. For centroid calculations, ifcrsis NULL, it will be internally set toNA_crs_.(New!) Provide
geometry. Thegeometryargument allows the user to supply ageometrycolumn that represents the coordinates as a simple feature geometry list column. This interface expects the user to prepare their input DT withget_geometry(). To use this interface, leave thecoordsandcrsargumentsNULL, and the default argument forgeometry('geometry') will be used directly.
Centroid function
The underlying centroid function used depends on the crs of the coordinates or geometry provided.
If the crs is longlat degrees (as determined by
sf::st_is_longlat()) andsf::sf_use_s2()is TRUE, the distance function issf::st_centroid()which passes tos2::s2_centroid().If the crs is longlat degrees but
sf::sf_use_s2()is FALSE, the centroid calculated will be incorrect. Seesf::st_centroid().If the crs is not longlat degrees (eg. NULL, NA_crs_, or projected), the centroid function used is mean.
Note: if the input is length 1, the input is returned.
See also
Other Centroid functions:
centroid_dyad(),
centroid_group(),
direction_to_centroid(),
distance_to_centroid()
Examples
# Load data.table
library(data.table)
# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))
# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#> ID X Y datetime population
#> <char> <num> <num> <POSc> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1
# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
# Edge-list generation
edges <- edge_dist(
DT,
threshold = 100,
id = 'ID',
coords = c('X', 'Y'),
timegroup = 'timegroup',
returnDist = TRUE,
fillNA = FALSE
)
# Generate dyad id
dyad_id(edges, id1 = 'ID1', id2 = 'ID2')
#> timegroup ID1 ID2 distance dyadID
#> <int> <char> <char> <num> <char>
#> 1: 1 G B 5.782904 B-G
#> 2: 1 H E 65.061671 E-H
#> 3: 1 B G 5.782904 B-G
#> 4: 1 E H 65.061671 E-H
#> 5: 2 H E 79.659918 E-H
#> ---
#> 17174: 1440 I C 2.831071 C-I
#> 17175: 1440 C F 9.372972 C-F
#> 17176: 1440 I F 7.512922 F-I
#> 17177: 1440 C I 2.831071 C-I
#> 17178: 1440 F I 7.512922 F-I
# Generate fusion id
fusion_id(edges, threshold = 100)
#> timegroup ID1 ID2 distance dyadID fusionID
#> <int> <char> <char> <num> <char> <int>
#> 1: 1 G B 5.782904 B-G 1
#> 2: 1 H E 65.061671 E-H 2
#> 3: 1 B G 5.782904 B-G 1
#> 4: 1 E H 65.061671 E-H 2
#> 5: 2 H E 79.659918 E-H 2
#> ---
#> 17174: 1440 I C 2.831071 C-I 2846
#> 17175: 1440 C F 9.372972 C-F 2845
#> 17176: 1440 I F 7.512922 F-I 2847
#> 17177: 1440 C I 2.831071 C-I 2846
#> 17178: 1440 F I 7.512922 F-I 2847
# Calculate fusion centroid
centroids <- centroid_fusion(
edges,
DT,
id = 'ID',
coords = c('X', 'Y'),
timegroup = 'timegroup'
)
print(centroids)
#> timegroup ID1 ID2 distance dyadID fusionID centroid_X centroid_Y
#> <int> <char> <char> <num> <char> <int> <num> <num>
#> 1: 1 G B 5.782904 B-G 1 699637.9 5509637
#> 2: 1 H E 65.061671 E-H 2 701698.0 5504306
#> 3: 1 B G 5.782904 B-G 1 699637.9 5509637
#> 4: 1 E H 65.061671 E-H 2 701698.0 5504306
#> 5: 2 H E 79.659918 E-H 2 701652.4 5504236
#> ---
#> 17174: 1440 I C 2.831071 C-I 2846 702960.6 5509447
#> 17175: 1440 C F 9.372972 C-F 2845 702960.7 5509451
#> 17176: 1440 I F 7.512922 F-I 2847 702959.5 5509452
#> 17177: 1440 C I 2.831071 C-I 2846 702960.6 5509447
#> 17178: 1440 F I 7.512922 F-I 2847 702959.5 5509452
# Or, using the new geometry interface
get_geometry(DT, coords = c('X', 'Y'), crs = 32736)
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 14293: J 700616.5 5509069 2017-02-28 14:00:54 1 0 1393
#> 14294: J 700622.6 5509065 2017-02-28 16:00:11 1 0 1394
#> 14295: J 700657.5 5509277 2017-02-28 18:00:55 1 0 1440
#> 14296: J 700610.3 5509269 2017-02-28 20:00:48 1 0 1395
#> 14297: J 700744.0 5508782 2017-02-28 22:00:39 1 0 1396
#> geometry
#> <sfc_POINT>
#> 1: POINT (715851.4 5505340)
#> 2: POINT (715822.8 5505289)
#> 3: POINT (715872.9 5505252)
#> 4: POINT (715820.5 5505231)
#> 5: POINT (715830.6 5505227)
#> ---
#> 14293: POINT (700616.5 5509069)
#> 14294: POINT (700622.6 5509065)
#> 14295: POINT (700657.5 5509277)
#> 14296: POINT (700610.3 5509269)
#> 14297: POINT (700744 5508782)
edges <- edge_dist(DT, threshold = 100, id = 'ID', timegroup = 'timegroup', returnDist = TRUE)
dyad_id(edges, id = 'ID1', id2 = 'ID2')
#> Key: <timegroup, ID1>
#> timegroup ID1 ID2 distance dyadID
#> <int> <char> <char> <num> <char>
#> 1: 1 A <NA> NA <NA>
#> 2: 1 B G 5.782904 B-G
#> 3: 1 C <NA> NA <NA>
#> 4: 1 D <NA> NA <NA>
#> 5: 1 E H 65.061671 E-H
#> ---
#> 22985: 1440 G <NA> NA <NA>
#> 22986: 1440 H <NA> NA <NA>
#> 22987: 1440 I C 2.831071 C-I
#> 22988: 1440 I F 7.512922 F-I
#> 22989: 1440 J <NA> NA <NA>
fusion_id(edges, threshold = 100)
#> Key: <timegroup, ID1>
#> timegroup ID1 ID2 distance dyadID fusionID
#> <int> <char> <char> <num> <char> <int>
#> 1: 1 A <NA> NA <NA> NA
#> 2: 1 B G 5.782904 B-G 1
#> 3: 1 C <NA> NA <NA> NA
#> 4: 1 D <NA> NA <NA> NA
#> 5: 1 E H 65.061671 E-H 2
#> ---
#> 22985: 1440 G <NA> NA <NA> NA
#> 22986: 1440 H <NA> NA <NA> NA
#> 22987: 1440 I C 2.831071 C-I 2846
#> 22988: 1440 I F 7.512922 F-I 2847
#> 22989: 1440 J <NA> NA <NA> NA
centroids <- centroid_fusion(
edges,
DT,
id = 'ID',
timegroup = 'timegroup'
)
print(centroids)
#> timegroup ID1 ID2 distance dyadID fusionID
#> <int> <char> <char> <num> <char> <int>
#> 1: 1 A <NA> NA <NA> NA
#> 2: 1 B G 5.782904 B-G 1
#> 3: 1 C <NA> NA <NA> NA
#> 4: 1 D <NA> NA <NA> NA
#> 5: 1 E H 65.061671 E-H 2
#> ---
#> 22985: 1440 G <NA> NA <NA> NA
#> 22986: 1440 H <NA> NA <NA> NA
#> 22987: 1440 I C 2.831071 C-I 2846
#> 22988: 1440 I F 7.512922 F-I 2847
#> 22989: 1440 J <NA> NA <NA> NA
#> geometry centroid
#> <sfc_POINT> <sfc_POINT>
#> 1: POINT (715851.4 5505340) POINT EMPTY
#> 2: POINT (699640.2 5509638) POINT (699637.9 5509637)
#> 3: POINT (710205.4 5505888) POINT EMPTY
#> 4: POINT (700875 5490954) POINT EMPTY
#> 5: POINT (701671.9 5504286) POINT (701698 5504306)
#> ---
#> 22985: POINT (698212 5508998) POINT EMPTY
#> 22986: POINT (699368.1 5507901) POINT EMPTY
#> 22987: POINT (702959.5 5509448) POINT (702960.6 5509447)
#> 22988: POINT (702959.5 5509448) POINT (702959.5 5509452)
#> 22989: POINT (700657.5 5509277) POINT EMPTY
