Skip to contents

centroid_fusion calculates the centroid (mean location) of each timestep in fusion events. The function accepts an edge list of fusion events identified by fusion_id from edge lists generated with edge_dist and a data.table with relocation data appended with a timegroup column from group_times. It is recommended to use the argument fillNA = FALSE for edge_dist when using centroid_fusion to avoid unnecessarily merging additional rows. Relocation data should be in two columns representing the X and Y coordinates.

Usage

centroid_fusion(
  edges = NULL,
  DT = NULL,
  id = NULL,
  coords = NULL,
  timegroup = "timegroup",
  na.rm = FALSE
)

Arguments

edges

edge list generated generated by edge_dist or edge_nn, with fusionID column generated by fusion_id

DT

input data.table with timegroup column generated with group_times matching the input data.table used to generate the edge list with edge_nn or edge_dist

id

character string of ID column name

coords

character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names.

timegroup

timegroup field in the DT within which the grouping will be calculated

na.rm

if NAs should be removed in calculating mean location, see rowMeans

Value

centroid_fusion returns the input edges appended with centroid columns for the X and Y coordinate columns.

These columns represents the centroid coordinate columns for each timestep in a fusion event. The naming of these columns will correspond to the provided coordinate column names prefixed with "centroid_".

Note: due to the merge required within this function, the output needs to be reassigned unlike some other spatsoc functions like fusion_id and group_pts.

A message is returned when centroid columns are already exists in the input edges, because they will be overwritten.

Details

The edges and DT must be data.table. If your data is a data.frame, you can convert it by reference using data.table::setDT or by reassigning using data.table::data.table.

The edges and DT are internally merged in this function using the columns timegroup (from group_times) and ID1 and ID2 (in edges, from dyad_id) and id (in DT). This function expects a fusionID present, generated with the fusion_id function. The timegroup argument expects the names of a column in edges which correspond to the timegroup column. The id, coords and timegroup arguments expect the names of a column in DT which correspond to the id, X and Y coordinates and timegroup columns. The na.rm argument is passed to the rowMeans function to control if NA values are removed before calculation.

See also

Examples

# Load data.table
library(data.table)

# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#>            ID        X       Y            datetime population
#>        <char>    <num>   <num>              <POSc>      <int>
#>     1:      A 715851.4 5505340 2016-11-01 00:00:54          1
#>     2:      A 715822.8 5505289 2016-11-01 02:01:22          1
#>     3:      A 715872.9 5505252 2016-11-01 04:01:24          1
#>     4:      A 715820.5 5505231 2016-11-01 06:01:05          1
#>     5:      A 715830.6 5505227 2016-11-01 08:01:11          1
#>    ---                                                       
#> 14293:      J 700616.5 5509069 2017-02-28 14:00:54          1
#> 14294:      J 700622.6 5509065 2017-02-28 16:00:11          1
#> 14295:      J 700657.5 5509277 2017-02-28 18:00:55          1
#> 14296:      J 700610.3 5509269 2017-02-28 20:00:48          1
#> 14297:      J 700744.0 5508782 2017-02-28 22:00:39          1

# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#>            ID        X       Y            datetime population minutes timegroup
#>        <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>     1:      A 715851.4 5505340 2016-11-01 00:00:54          1       0         1
#>     2:      A 715822.8 5505289 2016-11-01 02:01:22          1       0         2
#>     3:      A 715872.9 5505252 2016-11-01 04:01:24          1       0         3
#>     4:      A 715820.5 5505231 2016-11-01 06:01:05          1       0         4
#>     5:      A 715830.6 5505227 2016-11-01 08:01:11          1       0         5
#>    ---                                                                         
#> 14293:      J 700616.5 5509069 2017-02-28 14:00:54          1       0      1393
#> 14294:      J 700622.6 5509065 2017-02-28 16:00:11          1       0      1394
#> 14295:      J 700657.5 5509277 2017-02-28 18:00:55          1       0      1440
#> 14296:      J 700610.3 5509269 2017-02-28 20:00:48          1       0      1395
#> 14297:      J 700744.0 5508782 2017-02-28 22:00:39          1       0      1396

# Edge list generation
edges <- edge_dist(
    DT,
    threshold = 100,
    id = 'ID',
    coords = c('X', 'Y'),
    timegroup = 'timegroup',
    returnDist = TRUE,
    fillNA = FALSE
  )

# Generate dyad id
dyad_id(edges, id1 = 'ID1', id2 = 'ID2')
#>        timegroup    ID1    ID2  distance dyadID
#>            <int> <char> <char>     <num> <char>
#>     1:         1      G      B  5.782904    B-G
#>     2:         1      H      E 65.061671    E-H
#>     3:         1      B      G  5.782904    B-G
#>     4:         1      E      H 65.061671    E-H
#>     5:         2      H      E 79.659918    E-H
#>    ---                                         
#> 17174:      1440      I      C  2.831071    C-I
#> 17175:      1440      C      F  9.372972    C-F
#> 17176:      1440      I      F  7.512922    F-I
#> 17177:      1440      C      I  2.831071    C-I
#> 17178:      1440      F      I  7.512922    F-I

# Generate fusion id
fusion_id(edges, threshold = 100)
#>        timegroup    ID1    ID2  distance dyadID fusionID
#>            <int> <char> <char>     <num> <char>    <int>
#>     1:         1      G      B  5.782904    B-G        1
#>     2:         1      H      E 65.061671    E-H        2
#>     3:         1      B      G  5.782904    B-G        1
#>     4:         1      E      H 65.061671    E-H        2
#>     5:         2      H      E 79.659918    E-H        2
#>    ---                                                  
#> 17174:      1440      I      C  2.831071    C-I     2846
#> 17175:      1440      C      F  9.372972    C-F     2845
#> 17176:      1440      I      F  7.512922    F-I     2847
#> 17177:      1440      C      I  2.831071    C-I     2846
#> 17178:      1440      F      I  7.512922    F-I     2847

# Calculate fusion centroid
centroids <- centroid_fusion(
  edges,
  DT,
  id = 'ID',
  coords = c('X', 'Y'),
  timegroup = 'timegroup', na.rm = TRUE
)

print(centroids)
#>        timegroup    ID1    ID2  distance dyadID fusionID centroid_X centroid_Y
#>            <int> <char> <char>     <num> <char>    <int>      <num>      <num>
#>     1:         1      G      B  5.782904    B-G        1   699637.9    5509637
#>     2:         1      H      E 65.061671    E-H        2   701698.0    5504306
#>     3:         1      B      G  5.782904    B-G        1   699637.9    5509637
#>     4:         1      E      H 65.061671    E-H        2   701698.0    5504306
#>     5:         2      H      E 79.659918    E-H        2   701652.4    5504236
#>    ---                                                                        
#> 17174:      1440      I      C  2.831071    C-I     2846   702960.6    5509447
#> 17175:      1440      C      F  9.372972    C-F     2845   702960.7    5509451
#> 17176:      1440      I      F  7.512922    F-I     2847   702959.5    5509452
#> 17177:      1440      C      I  2.831071    C-I     2846   702960.6    5509447
#> 17178:      1440      F      I  7.512922    F-I     2847   702959.5    5509452