Skip to contents

centroid_dyad calculates the centroid (mean location) of a dyad in each observation identified by edge_nn or edge_dist. The function expects an edge-list generated by edge_nn or edge_dist and a data.table with relocation data appended with a timegroup column from group_times. Relocation data should be in two columns representing the X and Y coordinates, or in a geometry column prepared by the helper function get_geometry().

Usage

centroid_dyad(
  edges = NULL,
  DT = NULL,
  id = NULL,
  coords = NULL,
  crs = NULL,
  timegroup = "timegroup",
  geometry = "geometry"
)

Arguments

edges

edge-list generated generated by edge_dist or edge_nn, with dyad ID column generated by dyad_id

DT

input data.table with timegroup column generated with group_times matching the input data.table used to generate the edge list with edge_nn or edge_dist

id

character string of ID column name

coords

character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names

crs

numeric or character defining the coordinate reference system to be passed to sf::st_crs. For example, either crs = "EPSG:32736" or crs = 32736. Used only if coords are provided, see details under Interface

timegroup

character string of timegroup column name, default "timegroup"

geometry

simple feature geometry list column name, generated by get_geometry(). Default 'geometry', see details under Interface

Value

centroid_dyad returns the input edges appended with centroid column(s) for each timestep and dyad id.

If coords are provided, the centroid columns will be named by prefixing the coordinate column names with "centroid_" (eg. "X" = "centroid_X"). If geometry is used, the centroid column will be named "centroid".

Note: due to the merge required within this function, the output needs to be reassigned unlike some other spatsoc functions like dyad_id and group_pts. See details in FAQ.

A message is returned when the centroid column(s) already exist in the input because they will be overwritten.

Details

The edges and DT must be data.tables. If your data is a data.frame, you can convert it by reference using data.table::setDT() or by reassigning using data.table::data.table().

The edges and DT are internally merged in this function using the columns id, dyadID and timegroup. This function expects a dyadID present, generated with the dyad_id function. The id and timegroup arguments expect the names of a column in DT which correspond to the id and timegroup columns.

See below under "Interface" for details on providing coordinates and under "Centroid function" for details on the underlying centroid function used.

Interface

Two interfaces are available for providing coordinates:

  1. Provide coords and optionally crs. The coords argument expects the names of the X and Y coordinate columns. The crs argument expects a character string or numeric defining the coordinate reference system to be passed to sf::st_crs. For example, for UTM zone 36S (EPSG 32736), the crs argument is crs = "EPSG:32736" or crs = 32736. See https://spatialreference.org for a list of EPSG codes. For centroid calculations, if crs is NULL, it will be internally set to NA_crs_.

  2. (New!) Provide geometry. The geometry argument allows the user to supply a geometry column that represents the coordinates as a simple feature geometry list column. This interface expects the user to prepare their input DT with get_geometry(). To use this interface, leave the coords and crs arguments NULL, and the default argument for geometry ('geometry') will be used directly.

Centroid function

The underlying centroid function used depends on the crs of the coordinates or geometry provided.

Note: if the input is length 1, the input is returned.

Examples

# Load data.table
library(data.table)

# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#>            ID        X       Y            datetime population
#>        <char>    <num>   <num>              <POSc>      <int>
#>     1:      A 715851.4 5505340 2016-11-01 00:00:54          1
#>     2:      A 715822.8 5505289 2016-11-01 02:01:22          1
#>     3:      A 715872.9 5505252 2016-11-01 04:01:24          1
#>     4:      A 715820.5 5505231 2016-11-01 06:01:05          1
#>     5:      A 715830.6 5505227 2016-11-01 08:01:11          1
#>    ---                                                       
#> 14293:      J 700616.5 5509069 2017-02-28 14:00:54          1
#> 14294:      J 700622.6 5509065 2017-02-28 16:00:11          1
#> 14295:      J 700657.5 5509277 2017-02-28 18:00:55          1
#> 14296:      J 700610.3 5509269 2017-02-28 20:00:48          1
#> 14297:      J 700744.0 5508782 2017-02-28 22:00:39          1

# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#>            ID        X       Y            datetime population minutes timegroup
#>        <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>     1:      A 715851.4 5505340 2016-11-01 00:00:54          1       0         1
#>     2:      A 715822.8 5505289 2016-11-01 02:01:22          1       0         2
#>     3:      A 715872.9 5505252 2016-11-01 04:01:24          1       0         3
#>     4:      A 715820.5 5505231 2016-11-01 06:01:05          1       0         4
#>     5:      A 715830.6 5505227 2016-11-01 08:01:11          1       0         5
#>    ---                                                                         
#> 14293:      J 700616.5 5509069 2017-02-28 14:00:54          1       0      1393
#> 14294:      J 700622.6 5509065 2017-02-28 16:00:11          1       0      1394
#> 14295:      J 700657.5 5509277 2017-02-28 18:00:55          1       0      1440
#> 14296:      J 700610.3 5509269 2017-02-28 20:00:48          1       0      1395
#> 14297:      J 700744.0 5508782 2017-02-28 22:00:39          1       0      1396

# Edge-list generation
edges <- edge_dist(
    DT,
    threshold = 100,
    id = 'ID',
    coords = c('X', 'Y'),
    timegroup = 'timegroup',
    returnDist = TRUE,
    fillNA = FALSE
  )

# Generate dyad id
dyad_id(edges, id1 = 'ID1', id2 = 'ID2')
#>        timegroup    ID1    ID2  distance dyadID
#>            <int> <char> <char>     <num> <char>
#>     1:         1      G      B  5.782904    B-G
#>     2:         1      H      E 65.061671    E-H
#>     3:         1      B      G  5.782904    B-G
#>     4:         1      E      H 65.061671    E-H
#>     5:         2      H      E 79.659918    E-H
#>    ---                                         
#> 17174:      1440      I      C  2.831071    C-I
#> 17175:      1440      C      F  9.372972    C-F
#> 17176:      1440      I      F  7.512922    F-I
#> 17177:      1440      C      I  2.831071    C-I
#> 17178:      1440      F      I  7.512922    F-I

# Calculate dyad centroid
centroids <- centroid_dyad(
  edges,
  DT,
  id = 'ID',
  coords = c('X', 'Y'),
  timegroup = 'timegroup'
)

print(centroids)
#>        timegroup    ID1    ID2  distance dyadID centroid_X centroid_Y
#>            <int> <char> <char>     <num> <char>      <num>      <num>
#>     1:         1      G      B  5.782904    B-G   699637.9    5509637
#>     2:         1      H      E 65.061671    E-H   701698.0    5504306
#>     3:         1      B      G  5.782904    B-G   699637.9    5509637
#>     4:         1      E      H 65.061671    E-H   701698.0    5504306
#>     5:         2      H      E 79.659918    E-H   701652.4    5504236
#>    ---                                                               
#> 17174:      1440      I      C  2.831071    C-I   702960.6    5509447
#> 17175:      1440      C      F  9.372972    C-F   702960.7    5509451
#> 17176:      1440      I      F  7.512922    F-I   702959.5    5509452
#> 17177:      1440      C      I  2.831071    C-I   702960.6    5509447
#> 17178:      1440      F      I  7.512922    F-I   702959.5    5509452

# Or, using the new geometry interface
get_geometry(DT, coords = c('X', 'Y'), crs = 32736)
#>            ID        X       Y            datetime population minutes timegroup
#>        <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>     1:      A 715851.4 5505340 2016-11-01 00:00:54          1       0         1
#>     2:      A 715822.8 5505289 2016-11-01 02:01:22          1       0         2
#>     3:      A 715872.9 5505252 2016-11-01 04:01:24          1       0         3
#>     4:      A 715820.5 5505231 2016-11-01 06:01:05          1       0         4
#>     5:      A 715830.6 5505227 2016-11-01 08:01:11          1       0         5
#>    ---                                                                         
#> 14293:      J 700616.5 5509069 2017-02-28 14:00:54          1       0      1393
#> 14294:      J 700622.6 5509065 2017-02-28 16:00:11          1       0      1394
#> 14295:      J 700657.5 5509277 2017-02-28 18:00:55          1       0      1440
#> 14296:      J 700610.3 5509269 2017-02-28 20:00:48          1       0      1395
#> 14297:      J 700744.0 5508782 2017-02-28 22:00:39          1       0      1396
#>                        geometry
#>                     <sfc_POINT>
#>     1: POINT (715851.4 5505340)
#>     2: POINT (715822.8 5505289)
#>     3: POINT (715872.9 5505252)
#>     4: POINT (715820.5 5505231)
#>     5: POINT (715830.6 5505227)
#>    ---                         
#> 14293: POINT (700616.5 5509069)
#> 14294: POINT (700622.6 5509065)
#> 14295: POINT (700657.5 5509277)
#> 14296: POINT (700610.3 5509269)
#> 14297:   POINT (700744 5508782)
edges <- edge_dist(DT, threshold = 100, id = 'ID', timegroup = 'timegroup')
dyad_id(edges, id = 'ID1', id2 = 'ID2')
#> Key: <timegroup, ID1>
#>        timegroup    ID1    ID2 dyadID
#>            <int> <char> <char> <char>
#>     1:         1      A   <NA>   <NA>
#>     2:         1      B      G    B-G
#>     3:         1      C   <NA>   <NA>
#>     4:         1      D   <NA>   <NA>
#>     5:         1      E      H    E-H
#>    ---                               
#> 22985:      1440      G   <NA>   <NA>
#> 22986:      1440      H   <NA>   <NA>
#> 22987:      1440      I      C    C-I
#> 22988:      1440      I      F    F-I
#> 22989:      1440      J   <NA>   <NA>
centroids <- centroid_dyad(
  edges,
  DT,
  id = 'ID',
  timegroup = 'timegroup'
)
print(centroids)
#>        timegroup    ID1    ID2 dyadID                 geometry
#>            <int> <char> <char> <char>              <sfc_POINT>
#>     1:         1      A   <NA>   <NA> POINT (715851.4 5505340)
#>     2:         1      B      G    B-G POINT (699640.2 5509638)
#>     3:         1      C   <NA>   <NA> POINT (710205.4 5505888)
#>     4:         1      D   <NA>   <NA>   POINT (700875 5490954)
#>     5:         1      E      H    E-H POINT (701671.9 5504286)
#>    ---                                                        
#> 22985:      1440      G   <NA>   <NA>   POINT (698212 5508998)
#> 22986:      1440      H   <NA>   <NA> POINT (699368.1 5507901)
#> 22987:      1440      I      C    C-I POINT (702959.5 5509448)
#> 22988:      1440      I      F    F-I POINT (702959.5 5509448)
#> 22989:      1440      J   <NA>   <NA> POINT (700657.5 5509277)
#>                        centroid
#>                     <sfc_POINT>
#>     1:              POINT EMPTY
#>     2: POINT (699637.9 5509637)
#>     3:              POINT EMPTY
#>     4:              POINT EMPTY
#>     5:   POINT (701698 5504306)
#>    ---                         
#> 22985:              POINT EMPTY
#> 22986:              POINT EMPTY
#> 22987: POINT (702960.6 5509447)
#> 22988: POINT (702959.5 5509452)
#> 22989:              POINT EMPTY