Skip to contents

distance_to_centroid calculates the distance of each relocation to the centroid of the spatiotemporal group identified by group_pts. The function expects a data.table with relocation data appended with a group column from group_pts and centroid columns from centroid_group. Relocation data should be provided in two columns representing the X and Y coordinates, or in a geometry column prepared by the helper function get_geometry().

Usage

distance_to_centroid(
  DT = NULL,
  coords = NULL,
  group = "group",
  crs = NULL,
  return_rank = TRUE,
  ties.method = NULL,
  geometry = "geometry"
)

Arguments

DT

input data.table with centroid columns generated by eg. centroid_group

coords

character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names

group

group column name, generated by group_pts, default 'group'

crs

numeric or character defining the coordinate reference system to be passed to sf::st_crs. For example, either crs = "EPSG:32736" or crs = 32736. Used only if coords are provided, see details under Interface

return_rank

logical if rank distance should also be returned, default TRUE

ties.method

see ?data.table::frank()

geometry

simple feature geometry list column name, generated by get_geometry(). Default 'geometry', see details under Interface

Value

distance_to_centroid returns the input DT appended with a distance_centroid column indicating the distance to the group centroid and, optionally, a rank_distance_centroid column indicating the within group rank distance to the group centroid (if return_rank = TRUE).

A message is returned when distance_centroid and optional rank_distance_centroid columns already exist in the input DT, because they will be overwritten.

See details for appending outputs using modify-by-reference in the FAQ.

Details

The DT must be a data.table. If your data is a data.frame, you can convert it by reference using data.table::setDT() or by reassigning using data.table::data.table().

This function expects a group column present generated with the group_pts function and centroid coordinate column(s) generated with the centroid_group function. The group arguments expect the names of columns in DT which correspond to the group column. The return_rank argument controls if the rank of each individual's distance to the group centroid is also returned. The ties.method argument is passed to data.table::frank, see details at ?data.table::frank().

See below under "Interface" for details on providing coordinates and under "Distance function" for details on underlying distance function used.

Interface

Two interfaces are available for providing coordinates:

  1. Provide coords and crs. The coords argument expects the names of the X and Y coordinate columns. The crs argument expects a character string or numeric defining the coordinate reference system to be passed to sf::st_crs. For example, for UTM zone 36S (EPSG 32736), the crs argument is crs = "EPSG:32736" or crs = 32736. See https://spatialreference.org for a list of EPSG codes.

  2. (New!) Provide geometry. The geometry argument allows the user to supply a geometry column that represents the coordinates as a simple feature geometry list column. This interface expects the user to prepare their input DT with get_geometry(). To use this interface, leave the coords and crs arguments NULL, and the default argument for geometry ('geometry') will be used directly.

Distance function

The underlying distance function used depends on the crs of the coordinates or geometry provided.

Note: in both cases, if the coordinates are NA then the result will be NA.

References

See examples of using distance to group centroid:

Examples

# Load data.table
library(data.table)
# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))
# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#>          ID        X       Y            datetime population
#>      <char>    <num>   <num>              <POSc>      <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1
#>  ---                                                       
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1

# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12

# Spatial grouping with timegroup
group_pts(DT, threshold = 5, id = 'ID',
          coords = c('X', 'Y'), timegroup = 'timegroup')
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group
#>      <int>
#>   1:     1
#>   2:     2
#>   3:     3
#>   4:     4
#>   5:     5
#>  ---      
#> 116:   109
#> 117:   110
#> 118:   111
#> 119:   112
#> 120:    59

# Calculate group centroid
centroid_group(DT, coords = c('X', 'Y'), group = 'group')
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group centroid_X centroid_Y
#>      <int>      <num>      <num>
#>   1:     1   696191.5    5508362
#>   2:     2   696205.2    5508363
#>   3:     3   696745.8    5508225
#>   4:     4   696952.0    5508373
#>   5:     5   696074.7    5508214
#>  ---                            
#> 116:   109   696996.5    5508024
#> 117:   110   697046.4    5507922
#> 118:   111   697037.5    5507924
#> 119:   112   697303.0    5508347
#> 120:    59   696617.8    5508734

# Calculate distance to group centroid
distance_to_centroid(
  DT,
  coords = c('X', 'Y'),
  group = 'group',
)
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group centroid_X centroid_Y distance_centroid rank_distance_centroid
#>      <int>      <num>      <num>             <num>                  <num>
#>   1:     1   696191.5    5508362          0.000000                      1
#>   2:     2   696205.2    5508363          0.000000                      1
#>   3:     3   696745.8    5508225          0.000000                      1
#>   4:     4   696952.0    5508373          0.000000                      1
#>   5:     5   696074.7    5508214          5.973590                      4
#>  ---                                                                     
#> 116:   109   696996.5    5508024          0.000000                      1
#> 117:   110   697046.4    5507922          0.000000                      1
#> 118:   111   697037.5    5507924          0.000000                      1
#> 119:   112   697303.0    5508347          0.000000                      1
#> 120:    59   696617.8    5508734          2.136197                      1

# Or, using the new geometry interface
get_geometry(DT, coords = c('X', 'Y'), crs = 32736)
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      group centroid_X centroid_Y distance_centroid rank_distance_centroid
#>      <int>      <num>      <num>             <num>                  <num>
#>   1:     1   696191.5    5508362          0.000000                      1
#>   2:     2   696205.2    5508363          0.000000                      1
#>   3:     3   696745.8    5508225          0.000000                      1
#>   4:     4   696952.0    5508373          0.000000                      1
#>   5:     5   696074.7    5508214          5.973590                      4
#>  ---                                                                     
#> 116:   109   696996.5    5508024          0.000000                      1
#> 117:   110   697046.4    5507922          0.000000                      1
#> 118:   111   697037.5    5507924          0.000000                      1
#> 119:   112   697303.0    5508347          0.000000                      1
#> 120:    59   696617.8    5508734          2.136197                      1
#>                      geometry
#>                   <sfc_POINT>
#>   1: POINT (696191.5 5508362)
#>   2: POINT (696205.2 5508363)
#>   3: POINT (696745.8 5508225)
#>   4:   POINT (696952 5508373)
#>   5:   POINT (696079 5508218)
#>  ---                         
#> 116: POINT (696996.5 5508024)
#> 117: POINT (697046.4 5507922)
#> 118: POINT (697037.5 5507924)
#> 119:   POINT (697303 5508347)
#> 120: POINT (696616.7 5508736)
group_pts(DT, threshold = 5, id = 'ID', timegroup = 'timegroup')
#> group column will be overwritten by this function
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      centroid_X centroid_Y distance_centroid rank_distance_centroid
#>           <num>      <num>             <num>                  <num>
#>   1:   696191.5    5508362          0.000000                      1
#>   2:   696205.2    5508363          0.000000                      1
#>   3:   696745.8    5508225          0.000000                      1
#>   4:   696952.0    5508373          0.000000                      1
#>   5:   696074.7    5508214          5.973590                      4
#>  ---                                                               
#> 116:   696996.5    5508024          0.000000                      1
#> 117:   697046.4    5507922          0.000000                      1
#> 118:   697037.5    5507924          0.000000                      1
#> 119:   697303.0    5508347          0.000000                      1
#> 120:   696617.8    5508734          2.136197                      1
#>                      geometry group
#>                   <sfc_POINT> <int>
#>   1: POINT (696191.5 5508362)     1
#>   2: POINT (696205.2 5508363)     2
#>   3: POINT (696745.8 5508225)     3
#>   4:   POINT (696952 5508373)     4
#>   5:   POINT (696079 5508218)     5
#>  ---                               
#> 116: POINT (696996.5 5508024)   109
#> 117: POINT (697046.4 5507922)   110
#> 118: POINT (697037.5 5507924)   111
#> 119:   POINT (697303 5508347)   112
#> 120: POINT (696616.7 5508736)    59
centroid_group(DT)
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      centroid_X centroid_Y distance_centroid rank_distance_centroid
#>           <num>      <num>             <num>                  <num>
#>   1:   696191.5    5508362          0.000000                      1
#>   2:   696205.2    5508363          0.000000                      1
#>   3:   696745.8    5508225          0.000000                      1
#>   4:   696952.0    5508373          0.000000                      1
#>   5:   696074.7    5508214          5.973590                      4
#>  ---                                                               
#> 116:   696996.5    5508024          0.000000                      1
#> 117:   697046.4    5507922          0.000000                      1
#> 118:   697037.5    5507924          0.000000                      1
#> 119:   697303.0    5508347          0.000000                      1
#> 120:   696617.8    5508734          2.136197                      1
#>                      geometry group                 centroid
#>                   <sfc_POINT> <int>              <sfc_POINT>
#>   1: POINT (696191.5 5508362)     1 POINT (696191.5 5508362)
#>   2: POINT (696205.2 5508363)     2 POINT (696205.2 5508363)
#>   3: POINT (696745.8 5508225)     3 POINT (696745.8 5508225)
#>   4:   POINT (696952 5508373)     4   POINT (696952 5508373)
#>   5:   POINT (696079 5508218)     5 POINT (696074.7 5508214)
#>  ---                                                        
#> 116: POINT (696996.5 5508024)   109 POINT (696996.5 5508024)
#> 117: POINT (697046.4 5507922)   110 POINT (697046.4 5507922)
#> 118: POINT (697037.5 5507924)   111 POINT (697037.5 5507924)
#> 119:   POINT (697303 5508347)   112   POINT (697303 5508347)
#> 120: POINT (696616.7 5508736)    59 POINT (696617.8 5508734)
direction_to_centroid(DT)
#>          ID        X       Y            datetime population minutes timegroup
#>      <char>    <num>   <num>              <POSc>      <int>   <int>     <int>
#>   1:      A 696191.5 5508362 2017-01-17 00:00:47          1       0         1
#>   2:      A 696205.2 5508363 2017-01-17 02:00:48          1       0         2
#>   3:      A 696745.8 5508225 2017-01-17 04:00:48          1       0         3
#>   4:      A 696952.0 5508373 2017-01-17 06:00:54          1       0         4
#>   5:      A 696079.0 5508218 2017-01-17 08:00:54          1       0         5
#>  ---                                                                         
#> 116:      J 696996.5 5508024 2017-01-17 14:00:42          1       0         8
#> 117:      J 697046.4 5507922 2017-01-17 16:00:47          1       0         9
#> 118:      J 697037.5 5507924 2017-01-17 18:00:54          1       0        10
#> 119:      J 697303.0 5508347 2017-01-17 20:00:24          1       0        11
#> 120:      J 696616.7 5508736 2017-01-17 22:00:42          1       0        12
#>      centroid_X centroid_Y distance_centroid rank_distance_centroid
#>           <num>      <num>             <num>                  <num>
#>   1:   696191.5    5508362          0.000000                      1
#>   2:   696205.2    5508363          0.000000                      1
#>   3:   696745.8    5508225          0.000000                      1
#>   4:   696952.0    5508373          0.000000                      1
#>   5:   696074.7    5508214          5.973590                      4
#>  ---                                                               
#> 116:   696996.5    5508024          0.000000                      1
#> 117:   697046.4    5507922          0.000000                      1
#> 118:   697037.5    5507924          0.000000                      1
#> 119:   697303.0    5508347          0.000000                      1
#> 120:   696617.8    5508734          2.136197                      1
#>                      geometry group                 centroid direction_centroid
#>                   <sfc_POINT> <int>              <sfc_POINT>            <units>
#>   1: POINT (696191.5 5508362)     1 POINT (696191.5 5508362)          NaN [rad]
#>   2: POINT (696205.2 5508363)     2 POINT (696205.2 5508363)          NaN [rad]
#>   3: POINT (696745.8 5508225)     3 POINT (696745.8 5508225)          NaN [rad]
#>   4:   POINT (696952 5508373)     4   POINT (696952 5508373)          NaN [rad]
#>   5:   POINT (696079 5508218)     5 POINT (696074.7 5508214)    -2.374202 [rad]
#>  ---                                                                           
#> 116: POINT (696996.5 5508024)   109 POINT (696996.5 5508024)          NaN [rad]
#> 117: POINT (697046.4 5507922)   110 POINT (697046.4 5507922)          NaN [rad]
#> 118: POINT (697037.5 5507924)   111 POINT (697037.5 5507924)          NaN [rad]
#> 119:   POINT (697303 5508347)   112   POINT (697303 5508347)          NaN [rad]
#> 120: POINT (696616.7 5508736)    59 POINT (696617.8 5508734)     2.581654 [rad]