group_pts
groups rows into spatial groups. The function expects a
data.table
with relocation data, individual identifiers and a
threshold argument. The threshold argument is used to specify the criteria
for distance between points which defines a group. Relocation data should be
in two columns representing the X and Y coordinates.
Arguments
- DT
input data.table
- threshold
distance for grouping points, in the units of the coordinates
- id
character string of ID column name
- coords
character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names.
- timegroup
timegroup field in the DT within which the grouping will be calculated
- splitBy
(optional) character string or vector of grouping column name(s) upon which the grouping will be calculated
Value
group_pts
returns the input DT
appended with a
group
column.
This column represents the spatialtemporal group. As with the other
grouping functions, the actual value of group
is arbitrary and
represents the identity of a given group where 1 or more individuals are
assigned to a group. If the data was reordered, the group
may
change, but the contents of each group would not.
A message is returned when a column named group
already exists in
the input DT
, because it will be overwritten.
Details
The DT
must be a data.table
. If your data is a
data.frame
, you can convert it by reference using
data.table::setDT
or by reassigning using
data.table::data.table
.
The id
, coords
, timegroup
(and optional splitBy
)
arguments expect the names of a column in DT
which correspond to the
individual identifier, X and Y coordinates, timegroup (typically generated by
group_times
) and additional grouping columns.
The threshold
must be provided in the units of the coordinates. The
threshold
must be larger than 0. The coordinates must be planar
coordinates (e.g.: UTM). In the case of UTM, a threshold
= 50 would
indicate a 50m distance threshold.
The timegroup
argument is required to define the temporal groups
within which spatial groups are calculated. The intended framework is to
group rows temporally with group_times
then spatially with
group_pts
(or group_lines
, group_polys
).
If you have already calculated temporal groups without
group_times
, you can pass this column to the timegroup
argument. Note that the expectation is that each individual will be observed
only once per timegroup. Caution that accidentally including huge numbers of
rows within timegroups can overload your machine since all pairwise distances
are calculated within each timegroup.
The splitBy
argument offers further control over grouping. If within
your DT
, you have multiple populations, subgroups or other distinct
parts, you can provide the name of the column which identifies them to
splitBy
. The grouping performed by group_pts
will only consider
rows within each splitBy
subgroup.
See also
Other Spatial grouping:
group_lines()
,
group_polys()
Examples
# Load data.table
library(data.table)
# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))
# Select only individuals A, B, C for this example
DT <- DT[ID %in% c('A', 'B', 'C')]
# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#> ID X Y datetime population
#> <char> <num> <num> <POSc> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1
#> ---
#> 4265: C 702093.6 5510180 2017-02-28 14:00:44 1
#> 4266: C 702086.0 5510183 2017-02-28 16:00:42 1
#> 4267: C 702961.8 5509447 2017-02-28 18:00:53 1
#> 4268: C 703130.4 5509528 2017-02-28 20:00:54 1
#> 4269: C 702872.3 5508531 2017-02-28 22:00:18 1
# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 4265: C 702093.6 5510180 2017-02-28 14:00:44 1 0 1393
#> 4266: C 702086.0 5510183 2017-02-28 16:00:42 1 0 1394
#> 4267: C 702961.8 5509447 2017-02-28 18:00:53 1 0 1440
#> 4268: C 703130.4 5509528 2017-02-28 20:00:54 1 0 1395
#> 4269: C 702872.3 5508531 2017-02-28 22:00:18 1 0 1396
# Spatial grouping with timegroup
group_pts(DT, threshold = 5, id = 'ID',
coords = c('X', 'Y'), timegroup = 'timegroup')
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 4265: C 702093.6 5510180 2017-02-28 14:00:44 1 0 1393
#> 4266: C 702086.0 5510183 2017-02-28 16:00:42 1 0 1394
#> 4267: C 702961.8 5509447 2017-02-28 18:00:53 1 0 1440
#> 4268: C 703130.4 5509528 2017-02-28 20:00:54 1 0 1395
#> 4269: C 702872.3 5508531 2017-02-28 22:00:18 1 0 1396
#> group
#> <int>
#> 1: 1
#> 2: 2
#> 3: 3
#> 4: 4
#> 5: 5
#> ---
#> 4265: 4228
#> 4266: 4229
#> 4267: 4230
#> 4268: 4231
#> 4269: 4232
# Spatial grouping with timegroup and splitBy on population
group_pts(DT, threshold = 5, id = 'ID', coords = c('X', 'Y'),
timegroup = 'timegroup', splitBy = 'population')
#> group column will be overwritten by this function
#> ID X Y datetime population minutes timegroup
#> <char> <num> <num> <POSc> <int> <int> <int>
#> 1: A 715851.4 5505340 2016-11-01 00:00:54 1 0 1
#> 2: A 715822.8 5505289 2016-11-01 02:01:22 1 0 2
#> 3: A 715872.9 5505252 2016-11-01 04:01:24 1 0 3
#> 4: A 715820.5 5505231 2016-11-01 06:01:05 1 0 4
#> 5: A 715830.6 5505227 2016-11-01 08:01:11 1 0 5
#> ---
#> 4265: C 702093.6 5510180 2017-02-28 14:00:44 1 0 1393
#> 4266: C 702086.0 5510183 2017-02-28 16:00:42 1 0 1394
#> 4267: C 702961.8 5509447 2017-02-28 18:00:53 1 0 1440
#> 4268: C 703130.4 5509528 2017-02-28 20:00:54 1 0 1395
#> 4269: C 702872.3 5508531 2017-02-28 22:00:18 1 0 1396
#> group
#> <int>
#> 1: 1
#> 2: 2
#> 3: 3
#> 4: 4
#> 5: 5
#> ---
#> 4265: 4228
#> 4266: 4229
#> 4267: 4230
#> 4268: 4231
#> 4269: 4232