Skip to contents

group_lines groups rows into spatial groups by generating LINESTRINGs and grouping based on spatial intersection. The function expects a data.table with relocation data, individual identifiers and a distance threshold. The relocation data is transformed into sf LINESTRINGs using build_lines and intersecting LINESTRINGs are grouped. The threshold argument is used to specify the distance criteria for grouping. Relocation data should be in two columns representing the X and Y coordinates.

Usage

group_lines(
  DT = NULL,
  threshold = NULL,
  projection = NULL,
  id = NULL,
  coords = NULL,
  timegroup = NULL,
  sortBy = NULL,
  splitBy = NULL,
  sfLines = NULL
)

Arguments

DT

input data.table

threshold

The width of the buffer around the lines in the units of the projection. Use threshold = 0 to compare intersection without buffering.

projection

numeric or character defining the coordinate reference system to be passed to sf::st_crs. For example, either projection = "EPSG:32736" or projection = 32736.

id

character string of ID column name

coords

character vector of X coordinate and Y coordinate column names. Note: the order is assumed X followed by Y column names.

timegroup

timegroup field in the DT within which the grouping will be calculated

sortBy

Character string of date time column(s) to sort rows by. Must be a POSIXct.

splitBy

(optional) character string or vector of grouping column name(s) upon which the grouping will be calculated

sfLines

Alternatively to providing a DT, provide a simple feature LINESTRING object generated with the sf package. The id argument is required to provide the identifier matching each LINESTRING. If an sfLines object is provided, groups cannot be calculated by timegroup or splitBy.

Value

group_lines returns the input DT appended with a "group" column.

This column represents the spatial (and if timegroup was provided - spatiotemporal) group calculated by intersecting lines. As with the other grouping functions, the actual value of group is arbitrary and represents the identity of a given group where 1 or more individuals are assigned to a group. If the data was reordered, the group may change, but the contents of each group would not.

A message is returned when a column named "group" already exists in the input DT, because it will be overwritten.

Details

R-spatial evolution

Please note, spatsoc has followed updates from R spatial, GDAL and PROJ for handling projections, see more at https://r-spatial.org/r/2020/03/17/wkt.html.

In addition, group_lines (and build_lines) previously used sp::SpatialLines, rgeos::gIntersects, rgeos::gBuffer but have been updated to use sf::st_as_sf, sf::st_linestring, sf::st_intersects, and sf::st_buffer according to the R-spatial evolution, see more at https://r-spatial.org/r/2022/04/12/evolution.html.

Notes on arguments

The DT must be a data.table. If your data is a data.frame, you can convert it by reference using data.table::setDT.

The id, coords, sortBy (and optional timegroup and splitBy) arguments expect the names of respective columns in DT which correspond to the individual identifier, X and Y coordinates, sorting, timegroup (generated by group_times) and additional grouping columns.

The projection argument expects a numeric or character defining the coordinate reference system. For example, for UTM zone 36N (EPSG 32736), the projection argument is either projection = 'EPSG:32736' or projection = 32736. See details in sf::st_crs() and https://spatialreference.org for a list of EPSG codes.

The sortBy argument is used to order the input DT when creating sf LINESTRINGs. It must a column in the input DT of type POSIXct to ensure the rows are sorted by date time.

The threshold must be provided in the units of the coordinates. The threshold can be equal to 0 if strict overlap is intended, otherwise it should be some value greater than 0. The coordinates must be planar coordinates (e.g.: UTM). In the case of UTM, a threshold = 50 would indicate a 50m distance threshold.

The timegroup argument is optional, but recommended to pair with group_times. The intended framework is to group rows temporally with group_times then spatially with group_lines (or group_pts, group_polys). With group_lines, pick a relevant group_times threshold such as '1 day' or '7 days' which is informed by your study species, system or question.

The splitBy argument offers further control building LINESTRINGs. If in your input DT, you have multiple temporal groups (e.g.: years) for example, you can provide the name of the column which identifies them and build LINESTRINGs for each individual in each year. The grouping performed by group_lines will only consider rows within each splitBy subgroup.

See also

build_lines group_times

Other Spatial grouping: group_polys(), group_pts()

Examples

# Load data.table
library(data.table)

# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

# Subset only individuals A, B, and C
DT <- DT[ID %in% c('A', 'B', 'C')]

# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]
#>           ID        X       Y            datetime population
#>       <char>    <num>   <num>              <POSc>      <int>
#>    1:      A 715851.4 5505340 2016-11-01 00:00:54          1
#>    2:      A 715822.8 5505289 2016-11-01 02:01:22          1
#>    3:      A 715872.9 5505252 2016-11-01 04:01:24          1
#>    4:      A 715820.5 5505231 2016-11-01 06:01:05          1
#>    5:      A 715830.6 5505227 2016-11-01 08:01:11          1
#>   ---                                                       
#> 4265:      C 702093.6 5510180 2017-02-28 14:00:44          1
#> 4266:      C 702086.0 5510183 2017-02-28 16:00:42          1
#> 4267:      C 702961.8 5509447 2017-02-28 18:00:53          1
#> 4268:      C 703130.4 5509528 2017-02-28 20:00:54          1
#> 4269:      C 702872.3 5508531 2017-02-28 22:00:18          1

# EPSG code for example data
utm <- 32736

group_lines(DT, threshold = 50, projection = utm, sortBy = 'datetime',
            id = 'ID', coords = c('X', 'Y'))
#>           ID        X       Y            datetime population group
#>       <char>    <num>   <num>              <POSc>      <int> <num>
#>    1:      A 715851.4 5505340 2016-11-01 00:00:54          1     1
#>    2:      A 715822.8 5505289 2016-11-01 02:01:22          1     1
#>    3:      A 715872.9 5505252 2016-11-01 04:01:24          1     1
#>    4:      A 715820.5 5505231 2016-11-01 06:01:05          1     1
#>    5:      A 715830.6 5505227 2016-11-01 08:01:11          1     1
#>   ---                                                             
#> 4265:      C 702093.6 5510180 2017-02-28 14:00:44          1     1
#> 4266:      C 702086.0 5510183 2017-02-28 16:00:42          1     1
#> 4267:      C 702961.8 5509447 2017-02-28 18:00:53          1     1
#> 4268:      C 703130.4 5509528 2017-02-28 20:00:54          1     1
#> 4269:      C 702872.3 5508531 2017-02-28 22:00:18          1     1

## Daily movement tracks
# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '1 day')
#>           ID        X       Y            datetime population group timegroup
#>       <char>    <num>   <num>              <POSc>      <int> <num>     <int>
#>    1:      A 715851.4 5505340 2016-11-01 00:00:54          1     1         1
#>    2:      A 715822.8 5505289 2016-11-01 02:01:22          1     1         1
#>    3:      A 715872.9 5505252 2016-11-01 04:01:24          1     1         1
#>    4:      A 715820.5 5505231 2016-11-01 06:01:05          1     1         1
#>    5:      A 715830.6 5505227 2016-11-01 08:01:11          1     1         1
#>   ---                                                                       
#> 4265:      C 702093.6 5510180 2017-02-28 14:00:44          1     1       120
#> 4266:      C 702086.0 5510183 2017-02-28 16:00:42          1     1       120
#> 4267:      C 702961.8 5509447 2017-02-28 18:00:53          1     1       120
#> 4268:      C 703130.4 5509528 2017-02-28 20:00:54          1     1       120
#> 4269:      C 702872.3 5508531 2017-02-28 22:00:18          1     1       120

# Subset only first 50 days
DT <- DT[timegroup < 25]

# Spatial grouping
group_lines(DT, threshold = 50, projection = utm,
            id = 'ID', coords = c('X', 'Y'),
            timegroup = 'timegroup', sortBy = 'datetime')
#> group column will be overwritten by this function
#>          ID        X       Y            datetime population timegroup group
#>      <char>    <num>   <num>              <POSc>      <int>     <int> <int>
#>   1:      A 715851.4 5505340 2016-11-01 00:00:54          1         1     1
#>   2:      A 715822.8 5505289 2016-11-01 02:01:22          1         1     1
#>   3:      A 715872.9 5505252 2016-11-01 04:01:24          1         1     1
#>   4:      A 715820.5 5505231 2016-11-01 06:01:05          1         1     1
#>   5:      A 715830.6 5505227 2016-11-01 08:01:11          1         1     1
#>  ---                                                                       
#> 857:      C 710769.9 5507380 2016-11-24 14:00:55          1        24    63
#> 858:      C 710930.9 5507290 2016-11-24 16:00:26          1        24    63
#> 859:      C 711004.1 5507310 2016-11-24 18:00:49          1        24    63
#> 860:      C 711274.1 5507269 2016-11-24 20:00:24          1        24    63
#> 861:      C 711054.3 5506998 2016-11-24 22:00:41          1        24    63

## Daily movement tracks by population
group_lines(DT, threshold = 50, projection = utm,
            id = 'ID', coords = c('X', 'Y'),
            timegroup = 'timegroup', sortBy = 'datetime',
            splitBy = 'population')
#> group column will be overwritten by this function
#>          ID        X       Y            datetime population timegroup group
#>      <char>    <num>   <num>              <POSc>      <int>     <int> <int>
#>   1:      A 715851.4 5505340 2016-11-01 00:00:54          1         1     1
#>   2:      A 715822.8 5505289 2016-11-01 02:01:22          1         1     1
#>   3:      A 715872.9 5505252 2016-11-01 04:01:24          1         1     1
#>   4:      A 715820.5 5505231 2016-11-01 06:01:05          1         1     1
#>   5:      A 715830.6 5505227 2016-11-01 08:01:11          1         1     1
#>  ---                                                                       
#> 857:      C 710769.9 5507380 2016-11-24 14:00:55          1        24    63
#> 858:      C 710930.9 5507290 2016-11-24 16:00:26          1        24    63
#> 859:      C 711004.1 5507310 2016-11-24 18:00:49          1        24    63
#> 860:      C 711274.1 5507269 2016-11-24 20:00:24          1        24    63
#> 861:      C 711054.3 5506998 2016-11-24 22:00:41          1        24    63