Check whether an annotation file contains outlier lines
Source:R/check_annotation_biomartr.R
check_annotation_biomartr.Rd
Some annotation files include lines with character lengths greater than 65000.
This causes problems when trying to import such annotation files into R using import
.
To overcome this issue, this function screens for such lines
in a given annotation file and removes these lines so that
import
can handle the file.
Arguments
- annotation_file
a file path to the annotation file.
- remove_annotation_outliers
shall outlier lines be removed from the input
annotation_file
? If yes, then the initialannotation_file
will be overwritten and the removed outlier lines will be stored attempdir
for further exploration.
Examples
if (FALSE) { # \dontrun{
# download an example annotation file from NCBI RefSeq
Ath_path <- biomartr::getGFF(organism = "Arabidopsis thaliana")
# run annotation file check on the downloaded file
biomartr::check_annotation_biomartr(Ath_path)
# several outlier lines were detected, thus we re-run the
# function using 'remove_annotation_outliers = TRUE'
# to remove the outliers and overwrite the file
biomartr::check_annotation_biomartr(Ath_path, remove_annotation_outliers = TRUE)
} # }