Changelog
Source:NEWS.md
biomartr 1.0.7
CRAN release: 2023-12-02
New features
Generalization of Biomart database access #108
- Generalized biomart database interface (now uses https and port 433)
- added cache for biomart database overview
- added more unit tests for
listGenomes()
andbiomart()
Bug fixes
- fixed
listGenomes()
filter error #107 - Bacteria collection corner case bug fixed #109
biomartr 1.0.6
CRAN release: 2023-10-24
New features
Some cool new generalization, and check out function
biomartr:::supported_biotypes(db = "refseq")
. This function will simplify a lot of stuff downstream. (#104)Tests are now much quicker to run, because
biomartr::is.genome.available
(which is used basically everywhere) now reads files with data.table instead of reader. (#104)
Bug fixes
- Fixing bug in
is.genome.available()
where the skip_bacteria argument was not passed on internally tois.genome.available.refseq.genbank()
(#105)
biomartr 1.0.5
CRAN release: 2023-10-04
Package generalization
Over 5000 lines have been edited, most of them removed (#100), to generalize the package to make it more safe for future development. This progress is still ongoing.
New features
- Ensembl genomes is no longer a different database compared to ensembl in biomaRt, since this split is artifical. It is adviced to use only “ensembl” as db from now on, but “ensemblgenomes” will still work.
- Annotation did mean gff, but it should be both gff and gtf getter, with format specification, this is now fixed and generalized.
- Added in new kingdom for ensembl: protists supportwith correct collection getters
- The retrieval from the
UniProt
database is now updated to the new API/FTP path system. Now users can retrieve proteomes using the functionsgetProteome(db = "uniprot", ...)
andgetProteomeSet(db = "uniprot", ...)
(see #82) - new function
getBioSet
: Generic Bio data set extractor - new function
getBio
: A wrapper to all bio getters, selected with ‘type’ argument - a new function
getUniProtSTATS()
: Retrieve UniProt Database Information File (STATS)
biomartr 1.0.4
CRAN release: 2023-06-20
New Features
- in
getSummaryFile()
all columns of theassembly_summary.txt
are now specified with names and correct data types (#92) - all
get*()
functions, thegetKingdomAssemblySummary()
, andis.genome.available.refseq.genbank()
all receive a new argumentskip_bacteria
which is set toTRUE
by default. This ensures that the huge dataset file forbacteria
is not downloaded by default when retrieving summary files fromGenBank
. Users who wish to retrieved data from particular bacteria can actively setskip_bacteria = TRUE
in allget*()
functions.
Bug Fixes
- whenever the low-level function
getKingdomAssemblySummary()
was called by all get*() functions, due to an error in theassembly_summary.txt
file for viruses where the total gene count was stored as character and not as integer (as is the case for all otherassembly_summary.txt
files), an error occurred stating thatdplyr::bind_rows()
cannot join column $X35 due to differences in data types. This has now been resolved by parsing the correct data types withreadr
. Many thanks to … for pointing this out to me. (#92) - fixing md5checks in all
get*()
functions
biomartr 1.0.3
CRAN release: 2023-05-07
- adding pull request #88 which fixes issues with
http
tohttps
curl requests (Many thanks to @Roleren)
biomartr 1.0.2
CRAN release: 2022-02-23
New Functions
New function
check_annotation_biomartr()
helps to check whether downloaded GFF or GTF files are corrupt. Find more details herenew function
getCollectionSet()
allows users to retrieve a Collection: Genome, Proteome, CDS, RNA, GFF, Repeat Masker, AssemblyStats of multiple species
Example:
# define scientific names of species for which
# collections shall be retrieved
organism_list <- c("Arabidopsis thaliana",
"Arabidopsis lyrata",
"Capsella rubella")
# download the collection of Arabidopsis thaliana from refseq
# and store the corresponding genome file in '_ncbi_downloads/collection'
getCollectionSet( db = "refseq",
organism = organism_list,
path = "set_collections")
New Features
- the
getGFF()
function receives a new argumentremove_annotation_outliers
to enable users to remove corrupt lines from a GFF file Example:
Ath_path <- biomartr::getGFF(organism = "Arabidopsis thaliana", remove_annotation_outliers = TRUE)
the
getGFFSet()
function receives a new argumentremove_annotation_outliers
to enable users to remove corrupt lines from a GFF filethe
getGTF()
function receives a new argumentremove_annotation_outliers
to enable users to remove corrupt lines from a GTF fileadding a new message system to
biomartr::organismBM()
,biomartr::organismAttributes()
, andbiomartr::organismFilters()
so that large API queries don’t seem so unresponsivegetCollection()
receives new argumentsrelease
,remove_annotation_outliers
, andgunzip
that will now be passed on to downstream retrieval functionsthe
getGTF()
,getGenome()
andgetGenomeSet()
functions receives a new argumentassembly_type = "toplevel"
to enable users to choose between toplevel and primary assembly when using ensembl database. Settingassembly_type = "primary_assembly"
will save a lot a space on hard drives for people using large ensembl genomes.all
get*()
functions withrelease
argument now check if the ENSEMBL release is >45 (Many thanks to @Roleren #31 #61)in all
get*()
functions, thereadr::write_tsv(path = )
was exchanged toreadr::write_tsv(file = )
, since thereadr
package version > 1.4.0 is depreciating thepath
argument.tbl_df()
was deprecated in dplyr 1.0.0. Please usetibble::as_tibble()
instead. -> adjustedorganismBM()
accordinglycustom_download()
,getGENOMEREPORT()
, and other download functions now have specifiedwithr::local_options(timeout = max(30000000, getOption("timeout")))
which extends the default 60sec timeout to 30000000sec
Bug Fixes
Fixing bug where genome availability check in
getCollection()
was only performed inNCBI RefSeq
and not in other databases due to a constant used inis.genome.available()
rather than a variable (Many thanks to Takahiro Yamada for catching the bug) #53fixing an issue that caused the
read_cds()
function to fail indata.table
mode (Many thanks to Clement Kent) #57fixing an
SSL
bug that was found onUbuntu 20.04
systems #66 (Many thanks to Håkon Tjeldnes)fixing global variable issue that caused
clean.retrieval()
to fail when no documentation file was in ameta.retrieval()
folderThe NCBI recently started adding
NA
values as FTP file paths in theirspecies summary files
for species without reference genomes. As a resultmeta.retrieval()
stopped working, because no FTP paths were found for some species. This issue was now fixed by adding the filter rule!is.na(ftp_path)
into allget*()
functions (Many thanks for making me aware of this issue Ashok Kumar Sharma #34 and Dominik Merges #72)Fixing an issue in
custom_download()
where themethod
argument was causing issues when downloading fromhttps
directedftp
sites (Many thanks to @cmatKhan) #76Fixing issue when trying to combine multiple summary-stats files where NA’s were present in the list item that was passed along for combination in
meta.retrieval()
#73 (Many thanks to Dominik Merges)Fixing a bug in
download.database.all()
where the lack of removing listed file*-metadata.json
caused corruption of the download process (Many thanks to Jaruwatana Lotharukpong)
biomartr 0.9.2 - minor changes to comply with CRAN policy regarding Internet access failure -> Instead of using warnings or error messages, only gentle messages are allowed to be used
biomartr 0.9.0
CRAN release: 2019-05-21
Please be aware that as of April 2019, ENSEMBLGENOMES was retired (see details here). Hence, all biomartr
functions were updated and won’t support data retrieval from ENSEMBLGENOMES
servers anymore.
New Functions
- New function
clean.retrieval()
enables formatting and automatic unzipping of meta.retrieval output (find out more here: https://docs.ropensci.org/biomartr/articles/MetaGenome_Retrieval.html#un-zipping-downloaded-files) - New function
getGenomeSet()
allows users to easily retrieve genomes of multiple specified species. In addition, the genome summary statistics for all retrieved species will be stored as well to provide users with insights regarding the genome assembly quality of each species. This file can be used as Supplementary Information file in publications to facilitate reproducible research. - New function
getProteomeSet()
allows users to easily retrieve proteomes of multiple specified species - New function
getCDSSet()
allows users to easily retrieve coding sequences of multiple specified species - New function
getGFFSet()
allows users to easily retrieve GFF annotation files of multiple specified species - New function
getRNASet()
allows users to easily retrieve RNA sequences of multiple specified species - New function
summary_genome()
allows users to retrieve summary statistics for a genome assembly file to assess the influence of genome assembly qualities when performing comparative genomics tasks - New function
summary_cds()
allows users to retrieve summary statistics for a coding sequence (CDS) file. We noticed, that many CDS files stored in NCBI or ENSEMBL databases contain sequences that aren’t divisible by 3 (division into codons). This makes it difficult to divide CDS into codons for e.g. codon alignments or translation into protein sequences. In addition, some CDS files contain a significant amount of sequences that do not start with AUG (start codon). This function enables users to quantify how many of these sequences exist in a downloaded CDS file to process these files according to the analyses at hand.
New Features of Existing Functions
- the default value of argument
reference
inmeta.retrieval()
changed fromreference = TRUE
toreference = FALSE
. This way all genomes (reference AND non-reference) genomes will be downloaded by default. This is what users seem to prefer. -
getCollection()
now also retrievesGTF
files whendb = 'ensembl'
-
getAssemblyStats()
now also performs md5 checksum test - all md5 checksum tests now retrieve the new md5checkfile format from NCBI RefSeq and Genbank
-
getGTF()
: users can now specify the NCBI Taxonomy ID or Accession ID in addition to the scientific name in argument ‘organism’ to retrieve genome assemblies -
getGFF()
: users can now specify the NCBI Taxonomy ID or Accession ID for ENSEMBL in addition to the scientific name in argument ‘organism’ to retrieve genome assemblies -
getMarts()
will now throw an error when BioMart servers cannot be reached (#36) -
getGenome()
now also stores the genome summary statistics (see?summary_genome()
) for the retrieved species in thedocumentation
folder to provide users with insights regarding the genome assembly quality - In all get*() functions the default for argument
reference
is now set fromreference = TRUE
toreference = FALSE
(= new default) - all
get*()
functions now received a new argumentrelease
which allows users to retrieve specific release versions of genomes, proteomes, etc fromENSEMBL
andENSEMBLGENOMES
- all
get*()
functions received two new argumentsclean_retrieval
andgunzip
which allows users to upzip the downloaded files directly in theget*()
function call and rename the file for more convenient downstream analyses
biomartr 0.8.0
CRAN release: 2018-06-27
New Functions
- new function
getCollection()
for retrieval of a collection: the genome sequence, protein sequences, gff files, etc for a particular species
New Functionality of Existing Functions
getProteome()
can now retrieve proteomes from the UniProt database by specifyinggetProteome(db = "uniprot")
.is.genome.available()
now prints out more useful interactive messages when searching for available organismsis.genome.available()
can now handletaxids
andassembly_accession ids
in addition to the scientific name when specifying argumentorganism
is.genome.available()
can now check for organism availability in the UniProt databasegetGenome()
: users can now specify the NCBI Taxonomy ID or Accession ID in addition to the scientific name in argument ‘organism’ to retrieve genome assembliesgetProteome()
: users can now specify the NCBI Taxonomy ID or Accession ID in addition to the scientific name in argument ‘organism’ to retrieve proteomesgetCDS()
: users can now specify the NCBI Taxonomy ID or Accession ID in addition to the scientific name in argument ‘organism’ to retrieve CDSgetRNA()
: users can now specify the NCBI Taxonomy ID or Accession ID in addition to the scientific name in argument ‘organism’ to retrieve RNAsis.genome.available()
: argument order was changed from is.genome.available(organism, details, db) to is.genome.available(db, organism, details) to be logically more consistent with allget*()
functionsmeta.retrieval
receives a new argumentrestart_at_last
to indicate whether or not the download process when re-running themeta.retrieval
function shall pick up at the last species or whether it should crawl through all existing files to check the md5checksummeta.retrieval
now generates an csv overview file in thedoc
folder which stores genome version, date, origin, etc information for all downloaded organisms and can be directly used as Supplementary Data file in publications to increase computational and biological reproducibility of the genomics studydownload.database.all()
can now skip already downloaded files and internally removes corrupted files with non-matching md5checksum. Re-downloading of currupted files and be performed by simply re-running thedownload.database.all()
command
biomartr 0.7.0
CRAN release: 2018-01-03
Function changes
the function
meta.retrieval()
will now pick up the download at the organism where it left off and will report which species have already been retrievedall
get*()
functions and themeta.retrieval()
function receive a new argumentreference
which allows users to retrieve not-reference or not-representative genome versions when downloading from NCBI RefSeq or NCBI Genbankthe argument order in
meta.retrieval()
changed frommeta.retrieval(kingdom, group, db, ...)
tometa.retrieval(db,kingdom, group, ...)
to make the argument order more consistent with theget*()
functionsthe argument order in
getGroups()
changed fromgetGroups(kingdom, db)
togetGroups(db, kingdom)
to make the argument order more consistent with theget*()
andmeta.retrieval()
functions
biomartr 0.5.2
CRAN release: 2017-09-20
Bug fixes
- fixing bug (https://github.com/ropensci/biomartr/issues/6) that caused incorrect filtering condition when more than one entry for an organism is present in the assemblysummary.txt file at NCBI (Thanks to @kalmeshv)
biomartr 0.5.1
CRAN release: 2017-05-28
Bug fixes
fixing a bug in
exists.ftp.file()
andgetENSEMBLGENOMES.Seq()
that caused bacterial genome, proteome, etc retrieval to fail due to the wrong construction of a query ftp request https://github.com/ropensci/biomartr/issues/7 (Many thanks to @dbsseven)fix a major bug in which organisms having no representative genome would generate NULL paths that subsequently crashed the
meta.retrieval()
function when it tried to print out the result paths.
New Functions
new function
getRepeatMasker()
for retrieval of Repeat Masker output filesnew function
getGTF()
for genome annotation retrieval fromensembl
andensemblgenomes
ingtf
format (Thanks for suggesting it Ge Tan)new function
getRNA()
to perform RNA Sequence Retrieval from NCBI and ENSEMBL databases (Thanks for suggesting it @carlo-berg)new function
read_rna()
for importing Repeat Masker output files downloaded withgetRepeatMasker()
new function
read_rm()
for importing RNA downloaded withgetRNA()
as Biostrings or data.table objectnew helper function
custom_download()
that aims to make the download process more robust and stable -> In detail, the download process is now adapting to the operating system, e.g. using eithercurl
(macOS),wget
(Linux), orwininet
(Windows)
Function changes
function name
listDatabases()
has been renamedlistNCBIDatabases()
. Inbiomartr
version 0.6.0 the function namelistDatabases()
will be depreciatedmeta.retieval()
andmeta.retieval.all()
now allow the bulk retrieval of GTF files fortype = 'ensembl'
andtype = 'esnemblgenomes'
viatype = "gtf"
. SeegetGTF()
for more details.meta.retieval()
andmeta.retieval.all()
now allow the bulk retrieval of RNA files viatype = "rna"
. SeegetRNA()
for more details.meta.retieval()
andmeta.retieval.all()
now allow the bulk retrieval of Repeat Masker output files viatype = "rm"
. SeegetRepeatMasker()
for more details.all
get*()
retrieval functions now skip the download of a particular file if it already exists in the specified file pathdownload.database()
anddownload.database.all()
now internally perform md5 check sum checks to make sure that the file download was successfuldownload.database()
anddownload.database.all()
now return the file paths of the downloaded file so that it is easier to use these functions when constructing pipelines, e.g.download.database() %>% ...
ordownload.database.all() %>% ...
.meta.retrieval()
andmeta.retrieval.all()
now return the file paths of the downloaded file so that it is easier to use these functions when constructing pipelines, e.g.meta.retrieval() %>% ...
ormeta.retrieval() %>% ...
.getGenome()
,getProteome()
,getCDS()
,getRNA()
,getGFF()
, andgetAssemblyStats()
now internally perform md5 checksum tests to make sure that files are retrieved intact.
biomartr 0.4.0
CRAN release: 2017-03-14
Bug fixes
- fixing a major bug https://github.com/ropensci/biomartr/issues/6 that caused that in all
get*()
(genome, proteome, gff, etc.) andmeta.retrieval*()
functions the meta retrieval process errored and terminated whenever NCBI or ENSEMBL didn’t store all types of sequences for a particular organism: genome, proteome, cds, etc. This has been fixed now and function calls such asmeta.retrieval(kingdom = "bacteria", db = "genbank", type = "proteome")
should work properly now (Thanks to @ARamesh123 for making me aware if this bug). Hence, this bug affected all attempts to download all proteome sequences e.g. for bacteria and viruses, because NCBI does not store genome AND proteome information for all bacterial or viral species.
New Functions
new function
getAssemblyStats()
allows users to retrieve the genome assembly stats file from NCBI RefSeq or Genbank, e.g. ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.36_GRCh38.p10/GCF_000001405.36_GRCh38.p10_assembly_stats.txtnew function
read_assemblystats()
allows to import the genome assembly stats file from NCBI RefSeq or Genbank that was retrieved using thegetAssemblyStats()
function
Function changes
meta.retrieval()
andmeta.retrieval.all()
can now also download genome assembly stats for all selected speciesmeta.retrieval()
receives a new argumentgroup
that allows users to retrieve species belonging to a subgroup instead of the entire kingdom. Available groups can be retrieved withgetGroups()
.functions
getSubgroups()
andlistSubgroups()
have been removed and their initial functionality has been merged and integrated intogetGroups()
andlistGroups()
listGroups()
receives a new argumentdetails
that allows users to retrieve the organism names that belong to the corresponding subgroupsgetGroups()
is now based onlistGroups()
internal function
getGENOMESREPORT()
is now exported and available to the userall
organism*()
functions now also support Ensembl Plants, Ensembl Metazoa, Ensembl Protist, and Ensembl Fungi (Thanks for pointing out Alex Gabel)getMarts()
andgetDatasets()
now also support Ensembl Plants, Ensembl Metazoa, Ensembl Protist, and Ensembl Fungi (Thanks for pointing out Alex Gabel)
biomartr 0.3.0
CRAN release: 2017-02-09
Bug fixes
- Fixing a bug https://github.com/ropensci/biomartr/issues/2 based on the readr package that affected the
getSummaryFile()
,getKingdomAssemblySummary()
,getMetaGenomeSummary()
,getENSEMBL.Seq()
andgetENSEMBLGENOMES.Seq()
functions causing quoted lines in theassembly_summary.txt
to be omitted when reading these files. This artefact caused that e.g. instead of information of 80,000 Bacteria genomes only 40,000 (which non-quotations) were read (Thanks to Xin Wu).
biomartr 0.2.1
CRAN release: 2016-12-15
In this version of biomartr
the organism*()
functions were adapted to the new ENSEMBL 87 release in which organism name specification in the Biomart description column was changed from a scientific name convention to a mix of common name and scientific name convention.
all
organism*()
functions have been adapted to the new ENSEMBL 87 release organism name notation that is used in the Biomart descriptionfixing error handling bug that caused commands such as
download.database(db = "nr.27.tar.gz")
to not execute properly
biomartr 0.2.0
CRAN release: 2016-11-22
In this version, biomartr
was extended to now retrieve genome, proteome, CDS, GFF and meta-genome data also from ENSEMBL and ENSEMLGENOMES. Furthermore, all NCBI retrieval functions were updated to the new server folder structure standards of NCBI.
New Functions
new meta-retrieval function
meta.retrieval.all()
allows users to download all individual genomes of all kingdoms of life with one commandnew metagenome retrieval function
getMetaGenomes()
allows users to retrieve metagenome projects from NCBI Genbanknew metagenome retrieval function
getMetaGenomeAnnotations()
allows users to retrieve annotation files for genomes belonging to a metagenome project stored at NCBI Genbanknew retrieval function
getGFF()
allows users to retrieve annotation (*.gff) files for specific genomes from NCBI and ENSEMBL databasesnew import function
read_gff()
allowing users to import GFF files downloaded withgetGFF()
new internal functions to check for availability of ENSEMBL or ENSEMBLGENOMES databases
new database retrieval function
download.database.all()
allows users to download entire NCBI databases with one commandnew function
listMetaGenomes()
allowing users to list available metagenomes on NCBI Genbanknew external helper function
getSummaryFile()
to retrieve the assembly_summary.txt file from NCBInew external helper function
getKingdomAssemblySummary()
to retrieve the assembly_summary.txt files from NCBI for all kingdoms and combine them into one big data.framenew function
listKingdoms()
allows users to list the number of available species per kingdom of lifenew function
listGroups()
allows users to list the number of available species per groupnew function
listSubgroups()
allows users to list the number of available species per subgroupnew function
getGroups()
allows users to retrieve available groups for a kingdom of lifenew function
getSubgroups()
allows users to retrieve available subgroups for a kingdom of lifenew external helper function
getMetaGenomeSummary()
to retrieve the assembly_summary.txt files from NCBI genbank metagenomesnew internal helper function
getENSEMBL.Seq()
acting as main interface function to communicate with the ENSEMBL database API for sequence retrievalnew internal helper function
getENSEMBLGENOMES.Seq()
acting as main interface function to communicate with the ENSEMBL database API for sequence retrievalnew internal helper function
getENSEMBL.Annotation()
acting as main interface function to communicate with the ENSEMBL database API for GFF retrievalnew internal helper function
getENSEMBLGENOMES.Annotation()
acting as main interface function to communicate with the ENSEMBL database API for GFF retrievalnew internal helper function
get.ensemblgenome.info()
to retrieve general organism information from ENSEMBLGENOMESnew internal helper function
get.ensembl.info()
to retrieve general organism information from ENSEMBLnew internal helper function
getGENOMEREPORT()
to retrieve the genome reports file from ftp://ftp.ncbi.nlm.nih.gov/genomes/GENOME_REPORTS/overview.txtnew internal helper function
connected.to.internet()
enabling internet connection check
Function changes
functions
getGenome()
,getProteome()
, andgetCDS()
now can also in addition to NCBI retrieve genomes, proteomes or CDS from ENSEMBL and ENSEMLGENOMESthe functions
getGenome()
,getProteome()
, andgetCDS()
were completely re-written and now use the assembly_summary.txt files provided by NCBI to retrieve the download path to the corresponding genome. Furthermore, these functions now lost thekingdom
argument. Users now only need to specify the organism name and not the kingdom anymore. Furthermore, allget*
functions now return the path to the downloaded genome so that this path can be used as input to allread_*
functions.download_databases()
has been renamed todownload.databases()
to be more consistent with other function notationthe argument
db_format
was removed fromlistDatabases()
anddownload.database()
because it was misleadingthe command
listDatabases("all")
now returns all available NCBI databases that can be retrieved withdownload.database()
download.database()
now internally checks if input database specified by the user is actually available on NCBI serversthe documentary file generated by
getGenome()
,getProteome()
, andgetCDS()
is now extended to store more details about the downloaded genomeargument
database
inis.genome.available()
andlistGenomes()
has been renamed todb
to be consistent with all other sequence retrieval functionsis.genome.available()
now also checks availability of organisms in ENSEMBL. Seedb = "ensembl"
the argument
db_name
inlistDatabases()
has been renameddb
to be more consistent with the notation in other functionsthe argument
name
indownload.database()
has been renameddb
to be more consistent with the notation in other functionsgetKingdoms()
now retrieves also kingdom information for ENSEMBL and ENSEMBLGENOMESgetKingdoms()
received new argumentdb
to specify from which database (e.g.refseq
,genbank
,ensembl
orensemblgenomes
) kingdom information shall be retrievedgetKingdoms(db = "refseq")
received one more member:"viral"
, allowing the genome retrieval of all virusesargument
out.folder
inmeta.retrieval()
has been renamed topath
to be more consistent with other retrieval functionsall
read_*
functions now received a new argumentobj.type
allowing users to choose between storing input genomes as Biostrings object or data.table objectall
read_*
functions now haveformat = "fasta"
as defaultthe
kingdom
argument in thelistGenomes()
function was renamed totype
, now allowing users to specify not only specify kingdoms, but also groups and subgroups. Use:listGenomes(type = "kingdom")
orlistGenomes(type = "group")
orlistGenomes(type = "subgroup")
the
listGenomes()
function receives a new argumentsubset
to specify a subset of the selectedtype
argument. E.g.subset = "Eukaryota"
when specifyingtype = "kingdom"
biomartr 0.1.0
CRAN release: 2016-08-07
fixing a parsing error of the file
ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/assembly_summary.txt
The problem was that comment lines were introduced and columns couldn’t be parsed correctly anymore. This caused that genomes, proteomes, and CDS files could not be downloaded properly. This has been fixed now.genomes, proteome, and CDS as well as meta-genomes can now be retrieved from RefSeq and Genbank (not only RefSeq); only
getCDS()
does not have genebank access, becasue genbank does not provide CDS sequencesadding new function
meta.retrieval()
to mass retrieve genomes for entire kingdoms of lifefixed a major bug in
organismBM()
causing the function to fail. The failure of this function affected all downstreamorganism*()
functions. Bug is now fixed and everything works properlyupdated Vignettes
biomartr 0.0.3
CRAN release: 2016-03-02
updating unit tests for new API
fixing API problems that caused all BioMart related functions to fail
fixing retrieval problems in
getCDS()
,getProteome()
, andgetGenome()
the
listDatabases()
function now has a new optiondb_name = "all"
allowing users to list all available databases stored on NCBI