
Package index
-
as.list(<robotstxt_text>) - Convert robotstxt_text to list
-
fix_url() - Add http protocal if missing from URL
-
get_robotstxt() - Download a robots.txt file
-
rt_last_httpget_robotstxt_http_get() - Storage for HTTP request response objects
-
get_robotstxts() - Download multiple robotstxt files
-
guess_domain() - Guess a domain from path
-
http_domain_changed() - Check if HTTP domain changed
-
http_subdomain_changed() - Check if HTTP subdomain changed
-
http_was_redirected() - Check if HTTP redirect occurred
-
is_suspect_robotstxt() - Check if file is valid / parsable robots.txt file
-
is_valid_robotstxt() - Validate if a file is valid / parsable robots.txt file
-
list_merge() - Merge a number of named lists in sequential order
-
null_to_default() - Return default value if NULL
-
parse_robotstxt() - Parse a robots.txt file
-
paths_allowed() - Check if a bot has permissions to access page(s)
-
paths_allowed_worker_spiderbar() - Check if a spiderbar bot has permissions to access page(s)
-
%>% - re-export magrittr pipe operator
-
print(<robotstxt>) - Print robotstxt
-
print(<robotstxt_text>) - Print robotstxt's text
-
remove_domain() - Remove domain from path
-
request_handler_handler() - Handle robotstxt handlers
-
robotstxt() - Generate a representation of a robots.txt file
-
rt_cache - Get the robotstxt cache
-
rt_request_handler()on_server_error_defaulton_client_error_defaulton_not_found_defaulton_redirect_defaulton_domain_change_defaulton_sub_domain_change_defaulton_file_type_mismatch_defaulton_suspect_content_default - Handle robotstxt object retrieved from HTTP request