Getting started with tidytagsSource:
This vignette introduces the initial setup necessary to use tidytags. Specifically, this guide offers help for two key tasks.
- Making sure your TAGS tracker can be accessed
- Getting and storing Twitter API tokens
Before reading through these steps for setting up tidytags, please take a few moments to reflect on ethical considerations related to social media research.
tidytags should be used in strict accordance with Twitter’s developer terms.
Although most Institutional Review Boards (IRBs) consider the Twitter data that tidytags analyzes to not necessarily be human subjects research, there remain ethical considerations pertaining to the use of the tidytags package that should be discussed.
Even if tidytags use is not for research purposes (or if an IRB determines that a study is not human subjects research), “the release of personally identifiable or sensitive data is potentially harmful,” as noted in the rOpenSci Packages guide. Therefore, although you can collect Twitter data (and you can use tidytags to analyze it), we urge care and thoughtfulness regarding how you analyze the data and communicate the results. In short, please remember that most (if not all) of the data you collect may be about people—and those people may not like the idea of their data being analyzed or included in research.
We recommend the Association of Internet Researchers’ (AoIR) resources related to conducting analyses in ethical ways when working with data about people. AoIR’s ethical guidelines may be especially helpful for navigating tensions related to collecting, analyzing, and sharing social media data.
With these things in mind, let’s get started working through two key tasks.
A core functionality of tidytags is to retrieve tweets data from a Twitter Archiving Google Sheet; TAGS). A TAGS tracker continuously collects tweets from Twitter, based on predefined search criteria and collection frequency.
Here we offer a brief overview on how to set up TAGS, but be sure to read through the information on the TAGS landing page for thorough instructions on getting started with TAGS.
We recommend using TAGS v6.1.
You will be prompted to
Make a copy of TAGS that will then reside in your own Google Drive space. Click the button to do this.
Your TAGS tracker is now ready to use! Just follow the two-steps of instructions on the TAGS tracker:
tidytags is set up to access a TAGS tracker by using the googlesheets4 package. One requirement for using googlesheets4 is that your TAGS tracker has been “published to the web.” To do this, with the TAGS page open in a web browser, go to
File >> Share >> Publish to the web.
Link field should be ‘Entire document’ and the
Embed field should be ‘Web page.’ If everything looks right, then click the
Next, click the
Share button in the top right corner of the Google Sheets window, select
Get shareable link, and set the permissions to ‘Anyone with the link can view.’
The input needed for the
tidytags::read_tags() function is either the entire URL from the top of the web browser when opened to a TAGS tracker, or a Google Sheet identifier (i.e., the alphanumeric string following
https://docs.google.com/spreadsheets/d/ in the TAGS tracker’s URL).
Be sure to put quotations marks around the URL or sheet identifier when entering it into
To verify that this step worked for you, run the following code:
What should return is the following:
Then, try to run
read_tags() with your own URL or sheet identifier. If that does not work, carefully review the steps above.
With a TAGS tracker archive imported into R, tidytags allows you to gather quite a bit more information related to the TAGS-collected tweets with the
pull_tweet_data() function. This function builds off the rtweet package (via
rtweet::lookup_tweets()) to query the Twitter API. However, to access the Twitter API, whether through rtweet or tidytags, you will need to apply for developers’ access from Twitter. You do this through Twitter’s developer website.
Once approved for developer’s access to the Twitter API, be sure to save the keys and tokens granted to you. These will only be available to you once (but you can easily generate new ones later as needed), so save them in a secure place.
Never share API keys or tokens with anyone; never add these directly to your R code or output.
One option is to save your Twitter API credentials in the .Renviron file accessed through the
Your saved Twitter API key and tokens should like something like this:
TWITTER_APP = NameOfYourTwitterApp TWITTER_API_KEY = YourConsumerKey TWITTER_API_SECRET = YourConsumerSecretKey TWITTER_ACCESS_TOKEN = YourAccessToken TWITTER_ACCESS_TOKEN_SECRET = YourAccessTokenSecret TWITTER_BEARER_TOKEN = YourBearerToken TWITTER_BEARER = YourBearer
The rtweet documentation already contains a very thorough vignette, “Authentication with rtweet” (
vignette("auth", package = "rtweet")), to guide you through the process of authenticating access to the Twitter API. We recommend the app-based authentication method that uses
auth <- rtweet::rtweet_app(), described in the Apps section of the vignette.
The default for the app-based method is to enter the Twitter bearer token (what you saved as TWITTER_BEARER_TOKEN) interactively, when prompted.
Finally, to make sure the authentication works properly, run the code
After completing these two key task, you’re now ready to start using tidytags!
Now would be a good time to learn about the full functionality of the package by walking through the “Using tidytags with a conference hashtag” guide (
vignette("tidytags-with-conf-hashtags", package = "tidytags")).
tidytags is still a work in progress, so we fully expect that there are still some bugs to work out and functions to document better. If you find an issue, have a question, or think of something that you really wish tidytags would do for you, don’t hesitate to email Bret or reach out on Twitter: @bretsw and @jrosenberg6432.
You can also submit an issue on GitHub.
You may also wish to try some general troubleshooting strategies:
- Try to find out what the specific problem is
- Identify what is not causing the problem
- “Unplug and plug it back in” - restart R, close and reopen R
- Reach out to others! Sharing what is causing an issue can often help to clarify the problem.
- RStudio Community - https://community.rstudio.com/ (highly recommended!)
- Twitter hashtag: #rstats
- General strategies on learning more: https://datascienceineducation.com/c17.html