The Google Natural Language API reveals the structure and meaning of text by offering powerful machine learning models in an easy to use REST API. You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can also use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app.

Read more on the Google Natural Language API

The Natural Language API returns natural language understanding technologies. You can call them individually, or the default is to return them all. The available returns are:

  • Entity analysis - Finds named entities (currently proper names and common nouns) in the text along with entity types, salience, mentions for each entity, and other properties. If possible, will also return metadata about that entity such as a Wikipedia URL.
  • Syntax - Analyzes the syntax of the text and provides sentence boundaries and tokenization along with part of speech tags, dependency trees, and other properties.
  • Sentiment - The overall sentiment of the text, represented by a magnitude [0, +inf] and score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
  • Content Classification - Analyzes a document and returns a list of content categories that apply to the text found in the document. A complete list of content categories can be found here.

Demo for Entity Analysis

You can pass a vector of text which will call the API for each element. The return is a list of responses, each response being a list of tibbles holding the different types of analysis.

library(googleLanguageR)

# random text form wikipedia
texts <- c("Norma is a small constellation in the Southern Celestial Hemisphere between Ara and Lupus, one of twelve drawn up in the 18th century by French astronomer Nicolas Louis de Lacaille and one of several depicting scientific instruments. Its name refers to a right angle in Latin, and is variously considered to represent a rule, a carpenter's square, a set square or a level. It remains one of the 88 modern constellations. Four of Norma's brighter stars make up a square in the field of faint stars. Gamma2 Normae is the brightest star with an apparent magnitude of 4.0. Mu Normae is one of the most luminous stars known, but is partially obscured by distance and cosmic dust. Four star systems are known to harbour planets. ", 
         "Solomon Wariso (born 11 November 1966 in Portsmouth) is a retired English sprinter who competed primarily in the 200 and 400 metres.[1] He represented his country at two outdoor and three indoor World Championships and is the British record holder in the indoor 4 × 400 metres relay.")
nlp_result <- gl_nlp(texts)

Each text has its own entry in returned tibbles

Sentence structure and sentiment:

Information on what words (tokens) are within each text:

What entities within text have been identified, with optional wikipedia URL if its available.

Sentiment of the entire text:

The category for the text as defined by the list here.

The language for the text:

nlp_result$language
# [1] "en" "en"

The original passed in text, to aid with working with the output: