XML and Web pages

We will use XML package to do the parsing.

Install package

install.packages("XML", dep = T)
library(XML)

Load XML tree

install.packages("XML")
library(XML)

url <- "http://www.w3schools.com/xml/simple.xml"

document <- xmlTreeParse(url, useInternal=TRUE)

Get element name

rootNode <- xmlRoot(document)
xmlName(rootNode)

Access first element.

rootNode[[1]]

Get element on exact position (going through subelements).

rootNode[[1]][[1]]

Using custom function to load values from XML. xmlValue is that function.

Using XPath.

Read a table from HTML

Get page using httr package

Authentificate with httr package

We can use authenticate function in order to access a secured page.

The code above returns the following response.

Use handle function to access more page with during one authentificated session.

Last updated

Was this helpful?