Elizabeth Murray
  • home
  • research
    • phylogenomics in Aculeata
    • bee viruses
    • eucharitid ant parasitoids
  • publications
  • teaching
  • blog

Grab & Graph GBIF Biodiversity Data Using R

6/2/2019

2 Comments

 
R + GBIF = rgbif
Here is a tutorial for the R package 'rgbif'. This allows you to access specimen information in the Global Biodiversity Information Facility database. GBIF has hundreds of millions of species occurrence records from around the globe, open for anyone to use.
How did these records get into GBIF in the first place? The data come from many sources -- various museums, universities, and other institutions. Specimen label data has been recorded and digitized in spreadsheets, and this info is contributed to GBIF. I wrote a script in R to demonstrate how to import records for a desired taxon found in a region, and then I use R functions to display the data. 
Picture
Orasema
Picture
Lophyrocera
Picture
Kapala

As shown in the tutorial, I accessed GBIF records for all the Eucharitidae collected in the US. Accessing the data is rather straightforward. Manipulating the data for display takes a few more functions. The records at right are grouped by genus, with any N/As renamed as 'no genus ID'. The genera were then sorted by their number of records. I didn't clean the data, which would be smart to do if you're using it for a project.
rgbif R script, opens in a new window
Picture
Below I display the number of eucharitid records per year in two different ways.
On the left is a violin plot, which I think is a neat way to look at how data are distributed. The width of the violin is relative to the number of yearly records -- not the actual values, but a probability density distribution. The boxplot within shows the median value as a white dot. I excluded zeros here, so we aren't seeing the years where no Eucharitidae were recorded (this keeps the script slightly simpler).
On the right is a chronological view of records per year, excluding years with zero records. I've distinguished the years having more than ten records by using a darker bar color.
Picture
Picture

I enjoyed exploring rgbif and thought it turned into a good introduction to how to plot data from a biodiversity database. I hope you try it out, too! Also, here's another rgbif tutorial I recently found that looks pretty useful; it focuses more on the data manipulation than the data display.
2 Comments

    PhyloBlog

    Covering topics of phylogenetics and systematics & other science-related news.

    Archives

    October 2019
    June 2019
    March 2019
    November 2018
    October 2018
    September 2018
    December 2017
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    December 2016

    Categories

    All
    History & Context
    Programs & Packages
    Taxonomy & Morphology

    RSS Feed

Elizabeth A. Murray, ​PHYLOGENETICS AND EVOLUTION of Hymenoptera

@PhyloSolving  |  e.murray @ wsu.edu
  • home
  • research
    • phylogenomics in Aculeata
    • bee viruses
    • eucharitid ant parasitoids
  • publications
  • teaching
  • blog