The two main sources of global species datasets are GBIF and IUCN. GBIF provides point-observations of species occurrences, while IUCN provides range-maps (geometrical shapes/polygons) of potential species occurrence, based on expertise.
In some situations, we may want to restrict (overlay, clip) the point-observations to the range maps area. For example, because certain point-observations are obviously wrong/inacurate, or outliers. We will use the species Graptemys oculifera (turtles) as a running example, as it exists in both sources.
In [1]:
%matplotlib inline
import logging
root = logging.getLogger()
root.addHandler(logging.StreamHandler())
In [2]:
from iSDM.species import GBIFSpecies
gbif_species = GBIFSpecies(name_species="Graptemys oculifera")
gbif_species.find_species_occurrences().head()
Out[2]:
In [3]:
gbif_species.save_data()
In [4]:
gbif_species.plot_species_occurrence()
Notice the information "Points with NaN coordinnates ignored." Some records may be missing the most essential latitude/longitude information, and those are filtered out in the process of plotting. To explicitely filter such records out, you can also do:
In [5]:
gbif_species.geometrize(dropna=True) # converts the lat/lon columns into a geometrical Point, for each record
In [6]:
gbif_species.get_data().head() # notice the last 'geometry' column added now
Out[6]:
In [7]:
gbif_species.get_data().shape # there are 40 records left (containing lat/lon) at this point.
Out[7]:
In [8]:
from iSDM.species import IUCNSpecies
iucn_species = IUCNSpecies(name_species='Graptemys oculifera')
iucn_species.load_shapefile('../data/FW_TURTLES/FW_TURTLES.shp')
There are rangemaps for 181 species. Lets filter out only those for our turtles.
In [9]:
iucn_species.find_species_occurrences() # IUCN datasets have a 'geometry' column
Out[9]:
In [10]:
iucn_species.plot_species_occurrence() # the rangemap seems to be around the same area as the point-records
Just for backup, lets copy the records before filtering out.
In [11]:
backup_gbif_species = gbif_species.get_data().copy()
In [12]:
gbif_species.overlay(iucn_species)
How many records are left after filtering out?
In [13]:
gbif_species.get_data().shape # 39 records, so one unlucky observation falls outside the rangemap
Out[13]:
In [14]:
gbif_species.plot_species_occurrence()
It seems like the bottom-left point from above (In [4]), was removed.
It's easy to plot both filtered datasets (IUCN and GBIF) on a single map. They are both geopandas datastructures, so they have a geometry column, which is all we need. Geometries can be directly plotted with .plot()
In [15]:
gbif_species_geometry = gbif_species.get_data().geometry
iucn_species_geometry = iucn_species.get_data().geometry
In [62]:
from geopandas import GeoSeries
import matplotlib.pyplot as plt
plt.figure(figsize=(15,15))
combined_geometries = GeoSeries(gbif_species_geometry.append(iucn_species_geometry))
combined_geometries.plot()
Out[62]:
In [56]:
gbif_species_geometry.head()
Out[56]:
In [57]:
iucn_species_geometry.head()
Out[57]:
Remember we backed up the GBIF data before overlaying, so let's also map it together with the rangemap, to see exactly what got left out.
In [63]:
combined_geometries = GeoSeries(backup_gbif_species.geometry.append(iucn_species_geometry))
plt.figure(figsize=(15,15))
combined_geometries.plot()
Out[63]:
In [ ]:
In [ ]: