Large marine predator aerial survey data for Hauraki Gulf, New Zealand

Latest version published by Southwestern Pacific Ocean Biogeographic Information System (OBIS) Node on 20 June 2024 Southwestern Pacific Ocean Biogeographic Information System (OBIS) Node

Download the latest version of this resource data as a Darwin Core Archive (DwC-A) or the resource metadata as EML or RTF:

Data as a DwC-A file download 12,263 records in English (468 KB) - Update frequency: not planned
Metadata as an EML file download in English (18 KB)
Metadata as an RTF file download in English (17 KB)


Large marine predators, such as cetaceans and sharks, play a crucial role in maintaining biodiversity patterns and ecosystem health. Despite the recognised importance of these animals and their over-representation as threatened species, distribution data at appropriate temporal and spatial scales is often lacking or insufficient for effective conservation.

Here, we present sightings of large marine megafauna recorded from a replicate systematic aerial survey undertaken in the Hauraki Gulf, Aotearoa New Zealand during a full year. Using flexible machine learning models (Boosted Regression Tree models), we use these sightings data to investigate relationships between large marine predator occurrence (Bryde’s whales, common and bottlenose dolphins, bronze whalers, pelagic and immature hammerhead sharks) and spatially explicit environmental and biotic variables to predict species richness of large marine predators and investigate their fine-scale spatiotemporal distribution patterns. All models were considered informative (all, AUC > 0.78), and temporally dynamic variables, such as the distribution of prey, were important in predicting the occurrence of the study species and species groups.

Data Records

The data in this occurrence resource has been published as a Darwin Core Archive (DwC-A), which is a standardized format for sharing biodiversity data as a set of one or more data tables. The core data table contains 12,263 records.

This IPT archives the data and thus serves as the data repository. The data and resource metadata are available for download in the downloads section. The versions table lists other versions of the resource that have been made publicly available and allows tracking changes made to the resource over time.


The table below shows only published versions of the resource that are publicly accessible.

How to cite

Researchers should cite this work as follows:

Constantine R, Stephenson F, Hamilton O, Torres L, Kozmian-Ledward L, Pinkerton M (2024). Large marine predator aerial survey data for Hauraki Gulf, New Zealand. Version 1.2. Southwestern Pacific Ocean Biogeographic Information System (OBIS) Node. Occurrence dataset. https://nzobisipt.niwa.co.nz/resource?r=hauraki_predators&v=1.2


Researchers should respect the following rights statement:

The publisher and rights holder of this work is Southwestern Pacific Ocean Biogeographic Information System (OBIS) Node. This work is licensed under a Creative Commons Attribution (CC-BY 4.0) License.

GBIF Registration

This resource has not been registered with GBIF


Occurrence; Observation


Rochelle Constantine
  • Metadata Provider
  • Originator
  • Point Of Contact
University of Auckland
Fabrice Stephenson
  • Originator
University of Waikato
Olivia Hamilton
  • Originator
University of Auckland
Leigh Torres
  • Originator
Oregon State University
Lily Kozmian-Ledward
  • Originator
University of Auckland
Matt Pinkerton
  • Originator
Private Bag 14-901
6241 Wellington

Geographic Coverage

Hauraki Gulf, New Zealand

Bounding Coordinates South West [-36.825, 174.736], North East [-36.034, 175.747]

Temporal Coverage

Start Date / End Date 2013-11-15 / 2014-10-10

Sampling Methods

Twenty-two double observer line-transect aerial surveys were conducted typically twice a month from November 2013 to October 2014, following MacKenzie and Clement (2014). Surveys followed a systematic grid of eight parallel transects evenly spaced apart every 10 km, orientated offshore at 62° from the north. Under this design, it was possible to sample across a wide range of environmental conditions. The survey followed a simple, unstratified design due to a wide range of species with different or unknown distributions (Dawson et al. 2008). To ensure equal coverage probability (i.e., a random sample of the total habitat area in the survey area; Buckland et al. 2004) the start point of each survey was randomly chosen using the striplet method (Fewster 2011). Surveys commenced when the Beaufort sea state (BSS) was ≤3 across the study area, there was no rain or fog, and there was sufficient light, i.e., one hour after sunrise and before sunset.

Study Extent The Hauraki Gulf (36⁰ 10’–37⁰ 10’ S; 174⁰ 40–175⁰ 30’E) is a large (~4,000 km2), semi-enclosed embayment situated on the northeast coast of the North Island – Te Ika a Māui, New Zealand.

Method step description:

  1. All 22 surveys were conducted in a Cessna 207 fixed-wing aircraft, flying at 500 feet (152.4 m) and 100 knots (185.2 km/h). The aircraft accommodated two observers on each side who operated independently with no communication during the survey. Observers logged all sightings of large marine predator species, i.e., cetaceans, pinnipeds, sharks, rays, and oceanic fish observed during the flights. Observers also recorded all sightings of potential prey patches, i.e., schooling fish such as kahawai (Arripis trutta), pilchards (Sardinops sagax), jack mackerel (Trachurus spp.), saury (Scomberesox saurus) and zooplankton aggregations.
  2. Due to the double-observer design, records partly consisted of duplicate sightings, i.e., two records of the same animal or group made by observers seated on the same side of the plane. Duplicate sightings were reconciled post-hoc by comparing all records of the same species made by observers seated on the same side of the plane during the same survey. For cetaceans, sightings made within seven seconds and within 10° of each other were considered a duplicate sighting. Sharks and rays were generally encountered at low densities allowing for easy identification of duplicate records. However, we applied a narrower time frame of five seconds to classify records of sharks and prey to compensate for the reduction in the information available to identify duplicates.
  3. Sightings of sharks exceeding an estimated total length of 2.5 m (Last and Stevens 2009) that share a similar ecological role were combined to form “pelagic sharks” as they typically inhabit similar habitats (pers. comm. Clinton Duffy, Department of Conservation). These included rarely sighted species, i.e., shortfin mako (Isurus oxyrinchus), blue (Prionace glauca), and adult smooth hammerhead (Sphyrna zygaena) sharks. Juvenile hammerhead shark (total length < 2.5 m; Last and Stevens 2009) distribution was modelled separately as they exhibit different habitat preferences to adults.
  4. To estimate the distributions of the large marine predators, binomial BRT models require locations of both presences and absences (Manly et al. 2007). Presence records consisted of sightings recorded during aerial surveys. Given that our sightings only reflect a portion of the animals’ habitat use (i.e., when the animal is at, or close to, the surface of the water), presences records will likely be an underestimate of true distribution. In line with this, absence information collected from our aerial surveys should therefore also not be considered true absences. Instead, pseudo-absences (artificial absence points) were generated (Lobo and Tognelli 2011; Phillips et al. 2009).
  5. When data are collected during systematic surveys (as was the case here), pseudo-absences are distributed within the absence zones, i.e., where no sightings were made (Derville et al. 2016). To capture habitat use at a fine temporal (one-day) scale (Derville et al. 2016), 22 unique sets of pseudo-absences were generated for each species, corresponding to each survey. Pseudo-absences were distributed in the on-effort portions of transects, i.e., where observers were in search mode and within the viewing strip as per distance sampling methods (Hamilton et al. 2018; 420 m transect width for sharks and dolphins; 570 m for Bryde’s whale).
  6. To ensure that pseudo-absence points reflected available but unused habitat in a non-biased manner, sightings collected over the entire period were pooled to create a kernel density map for each taxon using the Spatial Analyst kernel density tool in ArcMap (2019, v10.2). Density contours were overlaid on survey transects and cropped to the width and length of transects. An exclusion zone was created around presence points two times the length of the transect width (880 m, 1140 m for Bryde’s whales) to prevent environmental overlap between presences and pseudo-absences – the length approximate to the 1 km spatial resolution of the study (Torres et al. 2008). We generated stratified random pseudo-absence points with numbers inversely related to the density contours. Pseudo-absences were distributed with a minimum of two times the transect width to avoid serial autocorrelation (Derville et al. 2016). To weight pseudo-absences according to the stratified design, we created a standardised, inverted kernel density map and extracted the kernel density estimate (KDE) of the cell within which a pseudo-absence point fell to assign a weight. The process produced approximately 30 times more pseudo-absences than presence points for sub-sampling during the modelling process.

Bibliographic Citations

  1. Constantine, Rochelle et al. (2023). Large marine predator aerial survey data for Hauraki Gulf, New Zealand [Dataset]. Dryad. https://doi.org/10.5061/dryad.wm37pvmrn doi.org/10.5061/dryad.wm37pvmrn
  2. Constantine, R., Stephenson, F., Hamilton, O., Torres, L., Kozmian-Ledward, L., & Pinkerton, M. (2023). Large marine predator aerial survey data for Hauraki Gulf, New Zealand [Data set]. Zenodo. https://doi.org/10.5061/dryad.wm37pvmrn doi.org/10.5061/dryad.wm37pvmrn

Additional Metadata

marine, harvested by iOBIS