All Projects
Agriculture

Multivariate Active Learning

Developing algorithms and robotic systems for intelligent environmental monitoring—learning correlations between multiple quantities of interest to maximise information gain with constrained sampling budgets, and deploying these methods on autonomous ground platforms for agricultural applications.

RoleCo-supervisor & Integration Lead
Timeline2023 – 2025
PartnersVonwiller Foundation
Team5+ Engineers
Adaptive Sampling

The Challenge

Environmental monitoring often requires measuring multiple quantities simultaneously—soil nitrogen, phosphorus, moisture, pH levels, or pasture height, for example. Traditional approaches treat each quantity independently, but in reality these variables are often correlated. Measuring one can provide information about others, and prior data sources (such as satellite imagery or elevation maps) may contain useful signals if their relationships to the quantities of interest can be learned.

The challenge was threefold: develop algorithms that can exploit inter-variable correlations to select sampling locations that maximise total information gain; test hypotheses about these correlations in real-time from sparse, non-collocated data; and deploy these methods on autonomous robotic platforms operating under practical constraints like time, energy, and robot capabilities.

MQGP learning and estimating the correlations between multiple quantities on nation-scale environment data.
MQGP learning and estimating the correlations between multiple quantities on nation-scale environment data.

Technical Approach

This project developed a series of methods for multivariate active learning and adaptive sampling, culminating in deployment on an autonomous ground robot.

Multi-Task Gaussian Processes (MTGP) — The foundational work introduced a GP-based framework that models correlations between a quantity of interest and prior data sources. By replacing the standard signal variance with a learned covariance matrix between quantities, the method can calculate hypothesis quality scores (coefficients of determination) from sparse, non-collocated samples. Good hypotheses improve prediction accuracy by factors of 1.4-3.4 within the first 7 samples; poor hypotheses are quickly identified and disregarded with no adverse effect.

Multi-Kernel GP (MKGP) — Building on the MTGP work, this extension allows different kernel types and hyperparameters to be assigned to each quantity, rather than assuming a single shared kernel. This better captures the distinct spatial characteristics of different environmental variables. The method uses restricted maximum likelihood estimation (RMLE) with free-form parameterisation of the cross-covariance matrix to jointly learn spatial models and inter-variable correlations.

Multivariate Adaptive Sampling (MVAS) — A sample selection method that considers information gain across all quantities of interest, explicitly leverages prior data based on learned correlations, and incorporates travel cost into the objective function. In experiments on the Jura heavy-metals dataset, MVAS achieved target accuracy with 158 samples compared to 200+ for coverage or random sampling, while reducing travel costs by 41% when the cost term was included.

Robotic System Integration — The algorithms were deployed on the Swagbot platform, a 4-wheel independent drive-and-steer robot equipped with LIDAR, RGB-D camera, GNSS, and IMU. The system performs real-time belief map generation, informative sample selection, and autonomous navigation, with computation times of approximately 0.3-0.4 seconds per sample—negligible compared to travel times of 20-60 seconds.

My Contribution

My specific contributions included:

  • Leading robotic system integration, including the guidance, navigation, and control stack on the Swagbot platform
  • Connecting the algorithmic work to practical agricultural monitoring applications
  • Supporting field validation experiments on University of Sydney farm properties
  • Providing guidance on the research direction and key problems faced by active sampling methods
  • Providing guidance on Gaussian Process methodology and adaptive sampling strategies

Applications

The methods were validated across multiple application domains:

Soil Nutrient Monitoring — Mapping the spatial distribution of heavy metals (cadmium, lead, copper) in the Jura dataset, demonstrating that MKGP achieves mean prediction errors of 13–15% compared to 14–17% for single-output methods, with substantially lower maximum errors and variance.

Pasture Height Mapping — Field deployment on a grazing livestock property, where the robot autonomously collected 30 samples within a 50×50m area using RGB-D camera measurements. The system successfully mapped pasture heights and evaluated hypotheses about correlations with terrain elevation (finding a weak negative correlation of −0.3).

Continental-Scale Environmental Data — Testing on NASA Earth Observations data across Australia, where the method identified strong correlations between vegetation, rainfall, elevation, and ground temperature, improving prediction accuracy by up to 75% in early samples.

Overview of MKGP/MVAS system, with results on synthetic Nitrogen-Phosphorous data with and without considering travel cost Λ.
Overview of MKGP/MVAS system, with results on synthetic Nitrogen-Phosphorous data with and without considering travel cost Λ.

Results & Impact

This research produced three publications: an IEEE ICRA paper (2024) establishing the hypothesis-testing framework, an ACRA paper (2024) demonstrating field deployment on a robotic platform, and an IEEE Robotics and Automation Letters paper (2025, IF 5.3) presenting the full MKGP/MVAS system.

Key quantitative results include:

  • MVAS requires 21% fewer samples than coverage or random sampling to achieve equivalent accuracy
  • Travel costs reduced by 41% when explicitly optimised
  • Correlation estimates converge correctly within 5-7 samples for both linear and nonlinear relationships
  • Real-time computation suitable for in-situ decision making (~0.4s per sample)

The algorithms are actively being used in current agricultural industry projects and have broader applications in mining exploration, environmental monitoring, and any domain requiring efficient multi-quantity spatial mapping.