Introducing a new benchmark dataset: Semcity Toulouse
A benchmark for building instance segmentation in satellite images
In order to reach the goal of reliably solving Earth monitoring tasks, automated and efficient machine learning methods are necessary for large-scale scene analysis and interpretation. A typical bottleneck of supervised learning approaches is the availability of accurate (manually) labeled training data, which is particularly important to train state-of-the-art (deep) learning methods. We present SemCity Toulouse, a publicly available, very high resolution, multi-spectral benchmark data set for training and evaluation of sophisticated machine learning models. The benchmark acts as test bed for single building instance segmentation which has been rarely considered before in densely built urban areas. Additional information is provided in the form of a multi-class semantic segmentation annotation covering the same area plus an adjacent area 3 times larger.The data set addresses interested researchers from various communities such as photogrammetry and remote sensing, but also computer vision and machine learning.
Further details and data download at: http://rs.ipb.uni-bonn.de/data/
WG II/6: Large-scale Machine Learning for Geospatial Data Analysis
ISPRS Working Group II/6 aims to promote large-scale machine learning methods to analyze geo-referenced data. Nowadays, a multitude of different sensors provide an ever increasing amount of observations at varying scale, temporal, and spatial resolution, making the processing pipelines strive for methods able to process such large amounts of data. For instance, imagery (and point clouds) can be obtained from overhead or terrestrial sensors for 3D modelling, for semantic interpretation or for monitoring scenes at large-scale. Data can either be acquired with dedicated campaigns like aerial/satellite imaging campaigns and mobile mapping or be collected from crowd-sourced, publicly available data sets like OpenStreetMap. An important aspect is the combination of multiple complementary views (e.g., street view panoramas and aerial images) from different sensors, and acquisitions made at different times. Multi-modal, multi-temporal, and multi-scale image analysis are therefore of particular scientific relevance.
Instead of hierarchical, rule-based methods that are tailored for a particular scene layout and task, machine learning enables to model relevant object patterns directly from labeled training data. The larger the labeled set, the more accurately data can be represented by powerful machine learning models. However, one of the main bottlenecks is to generate and gather enough training data to achieve sufficient generalization accuracy. In practice, ground truth is often labeled manually resulting in high costs, and it is consequently a limiting factor. Research in weakly supervised learning, transfer learning, and self-taught learning aims at significantly reducing manual labeling efforts to build models that are relevant to applications in practice.
In order to facilitate scientific progress, this WG fosters collaboration between the Photogrammetry & Remote Sensing and the Computer Vision & Machine Learning communities. Workshops at both, ISPRS events and CV & ML conferences shall raise mutual awareness and knowledge exchange. A benchmark challenge shall provide a new, large-scale data set and a sound evaluation tool to make a high quality data set with ground truth publicly available that acts as a testbed to make different methods better comparable. People submitting to this challenge will be encouraged to publish their method and to publicly release source code to further accelerate scientific progress.
Working Group Officers:
|Jan Dirk Wegner|
Photogrammetry & Remote Sensing
+41 44 633 68 08
University of Bonn
+49 228 73 2716
Swiss Data Science Center
+41 44 632 80 45
Univ. Paris Est
+33 1 43 98 84 36
+49 228 4339581
+49 228 461256
European Space Imaging
+49 89 130142 0
+49 89 130142 22
Terms of Reference:
- Large-scale image classification,
- Machine learning, deep learning,
- Pixel-wise semantic segmentation at large-scale,
- Supervised, weakly supervised, transfer, and human-in-the-loop learning
- Multi-view, multi-temporal, multi-modal image interpretation
- Change detection and environmental / urban monitoring