Towards general-purpose representation learning of polygonal geometries

Autor(en)
Gengchen Mai, Chiyu Jiang, Weiwei Sun, Rui Zhu, Yao Xuan, Ling Cai, Krzysztof Janowicz, Stefano Ermon, Ni Lao
Abstrakt

Neural network representation learning for spatial data (e.g., points, polylines, polygons, and networks) is a common need for geographic artificial intelligence (GeoAI) problems. In recent years, many advancements have been made in representation learning for points, polylines, and networks, whereas little progress has been made for polygons, especially complex polygonal geometries. In this work, we focus on developing a general-purpose polygon encoding model, which can encode a polygonal geometry (with or without holes, single or multipolygons) into an embedding space. The result embeddings can be leveraged directly (or finetuned) for downstream tasks such as shape classification, spatial relation prediction, building pattern classification, cartographic building generalization, and so on. To achieve model generalizability guarantees, we identify a few desirable properties that the encoder should satisfy: loop origin invariance, trivial vertex invariance, part permutation invariance, and topology awareness. We explore two different designs for the encoder: one derives all representations in the spatial domain and can naturally capture local structures of polygons; the other leverages spectral domain representations and can easily capture global structures of polygons. For the spatial domain approach we propose ResNet1D, a 1D CNN-based polygon encoder, which uses circular padding to achieve loop origin invariance on simple polygons. For the spectral domain approach we develop NUFTspec based on Non-Uniform Fourier Transformation (NUFT), which naturally satisfies all the desired properties. We conduct experiments on two different tasks: 1) polygon shape classification based on the commonly used MNIST dataset; 2) polygon-based spatial relation prediction based on two new datasets (DBSR-46K and DBSR-cplx46K) constructed from OpenStreetMap and DBpedia. Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins. While ResNet1D suffers from model performance degradation after shape-invariance geometry modifications, NUFTspec is very robust to these modifications due to the nature of the NUFT representation. NUFTspec is able to jointly consider all parts of a multipolygon and their spatial relations during prediction while ResNet1D can recognize the shape details which are sometimes important for classification. This result points to a promising research direction of combining spatial and spectral representations.

Organisation(en)
Institut für Geographie und Regionalforschung
Externe Organisation(en)
University of Bristol, University of California, Santa Barbara, University of Georgia, Stanford University, University of California, Berkeley, University of British Columbia (UBC), Google
Journal
Geoinformatica
Band
27
Seiten
289-340
Anzahl der Seiten
52
ISSN
1384-6175
DOI
https://doi.org/10.1007/s10707-022-00481-2
Publikationsdatum
10-2022
Peer-reviewed
Ja
ÖFOS 2012
507003 Geoinformatik, 102018 Künstliche Neuronale Netze
Schlagwörter
ASJC Scopus Sachgebiete
Geography, Planning and Development, Information systems
Link zum Portal
https://ucrisportal.univie.ac.at/de/publications/8a6dd4f2-5da3-411a-8ecc-dff88a18f2b3