Deep Floor Plan Recognition

Share on facebook
Share on linkedin
Share on email

Deep Floor Plan Recognition Using a Multi-Task Network with Room-Boundary-Guided Attention

Introduction

In this article, we analyse a new approach to recognise diverse floor plan elements, such as doors, windows, and different types of rooms, besides walls and rooms in the floor layouts.

This was made possible by modelling a hierarchy of floor plan elements and designing a deep multi-task neural network that learns to predict room-boundary elements and rooms with types. Room-boundary-guided attention mechanism was formulated in the spatial contextual module to take the boundary elements into account to enhance the room-type predictions.

The multi-task network is able to recognise walls of nonuniform thickness (see boxes 2, 4, 5), walls that meet at irregular junctions (see boxes 1, 2), curved walls (see box 3) in the layout.

Floor Plan Recognition

To recognise floor plan elements in a layout requires the learning of semantic information in the floor plans. 

Traditionally, the problem is solved based on low-level image processing methods that exploit heuristics to locate the graphical notations in the floor plans. This method however lacks generality to handle diverse conditions. Recent methods for the problem has begun to explore deep learning approaches. 

Floor plan elements organized in a hierarchy

This article presents a new method for floor plan recognition, with a focus on recognising diverse floor plan elements. These elements are inter-related graphical elements with structural semantics in the floor plans.

A hierarchy of labels for the elements are first modelled and a deep multi-task neural network was designed. The spatial contextual module was designed to explore the spatial relations between the elements via the room-boundary-guided attention mechanism to avoid feature blurring and maximise the network learning.

Methodology

Network Architecture

A shared VGG encoder was adopted to extract features from the input floor plan image. There are two main tasks for the network: one for predicting the room-boundary pixels with three labels and the other for predicting the room-type pixels with eight labels. 

Overall network architecture

VGG encoder and decoder architecture

The networks first learns the shared feature that is common for both tasks and later makes use of two separate VGG decoders to perform the two tasks.

To further maximise the learning, the spatial contextual module was designed to process and pass the room-boundary features from the top decoder to the bottom decoder to maximise the feature integration for room-type predictions.

Spatial contextual module with the room-boundary-guided attention mechanism which leverages the room-boundary features to learn the attention weights for room-type prediction.

Results

Comparing the results, we can see that Raster-to-Vector tends to have poorer performance on room-boundary predictions, e.g., missing even some room regions.

The results of the proposed method are more similar to the ground truths, even without post-processing. It also shows superiority over the others in terms of the overall accuracy and Fβ metrics.

This shows that the multi-task scheme with the shared features and the spatial contextual module helps to improve the floor plan recognition performance.

This shows that the multi-task scheme with the shared features and the spatial contextual module helps to improve the floor plan recognition performance.

Visual comparison of floor plan recognition results produced by the proposed method and by others on the R2V dataset

Source: Zhiliang Zeng, Xianzhi Li, Ying Kin Yu, Chi-Wing Fu (2019). Deep Floor Plan Recognition Using a Multi-Task Network with Room-Boundary-Guided Attention

This website uses cookies to ensure you get the best experience.