Deep Floor Plan Recognition: Using a Multi-Task Network with Room-Boundary-Guided Attention​

A New Way to Deconstruct Floor Plans

In this article, we analyse a new approach to recognise diverse floor plan elements, such as doors, windows, and different types of rooms, besides walls and rooms in the floor layouts.

This was made possible by modelling a hierarchy of floor plan elements and designing a deep multi-task neural network that learns to predict room-boundary elements and rooms with types. Room-boundary-guided attention mechanism was formulated in the spatial contextual module to take the boundary elements into account to enhance the room-type predictions.

Floor Plan Recognition Traditionally

To recognise floor plan elements in a layout requires the learning of semantic information in the floor plans. 

Traditionally, the problem is solved based on low-level image processing methods that exploit heuristics to locate the graphical notations in the floor plans. This method however lacks generality to handle diverse conditions. Recent methods for the problem has begun to explore deep learning approaches.

This article presents a new method for floor plan recognition, with a focus on recognising diverse floor plan elements. These elements are inter-related graphical elements with structural semantics in the floor plans.

A hierarchy of labels for the elements are first modelled and a deep multi-task neural network was designed. The spatial contextual module was designed to explore the spatial relations between the elements via the room-boundary-guided attention mechanism to avoid feature blurring and maximise the network learning.

Methods and Implications

Network Architecture
A shared VGG encoder was adopted to extract features from the input floor plan image. There are two main tasks for the network: one for predicting the room-boundary pixels with three labels and the other for predicting the room-type pixels with eight labels.

The networks first learns the shared feature that is common for both tasks and later makes use of two separate VGG decoders to perform the two tasks.

To further maximise the learning, the spatial contextual module was designed to process and pass the room-boundary features from the top decoder to the bottom decoder to maximise the feature integration for room-type predictions.

Room Boundary Guided Attention

Spatial contextual module with the room-boundary-guided attention mechanism which leverages the room-boundary features to learn the attention weights for room-type prediction.

Results of Different Methods

Comparing the results, we can see that Raster-to-Vector tends to have poorer performance on room-boundary predictions, e.g., missing even some room regions.

The results of the proposed method are more similar to the ground truths, even without post-processing. It also shows superiority over the others in terms of the overall accuracy and Fβ metrics.


This shows that the multi-task scheme with the shared features and the spatial contextual module helps to improve the floor plan recognition performance.


Source: Zhiliang Zeng, Xianzhi Li, Ying Kin Yu, Chi-Wing Fu (2019). Deep Floor Plan Recognition Using a Multi-Task Network with Room-Boundary-Guided Attention



Nested Technologies uses cookies to ensure you get the best experience.