Deep Floor Plan Recognition with Multi-Task Network

A New Way to Deconstruct Floor Plans

In this article, we analyse a new approach to recognise diverse floor plan elements, such as doors, windows, and different types of rooms, besides walls and rooms in the floor layouts.

A flowchart illustrates Deep Floor Plan Recognition: an input floor plan is processed by a shared VGG encoder to extract features, which are then used for room-boundary and room-type predictions on floor plan diagrams at the output.

This was made possible by modelling a hierarchy of floor plan elements and designing a deep multi-task neural network that learns to predict room-boundary elements and rooms with types. Room-boundary-guided attention mechanism was formulated in the spatial contextual module to take the boundary elements into account to enhance the room-type predictions.

Floor Plan Recognition Traditionally

To recognise floor plan elements in a layout requires the learning of semantic information in the floor plans. 

Traditionally, the problem is solved based on low-level image processing methods that exploit heuristics to locate the graphical notations in the floor plans. This method however lacks generality to handle diverse conditions. Recent methods for the problem has begun to explore deep learning approaches.

Two pairs of floor plans are shown: (a) and (c) depict original layouts with labeled rooms; (b) and (d) present Deep Floor Plan Recognition results, using color-coding, red rectangles, and numbers to highlight specific areas.

This article presents a new method for floor plan recognition, with a focus on recognising diverse floor plan elements. These elements are inter-related graphical elements with structural semantics in the floor plans.

A hierarchy of labels for the elements are first modelled and a deep multi-task neural network was designed. The spatial contextual module was designed to explore the spatial relations between the elements via the room-boundary-guided attention mechanism to avoid feature blurring and maximise the network learning.

Diagram showing a neural network with branching paths, each processing tensors of varying sizes for Deep Floor Plan Recognition, merging at shared features (16x16x512), and ending with element-wise addition and 1x1 convolution. Arrows indicate data flow.

Methods and Implications

Network Architecture
A shared VGG encoder was adopted to extract features from the input floor plan image. There are two main tasks for the network: one for predicting the room-boundary pixels with three labels and the other for predicting the room-type pixels with eight labels.

The networks first learns the shared feature that is common for both tasks and later makes use of two separate VGG decoders to perform the two tasks.

To further maximise the learning, the spatial contextual module was designed to process and pass the room-boundary features from the top decoder to the bottom decoder to maximise the feature integration for room-type predictions.

Room Boundary Guided Attention

Flowchart showing a neural network architecture for Deep Floor Plan Recognition, where room-boundary and room-type features are combined with attention weights, passed through direction-aware kernels, upsampled, and output as spatial contexture features.

Spatial contextual module with the room-boundary-guided attention mechanism which leverages the room-boundary features to learn the attention weights for room-type prediction.

Results of Different Methods

Comparing the results, we can see that Raster-to-Vector tends to have poorer performance on room-boundary predictions, e.g., missing even some room regions.

The results of the proposed method are more similar to the ground truths, even without post-processing. It also shows superiority over the others in terms of the overall accuracy and Fβ metrics.

A comparison of five algorithms for generating floor plans, including Deep Floor Plan Recognition, each represented in columns with five rows of images; the first column shows the original input and the second displays the ground truth.

 

This shows that the multi-task scheme with the shared features and the spatial contextual module helps to improve the floor plan recognition performance.

References

Source: Zhiliang Zeng, Xianzhi Li, Ying Kin Yu, Chi-Wing Fu (2019). Deep Floor Plan Recognition Using a Multi-Task Network with Room-Boundary-Guided Attention

Published

Share

Nested Technologies uses cookies to ensure you get the best experience.