Abstract:
Urban building segmentation from remote sensed imageries is challenging because there usually exists a variety of building features. Furthermore, very high spatial resolution imagery can provide many details of the urban building, such as styles, small gaps among buildings, building shadows, etc. Hence, satisfactory accuracy in detecting and extracting urban features from highly detailed images still remains. Deep learning semantic segmentation using baseline networks works well on building extraction; however, their ability in building extraction in shadows area, unclear building feature, and narrow gaps among buildings in
dense building zone is still limited. In this article, we propose parallel cross-learning-based pixel transferred deconvolutional network (PCL–PTD net), and then is used to segment urban buildings from
aerial photographs. The proposed method is evaluated and intercompared with traditional baseline networks. In PCL–PTD net, it is composed of parallel network, cross-learning functions, residual
unit in encoder part, and PTD in decoder part. The performance is
applied to three datasets (Inria aerial dataset, international society
for photogrammetry and remote sensing Potsdam dataset, and UAV
building dataset), to evaluate its accuracy and robustness. As a result, we found that PCL–PTD net can improve learning capacities of
the supervised learning model in differentiating buildings in dense
area and extracting buildings covered by shadows. As compared
to the baseline networks, we found that proposed network shows
superior performance compared to all eight networks (SegNet,
U-net, pyramid scene parsing network, PixelDCL, DeeplabV3+,
U-Net++, context feature enhancement networ, and improved