Multi-Level Attribute-Guided-Based Adaptive Multi-Dilated Convolutional Network for Image Aesthetic Assessment.
Sumei Li, Mingxuan Xie, Wei Xiang
Abstract
Open AccessImage aesthetic assessment (IAA) is crucial for both scientific research and practical applications, and numerous studies have achieved promising performance. However, they still exhibit two major limitations: the neglect of hierarchical interactions between attribute features and aesthetic features, and the distortion of the original aspect ratio during image preprocessing, which leads to a loss of aesthetic information. To address these issues, we propose a Multi-level Attribute-Guided Adaptive Multi-Dilated Convolutional Network (MAADN), which leverages multi-level attribute features to guide aesthetic assessment and reduces the negative impact of image preprocessing through adaptive dilated convolution. Specifically, we employ a dual-branch architecture: one branch extracts multi-level attribute features, while the other learns aesthetic features under the guidance of these attributes. We further design an Attention-based Attribute-Guided Aesthetic Module (AGAM), which utilizes visual attention mechanisms to enhance the guidance of attributes. Additionally, we design an Adaptive Multi-Dilate Rate Convolution Module (AMDM) that generates weights adaptively through the network to fuse dilated convolution features with different dilation rates, rather than simply calculating weights based on aspect ratios. This approach effectively alleviates the negative effects of image preprocessing while maintaining training flexibility. Extensive experimental results demonstrate that the proposed model outperforms current state-of-the-art approaches. Furthermore, visual analysis confirms MAADN's precise localization capability for aesthetically critical regions.