The complex underwater environment causes light to suffer from scattering effects and wavelength-dependent attenuation, and underwater images exhibit color deviation and low contrast, which hinder the progress of related underwater tasks. Deep learning algorithms now make extensive use of multi-scale features to improve underwater image quality, but the majority of these methods do not take channel differences into account while propagating features. To this end, we propose a cross aggregation transformer (CAT), which utilizes three stages of projection-crossing aggregation to adaptively select beneficial channels. This paper also designs a dynamic supplement underwater image enhancement network, which consists of a shallow network and an enhancement network. Through the encoder/decoder structure, the enhancement network restores the original appearance of the underwater image, while the shallow network extracts the shallow features at different scales. Both networks are designed to focus on under-enhanced regions and supplementary details in real time through the residual supplement module (RSM). The experimental findings demonstrate that CAT and RSM efficiently improve network performance and elevate the network above other advanced methods on various datasets.
Silhouette extraction of foreground objects appears frequently in various real-world applications, such as Advanced Driving Assistant System, Intelligent Monitoring System, and movie production. Plenty of solutions have been developed to extract silhouette in RGB image with only color information. Since those color based silhouette extraction methods still have difficulties to separate overlapping foreground objects and eliminate excessive segmentation, this paper proposes a novel object segmentation method using color and depth information in RGB-D images. Firstly, we remove the ground plane using the normal map of depth image. Secondly, to separate foreground objects at different distances completely and correctly, the deep Residual Network (ResNets) and Otsu’s multithresholding method are combined to divide the depth image into multiple layers. Each depth layer contains only one foreground object or objects at same distance. Finally, the outline of foreground object is extracted directly from its depth layer, and refined with color information. Experimental results demonstrate that our method has a better performance than those using color or depth information only, and extracts more types of objects than neural networks.
Place recognition plays an essential role in the field of autonomous driving and robot navigation. Point cloud based methods mainly focus on extracting global descriptors from local features of point clouds. Despite having achieved promising results, existing solutions neglect the following aspects, which may cause performance degradation: (1) huge size difference between objects in outdoor scenes; (2) moving objects that are unrelated to place recognition; (3) long-range contextual information. We illustrate that the above aspects bring challenges to extracting discriminative global descriptors. To mitigate these problems, we propose a novel method named TransLoc3D, utilizing adaptive receptive fields with a pointwise reweighting scheme to handle objects of different sizes while suppressing noises, and an external transformer to capture longrange feature dependencies. As opposed to existing architectures which adopt fixed and limited receptive fields, our method benefits from size-adaptive receptive fields as well as global contextual information, and outperforms current state-of-the-arts with significant improvements on popular datasets.
Starting from the vector multipliers, the inner product, norm, distance, as well as addition of two vectors of different dimensions are proposed, which makes the spaces into a topological vector space, called the Euclidean space of different dimension (ESDD). An equivalence is obtained via distance. As a quotient space of ESDDs w.r.t. equivalence, the dimension-free Euclidean spaces (DFESs) and dimension-free manifolds (DFMs) are obtained, which have bundled vector spaces as its tangent space at each point. Using the natural projection from a ESDD to a DFES, a fiber bundle structure is obtained, which has ESDD as its total space and DFES as its base space. Classical objects in differential geometry, such as smooth functions, (co-)vector fields, tensor fields, etc., have been extended to the case of DFMs with the help of projections among different dimensional Euclidean spaces. Then the dimension-varying dynamic systems (DVDSs) and dimensionvarying control systems (DVCSs) are presented, which have DFM as their state space. The realization, which is a lifting of DVDSs or DVCSs from DFMs into ESDDs, and the projection of DVDSs or DVCSs from ESDDs onto DFMs are investigated.