Blind Quality Assessment of 3D Dense Point Clouds with Structure Guided Resampling

Wei Zhou, Qi Yang, Qiuping Jiang, Guangtao Zhai, , and Weisi Lin, This work was supported in part by NSFC under Grant 61901236 and the Natural Science Foundation of Zhejiang LR22F020002. (Corresponding author: Qiuping Jiang.)W. Zhou is with the Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail: wei.zhou@uwaterloo.ca).Q. Yang is with the Tencent MediaLab, Shanghai 200030, China (e-mail: chinoyang@tencent.com).Q. Jiang is with the School of Information Science and Engineering, Ningbo University, Ningbo 315211, China (e-mail: jiangqiuping@nbu.edu.cn).G. Zhai is with the Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: zhaiguangtao@sjtu.edu.cn)W. Lin is with the School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798 (e-mail: wslin@ntu.edu.sg).

Abstract

Objective quality assessment of 3D point clouds is essential for the development of immersive multimedia systems in real-world applications. Despite the success of perceptual quality evaluation for 2D images and videos, blind/no-reference metrics are still scarce for 3D point clouds with large-scale irregularly distributed 3D points. Therefore, in this paper, we propose an objective point cloud quality index with Structure Guided Resampling (SGR) to automatically evaluate the perceptually visual quality of 3D dense point clouds. The proposed SGR is a general-purpose blind quality assessment method without the assistance of any reference information. Specifically, considering that the human visual system (HVS) is highly sensitive to structure information, we first exploit the unique normal vectors of point clouds to execute regional pre-processing which consists of keypoint resampling and local region construction. Then, we extract three groups of quality-related features, including: 1) geometry density features; 2) color naturalness features; 3) angular consistency features. Both the cognitive peculiarities of the human brain and naturalness regularity are involved in the designed quality-aware features that can capture the most vital aspects of distorted 3D point clouds. Extensive experiments on several publicly available subjective point cloud quality databases validate that our proposed SGR can compete with state-of-the-art full-reference, reduced-reference, and no-reference quality assessment algorithms.

3D point clouds, blind/no-reference, perceptual quality assessment, structure information, naturalness regularity, human visual system.

I Introduction

With the modern advances in 3D data capture devices and rendering technologies, popular 3D point clouds have become one of the most important multimedia representations for providing 6 degrees of freedom [38]. As defined, a 3D point cloud contains a huge number of scattered points with certain attributes. Each 3D point owns geometry and attribute information, which can represent actual objects or environments more visually compared with traditional images and videos. To be specific, geometry stands for the 3D coordinates of the point, while the attribute is composed of additional descriptors such as RGB color, surface normal, reflectance and opacity, etc. Due to such an abundant pattern, there have emerged lots of real-world applications for 3D point clouds, including VR/AR/XR [40], automatic diving [47], scene understanding [7], among many others. Typically, 3D point clouds can be divided as two types: relatively smaller objects and large-scale scenes. In this work, we focus on the first type of 3D dense point clouds.

Fig. 1: Comparison between a 2D image and a 3D point cloud. (a) 2D image, (b) 3D point cloud.

Although the data formats of 3D point clouds are very different, similar to conventional images and videos, a variety of distortions would be inevitably introduced during the processing chain of multimedia communication systems [36]. For example, the acquisition sensors may produce noise artifacts in captured 3D point clouds. Moreover, because of the large-scale volumetric content of 3D point clouds, executive downsampling and compression operations [30] can also lead to perceptual quality degradation. Consequently, how to effectively assess the perceptually visual quality of 3D point clouds is a significant and challenging problem, which can benefit other relevant processing stages, e.g. tuning the acquisition parameters for the optimal capture of 3D point clouds. Regarding to the quality assessment of 3D point clouds, a subjective test usually serves as the most accurate and reliable method [48, 3]. However, since organizing subjective experiments are time-consuming and labor-intensive, developing efficient objective quality metrics is a promising alternative.

In general, the characteristics of point clouds make the design of objective quality assessment models more challenging than traditional 2D images or videos. As shown in Fig. 1, here we illustrate two examples of a 2D image and a 3D point cloud, where the former is a man riding on a horse and the latter is a cake as an object. Ideally, both the two data formats can represent a scene or an object in real-world. However, we can see that a 2D image is usually displayed in a regular shape, while a 3D point cloud is unstructured and composed of many distributed points. Thus, new challenges would arise for the objective quality assessment of 3D point clouds.

Fig. 2: Typical distortions in 3D point clouds. (a) A reference point cloud, (b-d) examples of distorted point clouds.

According to the availability of original reference 3D point clouds, the objective quality assessment methods of 3D point clouds can be divided into three categories, consisting of full-reference (FR), reduced-reference (RR), and no-reference (NR) models. That is, the FR and RR models can access full and part data of reference point clouds, respectively. In contrast, the blind/NR models evaluate the perceptual quality of distorted 3D point clouds without any reference information. Among the NR models, general-purpose ones can evaluate distortion-generic 3D point clouds, which do not need to know about the distortion type and are more practical in real-world scenarios. Existing general-purpose NR quality assessment methods for 3D point clouds have been proposed mainly on the basis of deep neural networks (DNNs) [18, 45, 19]. By leveraging the remarkable learning ability of DNNs, perceptual quality can be directly predicted from the point data. However, the learning process often lacks the design philosophies of domain knowledge that reflect the human visual system (HVS) characteristics, causing unsatisfactory interpretability. In [50], a statistics-based quality assessment model for point clouds and meshes was proposed. To the best of our knowledge, so far there has been no exploration of explainable general-purpose NR quality assessment methods specifically designed for 3D point clouds based on domain knowledge.

As illustrated in Fig. 2, our main motivation comes from the observations of typical artifact appearances in various distorted 3D point clouds. From (b) to (d), we can find the perceptual quality degradations are relevant to geometry density, color difference and point orientation, respectively. For example, compared with the original reference point cloud in figure (a), the points are incomplete and have incorrect color in figures (b) and (c), respectively. Moreover, in figure (d), the orientation of these points are destroyed, as clearly observed by the object parts in the bounding boxes. Therefore, we aim to expose the black box of point cloud quality prediction by quantifying the distortions from the aspects of density, color and orientation, leading to the general-purpose NR point cloud quality index with Structure Guided Resampling (SGR).

In our proposed SGR, to extract efficient HVS-related features based on scattered points, regional pre-processing is first conducted, where keypoint resampling and local region construction are involved. In this pre-processing step, considering the huge amount of point clouds data and the crucial structure information, we first resample the test point cloud via normal vectors to obtain a serial of keypoints, and then construct local regions centered keypoints. Afterwards, inspired by the combined impacts of 3D geometry and associated attributes, 3 groups of quality-aware features including 1) geometry density features, 2) color naturalness features, and 3) angular consistency features, are employed to predict the perceptual quality of distorted 3D point clouds. Besides, we also consider the fundamental theory of natural scene statistics (NSS) [28, 34] in the extrated features. Finally, the quality-related features are fused into the overall SGR via a quality regression module. The effectiveness of our proposed SGR is verified on four subjective point cloud quality databases. The experiments demonstrate that the proposed SGR presents significant performance improvement compared with other NR metrics, and even better than most FR metrics. Besides, the proposed SGR shows stable performance with typical point cloud distortion types, such as Downsampling, Gaussian Noise, G-PCC, and V-PCC.

The main contributions of this paper are summarized as follows:

We propose a general-purpose blind objective quality assessment algorithm for 3D point clouds based on the cognitive characteristics of the human brain and the regularity of NSS.
According to the structural dependence of the HVS, the regional pre-processing is proposed in the SGR framework, which exploits the unique normal vectors of 3D point clouds.
The integrated influence of 3D geometry and associated attributes information can be reflected by the proposed quality-aware features. Experimental results on existing subjective point cloud quality databases demonstrate the competitive performance of the proposed model.

Fig. 3: Framework of the proposed SGR method which consists of regional pre-processing, quality-aware feature extraction, and quality regression. The regional pre-processing involves keypoint resampling and local region construction. For final visual score prediction, geometry density, color naturalness and angular consistency features are used to quantify the perceptual quality of 3D point clouds.

The rest of this paper is organized as follows. Section II reviews related work on objective quality assessment of 3D point clouds. In Section III, we introduce the technical details of our proposed NR quality assessment method. Section IV presents the validation results of the proposed model. Finally, we conclude the paper in Section V.

Ii Related Work

To assess the perceptual quality of distorted 3D point clouds, comparing with the corresponding original reference point clouds is the most intuitive way if the pristine one is available. Therefore, there have emerged many FR quality assessment methods for 3D point clouds, which can be generally classified into point-based and projection-based metrics.

The point-based models directly use the 3D points in the reference and distorted contents. Among them, the earliest point-based metric is point-to-point (p2po) [20], where representative mean squared error (MSE) or Hausdorff (HF) distance is often adopted for geometric peak signal-to-noise ratio (PSNR) computation. An alternative method is point-to-plane (p2pl) [37] that computes the PSNR between a test point and a corresponding normal vector in the reference point cloud. Except for the point-based geometry distortion measures, color information can also be used to compute the PSNR, such as the YUV channels or the luminance Y component [21]. But both p2po and p2pl quality metrics cannot precisely evaluate the perceptual quality of 3D point clouds under structural loss. Thus, the angular similarity (AS), also known as plane-to-plane, was proposed [1]. Moreover, on the basis of graph signal processing, Yang et al. [46] proposed graph similarity index (GraphSIM) and the extended multiscale GraphSIM (MS-GraphSIM) [49]. In addition, based on the designed mesh structural distortion measure (MSDM) [16], Meynet et al. [22] exploited local curvature statistics to develop the FR quality assessment method called PC-MSDM for the perceptual quality evaluation of 3D point clouds. They further proposed the point cloud quality metric (PCQM) [23] which utilizes the optimally weighted linear combination of geometric curvature and color features. Furthermore, similar to the well-known structural similarity index (SSIM) [41], Alexiou et al. [2] tried to capture the local changes of test point clouds, leading to the quality degradation measurement named PointSSIM.

Apart from the previously mentioned point-based FR methods, another kind of mainstream FR framework is to project 3D point clouds into multiple 2D images from various views and then conduct the 2D image quality assessment, namely projection-based models. For example, the SSIM [41] and its variant multiscale SSIM (MS-SSIM) [43] as well as the visual information fidelity in the pixel domain (VIFP) [32] are often employed to predict the visual quality of converted 2D images. Several projection methods can be adopted, where 6 perpendicular projection of a cube is one of typical ways [44]. Usually, different weights are allocated for extracted features from all projection planes.

Since the full information of associated reference point clouds do not always exist, Viola et al. [39] proposed a RR quality metric for point cloud contents (i.e. PCMRR), where a small set of features from the references are extracted and then delivered to the receiver side for evaluating the quality degradation of distorted point clouds. Additionally, some works also use DNNs to explore the challenging NR quality metrics of 3D point clouds. For example, Liu et al. [18] designed a point cloud quality assessment network (PQA-Net) which consists of multi-view-based joint feature extraction and fusion, distortion type identification, and final quality prediction. More recently, adversarial domain adaption has been used for the development of NR quality assessment methods for point clouds [45].

However, the existing NR models are generally data-driven and uninterpretable, little consideration is given to design domain knowledge oriented perceptual features that reveal the characteristics of the HVS. Therefore, in this work, we aim to bridge the explicit gap between the HVS perception and NR point cloud quality assessment. Specifically, considering the structure information of scattered points and the great success of NSS regularity in 2D image quality prediction, we are the first to propose a blind/NR quality assessment method specifically designed for 3D point clouds based on well-defined quality-aware features upon regional pre-processing and the cognitive properties of the human brain.

Fig. 4: Illustration of keypoint resampling for 3D point clouds under various distortion levels. (a)(c)(e) Distorted 3D point clouds, (b)(d)(f) the corresponding keypoints resampled from (a)(c)(e).

Iii Proposed Point Cloud Quality Index

The framework of our proposed SGR method is shown in Fig. 3, which is a general-purpose blind quality assessment model. Since the HVS is more sensitive to structures, we first resample keypoints from the input test 3D point cloud based on generated normal vectors. Then, we construct local regions for the resampled points and extract distortion-related features. Finally, the quality regression module is used to map the extracted features onto visual scores. We will introduce the technical details of SGR in the following subsections.

Iii-a Regional Pre-processing

Keypoint resampling. Suppose that a distorted 3D point cloud P contains $M$ points and each point has three geometry coordinates (e.g., $x, y, z$ ) as well as three color attributes (e.g., $r, g, b$ ), we can denote the 3D point cloud data as follows:

P = {[p_{1}, p_{2}, \dots, p_{M}]}_{1}^{T} \in R^{M \times 6},

(1)

where $p_{m} = (x_{m}, y_{m}, z_{m}, r_{m}, g_{m}, b_{m})$ is a point in the point cloud. In this representation, the geometry coordinates and color attributes can be separated as $g_{m} = (x_{m}, y_{m}, z_{m})$ and $c_{m} = (r_{m}, g_{m}, b_{m})$ , respectively. In addition, another attribute called normal vectors can be estimated from the point cloud.

Generally, each 3D point cloud data has numerous points. For example, the 3D point cloud in Fig. 1 (b) has $M = 2, 486, 566$ points. Moreover, these 3D points are irregular and unstructured. Nevertheless, the HVS inclines to perceive the structures of objects or environments. For traditional images and videos, such structure information has been widely used to design many objective quality measures [41, 43, 42]. Inspired by GraphSIM [46], we opt to extract the keypoints by graph-based resampling to realize the structure extraction of 3D point clouds. Specifically, considering that the resampled points should be high-frequency parts such as edges, contours, etc, GraphSIM uses a Haar-like high pass filter to extract point cloud skeleton based on original geometry coordinates. However, this method is too sensitive to the outlier, which may lead to unstable results. Therefore, we first compute the normal vectors as and treat them as the graph signal to guide keypoint resampling.

Fig. 5: Structure guided local region construction of a distorted 3D point cloud after keypoint resampling.

As defined, a graph filter represents a signal system that inputs a graph signal and outputs a tuned graph signal [8]. Mathematically, a linear and shift-invariant graph filter is a polynomial conversion of the graph shift operator $A \in R^{M \times M}$ computed by:

h (A) = K \sum k = 0 h_{k} A^{k} = h_{0} % I + h_{1} A + \dots + h_{K - 1} A^{K - 1},

(2)

where $h_{k}$ denotes the $k - t h$ filter coefficients and $K$ is the length of the graph filter. I represents the identity matrix. With the predefined graph filter, for an input graph signal $Φ \in R^{M \times 1}$ , the filtered output graph signal is a matrix-vector product which can be converted by eigendecomposition as:

h (A) Φ = V h (Λ) V^{- 1} Φ,

(3)

where the eigenvectors of $A$ constitute the columns of matrix V. The eigenvalue matrix $Λ \in R^{M \times M}$ is the diagonal matrix containing the corresponding eigenvalues of $A$ . Here, the eigenvalues are frequencies on the formed graph, which can be used to sort the 3D points.

Given a distorted 3D point cloud P, from the above analysis, we resample it to extract graph keypoints by:

^P = h_{H} (P, Φ, K)_{Θ}, Θ << M,

(4)

where $^P \in R^{Θ \times 6}$ , and $h_{H} (\cdot)$ is a high-pass graph filter with length equaling to $K$ , and the graph signal $Φ$ is normal vector. Moreover, $Θ$ represents the number of keypoints after the resampling operation. Similar to [49], we set $K$ and $Θ$ as $4$ and $M / 10, 000$ , respectively.

As shown in Fig. 4, we give examples of the keypoint resampling results for 3D point clouds which are distorted by various Octree-based compression levels. From (a), (c) to (e), the distortion levels increase, which can be easily observed by the appearances of distorted point clouds. We can see that the keypoints generally represent the structures of point clouds. In addition, with the increase of distortion levels, the number of resampled points decreases. Therefore, the keypoint resampling can be a promising quality indicator for subsequent local region construction and quality-aware feature extraction. This is mainly benefit from the efficient structure information of normal attribute.

Local region construction. After the process of keypoint resampling, we exploit the generated keypoints to construct local regions. For each keypoint $_{θ}$ in $^P$ , we cluster its neighbors based on the Euclidean distance of the corresponding geometry coordinates as follows:

~ P \subset^P, {∥ ~ g -^g_{θ} ∥}_{2}^{2} \leq α,

(5)

where $~ P$ denotes the groups of clustered neighbors of the keypoint $_{θ}$ . $~ g$ and $^g_{θ}$ are the geometry components of $~ P$ and $_{θ}$ , respectively. In addition, $α$ is used to cluster neighbors, which is $1 / 20$ of minimum range among $x$ , $y$ , and $z$ coordinates.

In Fig. 5, we show the structure guided local region construction of distorted 3D point cloud after keypoint resampling. Here, we take the figure (a) in Fig. 4 as an example, where three constructed regions are given. It can be found that these local regions have different shapes, which are used for the quality-aware feature extraction.

Iii-B Geometry Density

Based on the constructed local region, we exploit the geometry components to derivate geometry density features. To be specific, since the geometry information can be directly obtained from the local region, we separately compute the mean and standard deviation values of all three coordinates to serve as the geometry density features. For example, the mean and standard deviation of geometry information along $x$ -coordinate are calculated by:

μ_{g}^{x} = \frac{\sum_{i = 1}^{I} {~ g}_{i}^{x}}{I},

(6)

δ_{g}^{x} = \sqrt{\frac{\sum_{i = 1}^{I} {({~ g}_{i}^{x} - μ_{g}^{x})}^{2}}{I}},

(7)

where ${~ g}_{i}^{x}$ and $I$ represent the $i - t h$ geometry information and total number of graph points in the local region along $x$ -coordinate, respectively. Similar to $x$ -coordinate, we can also compute the mean and standard deviation of geometry information along $y$ -coordinate and $z$ -coordinate. With all three coordinates, we can obtain 6 dimensional geometry density features from the average operation of extracted features across various local regions.

Since the standard deviation is induced from mean value, we provide the mean value changes for two 3D point clouds at different distortion degrees, as shown in Fig. 6. Here, the point clouds are distorted with light and heavy V-PCC compression artifacts. As can be seen from this figure, the histogram comparisons of the distorted 3D point clouds demonstrate that the mean value from all three coordinates can effectively reveal the quality variation. In addition, the effectiveness of these features is mainly because they can reflect the density of 3D point clouds.

Fig. 6: Mean value changes for 3D point clouds at different distortion degrees. (a) Lightly compressed 3D point cloud, (b) heavily compressed 3D point cloud, (c) the mean value changes of (a,b).

Iii-C Color Naturalness

As for 3D point clouds, color information is a significant attribute. Thus, we further extract color naturalness features from the constructed local region. By taking the human perception into account, we first convert the RGB components to YUV space. Then, to normalize the distributions, we apply the zero-phase component analysis followed by local mean subtraction and divisive normalization [28] to the separate YUV channels of distorted local regions as:

{¯ ¯ ¯ ρ}_{z} (y, u, v) = \frac{{~ ρ}_{z} (y, u, v) - μ (y, u, v)}{} δ (y, u, v) + C

(8)

where ${~ ρ}_{z} (y, u, v)$ is the YUV information of local regions after the whitening filter of zero-phase component analysis [9, 52], and ${¯ ¯ ¯ ρ}_{z} (y, u, v)$ represents the final normalized YUV channels. $C = 1$ is a small constant that avoids instabilities when the denominator tends to $0$ . The local mean and standard deviation of YUV channels can be computed by:

μ (y, u, v) = L \sum l = - L T \sum t = - T ω_{l, t} {~ ρ}_{z} (y, u, v),

(9)

δ (y, u, v) = \sqrt{L \sum l = - L T \sum t = - T ω_{l, t} {({~ ρ}_{z} (y, u, v) - δ (y, u, v))}_{z}^{2}},

(10)

where $ω = {ω_{l, t} ∣ l = - L, \dots, L, t = - T, \dots, T}$ indicates a circularly symmetric Gaussian weighting function. Motivated by [24], we set $L = T = 3$ .

In Fig. 7, we show an example of statistical probability distribution for normalized coefficients, where the 3D point cloud is distorted by downscaling artifacts. On the basis of its Gaussian-like appearance, we use the generalized Gaussian distribution (GGD) [31] to capture the regularity of NSS, which is given by:

f (x; λ, ε^{2}) = \frac{λ}{2 η Γ (1 / λ)} e^{- {(\frac{| x |}{η})}^{λ}},

(11)

where

η = ε \sqrt{\frac{Γ (1 / λ)}{Γ (3 / λ)}},

(12)

and $Γ (\cdot)$ is the gamma function as follows:

Γ (a) = \int_{0}^{+ \infty} x^{a - 1} e^{- x} d x, a > 0.

(13)

In the GGD, $λ$ controls the shape of statistical probability distribution and $ε^{2}$ reflects the variance. For each color channel of the input distorted point cloud, we estimate 2 parameters $(λ, ε^{2})$ from the GGD fit by the moment matching-based approach [31]. Additionally, we use four scales containing the original scale as well as the reduction by the factors of 2, 4, and 8. Totally, with YUV channels, we have 24 dimensional color naturalness features which are calculated by the average operation among all the local regions.

Fig. 7: An example of statistical probability distribution for normalized coefficients. (a) A distorted 3D point cloud, (b) the corresponding statistical probability distribution for normalized coefficients of (a).

Iii-D Angular Consistency

Another important attribute information is normal that indicates the orientation of points. To fully use the unique normal vectors, apart from regional pre-processing, they have also been verified to be correlated to visual quality [37, 1, 39]. Therefore, we propose angular consistency features based on the normal information. Specifically, for each 3D point in the distorted point clouds, we first estimate the normal vector ${\to χ}_{i}$ of the point by using multiple neighboring points to fit a local plane for determining the normal vector. It should be noted that the number of neighboring points for normal vector estimation is validated in the experiments. Then, we adopt the $k$ -nearest algorithm to choose the set of neighbors for each normal vector. In such case, assume that the normal vector of each neighbor point is denoted by ${\to χ}_{j}$ , we compute the cosine similarity between ${\to χ}_{i}$ and ${\to χ}_{j}$ by:

Ω = cos (ζ) = \frac{{\to χ}_{i} \cdot {\to χ}_{j}}{∥ {\to χ}_{i} ∥ * ∥ ∥ {\to χ}_{j} ∥ ∥},

(14)

where $Ω \in [- 1, 1]$ and $ζ$ is the angular between the two normal vectors. Based on this, the inverse cosine of the obtained $Ω$ is calculated as:

ζ^{'} = arccos (| Ω |),

(15)

where $ζ^{'} \in [0, π / 2]$ and thus the ultimate angular similarity can be computed in the range of $[0, 1]$ as follows:

ψ = 1 - \frac{2 ζ^{'}}{π} .

(16)

Fig. 8: Demonstration of two scales for distorted 3D point clouds. (a) A distorted 3D point cloud, (b) the downsampled scale of (a).

From the above-mentioned computation, we obtain the angular similarity matrix for every distorted point cloud. On one hand, we conduct an average operation for the point number dimension. On the other hand, 5 kinds of statistics are calculated for the $k$ -nearest neighbor dimension, where the first two statistical features are the mean $μ$ and standard deviation $δ$ . Furthermore, the skewness and kurtosis are computed as:

S = E [{(\frac{ψ_{k} - μ}{δ})}^{3}],

(17)

K = E [{(\frac{ψ_{k} - μ}{δ})}^{4}],

(18)

where $E [\cdot]$ denotes the expectation operator, and $ψ_{k}$ is the angular similarity value. Except for these moments, as suggested in [12], the entropy is also obtained by:

H = - \sum k p (ψ_{k}) l o g p (ψ_{k}),

(19)

where $p (ψ_{k})$ is the probability of $ψ_{k}$ . These statistics of angular similarity matrix can be regarded as the angular consistency of normal vectors.

To extract the angular consistency features, since the HVS perception is hierarchical, here we employ more scales for better feature representations. As illustrated in Fig. 8, two scales of PCL compressed point clouds are shown, where figures (a) and (b) have $132, 518$ points and $66, 259$ points, respectively. From the two scales, we have 10 dimensional angular consistency features, which are used for quantifying visual distortions of 3D point clouds.

Databases	Reference Number	Distortion Number	Distortion Types	MOS Range
Waterloo	20	660	Downsampling, Gaussian noise, G-PCC, V-PCC	[1, 100]
M-PCCD	8	232	Octree-Lifting, Octree-RAHT, TriSoup-Lifting, TriSoup-RAHT, V-PCC	[1, 5]
SJTU	6	144	Octree-based compression, Color noise, Downscaling, Geometry Gaussian noise	[1, 10]
IRPC	6	54	PCL, G-PCC, V-PCC	[1, 5]

TABLE I: Detailed Information of Subjective Point Cloud Quality Databases.

Point Cloud Contents	Method Types	Method Names	SROCC	KROCC	PLCC
Banana	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.64	0.47	0.73
		$P S N R_{H F, p 2 p o}$ [20]	0.07	0.02	0.35
		$P S N R_{M S E, p 2 p l}$ [37]	0.54	0.40	0.57
		$P S N R_{H F, p 2 p l}$ [37]	0.08	0.04	0.35
		$P S N R_{Y}$ [21]	0.62	0.47	0.72
		$A S_{M e a n}$ [1]	0.39	0.30	0.40
		$A S_{R M S}$ [1]	0.34	0.26	0.39
		$A S_{M S E}$ [1]	0.34	0.26	0.39
		$S S I M_{p r o j e c t e d}$ [41]	0.77	0.60	0.72
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.82	0.65	0.80
		$V I F P_{p r o j e c t e d}$ [32]	0.82	0.64	0.81
		$G r a p h S I M$ [46]	0.46	0.37	0.53
		$P C Q M$ [23]	0.74	0.56	0.64
		$P o i n t S S I M$ [2]	0.18	0.12	0.10
	RR	$P C M R R$ [39]	0.47	0.38	0.42
	NR	$P Q A$ - $N e t$ [18]	0.52	0.39	0.53
	NR	Proposed SGR (SVR)	0.68	0.51	0.70
		Proposed SGR (RFR)	0.86	0.68	0.83
Cauliflower	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.49	0.35	0.55
		$P S N R_{H F, p 2 p o}$ [20]	0.16	0.11	0.09
		$P S N R_{M S E, p 2 p l}$ [37]	0.30	0.23	0.40
		$P S N R_{H F, p 2 p l}$ [37]	0.25	0.18	0.33
		$P S N R_{Y}$ [21]	0.54	0.40	0.56
		$A S_{M e a n}$ [1]	0.24	0.18	0.37
		$A S_{R M S}$ [1]	0.24	0.18	0.37
		$A S_{M S E}$ [1]	0.24	0.18	0.37
		$S S I M_{p r o j e c t e d}$ [41]	0.77	0.61	0.80
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.74	0.58	0.76
		$V I F P_{p r o j e c t e d}$ [32]	0.75	0.59	0.80
		$G r a p h S I M$ [46]	0.59	0.43	0.61
		$P C Q M$ [23]	0.69	0.52	0.67
		$P o i n t S S I M$ [2]	0.21	0.19	0.36
	RR	$P C M R R$ [39]	0.29	0.20	0.30
	NR	$P Q A$ - $N e t$ [18]	0.69	0.52	0.70
	NR	Proposed SGR (SVR)	0.68	0.50	0.70
		Proposed SGR (RFR)	0.70	0.52	0.72
Mushroom	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.63	0.48	0.66
		$P S N R_{H F, p 2 p o}$ [20]	0.26	0.20	0.48
		$P S N R_{M S E, p 2 p l}$ [37]	0.47	0.37	0.55
		$P S N R_{H F, p 2 p l}$ [37]	0.22	0.16	0.45
		$P S N R_{Y}$ [21]	0.60	0.44	0.79
		$A S_{M e a n}$ [1]	0.26	0.18	0.39
		$A S_{R M S}$ [1]	0.26	0.18	0.39
		$A S_{M S E}$ [1]	0.26	0.18	0.39
		$S S I M_{p r o j e c t e d}$ [41]	0.73	0.57	0.85
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.89	0.73	0.88
		$V I F P_{p r o j e c t e d}$ [32]	0.90	0.76	0.90
		$G r a p h S I M$ [46]	0.65	0.50	0.68
		$P C Q M$ [23]	0.76	0.60	0.71
		$P o i n t S S I M$ [2]	0.33	0.26	0.36
	RR	$P C M R R$ [39]	0.18	0.14	0.19
	NR	$P Q A$ - $N e t$ [18]	0.71	0.56	0.77
	NR	Proposed SGR (SVR)	0.80	0.66	0.85
		Proposed SGR (RFR)	0.82	0.65	0.86
Overall	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.48	0.33	0.50
		$P S N R_{H F, p 2 p o}$ [20]	0.16	0.11	0.34
		$P S N R_{M S E, p 2 p l}$ [37]	0.36	0.25	0.37
		$P S N R_{H F, p 2 p l}$ [37]	0.20	0.14	0.27
		$P S N R_{Y}$ [21]	0.53	0.37	0.56
		$A S_{M e a n}$ [1]	0.24	0.16	0.24
		$A S_{R M S}$ [1]	0.21	0.14	0.23
		$A S_{M S E}$ [1]	0.21	0.14	0.23
		$S S I M_{p r o j e c t e d}$ [41]	0.58	0.42	0.59
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.60	0.44	0.62
		$V I F P_{p r o j e c t e d}$ [32]	0.82	0.63	0.82
		$G r a p h S I M$ [46]	0.46	0.32	0.47
		$P C Q M$ [23]	0.71	0.52	0.65
		$P o i n t S S I M$ [2]	0.18	0.14	0.14
	RR	$P C M R R$ [39]	0.26	0.19	0.29
	NR	$P Q A$ - $N e t$ [18]	0.69	0.51	0.70
	NR	Proposed SGR (SVR)	0.73	0.55	0.74
		Proposed SGR (RFR)	0.73	0.55	0.75

TABLE II: Performance Evaluation of Different Point Cloud Contents.

Distortion Types	Method Types	Method Names	SROCC	KROCC	PLCC
Downsampling	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.86	0.67	0.96
		$P S N R_{H F, p 2 p o}$ [20]	0.85	0.64	0.97
		$P S N R_{M S E, p 2 p l}$ [37]	0.76	0.52	0.85
		$P S N R_{H F, p 2 p l}$ [37]	0.84	0.61	0.97
		$P S N R_{Y}$ [21]	0.69	0.55	0.70
		$A S_{M e a n}$ [1]	0.77	0.52	0.96
		$A S_{R M S}$ [1]	0.77	0.52	0.96
		$A S_{M S E}$ [1]	0.77	0.52	0.96
		$S S I M_{p r o j e c t e d}$ [41]	0.87	0.73	0.91
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.83	0.64	0.89
		$V I F P_{p r o j e c t e d}$ [32]	0.91	0.76	0.98
		$G r a p h S I M$ [46]	0.79	0.64	0.97
		$P C Q M$ [23]	0.89	0.73	0.85
		$P o i n t S S I M$ [2]	0.91	0.76	0.97
	RR	$P C M R R$ [39]	0.88	0.70	0.89
	NR	$P Q A$ - $N e t$ [18]	0.80	0.64	0.97
	NR	Proposed SGR (SVR)	0.82	0.61	0.95
		Proposed SGR (RFR)	0.84	0.67	0.97
Gaussian Noise	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.63	0.44	0.67
		$P S N R_{H F, p 2 p o}$ [20]	0.63	0.45	0.67
		$P S N R_{M S E, p 2 p l}$ [37]	0.62	0.43	0.67
		$P S N R_{H F, p 2 p l}$ [37]	0.63	0.45	0.67
		$P S N R_{Y}$ [21]	0.84	0.68	0.90
		$A S_{M e a n}$ [1]	0.68	0.49	0.71
		$A S_{R M S}$ [1]	0.67	0.49	0.71
		$A S_{M S E}$ [1]	0.67	0.49	0.70
		$S S I M_{p r o j e c t e d}$ [41]	0.66	0.48	0.84
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.66	0.51	0.85
		$V I F P_{p r o j e c t e d}$ [32]	0.81	0.66	0.86
		$G r a p h S I M$ [46]	0.71	0.57	0.72
		$P C Q M$ [23]	0.87	0.70	0.89
		$P o i n t S S I M$ [2]	0.63	0.46	0.70
	RR	$P C M R R$ [39]	0.88	0.73	0.89
	NR	$P Q A$ - $N e t$ [18]	0.64	0.44	0.75
	NR	Proposed SGR (SVR)	0.91	0.74	0.94
		Proposed SGR (RFR)	0.85	0.68	0.92
G-PCC	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.39	0.29	0.41
		$P S N R_{H F, p 2 p o}$ [20]	0.42	0.30	0.40
		$P S N R_{M S E, p 2 p l}$ [37]	0.42	0.30	0.42
		$P S N R_{H F, p 2 p l}$ [37]	0.34	0.23	0.34
		$P S N R_{Y}$ [21]	0.73	0.55	0.75
		$A S_{M e a n}$ [1]	0.03	0.08	0.13
		$A S_{R M S}$ [1]	0.03	0.03	0.13
		$A S_{M S E}$ [1]	0.03	0.03	0.13
		$S S I M_{p r o j e c t e d}$ [41]	0.63	0.48	0.63
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.66	0.51	0.67
		$V I F P_{p r o j e c t e d}$ [32]	0.83	0.65	0.84
		$G r a p h S I M$ [46]	0.61	0.44	0.70
		$P C Q M$ [23]	0.85	0.69	0.72
		$P o i n t S S I M$ [2]	0.77	0.60	0.78
	RR	$P C M R R$ [39]	0.19	0.13	0.20
	NR	$P Q A$ - $N e t$ [18]	0.67	0.51	0.68
	NR	Proposed SGR (SVR)	0.70	0.52	0.73
		Proposed SGR (RFR)	0.69	0.50	0.72
V-PCC	FR	$P S N R_{M S E, p 2 p o}$ [20]	0.42	0.30	0.48
		$P S N R_{H F, p 2 p o}$ [20]	0.27	0.19	0.35
		$P S N R_{M S E, p 2 p l}$ [37]	0.46	0.32	0.48
		$P S N R_{H F, p 2 p l}$ [37]	0.43	0.32	0.52
		$P S N R_{Y}$ [21]	0.32	0.25	0.48
		$A S_{M e a n}$ [1]	0.53	0.35	0.66
		$A S_{R M S}$ [1]	0.49	0.32	0.60
		$A S_{M S E}$ [1]	0.49	0.32	0.60
		$S S I M_{p r o j e c t e d}$ [41]	0.35	0.29	0.44
		$M S$ - $S S I M_{p r o j e c t e d}$ [43]	0.38	0.34	0.50
		$V I F P_{p r o j e c t e d}$ [32]	0.90	0.76	0.91
		$G r a p h S I M$ [46]	0.32	0.27	0.61
		$P C Q M$ [23]	0.59	0.45	0.59
		$P o i n t S S I M$ [2]	0.48	0.34	0.51
	RR	$P C M R R$ [39]	0.55	0.42	0.50
	NR	$P Q A$ - $N e t$ [18]	0.45	0.30	0.60
	NR	Proposed SGR (SVR)	0.75	0.57	0.79
		Proposed SGR (RFR)	0.72	0.53	0.76

TABLE III: Performance Evaluation of Various Distortion Types.

Iii-E Quality Regression

After the regional pre-processing and quality-aware feature extraction, a quality regression module is applied to map the relevant features onto final quality scores. Many regressors can be used for this purpose, among whom we select the well-known support vector regression (SVR) and random forest regression (RFR) due to their capacity and popularity in quality assessment problems [14, 33, 11, 51]. Especially, they have also been widely used for the spatial-domain and transform-domain NSS in the 2D image quality evaluation task [25, 29, 26, 13].

In our framework, given a test 3D point cloud, all features from geometry density, color naturalness and angular consistency aspects are extracted and concatenated to a feature vector. Before predicting the quality result of the test 3D point cloud, we use the labeled point clouds to train the regressor. With the trained regressor, we can predict the perceptual quality of any input test 3D point cloud. The LIBSVM [6] is utilized to implement SVR with a radial basis function kernel, together with the RFR proposed in [5, 10] are used for our experiments. Additionally, we follow the common regressor parameter settings that have been used in the training of mainstream quality assessment models.

Iv Validation of Proposed Method

Iv-a Evaluation Databases and Criteria

We conduct experiments to validate our proposed SGR on four publicly available subjective point cloud quality databases, including Waterloo [35, 17], M-PCCD [4], SJTU [44] and IRPC [15] databases. The detailed information of these databases is shown in TABLE I. Specifically, followed by the recent NR quality assessment work [18], we first use the selected Waterloo database to compare the proposed SGR with a variety of state-of-the-art FR, RR and NR quality assessment methods for 3D point clouds. This database contains 20 original reference point cloud contents with different geometry and texture complexities. Four distortion types are considered to generate the corresponding quality-degraded point clouds, which include 60 downsampling distorted point clouds, 180 distorted point clouds with Gaussian noise, 320 G-PCC (T) compressed point clouds, and 180 V-PCC compressed point clouds. Moreover, the adopted SJTU database consists of 6 pristine point cloud contents, i.e. longdress, loot, redandblack, shiva, soldier, and statue. To produce distorted point clouds, the Octree-based compression, color noise, downsampling, and geometry Gaussian noise are introduced to the original point clouds. As for M-PCCD and IRPC databases, the numbers of original point cloud contents are 8 and 6, respectively. By involving several distortion types, 232 and 54 distorted 3D point clouds are generated in the two databases.

Apart from the reference and distorted point clouds, each database also provides subjective quality labels in the form of mean opinion score (MOS), which are rated by viewers with several display modes, such as direct point cloud format and converted video format, etc. Details can be referred to the literature [35, 17, 4, 44, 15]. Note that only the 3D point clouds in the IRPC database have normal vectors.

Fig. 9: Examples of perceptual quality prediction. (a-b) Original reference point clouds with different contents, and the corresponding distortion degree increases from left to right. (c) SGR(SVR)=74.4780 / SGR(RFR)=61.9948, (d) SGR(SVR)=73.5027 / SGR(RFR)=68.9962, (e) SGR(SVR)=49.7316 / SGR(RFR)=46.3371, (f) SGR(SVR)=43.2950 / SGR(RFR)=44.6210, (g) SGR(SVR)=38.7809 / SGR(RFR)=36.8399, (h) SGR(SVR)=36.6015 / SGR(RFR)=35.2080. Higher MOS or SGR represents better visual quality for point clouds.

To validate the proposed SGR and compare it with other state-of-the-arts, we adopt three commonly-used evaluation criteria in IQA field, including Spearman Rank-Order Correlation Coefficient (SROCC), Kendall Rank-Order Correlation Coefficient (KROCC), and Pearson Linear Correlation Coefficient (PLCC). Among these evaluation criteria, the SROCC is usually used to measure prediction monotonicity, while KROCC can be applied to evaluate the ordinal association between two measured quantities. Moreover, the PLCC can be used to evaluate prediction accuracy. It should be noted that higher correlation coefficients indicate better performance for objective quality models.

Besides, before computing the PLCC for different objective quality assessment approaches, a five-parameter logistic nonlinear fitting function [27] is adopted to map the predicted quality scores into a common scale as:

g (γ) = β_{1} (\frac{1}{2} - \frac{1}{1 + e^{(β_{2} (γ - β_{3}))}}) + β_{4} γ + β_{5},

(20)

where $(β_{1} . . . β_{5})$ are five parameters to be fitted. $γ$ represents the raw objective score produced by objective quality models and $g (γ)$ denotes the regressed score after the nonlinear mapping.

Iv-B Performance of Objective Models

In order to validate the performance of our proposed SGR, we compare the proposed SGR with state-of-the-art FR, RR and NR models on the largest subjective point cloud quality database, i.e. the Waterloo database [35, 17]. These include 14 FR methods, where both point-based and projection-based approaches are compared. Among them, classical point-based metrics contain $P S N R_{M S E, p 2 p o}$ , $P S N R_{H F, p 2 p o}$ , $P S N R_{M S E, p 2 p l}$ , $P S N R_{H F, p 2 p l}$ , $P S N R_{Y}$ , $A S_{M e a n}$ , $A S_{R M S}$ , $A S_{M S E}$ , $G r a p h S I M$ , $P C Q M$ , and $P o i n t S S I M$ . The projection-based FR metrics involve $S S I M_{p r o j e c t e d}$ , $M S$ - $S S I M_{p r o j e c t e d}$ , and $V I F P_{p r o j e c t e d}$ . Representative RR and NR methods are also taken into consideration, such as $P C M R R$ and $P Q A$ - $N e t$ .

By following [18], we conduct the comparison for various visual contents and distortion types. The evaluation results of different point cloud contents are reported in TABLE II. For space convenience, we show several contents and the overall performance. We can observe that our proposed SGR achieves competitive results for all the test visual contents. Besides, by using both the SVR and RFR regressors, our framework delivers promising mapping results, demonstrating that the proposed SGR does not rely on specific regressor. More importantly, our proposed SGR is superior to the $P Q A$ - $N e t$ regarding the overall performance. It should be noted that the $P Q A$ - $N e t$ is a deep learning-based NR quality assessment model for 3D point clouds. In addition, beyond the performance results for individual visual contents, we show the performance comparisons of various distortion types in TABLE III. Again, our proposed SGR outperforms the $P Q A$ - $N e t$ for all distortion types, especially for Gaussian noise and V-PCC compression. Therefore, both the two tables demonstrate the superiority of our SGR algorithm.

Iv-C Visualization Comparisons

In addition to quantitative performance results, we compare the visualized 3D point clouds with the predicted scores by our proposed SGR. Both SVR and RFR predicted quality results are computed.

We show some examples of perceptual quality prediction in Fig. 9. Each row represents one visual content. That is, the original reference contents are banana and mushroom, which contain 807,184 and 1144,603 points, respectively. By introducing the Gaussian noise, the reference point cloud would be distorted with various degrees. From the trends of SGR (SVR) and SGR (RFR) values, we can see that the proposed SGR method successfully distinguishes distorted point cloud data of different qualities. Therefore, the proposed model can effectively evaluate the perceptual quality of distorted 3D point clouds. In addition, for point clouds with relatively lower quality, i.e. figures (e-f) and (g-h), the prediction results from two models are more consistent. This may be because compared to low-quality point clouds, it is harder to predict the perceptual quality of high-quality point cloud data.

Iv-D Validity on Other Subjective Databases

Apart from the largest Waterloo database, we further test the performance of the proposed SGR on other subjective databases, including M-PCCD [4], SJTU [44], and IRPC [15]. Note that only the distorted 3D point clouds with individual/single distortion types are tested.

In the experiments, each database is randomly divided into training and testing sets with $80 %$ - $20 %$ splitting. TABLE IV provides the comparison results, where the state-of-the-art NR model $P Q A$ - $N e t$ is compared with our proposed SGR. From this table, we can find that the proposed SGR performs better than the $P Q A$ - $N e t$ , which demonstrates the effectiveness of our SGR in general.

Databases	M-PCCD		SJTU		IRPC
Methods	SROCC	PLCC	SROCC	PLCC	SROCC	PLCC
$P Q A$ - $N e t$	0.60	0.65	0.82	0.85	0.40	0.58
Proposed SGR (SVR)	0.80	0.87	0.89	0.93	0.67	0.89
Proposed SGR (RFR)	0.91	0.92	0.84	0.89	0.68	0.89

TABLE IV: Performance Comparison on Other Subjective Databases.

Methods	SROCC	KROCC	PLCC
Geometry Density	0.58	0.41	0.70
Color Naturalness	0.54	0.40	0.72
Geometry Density + Color Naturalness	0.78	0.61	0.85
Angular Consistency	0.72	0.55	0.84
Proposed SGR (SVR)	0.89	0.72	0.93

TABLE V: Performance Results of Individual Component on SJTU database.

Methods	SROCC	KROCC	PLCC
Color	0.85	0.69	0.91
Geometry	0.86	0.69	0.91
Proposed SGR (SVR)	0.89	0.72	0.93

TABLE VI: Performance Results of Keypoint Resampling on SJTU database.

Fig. 10: Keypoint resampling comparisons. (a) An example of distorted point clouds, (b-d) the corresponding keypoints resampled from (a) by using geometry, color, and normal vectors, respectively.

Numbers	SROCC	KROCC	PLCC
6	0.79	0.61	0.85
11	0.87	0.71	0.91
12	0.89	0.72	0.93
13	0.87	0.70	0.91
15	0.85	0.68	0.90
18	0.83	0.65	0.89

TABLE VII: Performance Results of Normal Number on SJTU database.

Iv-E Performance of Individual Component

Since the proposed SGR is composed of three groups of quality-aware features, including geometry density, color naturalness, and angular consistency. It is interesting to validate the performance of each individual feature component. In TABLE V, we conduct the ablation study and show the results. It can be seen that the two components from regional pre-processing are combined to boost the performance. Moreover, further integrating angular consistency leads to the best results of the proposed SGR framework.

Additionally, the keypoint resampling is based on the unique normal vectors, we test the cases by using other information, such as the color attributes and geometry information. The performance results are listed in TABLE VI, where the proposed method outperforms the others. One possible explanation may be that the normal vectors can represent the structures of point clouds. Here, we also give the comparisons of keypoint resampling in Fig. 10. As can be seen in this figure, we take a point cloud distorted by severe Gaussian noise as an example. Compared with the keypoints resampled from geometry or color signals, by using the unique normal vectors, the keypoints are sampled more uniformly and cover a larger range of visual content. Besides, since the extracted quality-aware features after regional pre-processing are based on geometry and color aspects, our method could achieve the disentanglement of the known information to some extent. All these validate the superiority of our proposed structure guided resampling. Furthermore, the used normal number is explored in our experiment. TABLE VII shows that 12 normal numbers can deliver the best performance. With more and less normal numbers, the results would decrease. Therefore, we choose this parameter equaling to 12 in the proposed method.

V Conclusion

In this paper, we develop a general and efficient blind/no-reference quality assessment method for 3D point clouds, which is based on structure guided resampling. Inspired by the human perception of typical visual distortions, the proposed SGR model operates for distorted 3D geometry and associated attributes information without any special access to the reference, where regional pre-processing, quality-related feature extraction, and quality regression are involved in the proposed framework. In the experiments, we demonstrate that our proposed SGR algorithm can correlate well with human quality ratings on serveral subject-rated point cloud quality databases. Further, we also show the effectiveness of each constituted component of our proposed method.

In the future, we plan to explore more powerful distortion-aware features to improve the quality assessment model. Moreover, the way to apply our proposed method to the automatic optimization of existing 3D point cloud processing algorithms could be another direction.

References

[1] E. Alexiou and T. Ebrahimi (2018) Point cloud quality assessment metric based on angular similarity. In International Conference on Multimedia and Expo (ICME), pp. 1–6. Cited by: §II, §III-D, TABLE II, TABLE III.
[2] E. Alexiou and T. Ebrahimi (2020) Towards a point cloud structural similarity metric. In International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6. Cited by: §II, TABLE II, TABLE III.
[3] E. Alexiou, E. Upenik, and T. Ebrahimi (2017) Towards subjective quality assessment of point cloud imaging in augmented reality. In 2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6. Cited by: §I.
[4] E. Alexiou, I. Viola, T. M. Borges, T. A. Fonseca, R. L. De Queiroz, and T. Ebrahimi (2019) A comprehensive study of the rate-distortion performance in MPEG point cloud compression. APSIPA Transactions on Signal and Information Processing 8. Cited by: §IV-A, §IV-A, §IV-D.
[5] L. Breiman (2001) Random forests. Machine learning 45 (1), pp. 5–32. Cited by: §III-E.
[6] C. Chang and C. Lin (2011) LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2 (3), pp. 1–27. Cited by: §III-E.
[7] J. Chen, Z. Kira, and Y. K. Cho (2019) Deep learning approach to point cloud scene understanding for automated scan to 3D reconstruction. J. Comput. Civ. Eng 33 (4), pp. 04019027. Cited by: §I.
[8] S. Chen, D. Tian, C. Feng, A. Vetro, and J. Kovačević (2017) Fast resampling of three-dimensional point clouds via graphs. IEEE Transactions on Signal Processing 66 (3), pp. 666–681. Cited by: §III-A.
[9] Z. Chen, W. Zhou, and W. Li (2017) Blind stereoscopic video quality assessment: From depth perception to overall experience. IEEE Transactions on Image Processing 27 (2), pp. 721–734. Cited by: §III-C.
[10] A. Criminisi, J. Shotton, and E. Konukoglu (2011) Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning [internet]. Microsoft Research. Cited by: §III-E.
[11] S. V. R. Dendi and S. S. Channappayya (2020) No-reference video quality assessment using natural spatiotemporal scene statistics. IEEE Transactions on Image Processing 29, pp. 5612–5624. Cited by: §III-E.
[12] Y. Fang, K. Ma, Z. Wang, W. Lin, Z. Fang, and G. Zhai (2014) No-reference quality assessment of contrast-distorted images based on natural scene statistics. IEEE Signal Processing Letters 22 (7), pp. 838–842. Cited by: §III-D.
[13] P. G. Freitas, W. Y. Akamine, and M. C. Farias (2018) No-reference image quality assessment using orthogonal color planes patterns. IEEE Transactions on Multimedia 20 (12), pp. 3353–3360. Cited by: §III-E.
[14] K. Gu, G. Zhai, X. Yang, and W. Zhang (2014) Using free energy principle for blind image quality assessment. IEEE Transactions on Multimedia 17 (1), pp. 50–63. Cited by: §III-E.
[15] A. Javaheri, C. Brites, F. Pereira, and J. Ascenso (2020) Point cloud rendering after coding: Impacts on subjective and objective quality. IEEE Transactions on Multimedia 23, pp. 4049–4064. Cited by: §IV-A, §IV-A, §IV-D.
[16] G. Lavoué, E. D. Gelasca, F. Dupont, A. Baskurt, and T. Ebrahimi (2006) Perceptually driven 3d distance metrics with application to watermarking. In Applications of Digital Image Processing XXIX, Vol. 6312, pp. 150–161. Cited by: §II.
[17] Q. Liu, H. Su, Z. Duanmu, W. Liu, and Z. Wang (2022) Perceptual quality assessment of colored 3D point clouds. IEEE Transactions on Visualization and Computer Graphics. Cited by: §IV-A, §IV-A, §IV-B.
[18] Q. Liu, H. Yuan, H. Su, H. Liu, Y. Wang, H. Yang, and J. Hou (2021) PQA-Net: Deep no reference point cloud quality assessment via multi-view projection. IEEE Transactions on Circuits and Systems for Video Technology 31 (12), pp. 4645–4660. Cited by: §I, §II, TABLE II, TABLE III, §IV-A, §IV-B.
[19] Y. Liu, Q. Yang, Y. Xu, and L. Yang (2020) Point cloud quality assessment: Dataset construction and learning-based no-reference approach. arXiv preprint arXiv:2012.11895. Cited by: §I.
[20] R. Mekuria, Z. Li, C. Tulvan, and P. Chou (2016) Evaluation criteria for PCC (point cloud compression). Cited by: §II, TABLE II, TABLE III.
[21] R. Mekuria, S. Laserre, and C. Tulvan (2017) Performance assessment of point cloud compression. In Visual Communications and Image Processing (VCIP), pp. 1–4. Cited by: §II, TABLE II, TABLE III.
[22] G. Meynet, J. Digne, and G. Lavoué (2019) PC-MSDM: A quality metric for 3D point clouds. In 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–3. Cited by: §II.
[23] G. Meynet, Y. Nehmé, J. Digne, and G. Lavoué (2020) PCQM: A full-reference quality metric for colored 3D point clouds. In International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. Cited by: §II, TABLE II, TABLE III.
[24] A. Mittal, A. K. Moorthy, and A. C. Bovik (2012) No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing 21 (12), pp. 4695–4708. Cited by: §III-C.
[25] A. K. Moorthy and A. C. Bovik (2011) Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Transactions on Image Processing 20 (12), pp. 3350–3364. Cited by: §III-E.
[26] S. Pei and L. Chen (2015) Image quality assessment using human visual DOG model fused with random forest. IEEE Transactions on Image Processing 24 (11), pp. 3282–3292. Cited by: §III-E.
[27] A. M. Rohaly, P. J. Corriveau, J. M. Libert, A. A. Webster, V. Baroncini, J. Beerends, J. Blin, L. Contin, T. Hamada, D. Harrison, et al. (2000) Video quality experts group: current results and future directions. In Visual Communications and Image Processing 2000, Vol. 4067, pp. 742–753. Cited by: §IV-A.
[28] D. L. Ruderman (1994) The statistics of natural images. Network: Computation in Neural Systems 5 (4), pp. 517. Cited by: §I, §III-C.
[29] M. A. Saad, A. C. Bovik, and C. Charrier (2012) Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Transactions on Image Processing 21 (8), pp. 3339–3352. Cited by: §III-E.
[30] S. Schwarz, M. Preda, V. Baroncini, M. Budagavi, P. Cesar, P. A. Chou, R. A. Cohen, M. Krivokuća, S. Lasserre, Z. Li, et al. (2018) Emerging MPEG standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9 (1), pp. 133–148. Cited by: §I.
[31] K. Sharifi and A. Leon-Garcia (1995) Estimation of shape parameter for generalized gaussian distributions in subband decompositions of video. IEEE Transactions on Circuits and Systems for Video Technology 5 (1), pp. 52–56. Cited by: §III-C, §III-C.
[32] H. R. Sheikh and A. C. Bovik (2006) Image information and visual quality. IEEE Transactions on Image Processing 15 (2), pp. 430–444. Cited by: §II, TABLE II, TABLE III.
[33] L. Shi, W. Zhou, Z. Chen, and J. Zhang (2019) No-reference light field image quality assessment based on spatial-angular measurement. IEEE Transactions on Circuits and Systems for Video Technology 30 (11), pp. 4114–4128. Cited by: §III-E.
[34] A. Srivastava, A. B. Lee, E. P. Simoncelli, and S. Zhu (2003) On advances in statistical modeling of natural images. Journal of Mathematical Imaging and Vision 18 (1), pp. 17–33. Cited by: §I.
[35] H. Su, Z. Duanmu, W. Liu, Q. Liu, and Z. Wang (2019) Perceptual quality assessment of 3D point clouds. In 2019 IEEE International Conference on Image Processing (ICIP), pp. 3182–3186. Cited by: §IV-A, §IV-A, §IV-B.
[36] S. Sun, T. Yu, J. Xu, W. Zhou, and Z. Chen (2022) GraphIQA: Learning distortion graph representations for blind image quality assessment. IEEE Transactions on Multimedia. Cited by: §I.
[37] D. Tian, H. Ochimizu, C. Feng, R. Cohen, and A. Vetro (2017) Geometric distortion metrics for point cloud compression. In International Conference on Image Processing (ICIP), pp. 3460–3464. Cited by: §II, §III-D, TABLE II, TABLE III.
[38] J. van der Hooft, T. Wauters, F. De Turck, C. Timmerer, and H. Hellwagner (2019) Towards 6dof http adaptive streaming through point cloud compression. In Proceedings of the 27th ACM International Conference on Multimedia, pp. 2405–2413. Cited by: §I.
[39] I. Viola and P. Cesar (2020) A reduced reference metric for visual quality evaluation of point cloud contents. IEEE Signal Processing Letters 27, pp. 1660–1664. Cited by: §II, §III-D, TABLE II, TABLE III.
[40] Y. Wang, S. Zhang, B. Wan, W. He, and X. Bai (2018) Point cloud and visual feature-based tracking method for an augmented reality-aided mechanical assembly system. The International Journal of Advanced Manufacturing Technology 99 (9), pp. 2341–2352. Cited by: §I.
[41] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli (2004) Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13 (4), pp. 600–612. Cited by: §II, §II, §III-A, TABLE II, TABLE III.
[42] Z. Wang and Q. Li (2010) Information content weighting for perceptual image quality assessment. IEEE Transactions on Image Processing 20 (5), pp. 1185–1198. Cited by: §III-A.
[43] Z. Wang, E. P. Simoncelli, and A. C. Bovik (2003) Multiscale structural similarity for image quality assessment. In Asilomar Conference on Signals, Systems & Computers, Vol. 2, pp. 1398–1402. Cited by: §II, §III-A, TABLE II, TABLE III.
[44] Q. Yang, H. Chen, Z. Ma, Y. Xu, R. Tang, and J. Sun (2020) Predicting the perceptual quality of point cloud: A 3D-to-2D projection-based exploration. IEEE Transactions on Multimedia 23, pp. 3877–3891. Cited by: §II, §IV-A, §IV-A, §IV-D.
[45] Q. Yang, Y. Liu, S. Chen, Y. Xu, and J. Sun (2022) No-reference point cloud quality assessment via domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21179–21188. Cited by: §I, §II.
[46] Q. Yang, Z. Ma, Y. Xu, Z. Li, and J. Sun (2020) Inferring point cloud quality via graph similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence. Cited by: §II, §III-A, TABLE II, TABLE III.
[47] X. Yue, B. Wu, S. A. Seshia, K. Keutzer, and A. L. Sangiovanni-Vincentelli (2018) A lidar point cloud generator: from a virtual world to autonomous driving. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp. 458–464. Cited by: §I.
[48] J. Zhang, W. Huang, X. Zhu, and J. Hwang (2014) A subjective quality evaluation for 3D point cloud models. In 2014 International Conference on Audio, Language and Image Processing, pp. 827–831. Cited by: §I.
[49] Y. Zhang, Q. Yang, and Y. Xu (2021) MS-GraphSIM: Inferring point cloud quality via multiscale graph similarity. In Proceedings of the 29th ACM International Conference on Multimedia, pp. 1230–1238. Cited by: §II, §III-A.
[50] Z. Zhang, W. Sun, X. Min, T. Wang, W. Lu, and G. Zhai (2022) No-reference quality assessment for 3D colored point cloud and mesh models. IEEE Transactions on Circuits and Systems for Video Technology. Cited by: §I.
[51] W. Zhou, L. Shi, Z. Chen, and J. Zhang (2020) Tensor oriented no-reference light field image quality assessment. IEEE Transactions on Image Processing 29, pp. 4070–4084. Cited by: §III-E.
[52] W. Zhou, J. Xu, Q. Jiang, and Z. Chen (2021) No-reference quality assessment for 360-degree images by analysis of multifrequency information and local-global naturalness. IEEE Transactions on Circuits and Systems for Video Technology 32 (4), pp. 1778–1791. Cited by: §III-C.