Faithful visualization and dimensionality reduction on graphics processing unit

Electronic versions

Dogfennau

  • Safa Najim

Abstract

Information visualization is a process of transforming data, information and knowledge to the geometric representation in order to see unseen information. Dimensionality reduction (DR) is one of the strategies used to visualize high-dimensional data sets by projecting them onto low-dimensional space where they can be visualized directly. The problem of DR is that the straightforward relationship between the original highdimensional data sets and low-dimensional space is lost, which causes the colours of visualization to have no meaning. A new nonlinear DR method which is called faithful stochastic proximity embedding (FSPE) is proposed in this thesis to visualize more complex data sets. The proposed method depends on the low-dimensional space rather than the high-dimensional data sets to overcome the main shortcomings of the DR by overcoming the false neighbour points, and preserving the neighbourhood relation to the true neighbours. The visualization by our proposed method displays the faithful, useful and meaningful colours, where the objects of the image can be easily distinguished. The experiments that were conducted indicated that the FSPE is higher in accuracy than many dimension reduction methods because it prevents as much as possible the false neighbourhood errors to occur in the results. In addition, in the results of other methods, we have demonstrated that the FSPE has an important role in enhancing the low-dimensional space which are carried by other DR methods. Choosing the worst efficient points to update the rest of the points has helped in improving the visualization information. The results showed the proposed method has an impacting role in increasing the trustworthiness of the visualization by retrieving most of the local neighbourhood points, which they missed during the projection process. The sequential dimensionality reduction (SDR) method is the second proposed method in this thesis. It redefines the problem of DR as a sequence of multiple DR problems, each of which reduces the dimensionality by a small amount. It maintains and preserves the relations among neighbour points in low-dimensional space. The results showed the accuracy of the proposed SDR, which leads to a better visualization with minimum false colours compared to the direct projection of the DR method, where those results are confirmed by comparing our method with 21 other methods. Although there are many measurement metrics, our proposed point-wise correlation metric is the better. In this metric, we evaluate the efficiency of each point in the visualization to generate a grey-scale efficiency image. This type of image gives more details instead of representing the evaluation in one single value. The user can recognize the location of both the false and the true points. We compared the results of our proposed methods (FSPE and SDR) and many other dimension reduction methods when applied to four scenarios: (1) the unfolding curved cylinder data sets; (2) projecting a human face data sets into two dimensions; (3) classifing connected networks and (4) visualizing a remote sensing imagery data sets. The results showed that our methods are able to produce good visualization by preserving the corresponding colour distances between the visualization and the original data sets. The proposed methods are implemented on the graphic processing unit (GPU) to visualize different data sets. The benefit of a parallel implementation is to obtain the results in as short a time as possible. The results showed that compute unified device architecture (CUDA) implementation of FSPE and SDR are faster than their sequential codes on the central processing unit (CPU) in calculating floating-point operations, especially for a large data sets. The GPU is also more suited to the implementation of the metric measurement methods because they do a large computation. We illustrated that this massive speed-up requires a parallel structure to be suitable for running on a GPU.

Details

Iaith wreiddiolSaesneg
Sefydliad dyfarnu
Goruchwylydd / Goruchwylwyr / Cynghorydd
  • Ik Soo Lim (Goruchwylydd)
Dyddiad dyfarnu1 Mai 2014