An optimized unsupervised manifold learning algorithm for manycore architectures

Research output: Contribution to journalArticlepeer-review

Standard Standard

An optimized unsupervised manifold learning algorithm for manycore architectures. / Baldassin, Alexandro; Weng, Ying; Pedronette, Daniel Carlos Guimarães et al.
In: Information Sciences, Vol. 496, 09.2019, p. 410-430.

Research output: Contribution to journalArticlepeer-review

HarvardHarvard

Baldassin, A, Weng, Y, Pedronette, DCG & Almeida, J 2019, 'An optimized unsupervised manifold learning algorithm for manycore architectures', Information Sciences, vol. 496, pp. 410-430. https://doi.org/10.1016/j.ins.2018.06.023

APA

Baldassin, A., Weng, Y., Pedronette, D. C. G., & Almeida, J. (2019). An optimized unsupervised manifold learning algorithm for manycore architectures. Information Sciences, 496, 410-430. https://doi.org/10.1016/j.ins.2018.06.023

CBE

Baldassin A, Weng Y, Pedronette DCG, Almeida J. 2019. An optimized unsupervised manifold learning algorithm for manycore architectures. Information Sciences. 496:410-430. https://doi.org/10.1016/j.ins.2018.06.023

MLA

VancouverVancouver

Baldassin A, Weng Y, Pedronette DCG, Almeida J. An optimized unsupervised manifold learning algorithm for manycore architectures. Information Sciences. 2019 Sept;496:410-430. Epub 2018 Jun 21. doi: 10.1016/j.ins.2018.06.023

Author

Baldassin, Alexandro ; Weng, Ying ; Pedronette, Daniel Carlos Guimarães et al. / An optimized unsupervised manifold learning algorithm for manycore architectures. In: Information Sciences. 2019 ; Vol. 496. pp. 410-430.

RIS

TY - JOUR

T1 - An optimized unsupervised manifold learning algorithm for manycore architectures

AU - Baldassin, Alexandro

AU - Weng, Ying

AU - Pedronette, Daniel Carlos Guimarães

AU - Almeida, Jurandy

PY - 2019/9

Y1 - 2019/9

N2 - Multimedia data, such as images and videos, has become very popular in people’s daily life as a result of the widespread use of mobile devices. The ever-increasing amount of such data, along with the necessity for real-time retrieval, has lead to the development of new methods that can process them in a timely fashion with acceptable accuracy. In this paper, we study the performance of ReckNN, an unsupervised manifold learning algorithm based on the reciprocal neighbourhood and the authority of ranked lists. Most of the related work in this field do not fully investigate optimization strategies, an aspect that is becoming more important with the high availability of manycore machines. In order to address that issue, we fully investigate optimization opportunities in this article and make the following three main contributions. Firstly, we develop an efficient and scalable method for storing and accessing the distances between objects (e.g., video or image) based on dictionaries. Secondly, we employ memoization to speed up the computation of authority scores, leading to a significant performance gain even on single-core architectures. Lastly, we devise and implement several parallelization strategies and show that they are scalable on a 72-core Intel machine. The experimental results with MPEG-7, Corel5k and MediaEval benchmarks show that the optimized ReckNN delivers both efficiency and scalability, highlighting the importance of the proposed optimizations for manycore machines.

AB - Multimedia data, such as images and videos, has become very popular in people’s daily life as a result of the widespread use of mobile devices. The ever-increasing amount of such data, along with the necessity for real-time retrieval, has lead to the development of new methods that can process them in a timely fashion with acceptable accuracy. In this paper, we study the performance of ReckNN, an unsupervised manifold learning algorithm based on the reciprocal neighbourhood and the authority of ranked lists. Most of the related work in this field do not fully investigate optimization strategies, an aspect that is becoming more important with the high availability of manycore machines. In order to address that issue, we fully investigate optimization opportunities in this article and make the following three main contributions. Firstly, we develop an efficient and scalable method for storing and accessing the distances between objects (e.g., video or image) based on dictionaries. Secondly, we employ memoization to speed up the computation of authority scores, leading to a significant performance gain even on single-core architectures. Lastly, we devise and implement several parallelization strategies and show that they are scalable on a 72-core Intel machine. The experimental results with MPEG-7, Corel5k and MediaEval benchmarks show that the optimized ReckNN delivers both efficiency and scalability, highlighting the importance of the proposed optimizations for manycore machines.

KW - Multimedia retrieval

KW - Unsupervised learning

KW - Efficiency

KW - Scalability

KW - Parallelism

U2 - 10.1016/j.ins.2018.06.023

DO - 10.1016/j.ins.2018.06.023

M3 - Article

VL - 496

SP - 410

EP - 430

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

ER -