Efficient exact K-nearest neighbor graph construction for billion-scale datasets using GPUs with tensor cores | Proceedings of the 36th ACM International Conference on Supercomputing (2024)

Advanced Search
Browse
About
- Sign in
- Register

Advanced Search

ics

research-article

Authors:
Zhuoran Ji The University of Hong Kong, Hong Kong, China

The University of Hong Kong, Hong Kong, China
View Profile

,
Cho-Li Wang The University of Hong Kong, Hong Kong, China

The University of Hong Kong, Hong Kong, China
View Profile

ICS '22: Proceedings of the 36th ACM International Conference on SupercomputingJune 2022Article No.: 10Pages 1–12https://doi.org/10.1145/3524059.3532368

Published:28 June 2022Publication History

2citation
404
Downloads

Metrics

Total Citations2Total Downloads404

Last 12 Months197

Last 6 weeks20

Get Citation Alerts
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
See Also
Machine Learning Basics with the K-Nearest Neighbors Algorithm K-Nearest Neighbor(KNN) Algorithm - GeeksforGeeks The k-Nearest Neighbors (kNN) Algorithm in Python – Real Python KNN Algorithm – K-Nearest Neighbors Classifiers and Model Example
New Citation Alert!
Please log in to your account
Publisher Site

Get Access

ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing

Efficient exact K-nearest neighbor graph construction for billion-scale datasets using GPUs with tensor cores

Pages 1–12

PreviousChapterNextChapter

ABSTRACT

Approximate nearest neighbor search plays a fundamental role in many areas, and the k-nearest neighbor graph (KNNG) becomes a promising solution, especially in high-dimensional space. The advantages of KNNG come at the expense of high construction time, which is in quadratic time complexity in the number of points. Many GPUs have adopted specialized hardware units for matrix multiplication, providing an even higher arithmetic throughput. This paper presents flyKNNG, a GPU KNNG construction algorithm for billion-scale datasets. It deploys the distance matrix calculation to matrix multiplication units and adopts on-the-fly top-k selection to avoid transferring the exa-scale distance matrix to/from device memory. flyKNNG co-designs the two key algorithms to optimize the overall performance: the distance matrix calculation algorithm considers the data communication costs and pruning strategy of top-k selection; the top-k selection algorithm is also specially designed for on-the-fly usage, which impacts the data reuse and instruction-level parallelism of the distance matrix calculation as little as possible. Moreover, our top-k selection algorithm is optimized for the special data layout adopted by most matrix multiplication units. Experiments show that flyKNNG achieves 4.67X speedup compared with CUML/FAISS, one of the state-of-the-art approaches.

References

David Adedayo Adeniyi, Zhaoqiang Wei, and Y Yongquan. 2016. Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method. Applied Computing and Informatics 12, 1 (2016), 90--108.Google ScholarCross Ref
Tolu Alabi, Jeffrey D Blanchard, Bradley Gordon, and Russel Steinbach. 2012. Fast k-selection algorithms for graphics processing units. Journal of Experimental Algorithmics (JEA) 17 (2012), 4--1.Google Scholar
Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro, and Fausto Rabitti. 2016. YFCC100M-HNfc6: a large-scale deep features benchmark for similarity search. In International Conference on Similarity Search and Applications. Springer, 196--209.Google ScholarCross Ref
AMD. 2020. Introducing AMD CDNA architecture. https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdfGoogle Scholar
Artem Babenko and Victor Lempitsky. 2016. Efficient indexing of billion-scale datasets of deep descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2055--2063.Google Scholar
Ricardo J Barrientos, José I Gómez, Christian Tenllado, Manuel Prieto Matias, and Mauricio Marin. 2011. kNN query processing in metric spaces using GPUs. In European Conference on Parallel Processing. Springer, 380--392.Google ScholarCross Ref
Vishwanath Bijalwan, Vinay Kumar, Pinki Kumari, and Jordan Pascual. 2014. KNN based machine learning approach for text and document mining. International Journal of Database Theory and Application 7, 1 (2014), 61--70.Google ScholarCross Ref
Deng Cai. 2019. A revisit of hashing algorithms for approximate nearest neighbor search. IEEE Transactions on Knowledge and Data Engineering (2019).Google Scholar
Jieyang Chen, Nan Xiong, Xin Liang, Dingwen Tao, Sihuan Li, Kaiming Ouyang, Kai Zhao, Nathan DeBardeleben, Qiang Guan, and Zizhong Chen. 2019. TSM2: optimizing tall-and-skinny matrix-matrix multiplication on GPUs. In Proceedings of the ACM International Conference on Supercomputing. 106--116.Google ScholarDigital Library
Jack Choquette and Wish Gandhi. 2020. Nvidia A100 GPU: Performance & innovation for GPU computing. In 2020 IEEE Hot Chips 32 Symposium (HCS). IEEE Computer Society, 1--43.Google ScholarCross Ref
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems. 191--198.Google ScholarDigital Library
Wei Dong, Charikar Moses, and Kai Li. 2011. Efficient k-nearest neighbor graph construction for generic similarity measures. In Proceedings of the 20th international conference on World wide web. 577--586.Google ScholarDigital Library
Carlos Eiras-Franco, David Martínez-Rego, Leslie Kanthan, César Piñeiro, Antonio Bahamonde, Bertha Guijarro-Berdiñas, and Amparo Alonso-Betanzos. 2020. Fast Distributed k NN Graph Construction Using Auto-tuned Locality-sensitive Hashing. ACM Transactions on Intelligent Systems and Technology (TIST) 11, 6 (2020), 1--18.Google ScholarDigital Library
Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2017. Fast approximate nearest neighbor search with the navigating spreading-out graph. arXiv preprint arXiv:1707.00143 (2017).Google Scholar
Fabian Groh, Lukas Ruppert, Patrick Wieschollek, and Hendrik Lensch. 2019. Ggnn: Graph-based gpu nearest neighbor search. arXiv preprint arXiv:1912.01059 (2019).Google Scholar
Sadegh Bafandeh Imandoust and Mohammad Bolandraftar. 2013. Application of k-nearest neighbor (knn) approach for predicting economic events: Theoretical background. International Journal of Engineering Research and Applications 3, 5 (2013), 605--610.Google Scholar
Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence 33, 1 (2010), 117--128.Google Scholar
Hervé Jégou, Romain Tavenard, Matthijs Douze, and Laurent Amsaleg. 2011. Searching in one billion vectors: re-rank with source coding. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 861--864.Google ScholarCross Ref
Zhuoran Ji and Cho-Li Wang. 2021. Accelerating DBSCAN Algorithm with AI Chips for Large Datasets. In 50th International Conference on Parallel Processing. 1--11.Google Scholar
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with gpus. IEEE Transactions on Big Data (2019).Google ScholarCross Ref
Dhiraj Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, et al. 2019. A study of BFLOAT16 for deep learning training. arXiv preprint arXiv:1905.12322 (2019).Google Scholar
Ivan Komarov, Ali Dashti, and Roshan M D'Souza. 2014. Fast k-NNG construction with GPU-based quick multi-select. PloS one 9, 5 (2014), e92409.Google ScholarCross Ref
Quansheng Kuang and Lei Zhao. 2009. A practical GPU based kNN algorithm. In Proceedings. The 2009 International Symposium on Computer Science and Computational Technology (ISCSCI 2009). Citeseer, 151.Google Scholar
Wen Li, Ying Zhang, Yifang Sun, Wei Wang, Mingjie Li, Wenjie Zhang, and Xuemin Lin. 2019. Approximate nearest neighbor search on high dimensional data---experiments, analyses, and improvement. IEEE Transactions on Knowledge and Data Engineering 32, 8 (2019), 1475--1488.Google ScholarCross Ref
Heng Liao, Jiajin Tu, Jing Xia, and Xiping Zhou. 2019. Davinci: A scalable architecture for neural network computing. In 2019 IEEE Hot Chips 31 Symposium (HCS). IEEE Computer Society, 1--44.Google ScholarCross Ref
Bruno Meyer, Aurora Pozo, and Wagner M Nunan Zola. 2021. Warp-centric K-Nearest Neighbor Graphs construction on GPU. In 50th International Conference on Parallel Processing Workshop. 1--10.Google ScholarDigital Library
NVIDIA. 2021. CUDA Basic Linear Algebra Subroutine library. https://docs.nvidia.com/cuda/cublas/index.htmlGoogle Scholar
NVIDIA. 2021. CUDA Toolkit Documentation. https://docs.nvidia.com/cuda/index.htmlGoogle Scholar
NVIDIA. 2021. GPU Machine Learning Algorithms. https://rapids.aiGoogle Scholar
NVIDIA. 2021. NVIDIA Ampere GA102 GPU Architecture. https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdfGoogle Scholar
NVIDIA. 2021. NVIDIA Nsight Compute Kernel Profiling Guide. https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.htmlGoogle Scholar
Md Aamir Raihan, Negar Goli, and Tor M Aamodt. 2019. Modeling deep learning accelerator enabled gpus. In 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 79--92.Google ScholarCross Ref
Tal Ridnik, Emanuel Ben-Baruch, Asaf Noy, and Lihi Zelnik-Manor. 2021. Imagenet-21k pretraining for the masses. arXiv preprint arXiv:2104.10972 (2021).Google Scholar
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision 115, 3 (2015), 211--252.Google Scholar
SCI-Compiler. 2018. Ping Pong Buffer. http://www.scicompiler.doud/userguide/PingPongBuffer.htmlGoogle Scholar
Anil Shanbhag, Holger Pirk, and Samuel Madden. 2018. Efficient top-k query processing on massively parallel hardware. In Proceedings of the 2018 International Conference on Management of Data. 1557--1570.Google ScholarDigital Library
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.Google ScholarCross Ref
Xiaoxin Tang, Zhiyi Huang, David Eyers, Steven Mills, and Minyi Guo. 2015. Efficient selection algorithm for fast k-nn search on gpus. In 2015 IEEE International Parallel and Distributed Processing Symposium. IEEE, 397--406.Google ScholarDigital Library
Vasily Volkov. 2016. Understanding latency hiding on GPUs. University of California, Berkeley.Google Scholar
Hui Wang, Wan-Lei Zhao, and Xiangxiang Zeng. 2021. Large-Scale Approximate k-NN Graph Construction on GPU. arXiv preprint arXiv:2103.15386 (2021).Google Scholar
Mengzhao Wang, Xiaoliang Xu, Qiang Yue, and Yuxiang Wang. 2021. A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search. arXiv preprint arXiv:2101.12631 (2021).Google Scholar
Liu Yingfan, Cheng Hong, and Cui Jiangtao. 2021. Revisiting k-Nearest Neighbor Graph Construction on High-Dimensional Data: Experiments and Analyses. arXiv preprint arXiv:2112.02234 (2021).Google Scholar

Cited By

View all

Index Terms

Efficient exact K-nearest neighbor graph construction for billion-scale datasets using GPUs with tensor cores
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel algorithms
      1. Massively parallel algorithms
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Top-k retrieval in databases

Recommendations

Confirmation Sampling for Exact Nearest Neighbor Search
Similarity Search and Applications
Abstract
Locality-sensitive hashing (LSH), introduced by Indyk and Motwani in STOC ’98, has been an extremely influential framework for nearest neighbor search in high-dimensional data sets. While theoretical work has focused on the approximate nearest ...
Read More
K-Nearest Neighbor Finding Using MaxNearestDist
Similarity searching often reduces to finding the k nearest neighbors to a query object. Finding the k nearest neighbors is achieved by applying either a depth- first or a best-first algorithm to the search hierarchy containing the data. These ...
Read More
Approximate Direct and Reverse Nearest Neighbor Queries, and the k-nearest Neighbor Graph
SISAP '09: Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Retrieving the \emph{k-nearest neighbors} of a query object is a basic primitive in similarity searching. A related, far less explored primitive is to obtain the dataset elements which would have the query object within their own \emph{k}-nearest ...
Read More
See Also
Guide to K-Nearest Neighbors Algorithm in Machine Learning

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Information
Contributors

Published in
ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing
June 2022
514 pages
ISBN:9781450392815
DOI:10.1145/3524059
General Chairs:
Lawrence Rauchwerger
University of Illinois at Urbana-Champaign
,
Kirk Cameron
Virginia Tech
,
Program Chairs:
Dimitrios S. Nikolopoulos
Virginia Tech
,
Dionisios Pnevmatikatos
National Technical University of Athens
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [emailprotected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2022
Permissions
Request permissions about this article.
Request Permissions
Check for updates
Author Tags
KNNG construction
parallel computing
top-k selection
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate629of2,180submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Bibliometrics
Citations2

Article Metrics
- 2
  Total Citations
  View Citations
- 404
  Total Downloads
- Downloads (Last 12 months)197
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

View Digital Edition

Figures
Other

Close Figure Viewer

Browse AllReturnChange zoom level

Caption

View Table of Contents

Export Citations

Your Search Results Download Request

We are preparing your search results for download ...

We will inform you here when the file is ready.

Download now!

Your Search Results Download Request

Your file of search results citations is now ready.

Download now!

Your Search Results Download Request

Your search export query has expired. Please try again.

Efficient exact K-nearest neighbor graph construction for billion-scale datasets using GPUs with tensor cores | Proceedings of the 36th ACM International Conference on Supercomputing (2024)

New Citation Alert added!

New Citation Alert!

ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Export Citations