Learnable indexes and hashes
January 12, 2018 — February 18, 2020
approximation
feature construction
geometry
high d
language
linear algebra
machine learning
metrics
networks
neural nets
statistics
topology
Dr. Wu-Jun LI’s excellent Lit review and practicalities supporting their own papers. Kevin Zakka’s kNN classification using Neighbourhood Components Analysis is an illustrated guide to a type of dimensionality reduction I had not heard of before that looks handy for nearest-neighbour search, which I suppose is the entry-level use here. (Dwibedi et al. 2019)
1 Learnable hashes for similarity search
2 Learnable indexes for arbitrary search
Hip for a while there. Learned bloom filters etc can outperform hand-designed ones is the tl;dr. (Beutel et al. 2017; Kraska et al. 2017)
See also Mitzenmacher’s clarifications on false positive rate in such filters.
3 Token embedding indices
LLM style. See transformers for now.
4 References
Beutel, Kraska, Chi, et al. 2017. “A Machine Learning Approach to Databases Indexes.” In Advances In Neural Information Processing Systems.
Cao, Long, Wang, et al. 2017. “HashNet: Deep Learning to Hash by Continuation.” arXiv:1702.00758 [Cs].
Chiu, Prayoonwong, and Liao. 2020. “Learning to Index for Nearest Neighbor Search.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Dwibedi, Aytar, Tompson, et al. 2019. “Temporal Cycle-Consistency Learning.”
Gordo, Almazan, Revaud, et al. 2016. “End-to-End Learning of Deep Visual Representations for Image Retrieval.” arXiv:1610.07940 [Cs].
Graves, Wayne, Reynolds, et al. 2016. “Hybrid Computing Using a Neural Network with Dynamic External Memory.” Nature.
Kraska, Beutel, Chi, et al. 2017. “The Case for Learned Index Structures.” arXiv:1712.01208 [Cs].
Lai, Pan, Liu, et al. 2015. “Simultaneous Feature Learning and Hash Coding with Deep Neural Networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Lin, Yang, Hsiao, et al. 2015. “Deep Learning of Binary Hash Codes for Fast Image Retrieval.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.
Liong, Lu, Wang, et al. 2015. “Deep Hashing for Compact Binary Codes Learning.” In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Li, Wang, and Kang. 2015. “Feature Learning Based Deep Supervised Hashing with Pairwise Labels.” arXiv Preprint arXiv:1511.03855.
Mitra, and Craswell. 2017. “Neural Models for Information Retrieval.” arXiv:1705.01509 [Cs].
Nagathan, Mungara, and Manimozhi. 2014. “Content-Based Image Retrieval System Using Feed-Forward Backpropagation Neural Network.” International Journal of Computer Science and Network Security (IJCSNS).
Schüle, Simonis, Heyenbrock, et al. 2019. “In-Database Machine Learning: Gradient Descent andTensor Algebra for Main Memory Database Systems.”
Wang, Zhang, song, et al. 2018. “A Survey on Learning to Hash.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Xia, Pan, Lai, et al. 2014. “Supervised Hashing for Image Retrieval via Image Representation Learning.” In AAAI.
Xu, Wang, Tian, et al. 2015. “Convolutional Neural Networks for Text Hashing.” In IJCAI.
Yang, Lin, and Chen. 2018. “Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Zhang, Lin, Zhang, et al. 2015. “Bit-Scalable Deep Hashing with Regularized Similarity Learning for Image Retrieval and Person Re-Identification.” IEEE Transactions on Image Processing.
Zoran, Lakshminarayanan, and Blundell. 2017. “Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees.” arXiv:1702.08833 [Cs].