Low Latency, High Throughput Similarity Search with an In-Memory Associative Processor
With the recent advances in deep learning the ability of representing context of data such as images, video, text, music and even bio/chemo informatics into feature vectors (fingerprint) representation, enabling search by context using similarity search of queries, rather than exact search. Applications such as products review by context search, zero shot face recognition, video action detection, voice detection, find similar molecules for drugs discovery, finance trading behaviors etc., are few examples belonging to this category of applications.
Unfortunately, implementing similarity search at low latency to Billions of records and high throughput (tens of thousands queries per second) is limited due to the high complexity of computation.
This presentation details a fully programmable, In-Memory platform, based on associative computing in-memory chip. This associative processing unit (APU) can store massively feature vectors internally and compute similarity search for a query vector at throughput that greater than 100K queries per second at very low latency and for data base which is greater than 1B vectors.
The in-memory associative chip can be used in embeded solutions such as Edge application ( security cameras, robots, cars) as well as data centers for big data similarity search application.
His specialties are Computational Memory, Associative Processing, Parallel Algorithms, and Machine Learning.