SEMINAR: HydraDB: In-Memory Key-Value Store at IBM

In this talk, Dr. Yandong Wang will discuss experiences and lessons learned from building a general-purpose in-memory key-value middleware, called HydraDB. HydraDB synthesizes a collection of state-of-the-art techniques, including continuous fault-tolerance, Remote Direct Memory Access (RDMA), as well as awareness for multicore systems, etc, to deliver a high-throughput, low-latency access service in a reliable manner for cluster computing applications. The uniqueness of HydraDB mainly lies in its design commitment to fully exploit the RDMA protocol to comprehensively optimize various aspects of a general-purpose key-value store, including latency-critical operations, read enhancement, and data replications for high-availability service, etc. At the same time, HydraDB strives to efficiently utilize multicore systems to prevent data manipulation on the servers from curbing the performance of RDMA. Wang has introduced C-Hint, an efficient and reliable cache management system. Many teams in his organization have adopted HydraDB to improve the execution of their cluster computing frameworks, including Hadoop, Spark, Sensemaking analytics and Call Record processing. In addition, the team's performance evaluation with a variety of YCSB workloads has also demonstrated that HydraDB can substantially outperform several existing in-memory key-value stores on the market by an order of magnitude. 

Dr. Yandong Wang
Research Staff Member, IBM T.J. Watson Research Center

Dr. Yandong Wang is a Research Staff Member at IBM T.J Watson Research Center. Prior to joining IBM in 2014, he received the Ph.D. degree from Auburn University. He has also received the master degree in computer science and the bachelor degree in transportation engineering from Rochester Institute of Technology and TongJi University, respectively. His research interests span the fields of high-performance storage systems, distributed computing and the design of big data analytics frameworks. The results of his research have been published in many well-respected computer science journals and conferences, including IEEE TPDS, ACM SoCC, IEEE/ACM Supercomputing, IEEE IPDPS and ACM Sigmetrics etc. He is currently actively collaborating with many system designers and researchers to design the next-generation in-memory storage systems to address big data challenges.

Wednesday, February 18, 2015, 4:00 pm - 5:00 pm
3129 Shelby Center