In-Memory Computing Summit

Tracks:

Tales from the Trenches	Architecture
Hardware	Streaming Data
New Capabilities

/* * Custom CSS For Timetable */ #sched-schedule-1-1292066713 .sched-column-header { background: #e8e8e8; color: #3f3f3f; } #sched-schedule-1-1292066713 .sched-column-header:after { background: #ffffff; } #sched-schedule-1-1292066713 .sched-columns .sched-column:last-child .sched-column-header:after { background: #e8e8e8; } #sched-schedule-1-1292066713 .sched-column-bg-block { background: #ffffff; border-color: #e8e8e8; } #sched-schedule-1-1292066713 .sched-column-bg-block:after { background: #f5f5f5; } #sched-schedule-1-1292066713 .sched-row-no-title .sched-column .sched-column-bg { box-shadow: 0 -1px 0 #e8e8e8; } #sched-schedule-1-1292066713 .sched-title { color: #3f3f3f; } #sched-schedule-1-1292066713 .sched-time-value { color: #3f3f3f; } #sched-schedule-1-1292066713 .sched-event .sched-event-inner { color: #ffffff; text-align: center; } #sched-schedule-1-1292066713 .sched-event.sched-event-invert .sched-event-inner { } /*#sched-schedule-1-1292066713 a.sched-event.sched-event-sort-hidden { opacity: 0; }*/ #sched-schedule-1-1292066713 .sched-sort .sched-sort-current .sched-sort-current-label { width: 100px; } #sched-schedule-1-1292066713 .sched-sort.sched-sort-open .sched-sort-dropdown .sched-sort-current .sched-sort-current-label { width: 180px; } /* * Custom CSS Event Popup */ #sched-schedule-1-1292066713-popup .sched-popup-description { } #sched-schedule-1-1292066713-popup .sched-popup-description .sched-meta a, #sched-schedule-1-1292066713-popup .sched-popup-description .sched-popup-description-text a { color: #18bc9c; } #sched-schedule-1-1292066713-popup .sched-popup-description .sched-meta, #sched-schedule-1-1292066713-popup .sched-popup-description .sched-popup-description-text { color: #535353; background: #ffffff; } /* * List */ #sched-schedule-1-1292066713-list .sched-list-title { color: #3f3f3f; } #sched-schedule-1-1292066713-list .sched-list-column-title { color: #3f3f3f; } #sched-schedule-1-1292066713-list .sched-list-event { color: #3f3f3f; } #sched-schedule-1-1292066713-list .sched-list-event:hover { color: #000; } #sched-schedule-1-1292066713-list .sched-list-event-description { color: #666; } #sched-schedule-1-1292066713-list .sched-list-event-title { font-weight: bold; ; }

Schedule Day 1

May 23, 2016

Registration

Breakfast

Registration continues

The In-Memory Computing Landscape: Leading the Fast Data Revolution

Abe Kleinfeld

President & CEO, GridGain

9:00 am - 9:30 am

Keynote

In-Memory: The Foundation of the Internet of Things

Jason Stamper

Analyst, Data Management and Analytics, 451 Research

9:35 am - 10:05 am

Keynote

More Memory for In-Memory? Easy!

Benzi Galili

Chief Operating Officer, ScaleMP

10:10 am - 10:30 am

Keynotes

Break

10:30 am - 11:00 am

Demystifying In-Memory Data Grid and NoSQL DB. Are they one and the same?

Pandurang Pradeep Naik, Wipro

Pandurang Pradeep Naik

Principal Consultant, Wipro Limited

11:00 am - 11:50 am

Architecture

The Truth: How to Test Your Distributed System for the Real World

Noah Arliss, Workday

Noah Arliss

Senior Development Manager, Workday

11:00 am - 11:50 am

Tales from the Trenches

Break

Implementing User-Defined Data Structures in In-Memory Data Grids

Dr. William Bain, ScaleOut Software

Dr. William L. Bain

Founder & CEO, ScaleOut Software, Inc.

11:55 am - 12:45 pm

Architecture

In-memory data grids (IMDGs) are widely used as distributed, key-value stores for serialized objects, providing fast data access, location transparency, scalability, and high availability. With its support for built-in data structures, such as hashed sets and lists, Redis has demonstrated the value of enhancing standard create/read/update/delete (CRUD) APIs to provide extended functionality and performance gains. This talk describes new techniques which can be used to generalize this concept and enable the straightforward creation of arbitrary, user-defined data structures both within single objects and sharded across the IMDG.

A key challenge for IMDGs is to minimize network traffic when accessing and updating stored data. Standard CRUD APIs place the burden of implementing data structures on the client and require that full objects move between client and server on every operation. In contrast, implementing data structures within the server streamlines communication since only incremental changes to stored objects or requested subsets of this data need to be transferred. However, building extended data structures within IMDG servers creates several challenges, including, how to extend this mechanism, how to efficiently implement data-parallel operations spanning multiple shards, and how to protect the IMDG from errors in user-defined extensions.

This talk will describe two techniques which enable IMDGs to be extended to implement user-defined data structures. One technique, called single method invocation (SMI), allows users to define a class which implements a user-defined data structure stored as an IMDG object and then remotely execute a set of class methods within the IMDG. This enables IMDG clients to pass parameters to the IMDG and receive a result from method execution.

A second technique, called parallel method invocation (PMI), extends this approach to execute a method in parallel on multiple objects sharded across IMDG servers. PMI also provides an efficient mechanism for combining the results of method execution and returning a single result to the invoking client. In contrast to client-based techniques, this combining mechanism is integrated into the IMDG and completes in O(logN) time, where N is the number of IMDG servers.

The talk will describe how user-defined data structures can be implemented within the IMDG to run in a separate process (e.g., a JVM) to ensure that execution errors do not impair the stability of the IMDG. It will examine the associated performance trade-offs and techniques that can be used to minimize overhead.

Lastly, the talk will describe how popular Redis data structures, such as hashed sets, can be implemented as a user-defined data structure using SMI and then extended using both SMI and PMI to build a scalable hashed set that spans multiple shards. It will also examine other examples of user-defined data structures that can be built using these techniques.

The audience will learn (1) how to extend an IMDG to incorporate user-defined data structures, (2) the trade-offs between an extensible mechanism and the use of built-in data structures, such as in Redis, and (3) examples of using this mechanism in various applications.

Case Study - GridGain for Sberbank Architecture in Financial Services

Dmitriy Setrakyan, GridGain - Mikhail Khasin, Sberbank

Dmitriy Setrakyan, Founder and Chief Product Officer, GridGain

Mikhail Khasin, Senior Managing Director & Head of Core banking Transformation Program, Sberbank

11:55 am - 12:45 pm

Tales from the Trenches

Lunch

12:45 pm - 1:45 pm

How to Use JCache to Speed Up Your Application

Greg Luck, Hazelcast

Greg Luck

CEO, Hazelcast Inc.

1:45 pm - 2:35 pm

Architecture

Target's First Foray into an In-Memory Data Grid (and the Trips, Stumbles, and Falls that came with)

Jim Beyers, Hitendra Pratap Singh, Aaron Riekenberg, Target Corporation

Jim Beyers Director Engineering, Aaron Riekenberg - Lead Engineer

Hitendra Pratap Singh - Lead Engineer

1:45 pm - 2:35 pm

Tales from the Trenches

Break

Apache Ignite 2.0 - Towards Convergent Data Platform

Nikita Ivanov, GridGain

Nikita Ivanov

Founder & CTO, GridGain Systems

2:40 pm - 3:30 pm

Architecture

In-memory Stream Processing @ Integral Ad Science

Alexey Kharlamov, Integral Ad Science

Alexey Kharlamov

VP of Technology, Integral Ad Science Inc

2:40 pm - 3:30 pm

Tales from the Trenches

Break

3:30 pm - 3:50 pm

Extreme Transaction Processing In A Memory-Oriented World

Girish Mutreja, Neeve Research

Girish Mutreja

Founder & CEO, Neeve Research

3:50 pm - 4:40 pm

Architecture

Modern transactional systems need to be fast, always available and constantly scale to meet the ever changing needs of the business. It is becoming increasingly commonplace for next generation e-commerce systems to demand double or single digit millisecond response times, for financial trading systems to incur maximum latencies in the order of microseconds and gaming and analytic engines to consumes hundreds of thousands of transactions a second. It is a common and tempting mistake to believe that we can meet the extreme needs of such systems by just replacing traditional disk based storage systems with in-memory data grids using traditional application architectures. Such an approach will take us only so far after which the system's demands will once again overtake its capabilities. To truly meet the extreme needs of these systems and continue to scale as the demand scales, we need to think differently about how such systems are architected and employ modern techniques to unlock the full potential of memory oriented computing. This talk explains why and how.

Join Girish Mutreja, CEO of Neeve Research and author of the X Platform as he discusses the above and provides a unique perspective into what’s different about memory oriented TP applications and how application architectures, particularly mission critical applications, need to adapt to the new world of memory oriented computing. Girish will outline the key architectural elements of TP applications and explain how they need to function in the world of memory oriented computing. He will delve into why such systems need to be architected as a marriage between messaging and data storage; why message routing and data gravity is of critical importance to these systems; how structured, in-memory state lends to extreme agility; how fault tolerance, load balancing, transaction processing and threading need to function in such systems; why architectural precepts such as transaction pipelining and agent oriented design are critical to reliability, performance and scalability. Girish will illustrate how these concepts have enabled enterprises such as MGM Resorts to transition to game changing, memory oriented architectures by leveraging the X Platform.

Decision Making with Mllib, Spark and Spark Streaming

Girish Kathalagiri, Staff Engineer, Samsung SDS Research America

Girish Kathalagiri

Staff Engineer, Samsung SDS Research America

3:50 pm - 4:40 pm

Streaming Data

Break

The Illusion of Statelessness

Aleksandar Seovic, Architect, Oracle Coherence, Oracle

Aleksandar Seovic

Architect, Oracle Coherence, Oracle

4:45 pm - 5:35 pm

Architecture

Making IMC Enterprise Grade

Steve Wikes, CTO, Co-Founder, Striim

Steve Wikes

CTO, Co-Founder, Striim

4:45 pm - 5:35 pm

Streaming Data

Networking Reception

Schedule Day 1

May 23, 2016

Registration

7:30 am - 8:00 am
Breakfast

Registration continues

8:00 am - 9:00 am
The In-Memory Computing Landscape: Leading the Fast Data Revolution

Abe Kleinfeld, President & CEO, GridGain

9:00 am - 9:30 am
In-Memory: The Foundation of the Internet of Things

Jason Stamper, Analyst, Data Management and Analytics, 451 Research

9:35 am - 10:05 am
More Memory for In-Memory? Easy!

Benzi Galili, Chief Operating Officer, ScaleMP

10:10 am - 10:30 am
Break

10:30 am - 11:00 am
Demystifying In-Memory Data Grid and NoSQL DB. Are they one and the same?

Pandurang Pradeep Naik, Wipro

11:00 am - 11:50 am
The Truth: How to Test Your Distributed System for the Real World

Noah Arliss, Workday

11:00 am - 11:50 am
Break

11:50 am - 11:55 am
Implementing User-Defined Data Structures in In-Memory Data Grids

Dr. William Bain, ScaleOut Software

11:55 am - 12:45 pm
Case Study - GridGain for Sberbank Architecture in Financial Services

Dmitriy Setrakyan, GridGain - Mikhail Khasin, Sberbank

11:55 am - 12:45 pm
Lunch

12:45 pm - 1:45 pm
How to Use JCache to Speed Up Your Application

Greg Luck, Hazelcast

1:45 pm - 2:35 pm
Target's First Foray into an In-Memory Data Grid (and the Trips, Stumbles, and Falls that came with)

Jim Beyers, Hitendra Pratap Singh, Aaron Riekenberg, Target Corporation

1:45 pm - 2:35 pm
Break

2:35 pm - 2:40 pm
Apache Ignite 2.0 - Towards Convergent Data Platform

Nikita Ivanov, GridGain

2:40 pm - 3:30 pm
In-memory Stream Processing @ Integral Ad Science

Alexey Kharlamov, Integral Ad Science

2:40 pm - 3:30 pm
Break

3:30 pm - 3:50 pm
Extreme Transaction Processing In A Memory-Oriented World

Girish Mutreja, Neeve Research

3:50 pm - 4:40 pm
Decision Making with Mllib, Spark and Spark Streaming

Girish Kathalagiri, Staff Engineer, Samsung SDS Research America

3:50 pm - 4:40 pm
Break

4:40 pm - 4:45 pm
The Illusion of Statelessness

Aleksandar Seovic, Architect, Oracle Coherence, Oracle

4:45 pm - 5:35 pm
Making IMC Enterprise Grade

Steve Wikes, CTO, Co-Founder, Striim

4:45 pm - 5:35 pm
Networking Reception

5:35 pm - 7:00 pm

Tracks:

Tales from the Trenches	Architecture
Hardware	Streaming Data
New Capabilities

/* * Custom CSS For Timetable */ #sched-schedule-2-1959124808 .sched-column-header { background: #e8e8e8; color: #3f3f3f; } #sched-schedule-2-1959124808 .sched-column-header:after { background: #ffffff; } #sched-schedule-2-1959124808 .sched-columns .sched-column:last-child .sched-column-header:after { background: #e8e8e8; } #sched-schedule-2-1959124808 .sched-column-bg-block { background: #ffffff; border-color: #e8e8e8; } #sched-schedule-2-1959124808 .sched-column-bg-block:after { background: #f5f5f5; } #sched-schedule-2-1959124808 .sched-row-no-title .sched-column .sched-column-bg { box-shadow: 0 -1px 0 #e8e8e8; } #sched-schedule-2-1959124808 .sched-title { color: #3f3f3f; } #sched-schedule-2-1959124808 .sched-time-value { color: #3f3f3f; } #sched-schedule-2-1959124808 .sched-event .sched-event-inner { color: #ffffff; text-align: center; } #sched-schedule-2-1959124808 .sched-event.sched-event-invert .sched-event-inner { } /*#sched-schedule-2-1959124808 a.sched-event.sched-event-sort-hidden { opacity: 0; }*/ #sched-schedule-2-1959124808 .sched-sort .sched-sort-current .sched-sort-current-label { width: 100px; } #sched-schedule-2-1959124808 .sched-sort.sched-sort-open .sched-sort-dropdown .sched-sort-current .sched-sort-current-label { width: 180px; } /* * Custom CSS Event Popup */ #sched-schedule-2-1959124808-popup .sched-popup-description { } #sched-schedule-2-1959124808-popup .sched-popup-description .sched-meta a, #sched-schedule-2-1959124808-popup .sched-popup-description .sched-popup-description-text a { color: #18bc9c; } #sched-schedule-2-1959124808-popup .sched-popup-description .sched-meta, #sched-schedule-2-1959124808-popup .sched-popup-description .sched-popup-description-text { color: #535353; background: #ffffff; } /* * List */ #sched-schedule-2-1959124808-list .sched-list-title { color: #3f3f3f; } #sched-schedule-2-1959124808-list .sched-list-column-title { color: #3f3f3f; } #sched-schedule-2-1959124808-list .sched-list-event { color: #3f3f3f; } #sched-schedule-2-1959124808-list .sched-list-event:hover { color: #000; } #sched-schedule-2-1959124808-list .sched-list-event-description { color: #666; } #sched-schedule-2-1959124808-list .sched-list-event-title { font-weight: bold; ; }

Schedule Day 2

May 24, 2016

Registration, Breakfast

8:00 am - 9:00 am

In Memory Computing for Financial Services: Past, Present and Future

Robert Barr

VP Data Grid Engineering Lead, Barclays

9:00 am - 9:20 am

Keynote

NVDIMM - Changes are Here So What's Next?

Arthur Sainio

Co-Chair SNIA NVDIMM Special Interest Group and Director, Marketing, SMART Modular Technology

9:25 am - 9:40 am

Keynote

Innovation Presentations

9:40 am - 10:40 am

Unveiling the X Platform
Girish Mutreja – CEO, Founder, Neeve Research
Neeve Research offers the X Platform, a revolutionary memory-oriented transaction processing platform for extreme enterprise applications. The platform uniquely integrates structured in-memory state, advanced messaging, multi-agency and decoupled enterprise data management to enable a true no-compromise extreme TP platform. The true innovation of the platform lies in its ability to provide a no-compromise blend of extreme performance, reliability, scalability and developmental agility. It is extremely fast, it is extremely easy to use, it can be used to build a wide variety of applications and the applications built using it exhibit zero data loss and scale linearly. After almost a decade of hard engineering and close-quarters field hardening with an exclusive set of Fortune 300 companies, Neeve is opening the platform for wider use. Listen as Girish Mutreja unveils the X Platform and shows how easy it is to build an application that performs at 100s of thousands of transactions per second or sub-100 microsecond latencies with zero garbage and zero data loss.

Tap Into Your Enterprise - Why Database Change and IMC Are an Ideal Match
Steve Wilkes – CTO, Co-Founder, Striim
In-memory computing is all about now. It’s the art of collecting and processing data as quickly as it is created in order to provide instant actionable insights. Databases, however, are all about the past. They are a record of what happened, not what is happening right now.
In this presentation, you will learn how to turn your enterprise databases, and the applications they support, into real-time sources of what’s currently happening throughout the business. By utilizing database change, and in-memory processing and analytics, you can tap into your enterprise activity and make decisions while the data is still relevant.

Capture Perishable Insights Before the Moment is Lost
Chris Villinger – VP Business Development & Marketing, ScaleOut Software
In today’s competitive business environment, companies need to capture perishable opportunities before the moment is lost. Business intelligence is not enough. Live systems need to analyze data in flight to create operational intelligence. With its ability to store and analyze fast-changing data in milliseconds, in-memory computing technology provides the secret sauce that enables operational intelligence at scale for these systems.
ScaleOut Software’s in-memory computing technology integrates scalable, in-memory data storage and data-parallel computing to deliver on the promise of operational intelligence. This enables financial systems to react more quickly to market price changes, IoT applications to track the behavior of millions of devices, healthcare systems to analyze real-time telemetry from pacemakers, and e-commerce sites to make context-aware recommendations to online shoppers – just to name a few applications. We are only now beginning to tap the power of this technology to enhance the value of live systems.

Lambda-B-Gone: The In-memory Case Study for Faster, Smarter and Simpler Answers
Dennis Duckworth – Director of Product Marketing, VoltDB
Simplicity, accuracy, speed are three things everyone wants from their data architecture. A content delivery network based in LA, was looking to achieve these goals and developed a framework that handled batch and stream processing with open source software. The objective was to manage the real-time aggregation of over 32 TB of daily web server log data. The problem? Everything. Listen as Dennis Duckworth explains how VoltDB reduced the number of environments, used 1/10th the CPU cycles, and achieved 100% billing accuracy on 32 TB of daily web server data.

Comparing Software Defined Memory Options
Benzi Galili – Chief Operating Officer, ScaleMP
Software defined memory (SDM) has evolved in the past decade from DRAM accessible over custom fabrics to DRAM accessible over standard fabrics (SDM-F) and now memory over storage (SDM-S). This presentation will include a comparison of SDM-F and SDM-S, the advantages of each, key drivers of behavior and performance, and fit to specific use cases.

PipelineDB: The Streaming-SQL Database
Derek Nelson – CEO and Co-Founder, PipelineDB
PipelineDB is an open-source relational database that runs SQL queries continuously on streaming data, incrementally storing results in tables. Our talk will include an overview of PipelineDB's architecture, the use cases for continuous SQL queries on streams, user case studies, and outline how PipelineDB can used to easily build scalable and highly available streaming and realtime analytics applications using only SQL with no external dependencies.

Break

Science Without the Fiction: How Understanding Persistent Memory Market Trends can Help You Make the Best Design Choices

Gordon Patrick, Micron

Gordon Patrick

Director of Enterprise Computing Memory, Micron Technology Inc.

11:00 am - 11:50 am

Hardware

Shared In-Memory RDDs - Missing Link in Spark

Nikita Ivanov, GridGain

Nikita Ivanov

Founder & CTO, GridGain Systems, Apache Ignite

11:00 am - 11:50 am

Streaming Data

Break

What Non-Volatile Memory Means for the Future of Database Management Systems

Andy Pavlo, Carnegie Mellon University

Andy Pavlo

Asst. Professor, Carnegie Mellon University

11:55 am - 12:45 pm

Hardware

Test Driving Streaming and CEP on Apache Ignite

Matt Coventon, Innovative Software Engineering

Matt Coventon

Big Data Services Lead/Senior Software Engineer, Innovative Software Engineering

11:55 am - 12:45 pm

Streaming Data

Lunch

12:45 pm - 1:45 pm

The Benefits of Memory and Storage Convergence to In-Memory Computing

Amit Golander, Plexistor

Amit Golander

CTO, Plexistor

1:45 pm - 2:35 pm

Hardware

Propelling IoT Innovation with Predictive Analytics

Nikita Shamgunov, MemSQL

Nikita Shamgunov

CTO and co-founder, MemSQL

1:45 pm - 2:35 pm

New Capabilities

Break (Passport program giveaway raffle)

Introduce Non-volatile Generic Object Programming Model for In-Memory Computing (Spark) Performance Improvement

Yanping Wang, Apache Mnemonic

Apache Mnemonic (Incubating) Project Lead, Apache Software Foundation

2:40 pm - 3:30 pm

Hardware

In-Memory Computing frameworks such as Spark are gaining tremendous popularity for Big Data processing as their in-memory primitives make it possible to eliminate disk I/O bottleneck. Logically, the more available memory they have, the better performance they can achieve. However, unpredicted GC activity from on-heap memory management, high cost for serialization/de-serialization (SerDe), and burst temporary object creation/destruction greatly impacts their performance and scale-out ability. For example in Spark, when the volume of datasets are much larger than the system memory volume, SerDe makes significant impact on almost every in-memory computing steps such as caching, checkpoint, shuffling/dispatching, data loading and Storing.

With fast growing advanced server platform with significant increased non-volatile memory such as Intel 3D Xpoint technology powered NVMe and Fast SSD Array Storage, how to best use various hybrid memory-like resources from DRAM to NVMe/SSD determines Big Data applications performance and scalability.

In this presentation, we will first introduce our non-volatile generic Java object programming model for In-Memory Computing. This programming model defines in-memory non-volatile objects which can be directly operated on memory-like resources. We then discuss our structured data in-memory persistence library that can be used to load/store non-volatile generic Java object from/to underlying heterogeneous memory-like resources, such as DRAM, NVMe, even SSD.

We then present a non-volatile computing case using Spark. We will introduce that this model can (1) Lazily loads data to minimize memory footprint, (2) Naturally fits both non-volatile RDD and off-heap RDD, (3) Uses non-volatile/off-heap RDDs to transform Spark datasets, (4) Avoids memory caching by using in-place non-volatile datasets.

Finally we will present that up to 2X performance boost can be achieved on Spark ML tests after applying this non-volatile computing approach that removed SerDe, caching hot data, and reducing GC pause time dramatically.