Tracks:
Tales from the Trenches | Architecture |
Hardware | Streaming Data |
New Capabilities |
Schedule Day 1
Organizations increasingly require real-time, highly scalable computing platforms in industries such as financial services, telecommunications, retail, SaaS, and IoT. This has spurred rapid advancements in chips, servers, storage and software for in-memory computing. We will review the state of in-memory computing today and offer some thoughts on where it is headed tomorrow as companies strive to create real-time, massively scalable Fast Data solutions. Read more
In his keynote, Jason will look at some of the challenges that in-memory approaches to data and data processing are helping to overcome today. But he will also gaze into his crystal ball to ask what the future holds for in-memory computing. Specifically, how will in-memory approaches change in the next three to five years, as it is increasingly relied upon to support the emerging Internet of Things? Jason will also briefly look at some of the other nascent technologies that are likely to be used in parallel with in-memory computing, and he’ll wrap up by asking what kind of role in-memory is likely to play in related areas such as cloud computing and edge analytics. Read more
In-memory computing is a reality. So are the limits of memory capacity. Data size constantly increases, while application developers and IT staff push for in-memory efficiencies; the conclusion is inevitable: we need to be able to access more memory than the DRAM capacity that the server provides. ScaleMP’s Software Defined Memory (SDM) technology allows for more system memory to be available per server, far beyond the hardware limits, by utilizing memory from other nodes (over fabric) or from locally installed non-volatile memory (NVM) such as NAND Flash or 3D XPoint – transparently and without any changes to operating system or applications. We shall present the benefits of SDM, discuss the relevant use-cases, and share performance data.
With tremendous growth in big data, low latency and high throughput is the key ask for many big data application. The in-memory technology market is growing rapidly. We see that traditional database vendors are extending their platform to support in-memory capability and others are offering in-memory data grid and NoSQL solutions for high performance and scalability. In this talk, we will share our point of view on In-Memory Data Grid and NoSQL technology. It is all about how to build architecture that meets low latency and high throughput requirements. We will share our thoughts and experiences in implementing the use cases that demands low latency & high throughput with inherent scale-out features.
You will learn about how in-memory data grid and NoSQL is used to meet the low latency and high throughput needs and choosing in-memory technology that is good fit for your use case.
Lets face it: Distributed computing is hard. The truth is that most systems and vendor solutions work great under regular conditions, but what separates them is what happens when things go wrong. If you're building a mission critical distributed system, you need to take the time to build infrastructure to test for failure. In this talk we'll outline how we think about testing a distributed system, and share some real world experience in ferreting out issues before they become problems in production. We'll provide a hands on overview of our test framework and show you how you too can be prepared.
In-memory data grids (IMDGs) are widely used as distributed, key-value stores for serialized objects, providing fast data access, location transparency, scalability, and high availability. With its support for built-in data structures, such as hashed sets and lists, Redis has demonstrated the value of enhancing standard create/read/update/delete (CRUD) APIs to provide extended functionality and performance gains. This talk describes new techniques which can be used to generalize this concept and enable the straightforward creation of arbitrary, user-defined data structures both within single objects and sharded across the IMDG.
A key challenge for IMDGs is to minimize network traffic when accessing and updating stored data. Standard CRUD APIs place the burden of implementing data structures on the client and require that full objects move between client and server on every operation. In contrast, implementing data structures within the server streamlines communication since only incremental changes to stored objects or requested subsets of this data need to be transferred. However, building extended data structures within IMDG servers creates several challenges, including, how to extend this mechanism, how to efficiently implement data-parallel operations spanning multiple shards, and how to protect the IMDG from errors in user-defined extensions.
This talk will describe two techniques which enable IMDGs to be extended to implement user-defined data structures. One technique, called single method invocation (SMI), allows users to define a class which implements a user-defined data structure stored as an IMDG object and then remotely execute a set of class methods within the IMDG. This enables IMDG clients to pass parameters to the IMDG and receive a result from method execution.
A second technique, called parallel method invocation (PMI), extends this approach to execute a method in parallel on multiple objects sharded across IMDG servers. PMI also provides an efficient mechanism for combining the results of method execution and returning a single result to the invoking client. In contrast to client-based techniques, this combining mechanism is integrated into the IMDG and completes in O(logN) time, where N is the number of IMDG servers.
The talk will describe how user-defined data structures can be implemented within the IMDG to run in a separate process (e.g., a JVM) to ensure that execution errors do not impair the stability of the IMDG. It will examine the associated performance trade-offs and techniques that can be used to minimize overhead.
Lastly, the talk will describe how popular Redis data structures, such as hashed sets, can be implemented as a user-defined data structure using SMI and then extended using both SMI and PMI to build a scalable hashed set that spans multiple shards. It will also examine other examples of user-defined data structures that can be built using these techniques.
The audience will learn (1) how to extend an IMDG to incorporate user-defined data structures, (2) the trade-offs between an extensible mechanism and the use of built-in data structures, such as in Redis, and (3) examples of using this mechanism in various applications.
This presentation will be shared between Dmitriy Setrakyan, Founder and Chief Product Officer at GridGain, and Mikhail Khasin, Senior Managing Director & Head of Core banking Transformation Program at Sberbank, one of the largest banks in Eastern Europe. In this section we will cover how GridGain and Apache Ignite helped Sberbank achieve scalability and performance based on distributed in-memory caching and colocated data processing. We will closely dissect the data grid architecture and how it was applied. We will also go over a few gotchas to watch out for, as well as recommend several best practice approaches, including colocated data, colocated compute, fault-tolerance, distributed queries, deadlock-free transactions, and more. If you are interested in real-time scalable in-memory architectures, then this section is for you.
Caching is a frequently used and misused technique for speeding up performance, off-loading non-scalable or expensive infrastructure, scaling systems and coping with large processing peaks. In this talk Greg introduces you to the theory of caching and highlights key things to keep in mind when you apply caching. Then we take a comprehensive look at how the JCache standard standardises Java usage of caching.
2015 saw the birth of a new product team at Target, Available to Promise (ATP) ATP is a new way to talk to guests about product availability and delivery times on target.com and is used as the the reservation engine for orders. In order to provide the guest experience that we are seeking to deliver on Target.com - the team chose to make use of an in
memory data grid to serve up these features. This talk will cover what the team did right, what they did wrong, and what they were able to achieve (metrics).
Did:
- Implemented OSS hazelcast
- Needed consistency and transactions (while using a NOSQL, non-transactional datastore)
- Used transactions backed by RESTFul APIs (using MapStore with write behind)
- Needed to consolidate distributed systems into a unified response (with transactions)
- Used Microservices to facilitate the domains
- Went through many serialization iterations
- Iterated to the right balance of local caches and memory-grid (used topics for transport and making use of the camel hazelcast component)
- Made use of executor service as well ( so had mix of EP and Executor)
Learned:
- How to support transactions fast! (EP)
- How we bundled Hz (client server)
- Serialization is important!
- Gotchas - refreshing cache time, EP is bundled in the server
In this presentation I'll look at the roadmap for Apache Ignite 2.0 towards becoming one of the first convergent data platform that would combine cross-channel tiered storage model (DRAM, Flash, HDD) and multi-paradigm access pattern (K/V, SQL, MapReduce, MPP) into one highly integrated and easy to use data platform.
Integral Ad Science employs stream processing systems to extract value from data. We rely on Apache Storm to collect, aggregate and take decisions in real-time. Such systems frequently try to use external storage for persistent state to store real-time data view and provide failure recovery capabilities.
However, in heavily loaded systems disk- and SSD based storages easily become performance bottleneck and complicate software evolution. Given the typical data consistency and performance requirements, external state and reliance on world clock become a taxing and hardly maintainable choice.
In this talk, we will discuss how we handled the challenges when building 1.5M msg/sec global processing system with Apache Storm and Apache Kafka. We will review benefits of volatile in-memory state, inspect technology agnostic patterns reemerging in multiple applications including stream rewind, derived logical time and synchronization, and precision/performance trade offs.
Modern transactional systems need to be fast, always available and constantly scale to meet the ever changing needs of the business. It is becoming increasingly commonplace for next generation e-commerce systems to demand double or single digit millisecond response times, for financial trading systems to incur maximum latencies in the order of microseconds and gaming and analytic engines to consumes hundreds of thousands of transactions a second. It is a common and tempting mistake to believe that we can meet the extreme needs of such systems by just replacing traditional disk based storage systems with in-memory data grids using traditional application architectures. Such an approach will take us only so far after which the system's demands will once again overtake its capabilities. To truly meet the extreme needs of these systems and continue to scale as the demand scales, we need to think differently about how such systems are architected and employ modern techniques to unlock the full potential of memory oriented computing. This talk explains why and how.
Join Girish Mutreja, CEO of Neeve Research and author of the X Platform as he discusses the above and provides a unique perspective into what’s different about memory oriented TP applications and how application architectures, particularly mission critical applications, need to adapt to the new world of memory oriented computing. Girish will outline the key architectural elements of TP applications and explain how they need to function in the world of memory oriented computing. He will delve into why such systems need to be architected as a marriage between messaging and data storage; why message routing and data gravity is of critical importance to these systems; how structured, in-memory state lends to extreme agility; how fault tolerance, load balancing, transaction processing and threading need to function in such systems; why architectural precepts such as transaction pipelining and agent oriented design are critical to reliability, performance and scalability. Girish will illustrate how these concepts have enabled enterprises such as MGM Resorts to transition to game changing, memory oriented architectures by leveraging the X Platform.
Online decision making over time needs interacting with an ever changing environment. And underlying machine learning models need to change and adapt to this changing environment. We discuss class of algorithms and provide details of how the computation is parallelized using the Spark framework. Our implementation follows the architectural style of the Lambda Architecture—a batch layer to process bulk data and create models, a speed layer to process incremental data and create updates to models, and a serving layer to respond to decision requests in near real time. The batch layer is implemented as a Spark application, the speed layer is a Spark Streaming application, and the serving layer is implemented using the Play Framework. Spark’s MlLib and low-level API are used for training and creating models in both the batch and speed layers.
While everyone is talking about 'stateless' services as a way to achieve scalability and high availability, the truth is that they are about as real as the unicorns.
Building applications and services that way simply pushes the problem further down the stack, which only makes it worse and more difficult to solve (although, on the upside, it might make it somebody else’s problem). This is painfully obvious when building microservices, where each service must truly own its state.
The reality is that you don't need 'stateless' services to either scale out or be fault tolerant -- what you really need is a scalable, fault tolerant state management solution that you can build your services around.
In this talk we will discuss how some of the popular microservices frameworks are tackling this problem, and will look at technologies available today that make it possible to build scalable, highly available systems without 'stateless' service layers, whether you are building microservices or good ol' monoliths.
Do you need to move enterprise database information into a Data Lake in real time, and keep it current? Or maybe you need to track real-time customer actions in order to engage them while they are still accessible. Perhaps you have been tasked with ingesting and processing large amounts of IoT data.
Whatever the use case, you have found yourself embarking down the path of in-memory computing, and more specifically, stream processing and analytics. Your first thought may be to look towards open-source technology to achieve your objectives. But you quickly realize that there are a lot of options, and a lot of pieces that you would need to wire together to make this happen.
More importantly, you realize that, as part of your mission-critical business systems, this in-memory technology needs to be enterprise grade. It needs to be scalable, reliable, secure, and integrate easily with your existing systems.
In this presentation, Steve will discuss the architectural decisions that must be made to harden your IMC implementation for the enterprise. You will learn ways to approach scalability and reliability. This will cover partitioning of streaming data, design considerations for optimal data enrichment and processing capabilities, and failover and recovery strategies. Learn why it is crucial to secure the components of your in-memory architecture, and apply encryption correctly. Different approaches to integrating with the many sources of enterprise data in a streaming fashion – including enterprise databases, log files and IoT data – will also be shared.
-
Registration7:30 am - 8:00 am
-
BreakfastRegistration continues8:00 am - 9:00 am
-
The In-Memory Computing Landscape: Leading the Fast Data RevolutionAbe Kleinfeld, President & CEO, GridGain9:00 am - 9:30 am
-
In-Memory: The Foundation of the Internet of ThingsJason Stamper, Analyst, Data Management and Analytics, 451 Research9:35 am - 10:05 am
-
More Memory for In-Memory? Easy!Benzi Galili, Chief Operating Officer, ScaleMP10:10 am - 10:30 am
-
Break10:30 am - 11:00 am
-
Demystifying In-Memory Data Grid and NoSQL DB. Are they one and the same?Pandurang Pradeep Naik, Wipro11:00 am - 11:50 am
-
The Truth: How to Test Your Distributed System for the Real WorldNoah Arliss, Workday11:00 am - 11:50 am
-
Break11:50 am - 11:55 am
-
Implementing User-Defined Data Structures in In-Memory Data GridsDr. William Bain, ScaleOut Software11:55 am - 12:45 pm
-
Case Study - GridGain for Sberbank Architecture in Financial ServicesDmitriy Setrakyan, GridGain - Mikhail Khasin, Sberbank11:55 am - 12:45 pm
-
Lunch12:45 pm - 1:45 pm
-
How to Use JCache to Speed Up Your ApplicationGreg Luck, Hazelcast1:45 pm - 2:35 pm
-
Target's First Foray into an In-Memory Data Grid (and the Trips, Stumbles, and Falls that came with)Jim Beyers, Hitendra Pratap Singh, Aaron Riekenberg, Target Corporation1:45 pm - 2:35 pm
-
Break2:35 pm - 2:40 pm
-
Apache Ignite 2.0 - Towards Convergent Data PlatformNikita Ivanov, GridGain2:40 pm - 3:30 pm
-
In-memory Stream Processing @ Integral Ad ScienceAlexey Kharlamov, Integral Ad Science2:40 pm - 3:30 pm
-
Break3:30 pm - 3:50 pm
-
Extreme Transaction Processing In A Memory-Oriented WorldGirish Mutreja, Neeve Research3:50 pm - 4:40 pm
-
Decision Making with Mllib, Spark and Spark StreamingGirish Kathalagiri, Staff Engineer, Samsung SDS Research America3:50 pm - 4:40 pm
-
Break4:40 pm - 4:45 pm
-
The Illusion of StatelessnessAleksandar Seovic, Architect, Oracle Coherence, Oracle4:45 pm - 5:35 pm
-
Making IMC Enterprise GradeSteve Wikes, CTO, Co-Founder, Striim4:45 pm - 5:35 pm
-
Networking Reception5:35 pm - 7:00 pm
Tracks:
Tales from the Trenches | Architecture |
Hardware | Streaming Data |
New Capabilities |
Schedule Day 2
Long gone are the halcyon days of multi-year big-budget waterfall-style projects serviced by traditional relational databases, overnight batch processing and monthly reports.
Today's world is about immediate and global access to vast oceans of information, insights derived from the analysis of floods of data and the ability to quickly and effectively move to market and deliver value to our customers.
Financial services operate in this increasingly complex and demanding environment with escalating demands on security, performance and scalability.
Regulatory requirements abound from a proliferation of financial regulatory authorities like the Financial Conduct Authority (FCA), Prudential Regulatory Authority (PRA) and Bank of England in the UK, the Security and Exchange Commission (SEC) and Federal Reserve (Fed) in the US and dozens more in the countries that we operate in around the world, spawning regulatory jargon like SOX, MAS, SCAP, and BASEL III.
In this talk we'll look at the evolution of In-Memory Computing in Financial services and how it adds to and helps to address the challenges faced by large scale banking enterprises. We'll also look a little bit ahead at emerging technologies and discuss the opportunities and challenges that they present. Read more
Non-Volatile DIMMs, or NVDIMMs, have emerged as a go-to technology for boosting performance for next generation storage platforms. The standardization efforts around NVDIMMs have paved the way to simple, plug-n-play adoption. This session will highlight the state of NVDIMMs today and give a glimpse into the future - what customers, storage developers, and the industry would like to see to fully unlock the potential of NVDIMMs.
Unveiling the X Platform
Girish Mutreja – CEO, Founder, Neeve Research
Neeve Research offers the X Platform, a revolutionary memory-oriented transaction processing platform for extreme enterprise applications. The platform uniquely integrates structured in-memory state, advanced messaging, multi-agency and decoupled enterprise data management to enable a true no-compromise extreme TP platform. The true innovation of the platform lies in its ability to provide a no-compromise blend of extreme performance, reliability, scalability and developmental agility. It is extremely fast, it is extremely easy to use, it can be used to build a wide variety of applications and the applications built using it exhibit zero data loss and scale linearly. After almost a decade of hard engineering and close-quarters field hardening with an exclusive set of Fortune 300 companies, Neeve is opening the platform for wider use. Listen as Girish Mutreja unveils the X Platform and shows how easy it is to build an application that performs at 100s of thousands of transactions per second or sub-100 microsecond latencies with zero garbage and zero data loss.
Tap Into Your Enterprise - Why Database Change and IMC Are an Ideal Match
Steve Wilkes – CTO, Co-Founder, Striim
In-memory computing is all about now. It’s the art of collecting and processing data as quickly as it is created in order to provide instant actionable insights. Databases, however, are all about the past. They are a record of what happened, not what is happening right now.
In this presentation, you will learn how to turn your enterprise databases, and the applications they support, into real-time sources of what’s currently happening throughout the business. By utilizing database change, and in-memory processing and analytics, you can tap into your enterprise activity and make decisions while the data is still relevant.
Capture Perishable Insights Before the Moment is Lost
Chris Villinger – VP Business Development & Marketing, ScaleOut Software
In today’s competitive business environment, companies need to capture perishable opportunities before the moment is lost. Business intelligence is not enough. Live systems need to analyze data in flight to create operational intelligence. With its ability to store and analyze fast-changing data in milliseconds, in-memory computing technology provides the secret sauce that enables operational intelligence at scale for these systems.
ScaleOut Software’s in-memory computing technology integrates scalable, in-memory data storage and data-parallel computing to deliver on the promise of operational intelligence. This enables financial systems to react more quickly to market price changes, IoT applications to track the behavior of millions of devices, healthcare systems to analyze real-time telemetry from pacemakers, and e-commerce sites to make context-aware recommendations to online shoppers – just to name a few applications. We are only now beginning to tap the power of this technology to enhance the value of live systems.
Lambda-B-Gone: The In-memory Case Study for Faster, Smarter and Simpler Answers
Dennis Duckworth – Director of Product Marketing, VoltDB
Simplicity, accuracy, speed are three things everyone wants from their data architecture. A content delivery network based in LA, was looking to achieve these goals and developed a framework that handled batch and stream processing with open source software. The objective was to manage the real-time aggregation of over 32 TB of daily web server log data. The problem? Everything.
Listen as Dennis Duckworth explains how VoltDB reduced the number of environments, used 1/10th the CPU cycles, and achieved 100% billing accuracy on 32 TB of daily web server data.
Comparing Software Defined Memory Options
Benzi Galili – Chief Operating Officer, ScaleMP
Software defined memory (SDM) has evolved in the past decade from DRAM accessible over custom fabrics to DRAM accessible over standard fabrics (SDM-F) and now memory over storage (SDM-S). This presentation will include a comparison of SDM-F and SDM-S, the advantages of each, key drivers of behavior and performance, and fit to specific use cases.
PipelineDB: The Streaming-SQL Database
Derek Nelson – CEO and Co-Founder, PipelineDB
PipelineDB is an open-source relational database that runs SQL queries continuously on streaming data, incrementally storing results in tables. Our talk will include an overview of PipelineDB's architecture, the use cases for continuous SQL queries on streams, user case studies, and outline how PipelineDB can used to easily build scalable and highly available streaming and realtime analytics applications using only SQL with no external dependencies.
With persistent memory solutions quickly moving from concept designs to mass-production reality, IT architects are faced with significant questions: How do I get the most value out of my system? How will the broader market adopt and implement today’s NVDIMM portfolio? What applications gain the most benefit from today’s solutions? What are the current challenges for adoption? How should I plan to ensure I keep up with industry trends?
Gordon Patrick, director of Micron’s enterprise computing memory business, will provide a view of how current products are driving new opportunities in persistent memory and provide insight on important industry trends affecting tomorrow’s persistent memory platforms.
Three key audience takeaways:
- What are the clearest routes to add value to your systems through today’s persistent memory solutions?
- What key design elements should be considered given the broader shift to non-volatile memory systems?
- What changes are needed to truly extract the value from today’s persistent memory technology?
In this talk I'll present the SharedRDD - a high-performance in-memory caching layer for Spark jobs. We'll work through 1) design & architecture of this component, 2) configuration and 3) actual Java and Scala usage examples.
The advent of non-volatile memory (NVM) will fundamentally change the dichotomy between memory and durable storage in database management systems (DBMSs). These new NVM devices are almost as fast as DRAM, but all writes to it are potentially persistent even after power loss. Existing DBMSs are unable to take full advantage of this technology because their internal architectures are predicated on the assumption that memory is volatile. That means when NVM finally arrives, just like when you finally passed that kidney stone after three weeks, everyone will be relieved but the transition will be painful. Many of the components of legacy DBMSs will become unnecessary and will degrade the performance of data intensive applications.
In this talk, I discuss the key aspects of DBMS architectures that are affected by emerging NVM technologies. I then describe how to adapt in-memory DBMS architectures for NVM. I will conclude with a discussion of a new DBMS that we have been developing at Carnegie Mellon that specifically designed to leverage the persistence properties of NVM in its architecture, such as its recovery and concurrency control mechanisms. Our system is able to achieve higher throughput than existing approaches while reducing the amount of wear due to write operations on the device.
Apache Ignite is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time. But, did you know it provides streaming and complex event processing (CEP)? In this hands-on demonstration we will take Apache Ignite's Streaming and CEP features for a test drive. We will start with an example streaming use case then demonstrate how to implement each component in Apache Ignite. Finally we will show how to connect a dashboard application to Apache Ignite to display the results.
In-memory compute gave up on storage and moved the active working set to memory. This brings tremendous performance gains, but also consumes expensive DRAM resources; puts data at risk; and suffers from slow recovery time when power failures occur.
In this talk we will present the convergence of memory and storage, and how it can address these deficiencies. We will show examples in which Software-defined memory (SDM) has enabled: running working sets that are much larger than the DRAM budget; provide last-transaction safety; and immediately recover from power failure.
As the dangers of global climate change multiply, utility companies seek methods to reduce carbon emissions, such as integrating renewable and sustainable energy sources like wind, solar, and hydroelectric power. Renewable energy not only has the power to improve climate conditions, it also encourages economic growth. By combining advances in sensor technology with machine learning algorithms and environmental data, utility companies can monitor energy sources in real time to make faster decisions and speed innovation.
In this session, Nikita Shamgunov, CTO and co-founder of MemSQL, will conduct a live demonstration based on real-time data from 2 million sensors on 197,000 wind turbines installed on wind farms around the world. This Internet of Things (IoT) simulation explores the ways utility companies can integrate new data pipelines into established infrastructure. Attendees will learn how to deploy this breakthrough technology composed of Apache Kafka, a real-time message queue; Streamliner, an integrated Apache Spark solution; MemSQL Ops, a cluster management and monitoring interface; and a set of simulated data producers written in Python. By applying machine learning to analyze millions of data points in real time, the data pipeline predicts and visualizes health of wind farms at global scale. This architecture propels innovation in the energy industry and is replicable across other IoT applications including smart cities, connected cars, and digital healthcare.
In-Memory Computing frameworks such as Spark are gaining tremendous popularity for Big Data processing as their in-memory primitives make it possible to eliminate disk I/O bottleneck. Logically, the more available memory they have, the better performance they can achieve. However, unpredicted GC activity from on-heap memory management, high cost for serialization/de-serialization (SerDe), and burst temporary object creation/destruction greatly impacts their performance and scale-out ability. For example in Spark, when the volume of datasets are much larger than the system memory volume, SerDe makes significant impact on almost every in-memory computing steps such as caching, checkpoint, shuffling/dispatching, data loading and Storing.
With fast growing advanced server platform with significant increased non-volatile memory such as Intel 3D Xpoint technology powered NVMe and Fast SSD Array Storage, how to best use various hybrid memory-like resources from DRAM to NVMe/SSD determines Big Data applications performance and scalability.
In this presentation, we will first introduce our non-volatile generic Java object programming model for In-Memory Computing. This programming model defines in-memory non-volatile objects which can be directly operated on memory-like resources. We then discuss our structured data in-memory persistence library that can be used to load/store non-volatile generic Java object from/to underlying heterogeneous memory-like resources, such as DRAM, NVMe, even SSD.
We then present a non-volatile computing case using Spark. We will introduce that this model can (1) Lazily loads data to minimize memory footprint, (2) Naturally fits both non-volatile RDD and off-heap RDD, (3) Uses non-volatile/off-heap RDDs to transform Spark datasets, (4) Avoids memory caching by using in-place non-volatile datasets.
Finally we will present that up to 2X performance boost can be achieved on Spark ML tests after applying this non-volatile computing approach that removed SerDe, caching hot data, and reducing GC pause time dramatically.
Today, many companies are faced with a huge quantity of data and a wide variety of tools with which to process it. This potentially allows for great opportunities to satisfy customers’ needs and bring user experience to the next level. However, in order to achieve this and provide a competitive solution, sophisticated and complex data processing is needed. Such processing can rarely be done with one tool or framework -- a number of tools are often involved, each having prowess in a particular field of the processing pipeline.
In this session, we will see the latest endeavors of Apache Ignite to integrate with other big data platforms and provide its in-memory computing strengths for data processing pipelines. In particular we will have a closer look at how it can be integrated and used with Apache Kafka and/or Flume, and outline several use senarios.
Much industry focus is on All-Flash Arrays with traditional databases, but new databases using native direct-attached Flash have proven reliable, performant, and popular for operational use cases. Today, these operational databases store account information for banking and retail applications, real-time routing information for telecoms, and user profiles for advertising; they also support machine learning for applications in the financial industry, such as fraud detection. While proprietary PCIe and “wide SATA” had previously been popular, NVMe has finally come into operational use. Aerospike will discuss the benefits of NVMe for these use cases (including specific configurations and performance numbers), as well as the architectural implications of low-latency Flash and Storage Class Memory.
Stibo Systems recently released its in-memory component for our Master Data Management (MDM) platform, giving significant speed-ups in most parts of the system. Our MDM platform provides high volume data management with many concurrent users. This in-memory component is built in-house and this talk is about how and why we did this, including:
- MVCC (Multi Version Concurrency Control) aware map, off heap and compact.
- Lock-free MVCC aware indexing.
- Wait-free MVCC aware querying that goes directly on the metal.
- Clustering and MVCC with recovery support.
- Why we built our own in-memory technology, how we integrated it into our existing 200+ man years system and the speed-ups we gained.
This will help you as a developer to navigate the landscape of in-memory products and identify the trade-offs involved helping you choose the right path.
This talk will describe the future memory and storage architecture create by the convergence of In-Memory computing and emerging Persistent Memory technologies. The audience will learn:
- The new Memory and Storage architecture created by these technologies;
- The new operating system file system and memory management architectures under development by the major OS vendors;
- The new APIs for In-Memory computing with Persistent Memory
- Opportunities for software innovation based this disruptive shift in the cloud architecture
Speedment SQL Reflector is a software solution that allows applications to get automatically updated data in real time. The SQL Reflector loads data from your existing SQL database and feeds it into a in-memory data grid eg GridGains. When started, the SQL reflector will load your selected existing relational data into your map cluster. Also, any subsequent changes that are made to the relational database (regardless how, via your application, script, SQLcommands or even stored procedures) are then continuously fed to your GridGain nodes. Even SQL-transactions are preserved so that your maps will always reflect a valid state of the underlying SQL database.
-
Registration, Breakfast8:00 am - 9:00 am
-
In Memory Computing for Financial Services: Past, Present and FutureRobert Barr, VP Data Grid Engineering Lead, Barclays9:00 am - 9:20 am
-
NVDIMM - Changes are Here So What's Next?Arthur Sainio, Co-Chair SNIA NVDIMM Special Interest Group and Director, Marketing, SMART Modular Technology9:25 am - 9:40 am
-
Innovation Presentations9:40 am - 10:40 am
-
Break10:40 am - 11:00 am
-
Science Without the Fiction: How Understanding Persistent Memory Market Trends can Help You Make the Best Design ChoicesGordon Patrick, Micron11:00 am - 11:50 am
-
Shared In-Memory RDDs - Missing Link in SparkNikita Ivanov, GridGain11:00 am - 11:50 am
-
Break11:50 am - 11:55 am
-
What Non-Volatile Memory Means for the Future of Database Management SystemsAndy Pavlo, Carnegie Mellon University11:55 am - 12:45 pm
-
Test Driving Streaming and CEP on Apache IgniteMatt Coventon, Innovative Software Engineering11:55 am - 12:45 pm
-
Lunch12:45 pm - 1:45 pm
-
The Benefits of Memory and Storage Convergence to In-Memory ComputingAmit Golander, Plexistor1:45 pm - 2:35 pm
-
Propelling IoT Innovation with Predictive AnalyticsNikita Shamgunov, MemSQL1:45 pm - 2:35 pm
-
Break (Passport program giveaway raffle)2:35 pm - 2:40 pm
-
Introduce Non-volatile Generic Object Programming Model for In-Memory Computing (Spark) Performance ImprovementYanping Wang, Apache Mnemonic2:40 pm - 3:30 pm
-
Apache Ignite as a Data Processing HubRoman Shtykh, CyberAgent2:40 pm - 3:30 pm
-
Break3:30 pm - 3:50 pm
-
NVMe, Storage Class Memory and Operational Databases: Real-World ResultsBrian Bulkowski, Aerospike3:50 pm - 4:40 pm
-
Using Lock-free and Wait-free In-memory Algorithms to Turbo-charge High Volume Data ManagementHenning Andersen, Stibo Systems3:50 pm - 4:40 pm
-
Break4:40 pm - 4:45 pm
-
The In-Place Working Storage TierKen Gibson, Intel4:45 pm - 5:35 pm
-
Work with Multiple Hot Terabytes in JVMsPer Minborg, Speedment4:45 pm - 5:35 pm