Welcome!

Cognitive Computing Authors: Elizabeth White, Liz McMillan, Yeshim Deniz, Kevin Benedict, Progress Blog

Blog Feed Post

In-Memory Data Grid: Explained…

In-memory processing has been a pretty hot topic lately. Many companies that historically would not have considered using in-memory technology because it was cost prohibitive are now changing their core systems’ architectures to take advantage of the low-latency transaction processing that in-memory technology offers. This is a consequence of the fact that the price of RAM is dropping significantly and rapidly and as a result, it has become economical to load the entire operational dataset into memory with performance improvements of over 1000x faster. In-Memory Compute and Data Grids provide the core capabilities of an in-memory architecture.

The goal of In-Memory Data Grids (IMDG) is to provide extremely high availability of data by keeping it in memory and in highly distributed (i.e. parallelized) fashion. By loading Terabytes of data into memory IMDGs are able to work with most of the Big Data processing requirements today.

At a very high level IMDG is a distributed object store similar in interface to a typical concurrent hash map. You store objects with keys. Unlike traditional systems where keys and values are often limited to byte arrays or strings – with IMDGs you can use any domain object as either value or key. This gives tremendous flexibility by allowing to keep exactly the same object your business logic is dealing with in the Data Grid without the extra step of marshaling and de-marshaling alternative technologies would require. It also simplifies the usage of data grid as you can in most cases interface with distributed data store as with a simple hash map. Being able to work with domain objects directly is one of the main differences between IMDGs and In-Memory Databases (IMDB). With the latter, users still need to perform Object-To-Relational Mapping which typically adds significant performance overhead.

There are also some other features in IMDGs that distinguish them from other products, such as NoSql databases, IMDBs, or NewSql databases. One of the main differences would be truly scalable Data Partitioning across cluster. Essentially IMDGs in their purest form can be viewed as distributed hash maps with every key cached on a particular cluster node – the bigger the cluster, the more data you can cache. The trick to this architecture is to make sure that you collocate your processing with the cluster nodes where data is cached to make sure that all cache operations become local and that there is no (or minimal) data movement within the cluster. In fact, when using well designed IMDGs, there should be absolutely no data movement on stable topologies – the only time when some of the data is moved is when new nodes join in or some existing nodes leave, hence causing some data repartitioning within the cluster.

The picture below shows a classic IMDG with a key set of {k1, k2, k3} where each key belongs to a different node. The external database component is optional. If present, then IMDGs will usually automatically read data from the database or write data to it.

Another distinguishing characteristic of IMDGs is Transactional ACID support. Generally a 2-phase-commit (2PC) protocol is used to ensure data consistency within cluster. Different IMDGs will have different underlying locking mechanisms, but usually more advanced implementations will provide concurrent locking mechanisms (like MVCC – multi-version concurrency control) and reduce network chattiness to a minimum, hence guaranteeing transactional ACID consistency with very high performance.

Data consistency is one of the main differences between IMDGs and NoSQL databases. NoSQL databases are usually designed on top of Eventual Consistency (EC) approach where data is allowed to be inconsistent for a period of time as long as it will become consistent *eventually*. Generally, the writes on EC-based systems are somewhat fast, but reads are slow (or to be more precise, as fast as writes are). Latest IMDGs with an *optimized* 2PC should at least match if not outperform EC-based systems on writes, and be significantly faster on reads. It is interesting to note that the industry has made a full circle moving from a then-slow 2PC approach to the EC approach, and now from EC to an *optimized* 2PC which often is significantly faster.

Different products provide different 2PC optimizations, but generally the purpose of all optimizations is to increase concurrency, minimize network overhead, and reduce the number of locks a transaction requires to complete. As an example, Google’s distributed global database, Spanner, is based on a transactional 2PC approach simply because 2PC provided a faster and more straightforward way to guarantee data consistency and high throughput compared to MapReduce or EC.

Even though IMDGs usually share some common basic functionality, there are many features and implementation details that are different between vendors. When evaluating an IMDG product pay attention to eviction policies, (pre)loading techniques, concurrent repartitioning, memory overhead, etc… Also pay attention to the ability to query data at runtime. Some IMDGs, such as GridGain for example, allow users to query in-memory data using standard SQL, including support for distributed joins, which is pretty rare.

The typical use for IMDGs is to partition data across the cluster and then send collocated computations to the nodes where the data is. Since computations are usually part of Compute Grids and have to be properly deployed, load-balanced, failed-over, or scheduled, the integration between Compute Grids and IMDGs is very important. It is especially beneficial if both In-Memory Compute and Data Grids are part of the same product and utlize the same APIs which removes the need of integration and usually renders utmost performant and reliable systems.

IMDGs (together with Compute Grids) are used throughout a wide spectrum of industries in applications as diverse as Risk Analytics, Trading Systems, Bio Informatics, eCommerce, or Online Gaming. Essentially every project that struggles with scalability and performance can benefit from In-Memory Processing and IMDG architecture.

Read the original blog entry...

More Stories By Thomas Krafft

Over 15 years of experience in marketing and demand creation, with strategies driving over $500 million in revenue for a variety of companies in several high-growth and competitive markets, including consumer software and web services, ecommerce, demand creation through web and search, big data, and now healthcare.

@ThingsExpo Stories
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
Mobile device usage has increased exponentially during the past several years, as consumers rely on handhelds for everything from news and weather to banking and purchases. What can we expect in the next few years? The way in which we interact with our devices will fundamentally change, as businesses leverage Artificial Intelligence. We already see this taking shape as businesses leverage AI for cost savings and customer responsiveness. This trend will continue, as AI is used for more sophistica...
SYS-CON Events announced today that SourceForge has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SourceForge is the largest, most trusted destination for Open Source Software development, collaboration, discovery and download on the web serving over 32 million viewers, 150 million downloads and over 460,000 active development projects each and every month.
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Elon Musk is among the notable industry figures who worries about the power of AI to destroy rather than help society. Mark Zuckerberg, on the other hand, embraces all that is going on. AI is most powerful when deployed across the vast networks being built for Internets of Things in the manufacturing, transportation and logistics, retail, healthcare, government and other sectors. Is AI transforming IoT for the good or the bad? Do we need to worry about its potential destructive power? Or will we...
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
In his session at @ThingsExpo, Greg Gorman is the Director, IoT Developer Ecosystem, Watson IoT, will provide a short tutorial on Node-RED, a Node.js-based programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using a wide range of nodes in the palette that can be deployed to its runtime in a single-click. There is a large library of contributed nodes that help so...
What is the best strategy for selecting the right offshore company for your business? In his session at 21st Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, will discuss the things to look for - positive and negative - in evaluating your options. He will also discuss how to maximize productivity with your offshore developers. Before you start your search, clearly understand your business needs and how that impacts software choices.
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
There is huge complexity in implementing a successful digital business that requires efficient on-premise and cloud back-end infrastructure, IT and Internet of Things (IoT) data, analytics, Machine Learning, Artificial Intelligence (AI) and Digital Applications. In the data center alone, there are physical and virtual infrastructures, multiple operating systems, multiple applications and new and emerging business and technological paradigms such as cloud computing and XaaS. And then there are pe...
Real IoT production deployments running at scale are collecting sensor data from hundreds / thousands / millions of devices. The goal is to take business-critical actions on the real-time data and find insights from stored datasets. In his session at @ThingsExpo, John Walicki, Watson IoT Developer Advocate at IBM Cloud, will provide a fast-paced developer journey that follows the IoT sensor data from generation, to edge gateway, to edge analytics, to encryption, to the IBM Bluemix cloud, to Wa...
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, will discuss how data centers of the future will be managed, how th...