IDG Contributor Network: Ensuring big data and fast data performance with in-memory computing
In-memory computing (IMC) technologies have been available for years. However, until recently, the cost of memory made IMC impractical for all but the most performance-critical, high value applications.
Over the last few years, however, with memory prices falling and demand for high performance increasing in just about every area of computing, I’ve watched IMC discussions go from causing glazed eyes to generating mild interest, to eliciting genuine excitement: “Please! I need to understand how this technology can help me!”
Why all the excitement? Because companies that understand the technology also understand that if they don’t incorporate it into their architectures, they won’t be able to deliver the applications and the performance their customers demand today and will need tomorrow. In-memory data grids and in-memory databases, both key elements of an in-memory computing platform, have gained recognition and mindshare as more and more companies have deployed them successfully.
All the new developments around in-memory computing shouldn’t fool you into thinking it’s unproven. It’s a mature, mainstream technology that’s been used for more than a decade in applications including fraud detection, high-speed trading and high performance computing.
Consider the challenges caused by the explosion in data being collected and processed as part of the digital transformation. As you go through your day, almost everything you do intersects with some form of data production, collection or processing: text messaging, emailing, social media interaction, event planning, research, digital payments, video streaming, interacting with a digital voice assistant…. Every department in your company relies on more sophisticated, web-scale applications (such as ERP, CRM and HRM), which themselves have ever more sophisticated demands for data and analytics.
Now add in the growing range of consumer IoT applications: smart refrigerators, watches and security systems—with nonstop monitoring and data collection—and connected vehicles with constant data exchange related to traffic and road conditions, power consumption and the health of the car. Industrial IoT is potentially even bigger. I recently read that to improve braking efficiency, a train manufacturer is putting 400 sensors in each train, with plans to increase that number to 4,000 over the next five years. And data from all of these applications must be collected and often analyzed in real time.
That’s where in-memory computing comes in. An IMC platform offers a way to transact and analyze data which resides completely in RAM instead of continually retrieving data from disk-based databases into RAM before processing. In addition, in-memory computing solutions are built on distributed architectures so they can utilize parallel processing to further speed the platform versus single node, disk-based database alternatives. These benefits can be gained by simply inserting an in-memory computing layer between existing application and database layers. Taken together, performance gains can be 1,000X or more.
Also, because in-memory computing solutions are distributed systems, it is easy to increase the RAM pool and the processing power of the system by adding nodes to the cluster. The systems will automatically recognize the new node and rebalance data between the nodes.
Today, IMC use cases continue to expand. Companies are accelerating their operational and customer-facing applications by deploying in-memory data grids between the application and database layers of their systems to cache the data and enable distributed parallel processing across the cluster nodes. Some are using IMC technology for event and stream processing to rapidly ingest, analyze, and filter data on the fly before sending the data elsewhere.
Many large analytic databases and data warehouses are using IMC technology to accelerate complicated queries on large data sets. And companies are beginning to deploy hybrid transactional/analytical processing (HTAP) models which allow them to transact and run queries on the same operational data set, reducing the complexity and cost of their computing infrastructure in use cases such as IoT.
The importance of IMC will continue to increase over the coming years as ongoing development and new technologies become available including:
First-class support for distributed SQL
Strong support for SQL will extend the life of this industry standard, eliminating the need for SQL professionals to learn proprietary languages to create queries—something they can do with a single line of SQL code. Leading in-memory data grids already include ANSI SQL-99 support.
Non-volatile memory (NVM)
NVM retains data during a power loss, eliminating the need for software-based fault-tolerance. A decade from now, NVM will likely be the predominant computing storage model, enabling large-scale, in-memory systems which only use hard disks or flash drives for archival purposes.
Hybrid storage models for large datasets
By supporting a universal interface to all storage media—RAM, flash, disk, and NVM—IMC platforms will give businesses the flexibility to easily adjust storage strategy and processing performance to meet budget requirements without changing data-access mechanisms.
IMC as a system of record
IMC platforms will increasingly be used by businesses as authoritative data sources for business-critical records. This will in part be driven by IMC support for highly efficient hybrid transactional and analytical processing (HTAP) on the same database as well as the introduction of disk-based persistence layers for high availability and disaster recovery.
Artificial intelligence
Machine learning on small, dense datasets is easily accomplished today, but machine learning on large, sparse data sets requires a data management system that can store terabytes of data and perform fast parallel computations, a perfect IMC use case.
All the new developments around in-memory computing shouldn’t fool you into thinking it’s unproven. It’s a mature, mainstream technology that’s been used for more than a decade in applications including fraud detection, high-speed trading and high performance computing. But it’s now more affordable and vendors are making their IMC platforms easier to use and applicable to more use cases. The sooner you begin exploring IMC, the sooner your company can benefit from it.
This article is published as part of the IDG Contributor Network. Want to Join?
Source: InfoWorld Big Data