IDG Contributor Network: Five core attributes of a streaming data platform
As your data-driven organization considers incorporating new data sources like mobile apps, websites that serve a global audience, or sensor information from the internet of things, technologists will have questions about the required attributes of a streaming data platform.
There are five core attributes that are necessary for the implementation of an integrated streaming platform and allow for both the acquisition of streaming data and the analytics that make streaming applications possible:
Low latency: Streaming data platforms need to match the pace of the data sources that they will acquire data from as part of a stream. One of the keys to streaming data platforms is the ability to match the speed of data acquisition with the requirements of the near real-time analytics needed to disrupt particular business models or markets. The value of real-time streaming analytics diminishes when you have to wait for the data to be landed in a data warehouse or a Hadoop-based data lake architecture. In particular, for location-based services and predictive maintenance applications, the time between when the data is created and landed in a data management environment represents a missed customer opportunity at the least or a stranded multi-million dollar asset critical to your business operations at the most.
Scalable: Streaming data platforms are not just connecting a couple of data sources behind the corporate firewall. Streaming data platforms need to be able to match the projected growth of connected devices and the internet of things. This means that streaming data platforms will need to be able to stream data from a large number of sources — potentially millions or even billions of sources, both internally and externally.
Diverse: Streaming data platforms will need to support not just “new era” data sources from mobile devices, cloud sources, or the Internet of Things. Streaming data platforms will also be required to support “legacy” platforms such as relational databases, data warehouses, and operational applications like ERP, CRM, and SCM. These are the platforms with the information to place streaming devices, mobile apps, and browser click information into context to provide value-added insights.
Centralized: One of the core tenants of a streaming data platform is to make streaming architectures simpler to understand and easier to implement. Using a centralized architecture, streaming data platforms can not only reduce the number of potential connections between streaming data sources and streaming data destinations, but they can provide a centralized repository of technical and business metadata to enable common data formats and transformations.
Durable: The ability to land data in a data warehouse or Hadoop-based data lake environment is a key component to a streaming data platform. This allows for not only the “in-flight” acquisition and analysis of the streaming data, but allows for a streaming data platform to support historical analysis that can be used for the development of pattern-based policy rules or advanced analytical clustering for streaming data analysis and processing.
With these five core attributes as the foundation for your streaming data platform, you can start the technology journey toward building a robust and complete platform that will enable the streaming applications that your data-driven organization will be built upon.
This article is published as part of the IDG Contributor Network. Want to Join?
Source: InfoWorld Big Data