As vendors and users testified at last month's In-Memory Computing
Summit, the relatively low cost of flash memory is driving databases and
apps toward leveraging Fast Data – mobile and sensor cloud data – using
systems whose storage is predominantly or even entirely composed of
main and flash memory. One use case cited by a presenter employed one
terabyte of main memory and one petabyte of flash.
What is driving this shift in databases and the applications that use them?
Increasingly, enterprises are realizing that "almost-real-time"
handling of massive streams of data from cars, cell phones, GPS and the
like is the new frontier -- not only of analytics but also of
operational systems that handle the Internet of Things (IoT). As one
participant noted, this kind of real-time data-chewing not only allows
your car to warn of traffic ahead, but also to detect another car parked
around the corner in a dangerous position.
It gives each customer product "thing" in the Internet of Things
adaptability as well as intelligence. On the analytics side, its
real-time monitoring allows more rapid feedback and adjustment in the
sales process.
Handling this sensor-driven Web quickly is a task dubbed Fast Data by
summit participants. And it is a greater technical challenge than
handling Big Data, mainly because of the need for "almost-real-time"
operational processing.
The decreasing cost of flash memory now makes it possible to handle
Fast Data without breaking the bank, though, and new databases designed
to take advantage of "flash-only" (such as Redis Labs) are arriving.
Increasingly, Fast Data implementations are showing up not only at
public cloud providers but also within forward-looking enterprises.
So what are the emerging best practices of these flash-memory
database architectures? Space doesn’t permit a full discussion of the
smart implementation techniques that are sprouting, but here are five
good rules of thumb:
Treat Flash as Extension of Main Memory
This is what I call the "flat-memory" approach. Vendors must do much
of the heavy lifting in ensuring that processors treat flash as just a
slower version of main memory (as Intel is apparently doing) and flash
modules provide new interfaces optimized for random rather than disk
accesses. The user should look for vendors who do this best, and design
Fast-Data-using apps to view all storage as flat and equally available.
Implement Variable Tiering for Flash
That is, allow flash to sometimes be used for storage, and sometimes
for processing, depending on the needs of the application. Vendors such
as Redis Labs are in the forefront of providing this.
Understand Tradeoff between Data Correctness and Speed to Process
Specifically, vendors will vary in their ability to "write to
storage" without risking data loss, and to optimize processing speed at
the risk of some data incorrectness.
Mesh New Database Architecture with Existing Big Data Architectures
Summit participants agreed that the new database architectures simply
could not handle the full scope of present-day Big Data analytics – or,
to put it another way, Big Data can’t do Fast Data’s job, and vice
versa. Frameworks and brokers that parcel out application data between
operational and analytical databases are today’s main candidates for
good approaches to this problem.
Accept and Plan for Multiplication of Vendor Databases
Get familiar with columnar database technology. Real-world use cases
are already adding NoSQL/Hadoop-based databases and columnar databases
to handle the operational and analytical sides of Fast Data.
Bottom Line on Fast Data
Vendors are moving exceptionally rapidly to provide the basic
technology for the new flash-memory Fast Data architectures. The
benefits in real-time analytics and leveraging the Internet of Things
are clearly strategic, even before all of the potentialities of the
sensor-driven Web have been fully comprehended. So what are you waiting
for?
No comments:
Post a Comment