What The 'Internet of Things' Means For Big Data Storage

The rise of connected devices, from refrigerators to thermostats and even medical devices, has created an interesting data debacle: How should users collect, monitor and store the loads of data their devices now create? Here, Joel Berman, Acronis fellow and longtime IT professional, explains the data storage implications of the "Internet of Things," why a one-size-fits-all backup policy doesn't work, and how to manage the stream of data from connected devices. 

So, what does "The Internet of Things" trend mean to you? 

The “Internet of Things,” which I define as any endpoint that we are now putting online other than the traditional connected devices such as tablets, smartphones, PCs and laptops, is advancing in both form and sophistication. Back in 2003, Hitachi unveiled chips so small that they could be sprinkled into food and used to track what people eat. Today, we have everything from heart monitors to traffic cameras to smart refrigerators that can be used to collect and monitor data.

What are the storage implications of creating so much data?

How can we possibly track all of the data that we are creating? It's already getting to the point where we are wasting resources by collecting too much data. Take traffic cameras, for example. What’s the sense in saving every single piece of footage captured from a traffic camera? When you think of the millions of traffic cameras all around the world, that’s an incredible amount of data we don’t necessarily need.

However, when this data is aggregated and used for analysis purposes, it becomes meaningful and easier to store. Consumers and businesses alike have to ask themselves what data is most useful as a fine-grained element, and what data is useful only when it is aggregated and statistically interesting. Use a heart monitor as an example. A patient has a sensor that measures each heartbeat. Individual data captured when a patient’s heart has skipped a beat would be important to the patient and their doctor. However, if it were a pharmaceutical company monitoring the data from heart monitors, the company might be interested to see the data in aggregate form to track the success of certain heart medications across a wide population.

How should companies prioritize the data they collect from connected devices?

Some companies will delete the data that they’ve gathered after a week, and others will have a systematic process, such as only keeping the data they collect every other week from a certain year. The retention policy becomes really important based on how you define what data is worth keeping, and how quickly you need to access it. A company studying global warming might need to store daily temperature data for hundreds of years, while an electric company might only care about temperature updates once per hour for ten years or less. The electric company just needs to know how much power to generate based on the local weather forecast and data on how much energy users consume at various temperature points.

With the "Internet of Things," there's a fusion of many elements, and the relevant information depends upon how those elements work together to complete specific tasks. 

Image via Canstock Photo