Data Management

06:45 PM
Becca Lipman
Becca Lipman
Commentary
Connect Directly
Facebook
Google+
Twitter
RSS
E-Mail
50%
50%

So Much Data, So Little Time

How fast does data need to be used in order to be beneficial?

In a microsecond economy, most data is only useful in the first few milliseconds, or to an extent, hours after it is created. But the way the industry is collecting data, or more accurately, hoarding it, you'd think its value lasts a lifetime.

Yes, storage costs are going down and selecting data to delete is no easy task, especially for the unstructured and unclassified sets. And fear of deleting something that could one day be useful is always going to be a concern. But does this give firms the go-ahead to be data hoarders?

"There's debate about the industry collecting data over time, and how much of that long-term tail is useful," explains Dane Atkinson, CEO of SumAll, a NYC-based data analytics startup.  "People think of data as an endless repository, but most of the data's value only lasts for the five seconds after it's created."

Atkinson says in the long-tail, what's important is how much data can be economized given the cost of storage. Storage costs may be dropping, but the hosting fees aren't negligible, and firms still have to pay to keep their history.

He says firms are imagining they will eventually be in inflection mode, when someone or some tool will eventually come along and leverage all the data to generate deep insights, ones that ultimately justify the warehouse costs.

"They want infinite data sets so that one day you can ask any question. But it's impractical and rarely used correctly." He argues it's better to sit down and come up with specific questions based on what you need to know later (rather than "just in case") then narrow down the data sets to relevant indexes that tally the transactions.

How does a company reverse course on data hoarding? Given data's steep drop in value, Atkinson suggests adjusting the granularity as a good way to start. While some firms may see the value in storing transactions in an minute or hourly or daily log, others data sets may be most sensibly rolled up into a weekly or monthly metric, especially if looking at a ten-year plus timeframe.

Of course, that's rarely the reality. "Ever company I talk to has stored everything they possibly can," says Atkinson. "But most data, more than 50 percent, is like driving a car off the lot. The value drops significantly seconds later."

Leveraging tiered storage models can also help, he suggests, and archiving of files should be done in the most off-line and largest scale possible. "Way too many people keep data on expensive, highly granular tiers on aspirations that one day they will use it. And at the end of the day, once you get down to the results people want to see, it tends to be small files."

For example, an average customer that has 120 gigabytes per year will really want to use is 2-3 megabytes.

Despite the realities of the way firms are actually using their data, or how they will leverage data for discovery purposes in the future, the industry has shown no inclination in slowing down.

"You're living in the fantasy if you think you're going to leverage it."

Becca Lipman is Senior Editor for Wall Street & Technology. She writes in-depth news articles with a focus on big data and compliance in the capital markets. She regularly meets with information technology leaders and innovators and writes about cloud computing, datacenters, ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
ANON1244563612401
50%
50%
ANON1244563612401,
User Rank: Apprentice
7/29/2014 | 1:19:01 PM
Re: Data - to save or not to save
We live in a data-driven universe - not just a data-driven business world. But without waxing too philosophical, Becca you should also talk to a professional who does not share Atkinson's point of view. This is a very BIG issue and I hope you can chat with a data scientist from, say eBay - or someone from here at Teradata - or SAS. I think Oliver Ratzeberger would offer a meaningful, informed view that your readers would like to read about. - Mike O'Sullivan - a PR scientist 
Greg MacSweeney
50%
50%
Greg MacSweeney,
User Rank: Author
7/25/2014 | 7:12:18 AM
Re: Data
Getting rid of data is hard, as you point out. Just think of a person's email. Most email gets archived and users are very reluctant to delete messages (in bulk) just in case they need to refer to the email sometime in the future. But, realistically, how often do you look for or need an email from 6 months ago, let alone 1 or 2 years ago? Probably .0001% of emails are needed after 3 months, but we keep them all (and regulated companies, such as banks, are required to keep everything).
Becca L
50%
50%
Becca L,
User Rank: Author
7/24/2014 | 11:40:06 AM
Re: Data
When fintech execs talk about big data solutions, and the sheer amount of information they are collecting (because we can!), all I think of is big data problems in the long-term, but few are thinking along these lines today.

Yes, storing data is getting cheaper, but the a volume is high and growing, and these costs are not exactly negligible. I have no doubt Atkinson is right about the declining value of data over time, or how much of it will actually be made useful in the future. It seems crazy to me that so little attention has been given to weeding out useless old information. Just think of all the social media information being captured.. the tweets and facebook posts that hold no value whatsoever.

Of course, weeding out and trashing data takes guts - you never know what will be useful down the road or heaven forbid delete something that has a ripple effect and impacts the company's compliance. Lose-lose, really.
Byurcan
50%
50%
Byurcan,
User Rank: Author
7/24/2014 | 9:52:27 AM
Data
The ever-increasing hunger for data will only contonue, eventually firms will be so overburdened with storing and trying to mine data they will be able to do little else.
More Commentary
Interactive Data Launches Continuous Fixed Income Pricing Service
Independent intra-day FI pricing is helping to shine light on the opaque fixed income market.
Gartner: 75% of Mobile Apps Will Fail Security Tests Through 2015
The rise of BYOD means enterprises must implement security testing and containment solutions, according to new Gartner research.
Chip & Pain, EMV Will Not Solve Payment Card Fraud
Switching to EMV cards will lower retail fraud, but it's not enough. Here's the good, the bad, and the ugly.
With UCITS V, $9T Isnít as Easy as It Used to Be
With UCITS V's restrictive remuneration rules and hidden costs, going global may get a little less attractive.
Banks to Increase IT Spend on Big Data Challenges, Finds Aite Report
Big data has presented the greatest challenges and dissatisfaction for banks, yet it is the most likely to see upward spending in the next two years.
Register for Wall Street & Technology Newsletters
White Papers
Current Issue
Wall Street & Technology - July 2014
In addition to regular audits, the SEC will start to scrutinize the cyber-security preparedness of market participants.
Video
5 Things to Look For Before Accepting Terms & Conditions
5 Things to Look For Before Accepting Terms & Conditions
Is your corporate data at risk? Before uploading sensitive information to cloud services be sure to review these terms.