Financial services firms must get a handle on 'big data' -- a term used to refer to extremely large datasets -- in order to help meet growing regulatory reporting demands and uncover market opportunities.
Why It's Important: The term "big data," used to describe extremely large datasets, has been around for a number of years and is commonly used in academic circles, as well as in pharmaceuticals and scientific research, where users need to crunch massive amounts of data to reach conclusions. Big data often refers to data stores that consume terabytes, petabytes or even exabytes (1 million terabytes) of storage. Now financial services firms are looking to get a handle on big data as: a) the amount of data continues to grow exponentially; b) regulatory requirements are forcing firms to take a proactive view of risk; and c) more and more users inside financial firms require access to larger and larger data sets to help uncover market opportunities, customer trends, product development possibilities and, of course, regulatory reporting and risk management.
Where the Industry Is Now: Data management has been a challenge for capital markets firms for decades. The industry has collectively spent billions of dollars over the years trying to create complete and accurate datasets, the so-called "golden copy." Data management has improved, especially in certain asset classes, but data sharing across institutions is still a challenge, as each business unit often prefers to refer to its own set of data for its calculations -- making enterprisewide data analysis tremendously difficult.
Today there are even more types of unstructured data that need to be collected, analyzed and stored. For instance, traders are interested in news about particular companies or industries, and many advanced trading operations have developed tools that can analyze news in real time to help make trading decisions. Now, with more news coming in via video, audio or even through Twitter, some firms are trying to devise ways to analyze all of this data for a variety of needs. Firms are also analyzing data in documents, click data from websites, data from surveys, web search traffic data and more.
On the risk management and regulatory sides of the business, users need to look at data from across the business and sometimes need to compare it to data from the markets or other data sources. As terabytes grow into petabytes, existing relational database structures are having a difficult time keeping up.
Focus In 2012: One thing is certain: New regulations focused on risk management and transparency are driving the need to manage big data. "Risk is taking center stage," says Senthil Kumar, group VP, financial services, at Oracle. "Financial institutions need to get all of this disparate information in the right context. Risk management is driving a lot of investment, and banks are making changes in their data processes in areas of compliance, risk and performance."
Meanwhile, the hunger for data will likely continue to grow, comments James Austin, CEO of Vertex Analytics, a data-collection management provider. "Big data in financial services in 2012 -- or 2015, or 2020 for that matter -- is going to be an important topic," Austin says. "There aren't going to be any firms that want less data in the future; they are all going to want more data. And regulations are going to play a large part in big data."
Industry Leaders: Most large Wall Street banks are looking for better ways to handle large datasets. Bank of America Merrill Lynch, for example, is using big-data techniques to manage petabytes of data for regulatory compliance and advanced analytics. The bank is using technology from Hadoop, an open source framework that supports data-intensive distributed computing, allowing data to be crunched over a distributed network of computers.
Technology Providers: Most of the large database players also provide technology that could help with big data, including IBM, Oracle, Sybase, Tableau Software and Teradata. Smaller players include Vertex and Attivio, a provider of unified information access.
Price Tag: Hadoop, an open source product, is a favorite among developers who are working with big data. Since it is open source, the cost for Hadoop is minimal, not including hardware and support. Larger database providers offer a variety of products at different price levels based on application, dataset size, processing power and other variables.