Financial institutions are among the largest commercial consumers of high-performance computing (HPC), and for good reason. HPC provides the foundation for valuable analytics important to financial institutions, embracing: trading (high-frequency and algorithmic); risk management; pricing and valuation of securities and derivatives; and business and economic analytics, including modeling and simulation.
The pressures to provide accurate, reliable, financial analytics will push more financial institutions in the direction of HPC, and their veteran technologists will look for more efficient technologies for getting greater insights. For capital markets, providing real-time data and analysis -- faster and at a lower cost -- to make informed decisions is critical.
That is certainly true when competitors are gathering more and more data and getting increasingly creative with sophisticated approaches for technology application -- as in 2010 when UBS Investment Research used satellite surveillance of 100 Walmart parking lots and gathered data on the number of cars parked in those lots month after month. From space, analysts got a much different and more accurate picture of the company’s quarterly earnings than traditional methods did, based on the traffic surge the satellite images revealed compared to the previous year.
Thanks to new technologies, “the times, they are a changin” with HPC, and big data is leading the charge. Indeed, the evolution of big data has affected almost every industry, and it's no surprise that the financial community has embraced it so avidly.
Financial services companies understand that having more data improves the precision to understand and predict an enterprise’s financial health and growth -- or lack thereof. Big-data implementations look to provide the opportunity to manage huge volumes of disparate data, at the right speed and within the appropriate time frame to allow real-time analysis and reaction.
Certainly retailers, marketers, and financial institutions will continue to significantly benefit from investments in big-data analysis. But what lies ahead for big data? It can be anyone’s guess, but a few key things are going to be important to watch and adopt, and they continue to evolve.
Are social media a fad?
They may turn out to be, but they aren't dying anytime soon. Right now businesses are smitten with using big data to capture social media analytics. Much like the satellite images of the Walmart parking lots, institutions gather “images” using HPC to analyze social media data and predict trends and business preferences and, for that matter, a business’s effectiveness in using social media to its competitive advantage.
Social media provide insights into consumer habits and behaviors, which businesses are using to move product or services, and they are turning to HPC for help with all this data. The largest big-data implementations are directly or indirectly applied to social sites such as Facebook or Twitter and social behaviors on traditional sites such as eBay or Amazon. Consumers daily reveal the details of their lives providing an abundance of data on their behaviors coupled with mobile device ubiquity that provides an unprecedented look into their daily habits.
Big-data analysis of consumer sharing allows retailers to provide offers on products and services consumers are most likely to buy, at the precise moment when those consumers are most susceptible to make purchases, and prompt those consumers using the endpoint devices they are most likely using at the moment. While this data is certainly useful to enterprises, the challenge always lies in the analytics.
The data ingestion progression rate challenge
As more mobile and sensor sources of data emerge, so too will the requirements for rapid ingestion of huge amounts of diverse data types. Storage and processing are relatively inexpensive. Wired and wireless global networks are getting faster but will struggle to provide the capacity needed to feed some of the planned big-data implementations under development.
To keep an edge, it is strategically beneficial to analyze data as it is ingested, so that both processes are running concurrently. This works well with large batch jobs, but in a high throughput computing environment, where large jobs and small jobs converge, the smaller jobs get lost in the shuffle. So the challenge is making sure that nodes are dedicated automatically to these smaller jobs (which sometimes number upwards of 100,000 or more) without disrupting the larger jobs and requiring individual scheduling.
Recognizing this problem, my firm, Adaptive Computing, developed Nitro, which integrates with Moab HPC Suite. Nitro enables automatic high-speed throughput on short computing jobs by allowing the scheduler to incur the scheduling overhead only once for a large batch of jobs. In essence, it identifies a subset of compute nodes in an HPC cluster and transforms them into a cell of nodes where tasks, or short-lived jobs, can be assigned. Nitro works in unison with Moab to facilitate parallel job scheduling with larger, longer-running jobs.
The emerging data-as-a-service (DaaS)
While some institutions will look to ingest and process data to gain first-mover advantage, others may choose to wait until more DaaS offerings appear in the marketplace. The need for wider varieties and sources of data exist. If the needs are near-term, then DaaS might be an option. In the next few months, more companies are going to be creating secure, easily available data in various vertical markets including medical/healthcare, financial, environmental, logistics, and social behaviors.
Collateral tools and technologies will determine adoption velocities
The more data financial entities can collect, the more they can analyze to better predict stocks. The better their capabilities to analyze the data, the more precise they will be about their predictions. It goes without saying that without faster and better analysis and visualization tools, financial institutions will be unable to comprehend relationships or patterns that exist within the data, defeating the point of using it at all.
So many big-data calculations require process decomposition and distribution, marshaling, recovery, and auditing. Unfortunately, current tools from the Apache Foundation and other sources fall far short. Extract, transform, and load (ETL) technologies will need to evolve to perform their services across a broader set of inputs and outputs, providing a dynamic capability for extending to yet unknown forms.
Certainly the future is bright for big data, even with the growing hurdles that are inevitable. Of course it is somewhat unknown what big data will look like five years from now because of its continuing, unpredictable shifts. All we can do is design for change, not change the design.Alan Nugent is a leading independent member of the Adaptive Computing Board of Directors. He is a senior executive with nearly 30 years of experience in managing and engineering the strategic direction for technology in global enterprises ranging in size from start-ups to $20 ... View Full Bio