Why it's Important: High-speed trading shops and developers of trading algorithms are sifting through big data to find alphas in equity markets and other asset classes. They're using open source technologies such as Hadoop to chug through the data. Rather than purchase their own servers, firms can rent 300 virtual machines without incurring capital expenditures. Thus, many firms are turning to Amazon's EC2 cloud to access hundreds of virtual machines for processing analytics, back testing and storage of market data. The availability of elastic computing as a utility through Amazon EC2 and other cloud providers enables firms to rent processing power by the hour.
Where the Industry Is Now: Quantitative firms are generating tons of data from testing their strategies across an array of processors. Tradeworx, a firm engaged in high-frequency trading strategies, leverages the Amazon cloud for testing and running its own strategies. "We use the cloud for storage, for security and backup," said Manoj Narang, CEO at Tradeworx, speaking on a September Wall Street & Technology webcast. "We use it for automation, and we use it for speed, which is the ability to parallelize computations."
Facing a similar big data issue, Deep Value, a developer of algorithms, backtests its algorithms "on a multitude of orders across many months of historic data," says CEO Harish Devarajan. To gain an edge, it also must simulate how the algos would have worked across hundreds of days of trading. The next phase is to ask "what if" questions of the data from hundreds of machines, which actually creates a new problem--"storing an ocean of data," Devarajan says.
Focus in 2013: Financial firms are all storing the same market data and time series data in the cloud, which is wasting money, contends Adam Honore, research director at Aite Group, a research and consulting firm. After running tests and simulations, historical data storage could be a shared expense, Honore says. "All the time series [data] could get stored for everybody," he says.
Cloud providers like Amazon could create an ecosystem around the providers of time series data. For instance, if a provider such as Tick Data is in the cloud, Amazon has no mechanism to tell other financial firms that Tick Data is there storing the same data, Honore says. "At some point, firms like Deep Value and HFT shops and big banks need to say, 'This is stupid. We're all storing this stuff independently.'" Because high-frequency trading firms are highly competitive and regard data as their secret sauce, Honore says that individual transactions and client data should be stored separately. But "I don't necessarily agree that storing that data in the cloud is any less secure than storing it locally."
[HFT Players Struggle to Survive]
Though industry consortiums are time consuming to establish, Honore notes, one focused on sharing data could be useful. "Consortiums may decide they should all be drawing from the same pool [of historical data]," he says. Meanwhile, Deep Value is taking action. The algo developer's utility computing bills are getting so large that it's considering a hybrid approach. "We're buying machines," says Paul Haefele, Deep Value's managing director of technology, noting that it's better to use the cloud for volume spikes. Even so, for privacy, regulatory and confidentiality reasons, financial firms will likely want a hybrid approach where they own and operate their data centers, Haefele says.
Technology Providers: NYSE Technologies launched the Capital Markets Community Platform in June 2011, from its Mahwah, N.J., data center. In September, Amazon Web Services announced a partnership with Nasdaq OMX to offer FinQloud, a service that provides infrastructure and storage for financial data in the cloud to meet regulatory compliance requirements. Amazon Web Services, Google and IBM operate global data centers with server farms, offering services in the public cloud. Data center operators Savvis and Equinix are offering private cloud-based services, tailored to the needs of financial services firms.
Price Tag: A firm that's storing a big chunk of historical data for backtesting is paying roughly $3,000 a month, or $4,000 to $6,000 per month for both processing and storage, Aite Group's Honore says. Deep Value pays Amazon $60,000 a year to store 60 terabytes of data compared with paying $200 for a 3-TB hard drive. "Sometimes you don't need to use 300 machines," Haefele says. "It might be cheaper to use real machines than the virtual machines you get in the cloud."