Once upon a time, market data was simple. You bought data from an aggregator such as Reuters, Bloomberg or Thomson, and you accessed it through a feed or terminal. You paid a fee and you received clean, fast and reliable data.
With the advent of electronic trading, direct market access and algorithms, however, fast no longer is good enough. Firms need blazing fast -- no, strike that -- they need faster than light-speed fast.
The problem with getting data that fast is that the act of data aggregation slows it down. Sending data from the exchange to an aggregator to the firm takes time. Now, we are not talking seconds -- we are talking about milliseconds. But in an era of 10-millisecond execution, even 50-millisecond delays can be costly.
To minimize data latency, firms are beginning to skip the aggregator and move to direct feeds. A direct feed is just that -- buying the data directly from the exchange. OK, so how hard is that? Instead of connecting to the aggregator, I connect with the exchange, right?
Well, not so fast. Now, all of the complexity that aggregators abstracted needs to be addressed by each firm. The first level of complexity is that data comes from many sources. That means a firm needs to connect to multiple exchanges (and ECNs). And while some exchanges have consolidated, many have done so in name only. So the NYSE feed is different from the Arca feed, and the Nasdaq feed (until recently) wasn't really one feed; it was three feeds (SuperMontage, INET and BRUT). In addition, there is Bloomberg's TradeBook, BATS, OnTrade (Citigroup's acquisition of NexTrade's ECN) and DirectEdge (Knight's acquisition of the Attain ECN). Also, there are five or six regional exchanges as well as options, futures and a plethora of global exchanges, too.
Further complicating matters, each exchange has its own data format and symbology (naming conventions), which means that each data feed must be normalized, renamed and consolidated before the data actually can be used. And that is only the beginning. The high-speed data space has been segmented into five or six technology categories, each filled by a number of technology providers that rarely have best-of-breed solutions across segments.
First, ticker plants are needed to take data from the exchange and prepare it for delivery. Feed handlers then are needed to normalize the data and consolidate multiple feeds into a single stream. From the feed handler comes the stream processor, which looks for certain data and separates the wheat from the chaff. The desired data then is sent to an event processor that analyzes the data even closer for specific data conditions that trigger "events," such as the creation of a quote, order or order cancellation. In addition, you also need a high-speed messaging bus so that once the data is delivered it gets transported and processed quickly. Then there is storage. Once data is captured, you'll want to store it for back testing to determine if the conditions you are searching for and the events you want to create will actually be effective.
So for a simple, albeit not light-speed fast, aggregated data feed, you now need ticker plants, feed handlers, normalizers, stream processors and time-series databases. Oy. And by the way, none of this is cheap. Including buying the data and transporting it (which I didn't even mention), you are looking at each feed costing upwards of six figures. Multiply that by a dozen and, along with the expanding number of markets, suddenly you are talking real money.
So the next time your management starts talking about direct feeds, be prepared -- you are in for an expensive journey. <<<
Larry Tabb, Special Contributing EditorLarry Tabb is the founder and CEO of TABB Group, the financial markets' research and strategic advisory firm focused exclusively on capital markets. Founded in 2003 and based on the interview-based research methodology of "first-person knowledge" he developed, TABB Group ... View Full Bio