September 19, 2013

There is a growing consensus that the limits of value generation from the low-latency segment of high-frequency trading (HFT) are being reached and that the next wave of algorithmic trading will be based on the combination of real-time processing with unstructured “big data” sources.

While the “race to zero” in the low-latency segment of HFT is not entirely over – witness the recent investments in low-latency microwave networks by McKay Brothers and others – profitability has diminished and the market is consolidating for both HFT firms and their vendors.

Right now, everyone is struggling to position for the next wave by examining the technical architecture differences between the solutions for “big data” (e.g. Hadoop) and those for HFT (e.g. in-memory processing).

[Hadoop 2.0: The Capital Markets Dragon Slayer? ]

In my view, integration of technologies is unlikely to be the greatest obstacle to the successful integration of “big data” into HFT. There are three fundamental cultural differences that are going to be harder to resolve:

1. Data quality. Unstructured data is messy, inconsistent and incomplete. It is best used for situations in which the consequences of making mistakes are limited and there is ample opportunity to integrate the learnings from those mistakes back into the algorithms. I recently attended a presentation by a data scientist from Spotify, the music delivery application. Their recommendation algorithm is kept as simple as possible, given the massive size of the data set, but they are constantly adjusting it are and are more concerned with getting it better than getting it right.

2. Data timeliness. The big data ecosystem revolves around Hadoop, which is inherently a batch processing construct. While this is changing in Hadoop 2.0, the reality is still far from an online, real-time view of the world. (Spotify updates their music recommendations daily based on overnight batch processing. They plan to improve this soon – but to hours or minutes, not microseconds.)

3. Data architecture. Although it is a difficult challenge which is often honored in the breech, capital markets players understand and respect the value of data architecture. The concepts of master data management, data integrity, and golden sources are well-established. The absence of pre-defined schema in non-relational databases often associated with “big data” is just one indication that the “big data” world is not cultural ready for the merger.

Not listed above are the more generalized fears and uncertainties that accompany any innovation wave, especially one that threatens to challenge the status quo. Examples of these are also evident in recent postings in this publication and elsewhere. My advice is to face the cultural challenges head-on by recognizing them and especially by understanding how they play out in your own firm.

[Check out ways companies are Navigating the Big Spectrum of Big Data’s Solutions at Interop, which runs from September 30 through October 4 in NYC.]