Data Management

01:44 PM
50%
50%

Hadoop 2.0: The Capital Markets Dragon Slayer?

Hadoop 2.0 might help capital markets firms catch up with other industries when it comes to big data adoption.

While industries such as healthcare, e-commerce and even retail financial Services are being rapidly transformed by the technology ecosystem of "big data," leading capital markets firms have been in the uncharacteristic position of lagging the innovation curve.

Jennifer L. Costley, Ashokan Advisors
Jennifer L. Costley, Ashokan Advisors

Capital Markets technologists give many reasons for the lack of viable implementations: data sets that are not sufficiently "big," difficulty integrating another data solution into an already crowded portfolio, and even a desire to keep their activities confidential to preserve their market advantage. But the biggest reason for the failure of these firms to aggressively engage with big data has been limitations in the architecture of the core big data technology.

Apache Hadoop has been optimized for Map/Reduce, an offline ("batch") processing architecture that does not readily support use cases demanding real-time or near real-time results, such as Value-at-Risk (VaR) and algorithmic trading. Although other options beyond Map/Reduce are available, they are layered onto a core technology in which Hadoop closely couples resource management with data processing.

[For more on how Wall Street organizations are approaching big data challenges, read: Financial Firms Adopt Big Data As Defense Against Cyber Threats.]

As a result, it is not possible to run multiple applications simultaneously (e.g., SQL, in-memory analytics, and Map/Reduce) with any level of control over the prioritization of these applications, and hence a guarantee of timely results.

Spinning The YARN

That is, until now. The introduction of Hadoop 2.0 and, in particular, the new YARN resource manager component promises to provide the capital markets with a loosely-coupled architecture that will allow all of the necessary processing modes -- batch, interactive, online and streaming -- to run simultaneously on Hadoop with defined quality of service via resource allocation.

To understand how YARN works, it is important to first understand the current Hadoop architecture. The current Hadoop Map/Reduce System is composed of the JobTracker, which is the master scheduler, and TaskTrackers associated with each of the nodes (instances) of the application. The JobTracker is responsible for all resource management tasks including managing the TaskTrackers, tracking resource consumption/availability, and job life-cycle management. JobTracker views the cluster as composed of nodes managed by individual TaskTrackers with distinct "map" slots and "reduce" slots. These slots are cannot be reassigned. It is this locked-in hierarchy that prevents the optimal execution of non-Map/Reduce workloads on Hadoop.

YARN breaks this limitation by splitting up the two major responsibilities of the JobTracker, resource management and job scheduling/monitoring, into separate functions -- a global ResourceManager and per-application ApplicationMaster. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The ApplicationMaster negotiates resources from the ResourceManager and works with the NodeManager(s) to execute and monitor the component tasks.

This separation of responsibilities allows the arbitration of resources among the competing applications and provides the flexibility necessary for more optimal resource management and service level guarantees. This in turn should allow more business-driven capital markets use cases to be realized in the big data paradigm.

Whether or not Hadoop 2.0 proves to be the ultimate big data roadblock "dragon slayer" for capital markets, it will surely be a major breakthrough in eliminating some of the most critical obstacles to success.

About The Author: Jennifer L. Costley, Ph.D. is a scientifically-trained technologist with broad multidisciplinary experience in enterprise architecture, software development, line management and infrastructure operations, primarily (although not exclusively) in capital markets. She is also a non-profit board leader recognized for talent in building strong governance and process. Her current focus is in helping companies, organizations and individuals with opportunities related to data, analysis and sustainability. She can be reached at www.ashokanadvisors.com.

Jennifer L. Costley, Ph.D. is a scientifically-trained technologist with broad multidisciplinary experience in enterprise architecture, software development, line management and infrastructure operations, primarily (although not exclusively) in capital markets. She is also a ... View Full Bio
Comment  | 
Print  | 
More Insights
More Commentary
Why Settle for Less in the Front Office?
Recent research shows that sell-side firms are less than satisfied with their order management system (OMS) technology. Many front offices, however, continue to make do with their current solutions. Are they selling themselves short?
BYOD Policy: Don't Reinvent the Wheel
Financial firms still feel overwhelmed by BYOD risks and challenges. But these can be addressed by a good policy, and the guidelines are already out there.
The BYOD Challenge
Having a policy in place to manage mobile devices used by employees for work purposes is necessary in this current day.
Getting Onboarding Right in the Age of the Customer
Disparate “Frankenstein” systems slow down the onboarding process and impede customer service, says Pegasystems.
Performance Monitoring Key to Smooth Infrastructure Modernization
As banks consider how to shift infrastructure and storage solutions, they can’t afford to lose visibility into performance.
Register for Wall Street & Technology Newsletters
White Papers
Current Issue
Wall Street & Technology - July 2014
In addition to regular audits, the SEC will start to scrutinize the cyber-security preparedness of market participants.
Video
5 Things to Look For Before Accepting Terms & Conditions
5 Things to Look For Before Accepting Terms & Conditions
Is your corporate data at risk? Before uploading sensitive information to cloud services be sure to review these terms.