Wall Street & Technology: Blog
subscribe June 22, 2007

IBM Previews Ultra-Powerful Stream Processing System

For the past four years, a team of 70 engineers in IBM’s T. J. Watson Research Center in Hawthorne, N.Y. has been working on an ultra-powerful, large-scale (as in petabytes of data) stream processing system, currently running on 800 x86 computers with embedded Cell processors, that can analyze in real time massive volumes of market data and news (as well as medical, seismic, astrological or any other type of data). At the SIFMA show this week, IBM talked about this “mature prototype,” called System S, which Wall Street firms will one day be able to use to create a no-holds-barred environment in which their quants can roam free, testing ideas, finding correlations and refining algorithms, using a huge pipeline of streaming data and seeing instant results. A government agency is already using System S and IBM has filed 400 patents for it. The reason IBM talked this project up at SIFMA is because it's interested in working with capital markets firms on pilots to see what System S could do for them.

“Some of our most sophisticated clients on Wall Street are jumping all over this,” says Kevin Pleiter, director of financial services at IBM. “The power of this is it’s able to correlate events from disparate data sources.” For instance, a market data event such as a plunge in the price of certain stocks might trigger an algorithmic trading program to buy some of the stock. But if the price drop were caused by a calamity such as a terrorist attack, such a purchase would be unwise. Simultaneously as it’s watching the market data, System S could be taking in video feed from television networks, analyzing the news, and sending a recommendation to put the trading system into crisis mode.

In another example, a buy-side firm looking at a company could correlate its road show, analyst call, and fundamentals such as earnings per share with other data sources. “If the CEO is saying that orders are strong, imagine being able to correlate that with satellite imagery that tells you whether or not the parking lot is full and whether or not trucks are going to and from the distribution facility,” says Pleiter. “If the parking lot is half empty, the system would recognize that this guy is trying to talk his stock up.”

Pleiter acknowledges that several complex event processing products exist on the market today. “But today it’s an environment where people are taking in structured data, putting it into a fixed format, and events are triggered off that stream,” he says. “System S takes this four generations forward, to a highly distributed, highly scalable stream processing technology that can take in any type of structured or unstructured data without requiring reformatting and allowing anything to be done with it.” For instance, video streams from CNN, Al Jazeera, and BBC News could be analyzed alongside market data feeds from Reuters, Thomson and Bloomberg as well as archived phone calls, emails, HTML pages, research reports, purchase orders, invoices, satellite images and more. System S is said to have parsers and semantic annotation to help analyze each of these streams.

IBM already had many of the pieces required to do this. It’s had an enterprise platform for managing structured and unstructured information together for several years, and a couple of years ago it introduced an architecture for mixing and matching various types of search and text analytics technology (this is called UIMA and works with most of the best-in-class search products). It has video parsing and searching. It has speech recognition software.

System S contains a brand-new technology layer that Pleiter describes as “an artificial intelligence-like scheduling technology. This is intelligent scheduling, looking at the information streams and steering the hardware when major changes occur, because something important must have happened,” he says.

In the Watson lab, the System S computers are connected with 20 gigabit InfiniBand, but researchers are playing with optical switches and optical networking, aiming for a super-fast 100 GB network.

The System S user environment comes in three versions. There’s a simplistic user interface that lets users query the system the way they would a database, using predefined SQL calls. There’s an intermediate interface that's similar to using Excel macros. For power users, there’s an Eclipse-based development environment for writing custom applications.

The system is meant to be flexible in its use of hardware. “The conceptual picture is that the design of the software control programs is specifically intended to allow aggregation and exploitation of all kinds of hardware,” says Nagui Halim, director of high performance stream processing at IBM. “My thinking was customers get big installations of hardware, they make changes, they buy specialized accelerators. We allow the segmentation of specialized functions to accelerators.” So far the Watson lab is using embedded Cell processors on IBM BladeCenters running Linux and they’re testing FPGAs. System S can run on a tiny system such as a laptop and scale up to a 100,000 node cluster.

Posted by Penny Crosman at 03:30 PM



This is a public forum. CMP Media and its affiliates are not responsible for and do not control what is posted herein. CMP Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of CMP Media LLC and may be edited and republished in print or electronic format as outlined in CMP Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.


CHECK THIS OUT

Novell Real Time Linux Webcast Series
In order to succeed, companies must be able to respond quickly, deliver superior value and quality of service, and carefully manage their costs. In this series of brief webcasts, you will learn how SUSE Linux Enterprise Real Time from Novell enables organizations to respond quicker by delivering low latencies, deliver increased value with fast response times, and better manage costs.

Events

Live Events:
Accelerating Wall Street 2
October 02, 2008

Buy-Side Trading Summit 2008
November 16-18, 2008


White Papers

Level 3 Connectivity Kit
Stay ahead of the bandwidth curve. The Level 3 Connectivity Kit provides full resources to help you make informed decisions regarding your network infrastructure. Download the Data Center Networking Strategies for Financial Services Firms White Paper; Business Class Ethernet: Trends in Perspective eBook and BC/DR Best Practices for the Data-Intensive Enterprise Gartner Webcast

Surviving and Thriving in a Challenging Market
Learn how financial services firms can use customer-centric strategies and tools to maximize client value and loyalty, gain insight into new opportunities, and do more with less, counteracting market volatility.

Marketplace

Career Center


Ready to take that job and shove it?

Function:
Information Technology
Engineering
State:


Keyword(s):

Browse By:
State | City
techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics