Big data becomes big business for financial services firms -

Global economies / Country reports / Data moves centre stage

Dan Barnes | 1/11/2012 9:01 am

The push for transparency, real-time information and improved risk management in financial services has led to the adoption of new computing infrastructure, which means that running a Google search to check credit risk exposure might be around the corner.

A development in the way that data is stored and retrieved is promising to give banks previously unheard of insight. The finance industry is under pressure from both regulators and customers to improve transparency, with the consequence of non-compliance ranging from increased capital charges to being barred from certain lines of business. At the same time, the continuing growth in electronic transactions across both retail and wholesale businesses has meant that banks must run increasingly higher levels of calculations at faster speeds.

Editor's choice

The age of analytics: what banks have to gain

“Over the past five years, the importance of scale-out architecture to running our business has increased substantially,” says Don Duet, global co-chief operating officer in the technology division at investment bank Goldman Sachs. “The amount of computing that is required to run a large global market making firm and the amount required to run risk management at scale, given the size of positions and complexity of our business, means that we have seen an exponential curve in the amount of computing needed to run and execute the business on a daily basis.”

Running at real time

The push for real-time information and improved risk management is challenging for relational databases, the workhorses of data analysis. The process of extracting, transforming and loading data into relational databases is slow and requires the data to be structured. That means the user must predetermine which characteristics of the subject will be of interest in the future, based on the queries that will be demanded of it, then place numerical values around those characteristics so the information becomes quantifiable.

If circumstances change and new queries are demanded, the effectiveness of the database will be questionable. For example, the mandated use of central counterparties for clearing over-the-counter derivatives has meant that the calculation of variation margins must be done intraday, rather than on a quarterly basis. Clearly a system set to gather the data on a bank’s position and then calculate this figure every three months may not be able to cope with the new demands.

However, the systems used to provide search engines with rapid access to the content of millions of documents are being tested by banks as a solution to challenges of this sort. The online and social media community – Google and Facebook, for example – has been faced with an infinitely larger challenge, in terms of storing and retrieving data that has not been structured.

“The circles we were ending up in were often the same circles as people building scalable computing architecture for search or social architectures," says Mr Duet. "The focus from a design and management perspective was leading us to spend as much time with dot-com firms as with classic infrastructure firms.”

Goldman Sachs joined the Open Compute project, which has the target of building “one of the most efficient computing infrastructures at the lowest possible cost”.

Paul Davis, a global banking industry leader at IBM, says: “Large institutions always had the capability to do this if they wanted to spend the money; what has made this more attractive is that they are able to do this cost-effectively using commoditised hardware.”

Source code

Google operates using a system called MapReduce, which breaks information down into equal-sized chunks and stores it in parallel across multiple commoditised servers. That has two advantages. First, searches can be carried out in parallel, with all of the chunks of data at the same time rather than sequentially, making the process of running a check very fast. Second, the lack of structure in the stored data means that searches can be open with the ability to apply more in-depth searches of data once an initial search has delivered results.

Open-source software group Apache released its own system on December 27, 2011, which had been inspired by a 2004 whitepaper in which Google explained the MapReduce design. Called Apache Hadoop, it has been picked up by Yahoo! and Facebook among others, and as an open-source platform is allowing firms to test the potential that big data technology holds for them. Similar to MapReduce, it stores data on low-cost server clusters delivering a cheap method of rapidly searching through unstructured data.

“It allows you to get this weapons-grade analytical capability at a low-cost entry point,” says Mr Duet. Few banks are beyond the proof-of-concept (PoC) stage yet; most big players are currently exploring the technology’s limits and strengths to identify where it can be used.

In-house action

Alasdair Anderson, global head of IT Architecture at HSBC, says: “The concept we are exploring with big data is the ability to query the entire enterprise in a single environment. We have been kicking the tyres. Could this technology be used productively by our offshore development teams? Or does big data require a team of Silicon Valley geniuses to make it work? The answer to the offshore question was a resounding 'yes'.”

To test it, the bank set up a 'hack-a-thon' competition; 18 teams were given 24 hours over a weekend on a big Hadoop cluster with terabytes of cleansed data sets, to see what they could do with it. At the end of the 24 hours, 17 teams had delivered solutions. The winning team created a solution that used predictive analytics to determine how customers might make choices about their investment portfolios.

Internally, the bank is now examining where it can best be used. Its properties mean that the first insight gleaned from a set of data is far from the last and that will allow banks to identify greater and greater numbers of use cases as they experiment with and apply the systems to different challenges.

“You can continually advance measurement and quality of information once the data sets are in a file system,” says Mr Anderson. “It is not the case that you analyse it once, then it is structured and you never change it again. You have always got capacity to work on it. The team that developed the predictive analysis were using statistical methods to gain insights; they wanted to take the results of what the customer sees on a screen, the choice they make, then modelling in a similar way to machine learning to predict likely customer choices in the future.”

Big data in action

Goldman Sachs has been applying big data approaches and technology to help its security model evolve, in the face of rapidly changing threats. “How do we continuously improve data management, ensuring that the assets of the firm are protected and that we protect ourselves against external threats?” asks Mr Duet. “We have built a meaningful size Hadoop cluster, where we can bring in security content from web logs to firewall data to packets on the network and then run different types of analytics and queries to look for unusual behaviours."

Known internally as the Hunter Programme, it makes use of the flexible querying that big data technology provides to help the firm understand whether it is under attack.

Credit Suisse is also applying big data models to security. Ed Dabagian-Paul, lead architect for enabling technologies at Credit Suisse, says that security is effectively a cost of doing business, and so the ability to run it on low-cost hardware and open source software is appealing. “It is nothing the industry hasn’t been doing for years, but we are now able to do it at a lower cost point and across types of data that we weren’t able to run it across before,” he says.

The bank is using the commercial product Splunk, which has a scale-out architecture similar to Hadoop, to analyse security and other logs. It has also run a PoC with ETH Zurich (Swiss Federal Institute of Technology), looking at attempts to access databases with non-parsable queries, to check whether there was an attempt to access restricted information.

IBM's Mr Davis says: “The traditional use cases that we see fall into three areas: risk, fraud and customer data. We saw the most initial interest around risk management and compliance. Banks are using this for financial risk analysis, providing insight into credit and market risk, running Monte Carlo simulations [computational algorithms that rely on repeated random sampling to compute their results]. It is also used for risk assessment on small and medium[-sized] businesses where firms are pulling in information from news feeds to supplement structured data, and also on larger institutions that are operating as a bank’s counterparty, where a lot of information about them is released in unstructured or semi-structured formats such as press releases.

“Second, it is used for anti-fraud, anti-money laundering and market surveillance. It is amazing what people will put into Facebook or Twitter that can be correlated with other data. Once you have established something suspicious you can deep dive into the data to see if there is something to substantiate the suspicions. Finally, we have seen a lot of interest around customer and marketing insights, for example BBVA is looking at sentiment analysis, but also banks are looking at contact-centre analytics, taking voice, e-mails, surveys, customer data that you can factor into you segmentation and next best action activities.”

Still in PoC with other banks are situations such as the examination of derivatives contracts held in PDF format to determine the firm’s position in credit default swaps, for example.

No substitute

This does not mean that banks are looking to replace old systems wholesale. Typically, once a query has been run across unstructured data, the banks will have quantifiable data that can be analysed using traditional databases.

“It can be hard to prove the return on investment [ROI] in some of these technologies,” says Neil Palmer, partner in the advanced technology practice at SunGard Consulting Services. “The ability to discover unseen patterns is not a [good] ROI to get you on board in the first place. The other challenge we are seeing is that you need more tools that operate at the Hadoop layer. As patterns emerge, that will drive the development of other applications to deal with them.”

Mr Dabagian-Paul at Credit Suisse says that there are some aspects of Hadoop that are not as resilient as the abilities that can be provided with a traditional database, which will require Hadoop to develop or the banks to change how they do business, with a bit of both most likely to occur. The existing support for data analysis in the enterprise will also need to change if firms are to make best use of big data technologies.

“The cost structure of storing all of this data on commodity hardware without licence fees is very interesting but most banks have enterprise-wide agreements with their database vendors, so bringing in one Hadoop cluster will either not change or even increase their overall spend slightly,” he says. “There needs to be a tipping point where enough data is running through Hadoop and not relational databases and I think we are a while off from that.”

Philosophical shift

From a hardware perspective, banks have not been geared towards the stripped-down servers that are used by the social media and search firms to support big data technology and Mr Anderson believes this may slow adoption.

"There is a dichotomy in the data centre,” he says. “You might have 1000 servers on the floor, then you implement virtualisation and you have 4000 servers, but that means your diversity and complexity increases and that is taking you further away from where you want to get to. Big data takes those 1000 computers and makes them one. So that is a real sticking point for everyone within the enterprise corporate world; data centres are not set up to do this. So at the moment adoption is challenging."

However, the model supports the evolution of risk mitigation to the software layer of the technology stack, where traditionally a lot of risk mitigation has been handled in hardware. A data centre would be built with two power supplies and a significant supporting infrastructure; a big scale-up server would have all of the bells and whistles attached to make sure its risk of failure was as close to zero as possible. But that was the old world, argues Mr Duet.

"The new world is to take as much of that risk mitigation and put it into software,” he says. “Your application is smart enough to know it needs to be backing itself up persistently. Or, rather than spending $20m on building in critical power and electrical resiliency, you distribute data though your cloud so that information is distributed into multiple places by default, so failure of any one place can be easily recovered. There is a secular shift and Hadoop is a great example of that. With MapReduce you write every piece of data to multiple places so if you lost any one node, that data is replicated in enough places that you never lose the actual data."

PLEASE ENTER YOUR DETAILS TO WATCH THIS VIDEO

All fields are mandatory

Full Name

Job Title

Academic
Account Executive
Acquisitions Librarian
Acting Group Chief Risk Officer
Actuary
Administration Manager
Advisor
Advisory Officer
AML/FIU Officer
Analyst
Assistant General Manager
Assistant Manager
Assistant of the Administration Office
Assistant Vice President
Associate
Associate Partner
Associate Vice President
AVP and Senior Manager
Banking Officer
Bibliographic Services Coordinator
Branding Manager
Business Head
Business Information Researcher
Chairman
Chairman of the Board, Owner
Chairman of the Management Board
Chambers Director
Chief Analyst
Chief Business Officer and Chief Information Officer
Chief Compliance Officer
Chief Credit Officer
Chief Digital Officer
Chief Economist
Chief Executive Officer
Chief Financial Officer
Chief Information Officer
Chief Investment Officer
Chief Liability Management Officer
Chief Librarian
Chief Manager
Chief Marketing Officer
Chief Operating Officer
Chief Risk Infrastructure Officer
Chief Risk Officer
Chief Strategy Officer
Chief Technology Officer
Communications Services Officer
Communications Specialist
Compliance Manager
Compliance Officer
Consultant
Content Marketing Specialist
Coordinator Reference and Research
Corporate Controller
Corporate Secretary
Correspondent Manager
Counsel
Country Chairman
Country Manager
Department Head
Deputy Chairman of the Supervisory Board
Deputy Chief Financial Officer
Deputy Chief of Department
Deputy Customer Service Manager
Deputy Director General
Deputy General Manager
Deputy Governor
Deputy Managing Director - Shared Services
Deputy President of Management Board
Director
Director Central
Director General
Economist
Electronic Resources Librarian
Enterprise Risk Services Consultant
EU Policy Adviser
Executive Chairman
Executive Clerk
Executive Director
Executive Secretary to Vice Chairman
Executive Vice President
Executive/Associate
Financial Analyst
Financial and Risk Assurance Specialist
Financial Controller
First Deputy Governor
Fund Manager
General Affairs Administrator
General Manager
Global Head
Global Head of Client Interest Protection
Global Sales Director
Governor
Group Finance Director
Group Head
Group Head of Institutional Clients
Head
Head Librarian
Head of Accounting and Procurement Sector
Head of Content
Head of Division
Head of EMEA Communications
Head of European Regulatory and Public Affairs
Head of Financial Services
Head of Innovation
Head of IT & Admin
Head of Media Coordination
Head of Operational Risks
Head of Press and Social Networks
Head of Regulatory Affairs
Head of Sector
Head of Trade Finance Bank Relations
Head of Training
Information Advisor
Information Resource Manager
Interim Office Manager
International Affairs Advisor
International Media Relations Manager
Knowledge Manager
Lead Economist
Librarian
Library Journal Administration
Library Officer
Library Technician
Licensing and Approvals Officer
Macroprudential Specialist
Manager
Manager Market Risk
Managing Director
Media Director
Office Administrator
Owner/Partner/Proprietor
Partner
Philanthropist
Policy Analyst
Portfolio Manager
PR & Marketing Unit Head
Prefer not to disclose
President and Chief Executive Officer
Principal
Procurement Officer
Professor and Director
Project Manager
Public Relationship Manager
Regulatory Affairs Director
Regulatory Compliance Manager
Relationship Manager
Researcher
Resolution Planning Specialist
Retired
Risk Manager
Second Deputy Governor
Secretary
Secretary for the Commission
Senior Administration Officer
Senior Analyst
Senior and Manager Partner
Senior Communications Officer
Senior Content Marketing Specialist
Senior Credit Management and Control Manager
Senior Executive
Senior Financial Markets Analyst
Senior Fund Manager
Senior Information Services Manager
Senior Information Specialist
Senior Inspector
Senior International Advisor
Senior Officer
Senior Partner
Senior Policy Analyst
Senior Researcher
Senior Supervisor
Senior Vice President
Senior Vice President and Division Chief
Specialist Quantitative Analyst
Strategy and Policy Officer
Strategy Coordinating Officer
Subscription coordinator
Subscription Manager
Superintendent
Supervisor of Insurance
Supervisory Board Member
Surveillance Officer
Systems & Projects Manager
Trader
Treasurer
Treasury Director
Vice President

Company

Country

Afghanistan
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Australia
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belgium
Belize
Benin
Bermuda
Bhutan
Bolivia
Bosnia and Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Canada
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Congo
Congo, The Democratic Republic Of The
Cook Islands
Costa Rica
Croatia
Cuba
Cyprus
Czech Republic
Côte d'Ivoire
Denmark
Djibouti
Dominica
Dominican Republic
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard Island and McDonald Islands
Holy See (Vatican City State)
Honduras
Hong Kong
Hungary
Iceland
India
Indonesia
Iran, Islamic Republic of
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Japan
Jersey
Jordan
Kazakhstan
Kenya
Kiribati
Korea, Democratic People's Republic of
Korea, Republic of
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Libyan Arab Jamahiriya
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia, The Former Yugoslav Republic of
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States of
Moldova, Republic of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Norway
Oman
Pakistan
Palau
Palestinian Territory, Occupied
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Barthélemy
Saint Helena
Saint Kitts and Nevis
Saint Lucia
Saint Martin
Saint Pierre and Miquelon
Saint Vincent and The Grenadines
Samoa
San Marino
Sao Tome and Principe
Saudi Arabia
Senegal
Serbia
Seychelles
Sierra Leone
Singapore
Slovakia
Slovenia
Solomon Islands
Somalia
South Africa
South Georgia and The South Sandwich Islands
Spain
Sri Lanka
Sudan
Suriname
Svalbard and Jan Mayen
Swaziland
Sweden
Switzerland
Syrian Arab Republic
Taiwan
Tajikistan
Tanzania, United Republic of
Thailand
Timor-Leste
Togo
Tokelau
Tonga
Trinidad and Tobago
Tunisia
Turkey
Turkmenistan
Turks and Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United Kingdom
United States - Excluding New York State
United States - New York State
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Venezuela
Viet Nam
Virgin Islands, British
Virgin Islands, U.S.
Wallis and Futuna
Western Sahara
Yemen
Zambia
Zimbabwe
Åland

Phone

Email Address

The Banker is a service from the Financial Times. The Financial Times Ltd takes your privacy seriously.

Choose how you want us to contact you.

Invites and Offers from The Banker

Receive exclusive personalised event invitations, carefully curated offers and promotions from The Banker

By Email

By Phone

For more information about how we use your data, please refer to our privacy and cookie policies.

Please tick this box if you would like to hear from our sponsor, [Sponsor name], about their products and special offers by email

Terms and conditions

I understand and agree to The Banker terms & conditions.

Big data becomes big business for financial services firms

Editor's choice

PLEASE ENTER YOUR DETAILS TO WATCH THIS VIDEO

Choose how you want us to contact you.

Join our community

The Banker on Twitter

Big data becomes big business for financial services firms

Editor's choice

PLEASE ENTER YOUR DETAILS TO WATCH THIS VIDEO

Choose how you want us to contact you.

More stories from this section

Join our community

The Banker on Twitter