Case Study: Developing an Automated Sentiment Analysis Service

Through this case study, discover:

how we built a consistent and individualized Natural Language Processing (NLP) pipeline,
how we were able to extract the financial news sentiment for thousands of organizations over a time period of 15+ years,
how sentiment analysis using financial news opens up numerous applications – including research and trading, trend monitoring, as a source of information for developing investment strategies, for portfolio and risk monitoring, or for market analysis purposes.

Context & Challenges

Our client is Julius Bär, the leading Swiss wealth management group. Julius Bär has been managing client assets for over 130 years and is headquartered in Zurich with a presence in over 25 countries and around 60 locations.

In order to be able to react quickly, especially to important events in the financial environment, banks have global monitoring of financial news. At Julius Bär, financial news is available in-house in several languages. The quality and completeness of the monitoring are crucial, for example, in order to check the validity of current investment recommendations or to identify relevant risks at an early stage as part of ongoing risk monitoring.

In most cases, this involves accessing a large number of different news streams and then attempting to combine them and manually “distill” the essential information relevant to decision-making. Due to the high manual share, however, this process is time and personnel intensive as well as highly repetitive. This applies to an even greater extent to the manual condensation of information, e.g., to the level of individual actors – this is necessary to be able to evaluate news over time.

Thanks to rapid advances in modern NLP as well as artificial intelligence (AI), it is possible today to automate the extraction and summarization of high-quality information. The backbone of such analysis are gigantic deep-learning language models such as GPT-3, which can automatically summarize, classify, translate, and even autonomously generate texts. They are based on a transformer architecture, which enables them to reproduce human ways of thinking and patterns of reasoning.

By using these transformer models, sentiment assessment as well as data compilation and pre-processing can be almost completely automated, while humans can play to their strengths in interpreting trends and, if necessary, deriving actions. Transformer models using financial news are therefore applied in many different contexts.

To get a clear idea of the type of situation that can influence investment strategies, check out the Wirecard scandal.

One of the key challenges in the present case comprised the fact that the financial news data was multilingual, and it was insufficient to predict an overall sentiment score for every news article. Instead, the sentiment scores had to be aggregated for each individual company.

Investigating on the project feasibility, it became clear that out-of-box technologies or solutions could not be used in the present context. The main challenges were:

to design and implement a multi-step NLP pipeline deriving financial news sentiments for individual companies;
to efficiently annotate high-quality ground truth for training and evaluating multiple NLP pipeline components;
to create visualizations tracking the sentiments of many companies simultaneously over time to spot negative events.

Our Approach

In order to address the top challenges described and develop a sustainable solution we especially took care of:

Translating the complex problem into attainable NLP steps by discussing with the business teams, analyzing the existing data, and based on our NLP expert experience.
Helping in the tool decisions for each step of the project (pdf parsing, machine translation, …). We considered task performance quality, latency, and integration into a consistent NLP Pipeline. For more details, go to the last section of this article.
Making design choices together with our client’s teams. Every decision was supported by experiments and iterations on client documents and a subset of the data in order to define the best solution in the end.
Solving the hard problem of attributing sentiments to individual companies by a focus on linguistic parsing rules.
Strongly integrating customer feedback to develop intuitive visualizations.

Key Benefits

We were able to extract the financial news sentiment for thousands of organizations over a time period of 15+years, making it possible to identify key negative and positive trends for every single organization.
The special feature of the present approach was the secure, cross-document assignment of the sentiments to the respective organizations concerned. Until now, there was no ready-made standard solution for this.
During the individual development, the decomposition of the complex NLP task into smaller subtasks proved to be crucial for the success of the project. On the one hand, the efficient training data annotation enabled a fast model optimization. On the other hand, it enabled the identification of more than 11,000 polarity phrases in a short time, which could later be used to determine organization-related sentiments based on syntactic distance.
We delivered a complex multi-step NLP pipeline from pdf documents to a scalable monitoring solution for organization-level sentiments in only six months.
By building a consistent and documented NLP pipeline, it is now very easy to put the developed ML models in production as part of an existing application or as a service. In the end, our recommendation is to use it as a restful service, therefore delivering an intuitive easy-to-use data product.
Furthermore, functional extensions open up additional application possibilities like e.g. the additional connection and integration of further sources (such as web providers of financial news) into the existing solution is easily possible. This would make it possible to implement a daily news sentiment dashboard – including graphical analysis options of sentiment development over historical time and a comprehensive free-text search function, for example for certain key events or for organizations.

Team involved on this project

One NLP expert and one Data Scientist collaborated with our client’s business and tech teams for six months on this NLP project (automated sentiment analysis service).

Technologies & Tools

Making the right tools selection in such a project is definitely no easy task. In fact, it requires a NLP expert able to analyze the plethora of open and commercial tools and technologies available on the market. Below are some insights on how our experts helped our client decide on which tools should be used for their sentiment analysis service:

fasttext has a small performance gain towards langdetect but is much faster.

Spacy and flair english models have a similar kind of performance while spacy has slightly higher inference speed on CPU vs. Flair. But the NLP pipeline concept of spacy is much more suited to easily build-up a custom end-2-end NLP pipeline with any custom component with any kind of NLP step to be integrated. It offers more integrations with other NLP tools and has more detailed documentation. Hence spacy is selected as an NER and pipeline tool over flair.

Sklearn character level TF-IDF vectorization combined with Logistic Regression is fastestet and reaches higher F1 scores on the inhouse financial news documents compared toFinBERT, but custom finetuned BERT is much better – but with too slow inference time. Distilbert is a good compromise in speed and performance.

More NLP Case Studies

Thanks to our
Authors

Christoph H.

Peter Neckel

Developing an Automated Sentiment Analysis Service for Julius Bär

Context & Challenges

Our Approach

Key Benefits

Team involved on this project

Technologies & Tools

More NLP Case Studies

Thanks to our
Authors

Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot

From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank

Unlocking Analytics Capabilities: Migration of a SAS Environment to SAS Viya on a Cloud-based Architecture

Harnessing NLP for Enhanced ESG Insights in Asset Management

Enhancing Padel Fan Experience: The Circus Brussels Padel Open Mobile App

Digital Payment Innovations: Cashless Payment, Gamification, and Loyalty Program – All in one app!

Enabling GDPR Compliance Through Anonymization of Personal Data Using Computer Vision

Model Monitoring using Explainable AI (XAI): Detecting Concept and Data Drift

Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot

From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank

Unlocking Analytics Capabilities: Migration of a SAS Environment to SAS Viya on a Cloud-based Architecture

Harnessing NLP for Enhanced ESG Insights in Asset Management

Enhancing Padel Fan Experience: The Circus Brussels Padel Open Mobile App

Digital Payment Innovations: Cashless Payment, Gamification, and Loyalty Program – All in one app!

Enabling GDPR Compliance Through Anonymization of Personal Data Using Computer Vision

Model Monitoring using Explainable AI (XAI): Detecting Concept and Data Drift

Developing an Automated Sentiment Analysis Service for Julius Bär

Context & Challenges

Our Approach

Key Benefits

Team involved on this project

Technologies & Tools

More NLP Case Studies

Thanks to our Authors

Related case studies

Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot

From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank

Unlocking Analytics Capabilities: Migration of a SAS Environment to SAS Viya on a Cloud-based Architecture

Harnessing NLP for Enhanced ESG Insights in Asset Management

Enhancing Padel Fan Experience: The Circus Brussels Padel Open Mobile App

Digital Payment Innovations: Cashless Payment, Gamification, and Loyalty Program – All in one app!

Enabling GDPR Compliance Through Anonymization of Personal Data Using Computer Vision

Model Monitoring using Explainable AI (XAI): Detecting Concept and Data Drift

Taking Information Access Experience to a New Level with a Generative AI Powered Chatbot

From Theory to Practice: A Generative AI Workshop to Guide a Leading Bank

Unlocking Analytics Capabilities: Migration of a SAS Environment to SAS Viya on a Cloud-based Architecture

Harnessing NLP for Enhanced ESG Insights in Asset Management

Enhancing Padel Fan Experience: The Circus Brussels Padel Open Mobile App

Digital Payment Innovations: Cashless Payment, Gamification, and Loyalty Program – All in one app!

Enabling GDPR Compliance Through Anonymization of Personal Data Using Computer Vision

Model Monitoring using Explainable AI (XAI): Detecting Concept and Data Drift

Thanks to our
Authors