Blue Swan: alternative data analysis

The Challenge
Our customer collects more than 100 alternative data sources that might have some predictive value for selling this data to hedge funds, asset managers, and investment funds.
The non-market data often can give an edge in trading because of novel and uncorrelated alpha. Still, generally, the process of mining such datasets is very time and resource consuming.
Before going all-in with some alpha hypothesis, it is vital to test its usefulness quantitatively. An additional challenge lies in the nature of alternative data – you have to build the features first from these unstructured data sources.
The Solution
Combining business analytics and data science skills, we have created numerous expert and mathematical custom variables that were modelled with ensembles of algorithms.
After that, we have reverse-engineered the models and studied the most important features. In this way, we identified data sources that have predictive value.
Next, based on these forecasts, we have created a dynamic portfolio optimisation system in order to understand the whole market feature importance.
The Result
Before
- No transparent and tangible method to tell in advance if a new data source has predictive value, how much and how it works combined with other data sources
- Average 1.1 Sharpe ratio of various benchmark strategies on the cryptocurrency market
After
- The transparent value model was developed that shows how each separate data source and their combination affect trading decisions (5-10% predictive accuracy boost)
- The portfolio optimization strategy based on alternative data had Sharpe ratio around 2.0
- Improved value propositions for the customers who can acquire trusted and verified alpha features and trading signals