ExtractAlpha Transcripts Model US
The model is generated using our tailor-made NLP engine which is trained recursively using a corpus of over 200,000 transcripts going back to 1999. We utilize the NLP engine to produce both Bag-of-words and word-embedding features, on which we train a machine learning model to predict future stock returns.
Using this signal, a portfolio that goes long the highest-ranking decile of liquid US stocks and short the bottom decile generates economically significant returns with low turnover. The signal has non-trivial factor and industry exposures.