The competition consisted of two main phases. In the initial phase, the very first challenge, teams were tasked with developing bottom-up stock-picking strategies for portfolio construction using AI and machine learning. They worked with a dataset of approximately 1,000 large-cap U.S. stocks, each with over 140 signals per month, spanning the past 23 years of financial markets data. The goal was to build and back-test monthly rebalanced portfolios to identify high-performing strategies, establishing a foundation for advancing to the final round which took place on October 23 and 24, 2024.
In the final challenge, the top selected by a review committee 10 teams faced a new, more complex task. Building on their previous work, they were now required to design an institutional-grade investment product under real-world constraints. This included restrictions to long-only positions, specific limits on portfolio turnover, volatility, and risk, and a prohibition on leverage. Additionally, they shifted focus to a larger set of mid- to small-cap U.S. stocks (2,500 to 4,000 monthly) and an extended data range from January 2000 to August 2024. Teams were also required to maintain portfolios with a minimum of 50 stocks and a maximum of 100.
The objective for each team was to tackle these constraints while crafting a robust solution for optimized portfolio construction. They were encouraged to incorporate any additional data they believed could enhance their models’ performance. Their key tasks included:
1. Identifying effective machine learning methods for portfolio construction.
2. Recognizing key financial factors that could improve stock selection.
3. Optimizing stock selection and allocation to maximize performance within the set parameters.
4. Running back-tests to evaluate the historical performance of their portfolios.
The following are the teams’ summaries of their approaches to solving this complex challenge.
Our team model consisted of two main parts – the prediction and portfolio optimization model. With the data given, we first applied a discrete wavelet transform on the given features and target return variable. We trained an xgboost model to predict these decomposed returns out of sample using the decomposed features available. We then performed an inverse discrete wavelet transform to reconstruct the return time series. This yielded monthly OOS predictions for 14 years, from 2010 to 2024. Using these predicted returns, we built (rebalanced) our portfolio every month by finding the optimal set of weights that optimized for expected return, while respecting some constraints given (such long only and between 50 to 100 stocks). This was done with the open-source cvxpy optimization package. Over the tested OOS period, our model achieved a 55% hit ratio (directional accuracy) in its predictions, and delivered an annual average return of 36%, in excess of the risk-free rate. We achieved positive portfolio returns in 67% of months OOS.
Our team implements five analysts with one portfolio manager AI system as seen in traditional portfolio managing teams. Each Analyst focuses on a different part of the market (fundamentals, sector, options, market sentiment, momentum indicators) and provides a prediction for the upcoming month’s stock excess return over the risk-free rate. The models have a rnn-mham-lstm structure, except for the sentiment model which uses two transformers. When originally testing each one independently, their results were below par as they took on more risk yet marginally beat the benchmark, the S&P500. To account for this deficit present in all models, the gating network is used to consolidate the predictions and output its own prediction after evaluating the trustworthiness and bias of each analyst given the current market conditions, which it learns of along the way.
To implement the portfolio manager’s final predictions, a selector script uses five ratios (Sharpe, Sortino, Treynor, Calmar, Information) and Jensen’s Alpha to pick out the optimal stocks in a 50 long-only ideal portfolio for the upcoming month. Keeping monthly turnover constraints in mind, a separate program creates and rebalances the portfolio with guidance from the ideal portfolio for the upcoming month. Under conservative liquidity and trading fee assumptions (15bps turnover fee), QUARCC achieves an 18.81% CAGR portfolio with a beta of 1.02 and a 95% VaR of -17.88% over the 14 years (Jan 2010 – Aug 2024), creating an annualized alpha of 5.12%.
Our long-only, 50 small/medium cap stock portfolio achieved an annualized return of 28% with a Sharpe ratio of 0.95 from January 2010 to August 2024. After experimenting with various models—including linear models, multilayer perceptrons (MLPs), multilayer perceptron-mixer (MLP-Mixer), long short-term memory networks (LSTMs), and Transformers—the best performance was achieved using a convolutional neural network with bidirectional long short-term memory and attention mechanism (CNN-BiLSTM-AM). The CNN and attention mechanism effectively selected significant features while the BiLSTM learned temporal relationships between features.
Future work could explore optimizing portfolio construction and incorporating alternative datasets. We experimented with reinforcement learning for portfolio optimization, but the model struggled to converge due to the high dimensionality of the state space. Additionally, we developed proof of concepts for earnings call sentiment analysis, satellite image classification/segmentation, and lawsuit tracking, which could be valuable as future signal inputs.
We developed an All-Weather Fundamental Surprise Enhanced Momentum Strategy that uses a market regime detection algorithm to switch between two strategies: an aggressive one that aims to capture the most alpha during bull markets and a more defensive approach that focuses on low risk during bear markets. For both strategies, we added EPS surprise estimates as an extra layer to our multi-step screening method. We’re excited to share that this approach delivered an impressive annualized return of 23.2%, with an annualized alpha of 6.9% and a Sharpe ratio of 0.84!
Our team’s strategy utilized a neural network to gauge the quality of each stock’s respective company, implementing a universal function approximator to generate a “Quality Score”. We then scraped SEC insider trading data for each company to develop another signal our model could use to predict returns. These two factors were optimally blended with a medium-term absolute momentum factor in order to reduce periods of drawdown. Each month, our model would consider the stocks with the highest blended score, and use a convex optimization model to assign the optimal weights when constructing the portfolio while staying within certain constraints on volatility, turnover, and position limits.
Our strategy outperformed the market over the test period of 2010-2024, with average annualized portfolio return of 18.54%, while also producing a portfolio with an annualized Sharpe ratio of 1.25 and information ratio of 0.72. The team was pleased with our annualized Alpha of 3.18%, and our strategy performed well over several sub-timeframes. Further work to be carried out and implemented includes analyzing SEC sentiment data to create an improved insider trading signal, normalizing the momentum by sector, and active factor investing.
Our two-stage strategy begins with sector selection, where we classify sectors into two groups based on their predicted returns and specific volatility factors. First, we rank each sector by monthly return and group them as ranks 1-5 and ranks 6-11. This binary grouping is then predicted by our model, for which we tested multiple classifiers over historical data, ultimately selecting AdaBoost based on performance from 2000 to 2010. Using an expanding window approach, this model predicted groupings that informed our strategy. Notably, Real Estate, Materials, and Consumer Staples consistently ranked as top sectors, likely due to factors like lowered interest rates, rising gold prices, and the defensive appeal of Consumer Staples during volatile periods. Tech’s lower ranking might reflect our dual volatility filters, which adjust for trailing VIX and sector-specific volatility, as Tech tends to perform in higher-volatility phases.
In the stock selection stage, our aim was to forecast returns for small and mid-cap stocks based on fundamentals like growth potential and financial health. We selected 29 features tied to solid fundamentals, trained our model with them, and let our algorithm suggest supplemental features to add, prioritizing stocks with high R&D spending and efficient asset use. Each month, we construct a starting equal-weighted portfolio of 100 stocks, constrained by liquidity (a minimum $2M monthly volume) but allowed to double down on stocks that keep appearing in the strongest position. Our strategy delivered annualized returns of 20.11% with a volatility of 27.96%, outpacing the S&P 500’s 12.52% returns but under a more volatile profile. In future iterations, we aim to specialize model predictions by sector, refine our model selection annually to ensure the best fit with current data, and implement an adaptative turnover limit to control transaction costs.
Our strategy used a genetic algorithm to select the most promising 25 features for predicting future stock returns. We also leveraged an LLM to diversify the portfolio by prioritizing stocks from underweighted sectors for monthly rebalancing. We used a 10-year rolling window for training the genetic algorithm (the initial training period was 2000-2010) and then used the 25 features the algorithm picked for our 3-layer neural network, which was asked to forecast individual monthly stock returns for the next 2 years based on the given features. We did this process in increments of 2 years, so after the first trading period (years 2010-2012), the genetic algorithm was run again to select the most promising features for the next 2 years (2012-2014) based on the last 10 years (2002-2012). Then, we repeated the process for the next 2 years, and so on. We started with an equally weighted portfolio, and after that, we put more weight on stocks with higher market caps with the goal of reducing the volatility of the portfolio when it came to selecting new stocks to invest in. We have achieved promising results with this approach, including a Sharpe of 2, which we chose to ignore because it did not adhere to the 1% tracking error constraint. After trying to minimize the volatility of our portfolio, we were not able to come up with a sound portfolio logic that adhered to the constraints, and for that reason, the only Sharpe we were able to officially report was 0.68. However, we all firmly believe that this approach to portfolio construction is valid and could yield positive results if implemented in the real markets, with enough time for tuning. One should focus on running the algorithm with smaller windows, say 1 year, more generations, and more individuals in each generation for better feature selection. Also, we suggest integrating LSTM to asses 12 months of macroeconomic data, Transform-Encoders to look at all 150 features with a 12-month look back each, and Cross Asset Attention Networks to see how the bottom-up and top-down approaches interact.
We applied amortized causal discovery models to select features for stock return prediction that truly matter, focusing on predictors while adjusting for confounders and eliminating spurious correlations—issues that conventional methods often overlook. Additionally, we created indices or signals, including sentiment and risk scores, derived from companies’ annual SEC filings (10-K filings) by using a Chain-of-Thought zero-shot approach to query an LLM (Llama-3.2-3B in our case). This approach provided valuable insights into each company’s financial health.
We defined our stock selection criteria by combining features from the causal DAG and our custom risk metrics. We then used the Black-Litterman model to compute a covariance matrix that incorporated these return predictions. This matrix was key to asset allocation, which we assessed through min-max robust portfolio optimization, where market risk levels were managed using a Hidden Markov model to capture volatility.