If you create a Machine Learning model to predict the price of a stock:
![Graphical representation of a decision tree model used to predict stock prices, illustrating how the feature space is divided to reach more accurate predictions.](https://images.datons.ai/STOCK-ANALYSIS-2/P_tree.png)
How can you evaluate its performance if you apply it to your investment strategy?
![Illustrative scheme of applying a Machine Learning model in an investment strategy, showing the decision-making process based on price predictions.](https://images.datons.ai/STOCK-ANALYSIS-2/P_ML_Strategy.png)
Data
We start with the stock data of NVIDIA
with its ticker NVDA
.
Check out this tutorial to learn how to preprocess the daily return of a stock.
import pandas as pd
df = pd.read_csv('data.csv', index_col='Date', parse_dates=True)
![Capture of NVIDIA's stock data, showing a dataset ready for analysis and application of Machine Learning models, focusing on data preparation and cleaning.](https://images.datons.ai/STOCK-ANALYSIS-2/D_raw.png)
Questions
- How is a Machine Learning model implemented to predict the change in closing price?
- What is the role of the
min_samples_leaf
parameter in theDecisionTreeRegressor
algorithm? - How do we measure the model’s error and what does it tell us about its performance?
- How do we introduce a Machine Learning model into an investment strategy?
- How do we evaluate the performance of the Machine Learning investment strategy?
Methodology
Feature selection
We want to predict the percentage change in tomorrow’s closing price. This will be the target variable, and the rest will be the explanatory ones.
target = 'Change Tomorrow'
y = df[target]
X = df.drop(columns=target)
Machine learning model
We will use the DecisionTreeRegressor
algorithm to
predict the change in the closing price.
As a minimum, we want there to be 10 samples at the end of each
branch of the tree, min_samples_leaf=10
.
from sklearn.tree import DecisionTreeRegressor
model = DecisionTreeRegressor(min_samples_leaf=10)
model.fit(X, y)
![Graphical representation of a decision tree model used to predict stock prices, illustrating how the feature space is divided to reach more accurate predictions.](https://images.datons.ai/STOCK-ANALYSIS-2/P_tree.png)
Model evaluation
With the model’s mathematical conditions, we calculate the model’s predictions:
y_pred = model.predict(X)
And compare them with the actual values, thus obtaining the error.
error = y - y_pred
To have a better evaluation metric of the model, we calculate the root mean square error (RMSE); it usually tells us how much the predictions deviate from the actual value 68% of the time.
error2 = error ** 2
MSE = error2.mean()
RMSE = MSE ** 0.5
In our case, the prediction of the percentage change for tomorrow that the model makes will deviate on average 2.99% from the actual value.
Is this an acceptable value for our investment strategy? How could we improve it?
I read your comments to design the next tutorial.
Now let’s continue with the implementation of the investment strategy
using the backtesting.py
library.
Create investment strategy
To implement the investment strategy, we create a class that inherits
from backtesting.Strategy
functionalities.
For this, the class requires two methods: init
and
next
.
init
: initializes the strategy with the previously calculated model.next
: calculates the prediction for tomorrow and decides whether to buy, sell, or do nothing.
from backtesting import Strategy
class MLStrategy(Strategy):
def init(self):
self.model = model
def next(self):
X_today = self.data.df.iloc[[-1]]
y_tomorrow = self.model.predict(X_today)
if y_tomorrow > RMSE:
self.buy()
elif y_tomorrow < -RMSE:
self.sell()
else:
pass
Backtest with trading conditions
Finally, we simulate the investment strategy (aka backtest) with the following conditions to evaluate its performance.
from backtesting import Backtest
bt = Backtest(
X, MLStrategy, cash=1_000, commission=.002,
exclusive_orders=True, trade_on_close=True
)
results = bt.run()
In the backtest report, we observe that, after 1,533 days, we obtain
a Final Equity
of $11,019.19.
![Summary of the results of a backtest applying a Machine Learning-based investment strategy, highlighting the final equity and total return obtained during the test period.](https://images.datons.ai/STOCK-ANALYSIS-2/D_backtest.png)
Although it would have been easier to buy and hold the stock without
a Machine Learning model; obtaining a Return
of 1,372.08%
(vs. 1,001.92%).
How could we improve the Machine Learning investment strategy? I read your comments.
Visualize backtest simulation
Finally, we visualize the backtest simulation to better understand the performance of the investment strategy.
bt.plot()
In addition to performance metrics, we observe one that is crucial
for evaluating the investment strategy: the Drawdown
.
This metric tells us how much we would be willing to suffer without closing the position.
![Interactive chart generated by the backtesting.py library, offering a detailed visualization of the investment strategy's performance over time, including key metrics such as drawdown.](https://images.datons.ai/STOCK-ANALYSIS-2/P_bokeh_plot.png)
In other words, Drawdown measures the risk of the investment strategy.
If you want to delve into integrating Machine Learning models into investment strategies, I invite you to check out this course.
Conclusions
- Machine Learning Model:
DecisionTreeRegressor
is a tree algorithm that selects the most significant historical patterns to predict changes in prices. - Parameter
min_samples_leaf
:min_samples_leaf=10
prevents overfitting by ensuring a minimum number of samples in the tree leaves, improving the model’s generalization. - Error Measurement:
RMSE
to quantify the deviation of predictions from actual values with 68% confidence. - Introduction in Investment Strategy:
Strategy
withinit
andnext
integrates the model’s predictions into trading decisions. - Performance Evaluation:
Backtest
allows us to simulate the investment strategy with customized trading conditions.