Preprocess and analyze stock returns with Python

Learn to statistically analyze the stock performance of a company with Python.

Who would have been the lucky ones who bought NVIDIA shares before they went up 20% in a single day?

Daily return histogram of NVIDIA shares, showing a distribution that includes a peak of more than 20% increase in a single day.
F1. Daily Return Histogram of NVIDIA

In this tutorial, we will calculate the stock returns of NVIDIA to analyze their statistical behavior.

Data

Thanks to the yfinance library, we can download the stock data of NVIDIA with its ticker NVDA.

import yfinance as yf

df = yf.download('NVDA')
Raw data downloaded from NVIDIA's stock using the yfinance library, showing opening, closing, high, and low prices.
F2. Raw data of NVIDIA via yfinance.

Questions

  1. How to download stock data of a specific company using Python?
  2. What command allows filtering data by specific dates?
  3. How is the daily return of a stock calculated?
  4. What is the method to visualize the distribution of daily return?
  5. How to interpret the distribution of daily return?
  6. How is the cumulative return of an investment calculated?

Methodology

Filtering dates of interest

We use loc to filter the table from the start of the 2020s to today.

df = df.loc['2020-01-01':]

Let’s visualize the evolution of the closing prices on the stock exchange.

df['Adj Close'].plot()
Evolution of the adjusted closing prices of NVIDIA's shares since 2020, highlighting trends and volatilities.
F3. Adjusted Closing Prices of NVIDIA

Daily return

We use the pct_change function to calculate the daily return, which is nothing more than the percentage variation of the closing price compared to the previous day.

df['Return Daily'] = df['Adj Close'].pct_change()

The first day has no return, as there is no previous day to compare it with.

Data snapshot showing the calculation of the daily return of NVIDIA's shares, indicating day-to-day percentage variations.
F4. Daily Return Calculated

Distribution of daily return

Observing the distribution of daily returns, we see that the stock went down almost 20% in a single day.

On the positive side, the maximum daily return exceeded 20%.

Blessed are those who bought the day before the 20% rise (blessed are those who bought the day before).

df['Return Daily'].plot.hist(bins=50)
Daily return histogram of NVIDIA shares, showing a distribution that includes a peak of more than 20% increase in a single day.
F1. Daily Return Histogram of NVIDIA

Following a normal distribution, we can say with 68% confidence that the daily return will be between 0.00312 \(\) 0.0342 (\(\)). That is, between -3.1% and 3.7%.

df['Return Daily'].describe()
Descriptive statistics of the daily return of NVIDIA, including mean, standard deviation, and percentiles, to analyze the distribution of returns.
F5. Daily Return Statistics

Cumulative return

Lastly, let’s calculate how much money we would have for each dollar invested if we had bought theshare at the start of the decade.

df['Return Cumulative'] = (df['Return Daily']
 .fillna(0)
 .add(1)
 .cumprod()
)

Wow, every dollar invested in NVIDIA’s stock at the start of the decade would have turned into around 15 dollars.

Graph of the cumulative return of an investment in NVIDIA's shares from the start of 2020, showing exponential growth.
F6. Cumulative Return of NVIDIA

If you want to delve into financial data programming with Python, this may interest you.

Conclusions

  1. Stock Data Download: yf.download allows downloading the stock data of a company using its ticker.
  2. Date Filtering: loc['YYYY-MM-DD':] filters the data for a specific date range.
  3. Daily Return Calculation: pct_change calculates the percentage variation between consecutive rows.
  4. Visualization of the Return Distribution: plot.hist displays the distribution of daily returns, representing the stock’s volatility.
  5. Interpretation of the Return Distribution: describe to get statistical values and estimate the range of returns we can expect with a certain level of confidence.
  6. Cumulative Return: cumprod calculates the cumulative product to take into account the effect of reinvesting returns.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to datons.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.