Visually comparing contracts from Iberian gas market MIBGAS

Discover how to analyze and visualize MIBGAS contract prices using Pandas and Matplotlib. This tutorial guides you through data filtering, date handling, and interpolation techniques.

Line graph showing the increase in future gas prices compared to current prices.
F1. Visualizing contango: future prices exceed current prices

If I have a factory that consumes natural gas, it’s not the same to buy it for consumption the next day as for the next year. Prices vary depending on the delivery date, and it’s important to have an overview of how they behave to make purchasing decisions.

In Figure 1, we see that, in general, future prices are higher than current prices. This is known as contango.

As they say in Argentina: it takes two to tango.

All very clear, but… how do we process the raw data provided by MIBGAS for a contract comparison visualization?

Data

Each row represents a trading date for a specific contract depending on when the gas will be delivered.

  • GDAES_D+1: Next day
  • GMAES: Next month
  • GQES_Q+1: Next quarter
  • GYES_Y+1: Next year

In this tutorial, we work with 2024 data in CSV format, obtained from MIBGAS.

import pandas as pd
df = pd.read_csv('data/MIBGAS_Data_2024.csv', sep=';', skiprows=1)
Preview of raw MIBGAS data in CSV format with multiple columns.
F2. Initial MIBGAS data before processing

Questions

  1. Why filter and rename the columns of the DataFrame?
  2. How do we restructure the contract categories into columns of the DataFrame?
  3. What function is used to convert text to dates?
  4. How to visualize data directly from the DataFrame with a function?
  5. What smart technique is applied to fill missing data in time series?
  6. Why is it important for the time series to be sorted by date?

Methodology

Select and rename columns

We will compare the different gas prices by contract type and trading date.

df = df.filter(regex='Trading|Product|MIBGAS')
df.columns = ['date_trading', 'contract', 'price']
Pandas DataFrame showing specific columns: date, contract, and price
F3. Filtered DataFrame

Restructure contracts into columns

To simplify working with the DataFrame, we will restructure it so that each column represents a contract and each row a trading day.

df = df.pivot(index='date_trading', columns='contract', values='price')
Pivoted DataFrame with contracts in columns for simplified analysis.
F4. Contracts reorganized as columns

Format date column

By default, dates are in string format. We will convert it to datetime so that visualization and analysis functions take into account the temporal nature of the data.

df.index = pd.to_datetime(df.index, dayfirst=True)

Sort rows by date

It is very important that the data is sorted by date so that visualization is done in the correct order.

df.sort_index(inplace=True)

Comparative column visualization

Since the pandas DataFrame is connected to the matplotlib library, we can visualize the data directly using the plot function.

df.plot();
Smoothed line graph of gas contract prices, result of linear interpolation.
F5. Comparative contracts after interpolation

We observe many irregular jumps due to missing data on some days, probably due to holidays or weekends.

Linear interpolation

We will use linear interpolation to fill these gaps, making the visualization clearer.

df.interpolate(method='linear', inplace=True)

Now we can compare the different gas contracts over time.

Line graph showing the increase in future gas prices compared to current prices.
F6. Visualizing contango: future prices exceed current prices

If we observe that lighter colors indicate a more distant horizon and are mostly positioned at the top of the graph, we can conclude that, in the long term, gas prices are higher.

Conclusions

Thanks to this tutorial, you now know how to:

  1. Filter & Rename Columns: Simplify the DataFrame to work only with the variables you’re interested in and rename them for easier handling.
  2. Restructure DataFrames: The pivot function allows you to restructure categories from one column into independent columns, simplifying data visualization and analysis.
  3. Transform Text to Dates: Using pd.to_datetime to convert strings to datetime objects is essential for efficiently manipulating time series.
  4. Data Visualization: The integration of Pandas with Matplotlib enables direct and effective visualizations from the DataFrame.
  5. Linear Interpolation: Linear interpolation is a key technique for dealing with missing data, allowing for a continuous representation of time series.
  6. Sort Time Series: It’s crucial to sort data by date to ensure temporal coherence in analysis and visualization.

Take a step forward and learn to develop algorithms and applications with our digital courses in Udemy.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to datons | Practical Python for Consulting & Training.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.