I came across this Bloomberg article talking about Santa Claus Rally, which, according to Investopedia, “refers to the sustained increases found in the stock market during the last five trading days of December through the first two trading days of January”. I have been curious with various market anomalies and let’s find out more about this anomaly this Christmas.

First attempt: S&P 500 for the 7 trading days.

Based on the description, let’s do a quick sanity check. Let’s pull S&P 500 daily history, calculate average daily returns over the 7 trading days (last 5 days and first 2 days around year end), and calculate average daily returns for all other 7-trading-day windows in the history. Then let’s compare the distributions.

import yfinance as yf
import pandas as pd
import plotly.express as px


spx = yf.Ticker("^SPX")
df_spx = spx.history(period="max")

# Pre-process to calculate daily returns
df = df_spx.rename(columns={"Close": "close"})[["close"]].assign(ret=lambda x: x["close"].pct_change())

# Label Santa Claus rally periods
# Yahoo Finance S&P 500 history starts from 1927-12-30, therefore we are going to from the Christmas/ New Year of 1928.
last_n_trading_days, first_n_trading_days = 5, 2
year_end_days = (
    df.index.to_series()
    .groupby(pd.Grouper(freq="YE"), as_index=True)
    .nth(slice(-last_n_trading_days, None))["1928-01-01":"2023-12-31"]
    .index
)
year_start_days = (
    df.index.to_series()
    .groupby(pd.Grouper(freq="YE"), as_index=True)
    .nth(slice(0, first_n_trading_days))["1929-01-01":"2024-01-31"]
    .index
)
df["santa_period"] = 0
df.loc[year_end_days, "santa_period"] = df.loc[year_end_days].index.year
df.loc[year_start_days, "santa_period"] = df.loc[year_start_days].index.year - 1

# Calculate average daily returns for Santa Claus Rally periods
santa_rets = (
    df[df["santa_period"] != 0]
    .groupby("santa_period")
    .agg(ret=pd.NamedAgg(column="ret", aggfunc=lambda x: (1+x).prod()**(1/len(x))-1))["ret"]
    .dropna()
)

# Calculate average daily returns for rolling 7 trading days, and exclude the ones that overlap completely with Santa period
n_days = last_n_trading_days + first_n_trading_days
period_end_mask = (df["santa_period"] != 0).rolling(n_days).sum() == n_days
other_rets = (df["close"] / df.shift(n_days)["close"]) ** (1/n_days) - 1
other_rets = other_rets[~period_end_mask].dropna()

df_plot = pd.concat({
    "santa": santa_rets.reset_index(drop=True), 
    "other": other_rets.reset_index(drop=True),
}).reset_index(level=0).rename(columns={"level_0": "category", 0: "ret"})

fig = px.histogram(
    df_plot, 
    x="ret", 
    color="category", 
    marginal="box",
    hover_data=df_plot.columns
)
fig.show()

The result looks promising: we can see the returns of Santa Claus periods seem to have a higher / more positive median.

Let’s test this statistically and see if it is significant. We know that the returns would not be normally distributed, and therefore let’s try testing it with Mann-Whitney U-test (which, however, requires the samples to be independent, and there maybe autocorrelations in our samples which would defeat this).

from scipy.stats import mannwhitneyu


print(santa_rets.mean())
print(other_rets.mean())

u_stat, p_val = mannwhitneyu(santa_rets, other_rets, alternative='greater')
print(p_val)
0.0022794465565883853
0.0002409215972112875
7.25243117763683e-07

The p-value is small enough to show that the 7-day returns during Santa Claus period has a statistically significant over-performance than other 7-day periods in history.

Is that it? Time for (one of) my favorite comics:

what if

Are the 5-day and 5-day periods arbitrary? What if we try other combinations?

Second attempt: S&P 500 for various days before/ after

Let’s refactor our code a little bit to make it easier.

import seaborn as sns

from typing import Tuple
from itertools import product


def label_santa_periods(df: pd.DataFrame, last_n_trading_days: int = 5, first_n_trading_days: int = 2) -> pd.DataFrame:
    # Yahoo Finance S&P 500 history starts from 1927-12-30, therefore we are going to from the Christmas/ New Year of 1928.
    year_end_days = df.index.to_series().groupby(pd.Grouper(freq="YE"), as_index=True).nth(slice(-last_n_trading_days, None))["1928-01-01":"2023-12-31"].index
    year_start_days = df.index.to_series().groupby(pd.Grouper(freq="YE"), as_index=True).nth(slice(0, first_n_trading_days))["1929-01-01":"2024-01-31"].index
    df["santa_period"] = 0
    df.loc[year_end_days, "santa_period"] = df.loc[year_end_days].index.year
    df.loc[year_start_days, "santa_period"] = df.loc[year_start_days].index.year - 1
    return df

def calculate_santa_period_returns(df: pd.DataFrame) -> pd.Series:
    assert "ret" in df.columns
    assert "santa_period" in df.columns
    ret = df[df["santa_period"] != 0].groupby("santa_period").agg(ret=pd.NamedAgg(column="ret", aggfunc=lambda x: (1+x).prod()**(1/len(x))-1))["ret"]
    ret = ret.dropna()
    return ret

def calculate_other_period_returns(df: pd.DataFrame, n_days: int) -> pd.Series:
    # n_days should be equal to last_n_trading_days + first_n_trading_days
    assert "ret" in df.columns
    assert "santa_period" in df.columns
    # calculate rolling returns and exclude the ones that overlap completely with Santa period
    df_period_end_mask = (df["santa_period"] != 0).rolling(n_days).sum() == n_days
    ret = (df["close"] / df.shift(n_days)["close"]) ** (1/n_days) - 1
    ret = ret[~df_period_end_mask]
    ret = ret.dropna()
    return ret

def calculate_return_stats(santa_returns: pd.Series, other_returns: pd.Series) -> Tuple[float, float, float]:
    santa_return_mean = santa_returns.mean()
    other_return_mean = other_returns.mean()
    
    # Perform Mann-Whitney U test
    u_stat, p_val = mannwhitneyu(santa_returns, other_returns, alternative='greater')

    return santa_return_mean, other_return_mean, p_val

results = []
for last_n, first_n in product(range(10), range(10)):
    if last_n + first_n == 0:
        continue
        
    df_ = label_santa_periods(df, last_n_trading_days=last_n, first_n_trading_days=first_n)
    df_santa = calculate_santa_period_returns(df_)
    df_other = calculate_other_period_returns(df_, last_n + first_n)
    santa_return_mean, other_return_mean, p_val = calculate_return_stats(df_santa, df_other)
    results.append({
        "last_n_trading_days": last_n,
        "first_n_trading_days": first_n,
        "santa_return_mean": santa_return_mean,
        "other_return_mean": other_return_mean,
        "p_val": p_val,
    })

df_results = pd.DataFrame.from_records(results)
df_results["diff"] = df_results["santa_return_mean"] - df_results["other_return_mean"]

df_pivot = pd.pivot(df_results[df_results["p_val"] < 0.01], index="last_n_trading_days", columns="first_n_trading_days", values="diff")
df_pivot = df_pivot.reindex(index=range(10), columns=range(10))

cm = sns.light_palette("green", as_cmap=True)
(df_pivot * 100).style.background_gradient(cmap=cm, axis=None)

We are running the analysis for a 10x10 grid of 0-9 trading days before year-end and 0-9 trading days after year-end. We then set our p-value threshold to 0.01, ignoring combinations that are not significantly greater at 99% confidence level (showing as nan). All the numbers are in percentage points (scaled by 100) for easier reading.

first_n_trading_days 0 1 2 3 4 5 6 7 8 9
last_n_trading_days                    
0 nan nan nan nan nan nan nan nan nan nan
1 nan nan 0.211150 0.170398 0.132726 0.105281 nan nan nan nan
2 0.313475 0.222950 0.271739 0.227015 0.186110 0.154915 0.140181 0.122966 0.104687 0.095749
3 0.245671 0.194335 0.239194 0.207457 0.175224 0.149310 0.136818 0.121663 0.105178 0.096944
4 0.197345 0.165893 0.207950 0.185207 0.159753 0.138475 0.128297 0.115293 0.100723 0.093473
5 0.193772 0.168128 0.203852 0.184442 0.161855 0.142520 0.132888 0.120596 0.106739 0.099582
6 0.148660 0.133155 0.168740 0.155359 0.137931 0.122521 0.115366 0.105366 0.093577 0.087780
7 0.118427 0.108583 0.142934 0.133491 0.119679 0.107065 0.101649 0.093350 0.083156 0.078385
8 0.106123 0.098782 0.130648 0.123155 0.111338 0.100346 0.095792 0.088433 0.079183 0.074928
9 0.096526 0.090805 0.120496 0.114530 0.104299 0.094607 0.090733 0.084149 0.075709 0.071888

Some observations on the above table:

  • While the original 5-day-before and 2-day-after rule of thumb is indeed one of the overperfomers, our best mean overperformance is observed with last 2 trading days every year (2x0).
  • Overall the shorter the period is, the stronger the overperformance (which make sense as the longer the period is, the more noises there would be).
  • If we look diagonally, it appears the we have better performance from the year end, than from the beginnering of the year.

what if

The observations reminds me of the turn-of-the-month effect. Is the Santa Claus Rally just an instance of the effect?

Third attempt: S&P 500 Santa vs turn-of-the-month

Let’s do some further refactoring for analyzing this.

from typing import Iterable


def label_month_end_periods(df: pd.DataFrame, last_n_trading_days: int = 5, first_n_trading_days: int = 2) -> pd.Series:
    month_end_days = df.index.to_series().groupby(pd.Grouper(freq="ME")).nth(slice(-last_n_trading_days, None))["1928-01-01":"2024-12-01"].index
    month_start_days = df.index.to_series().groupby(pd.Grouper(freq="ME")).nth(slice(0, first_n_trading_days))["1928-02-01":"2024-12-31"].index
    df["month_end_year"] = 0
    df["month_end_month"] = 0
    df.loc[month_end_days, "month_end_year"] = df.loc[month_end_days].index.year
    df.loc[month_end_days, "month_end_month"] = df.loc[month_end_days].index.month
    df.loc[month_start_days, "month_end_year"] = (df.loc[month_start_days].index - pd.DateOffset(months=1)).year
    df.loc[month_start_days, "month_end_month"] = (df.loc[month_start_days].index - pd.DateOffset(months=1)).month
    return df

def calculate_month_end_period_returns(df: pd.DataFrame, months: Iterable[int]) -> pd.Series:
    assert "ret" in df.columns
    assert "month_end_year" in df.columns
    assert "month_end_month" in df.columns
    ret = df[(df["month_end_year"] != 0) & (df["month_end_month"].isin(months))].groupby(["month_end_year", "month_end_month"]).agg(ret=pd.NamedAgg(column="ret", aggfunc=lambda x: (1+x).prod()**(1/len(x))-1))["ret"]
    ret = ret.dropna()
    return ret

results = []
for last_n, first_n in product(range(10), range(10)):
    if last_n + first_n == 0:
        continue
        
    df_ = label_month_end_periods(df, last_n_trading_days=last_n, first_n_trading_days=first_n)
    df_santa = calculate_month_end_period_returns(df_, [12])
    df_other = calculate_month_end_period_returns(df_, range(1, 12))
    santa_return_mean, other_return_mean, p_val = calculate_return_stats(df_santa, df_other)
    results.append({
        "last_n_trading_days": last_n,
        "first_n_trading_days": first_n,
        "santa_return_mean": santa_return_mean,
        "other_return_mean": other_return_mean,
        "p_val": p_val,
    })

df_results = pd.DataFrame.from_records(results)
df_results["diff"] = df_results["santa_return_mean"] - df_results["other_return_mean"]
df_pivot = pd.pivot(df_results[df_results["p_val"] < 0.01], index="last_n_trading_days", columns="first_n_trading_days", values="diff")
df_pivot = df_pivot.reindex(index=range(10), columns=range(10))
(df_pivot * 100).style.background_gradient(cmap=cm, axis=None)

We now run the analysis again for a 10x10 grid, but this time we compare the year-end period versus other month-ends (i.e. x1 trading days before month end, and x2 trading days after).

first_n_trading_days 0 1 2 3 4 5 6 7 8 9
last_n_trading_days                    
0 nan nan 0.060410 nan nan nan nan nan nan 0.065573
1 nan nan nan nan nan nan nan nan nan nan
2 0.274768 nan 0.199429 0.148638 0.115169 0.101307 nan nan nan nan
3 0.229832 nan 0.187151 0.147004 0.118593 0.106050 0.100184 nan nan nan
4 0.193283 nan 0.169750 0.137892 0.114148 0.103513 0.098466 0.090506 0.078573 nan
5 0.216654 0.167831 0.189821 0.159397 0.135869 0.124149 0.117681 0.108793 0.096369 0.094067
6 0.172165 0.136702 0.159782 0.136056 0.117190 0.108237 0.103635 0.096502 0.085830 0.084389
7 0.150142 0.121790 0.143952 0.124201 0.108177 0.100715 0.097060 0.090913 0.081333 0.080046
8 0.138836 0.114926 0.135542 0.118338 0.104118 0.097561 0.094402 0.088833 0.079857 0.078607
9 0.131269 0.110419 0.129588 0.114365 0.101563 0.095671 0.092848 0.087675 0.079022 0.078149

We can still see the overperformance with 2x0 (i.e. the average performance at last 2 trading days is better than the average performance of last 2 trading days every month except December). The original 5x2 configuration remains significant.

what if

Now that we seem to have found an anomaly, does it apply to markets other than S&P 500?

Fourth attempt: Other indices

I will not post the code, which is very similar to above with minor changes in order to retrieve the data for other markets. Here is a summary:

NASDAQ 100 (^NDX)

No Santa Claus Rally effect observed in the 10x10 grid. This is a bit surprising but perhaps it demonstrates the investors of growth/ tech stocks and their styles are vastly different than S&P 500’s.

UK FTSE 100 (^FTSE)

Overperformance observed in general in the last 5+ trading days, and decays in the new year.

first_n_trading_days 0 1 2 3 4 5 6 7 8 9
last_n_trading_days                    
0 nan nan nan nan nan nan nan nan nan nan
1 nan nan nan nan nan nan nan nan nan nan
2 nan nan nan nan nan nan nan nan nan nan
3 nan nan nan 0.179974 0.148520 nan nan nan nan nan
4 nan 0.184101 0.159588 0.180822 0.155016 0.139759 nan nan nan nan
5 0.213257 0.212895 0.187713 0.206205 0.182043 0.166095 0.132799 nan nan nan
6 0.207382 0.203049 0.182193 0.202309 0.182406 0.168328 0.137736 0.106047 nan nan
7 0.213919 0.204609 0.185808 0.206336 0.189194 0.176149 0.147406 0.116834 0.105737 nan
8 0.207518 0.196220 0.180064 0.201777 0.187663 0.176116 0.149510 0.120381 0.109588 0.096694
9 0.169017 0.160836 0.149326 0.174126 0.164323 0.155628 0.132233 0.105600 0.096155 nan

Asian indices

We could observe the rally in Hong Kong (Hang Seng Index, ^HSI) and Japan (Nikkei 225, ^N225). In Hong Kong, the overperformance limits to a period starting 3+ days before year end, AND exactly 3 days after year end, whereas in Japan the overperformance almost limits to a period staring 3+ days before year end, and 1 day after year end.

What’s surprising to me was that I did not find any significant rally in the 10x10 grid for South Korea (KOSPI, ^KS11), despite South Korea is one of the few East Asian countries that celebrate it as a national holiday.

Parting words

It is interesting to see a well-known anomaly to persist in the market. Obviously with such a marginal overperformance and given the rally happens once per year, this is more of a fun, than a real strategy - perhaps buying some ETFs could be a good Christmas gift?

Wish y’all a Merry Christmas and Happy New Year!