Calculating Percentile Rank of Stocks Data using Python & Pandas


Many times when we work with data, we want to compare certain datapoints and put them into some kind of ranking order. Say, for example, that we want to rank our investments according to the stocks with the highest dividends. Or, say, we want to rank our investments with the lowest PE Ratios.

Using Python and the Pandas library, we can accomplish this quickly and easily. Let’s try some examples.

Calculating Percentile Rank of Stocks Data using Python & Pandas

Here we create a DataFrame of our Data of Very Large Cap stocks:

# Let's Create a Pandas DataFrame of 10 Large Cap Stocks

# import the Pandas library
import pandas as pd

# Our Stocks Data with Symbols, Dividends and PE Ratios
stocks = {'Symbol': ['AAPL','ABBV','V','JNJ','PG','KO','LLY',
                    'MA','MCD','MRK'],
'Dividend Yield': ['0.66','3.84','0.75','2.66','2.74','2.96','1.09',
                   '0.63','2.01','2.79'],
'PE Ratio': ['29.1','73.4','27.5','24.1','20.1','21.5','50.2',
            '27.4','38.2','14.6']}

# Create the DataFrame 
stock_df = pd.DataFrame(stocks)

Let’s see what the data looks like:

#print the dataframe
print(stock_df)

output:

  Symbol Dividend Yield PE Ratio
0   AAPL           0.66     29.1
1   ABBV           3.84     73.4
2      V           0.75     27.5
3    JNJ           2.66     24.1
4     PG           2.74     20.1
5     KO           2.96     21.5
6    LLY           1.09     50.2
7     MA           0.63     27.4
8    MCD           2.01     38.2
9    MRK           2.79     14.6

Now let’s go about ranking the stocks by their dividend yield from highest to lowest.

First, we need to create a new column to hold each of our stock’s rankings. We will be ranking the stocks by percentage rank and since, we want them to be highest to lowest, we use ascending=True in our rank function.

Second, we will sort the new rankings column from highest to lowest using the sort_values method in pandas.

#1 Creating new column for rankings
stock_df['Dividend_Yld_Rank'] = stock_df['Dividend Yield'].rank(ascending=True, pct=True)

#2 Sorting the rankings 
stock_df = stock_df.sort_values(['Dividend_Yld_Rank'], ascending=False)

print(stock_df)

output:

 
  Symbol Dividend Yield PE Ratio  Dividend_Yld_Rank
1   ABBV           3.84     73.4                1.0
5     KO           2.96     21.5                0.9
9    MRK           2.79     14.6                0.8
4     PG           2.74     20.1                0.7
3    JNJ           2.66     24.1                0.6
8    MCD           2.01     38.2                0.5
6    LLY           1.09     50.2                0.4
2      V           0.75     27.5                0.3
0   AAPL           0.66     29.1                0.2
7     MA           0.63     27.4                0.1

You can see from the data that ABBV, KO and MRK are the three highest dividends in our data. The data is sorted by the last column on the right (Dividend_Yld_Rank) as Percentiles. The top stock is 1.0 or at the 100th percentile, followed by the next stock in the 90th percentile, the next in the 80th percentile and so forth.

Next, let us look at ranking the stocks by the PE Ratio column. Usually, we look to buy stocks with lower PE Ratios as we are paying less for its earnings stream. This means we need to change an option in the rank function to ascending=False.

We will create a new column and this time we want to sort them from lowest to highest. The lowest PE Ratio will be the highest rank for us in this example.

#1 Creating new column for PE Ratio rankings
stock_df['PE_Ratio_Rank'] = stock_df['PE Ratio'].rank(ascending=False, pct=True)

#2 Sorting the PE Ratio rankings 
stock_df = stock_df.sort_values(['PE_Ratio_Rank'], ascending=False)

print(stock_df)

output:

  Symbol Dividend Yield PE Ratio  PE_Ratio_Rank
9    MRK           2.79     14.6            1.0
4     PG           2.74     20.1            0.9
5     KO           2.96     21.5            0.8
3    JNJ           2.66     24.1            0.7
7     MA           0.63     27.4            0.6
2      V           0.75     27.5            0.5
0   AAPL           0.66     29.1            0.4
8    MCD           2.01     38.2            0.3
6    LLY           1.09     50.2            0.2
1   ABBV           3.84     73.4            0.1

Now you can see that the top ranked stock by the lowest PE Ratio is MRK with a PE of 14.6. Next up is PG, followed by KO and then JNJ to round out the top four lowest PE stocks. At the bottom of the data we see ABBV with the highest PE Ratio of 73.4.

This was a quick and simple example of ranking a group of stocks. There are many other parameters you can employ when using pandas rank function. To go more in depth, please check out the pandas documentation.


Article by Zachary Storella – See more programming posts on our Python Page