Python: How to Split or Extract Text String by Dash or Hyphen


Sometimes when your programming, you will come across a situation when you’ll we want to extract part of a text string but not all of the text string. Maybe it is in file names, column names or other text fields. It could be a number or just part of a date. Let’s see how we can accomplish this with these examples using the built-in Python split method.

In this example, I want to get the symbol of the stock from the string: AAPL-EarningsPerShare.

Here, I create a variable that takes the ‘AAPL’ part out of the phrase by using the built-in split method. I am specifying that I want to take this string ‘AAPL-EarningsPerShare‘,

I want to split the string by the dash (‘-‘), and finally I want to take the first part of the split [0]. If I wanted the second part of the split, I would use [1] at the end of this function.

Symbol = 'AAPL-EarningsPerShare'.split('-')[0]

then display by using the print function.

print(Symbol)

output:

AAPL

Let’s get the second part of the string:

Symbol_Part2 = 'AAPL-EarningsPerShare'.split('-')[1]
print(Symbol_Part2)

output:

EarningsPerShare

In this next example, we have a date like this: November_27_1942

Let’s get all three sections of this text string. We will need to split (‘_’) the string by underscore instead of a dash this time. And we will need to specify which part of the string we want with the ending location numbers of [0], [1] and [2].

Month = 'November_27_1942'.split('_')[0]
Day = 'November_27_1942'.split('_')[1]
Year = 'November_27_1942'.split('_')[2]

print(Month)
print(Day)
print(Year)

output:

November
27
1942

You can see from the output, we got all three sections separated out. This can be a helpful function when dealing with groups of similar files or text fields. See the python documentation.


Article by Zachary Storella – See more programming posts on our Python Page