Get the first element after split

str.split finds all the occurrences of a separator and returns a list.

If you only need the first element then this is inefficient in two ways:

  • There's search for the separator beyond the first occurrence
  • The method returns all the other elements in the last which is not needed

In [1]:
def get_first_split(input_string, split_by=' '):
    try:
        input_string_stripped = input_string.strip()
        split_by_index = input_string_stripped.find(split_by)
        if split_by_index == -1:
            return input_string_stripped
        else:
            substring = input_string_stripped[:split_by_index].strip()
            return substring
    except AttributeError:
        return ''

In [2]:
get_first_split(' this is  _ a test', '_')


Out[2]:
'this is'

In [3]:
get_first_split('some_group_1', '_')


Out[3]:
'some'

Performance compared to str.split


In [8]:
x = 'XQWRQW' * 100000
x += ' ' + ('XWQQWFW' * 100000)

In [9]:
len(x)


Out[9]:
1300001

In [46]:
%%timeit
x.split()[0]


1000 loops, best of 3: 1.43 ms per loop

In [12]:
%%timeit
x.split(maxsplit=1)[0]


1000 loops, best of 3: 700 µs per loop

In [47]:
%%timeit
get_first_split(x)


10000 loops, best of 3: 53.4 µs per loop