This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://github.com/donnemartin/interactive-coding-challenges).

Challenge Notebook

Problem: Compress a string such that 'AAABCCDDDD' becomes 'A3B1C2D4'. Only compress the string if it saves space.

Constraints

  • Can we assume the string is ASCII?
    • Yes
    • Note: Unicode strings could require special handling depending on your language
  • Can you use additional data structures?
    • Yes
  • Is this case sensitive?
    • Yes

Test Cases

  • None -> None
  • '' -> ''
  • 'AABBCC' -> 'AABBCC'
  • 'AAABCCDDDD' -> 'A3B1C2D4'

Algorithm

Refer to the Solution Notebook. If you are stuck and need a hint, the solution notebook's algorithm discussion might be a good place to start.

Code


In [151]:
compressed = []

def compre_string(string):
    if string: == None or len(string) == 0:
        return string
    for x in string:
        pair = []
        if string.count(x) > 1:
            pair.append(x)
            pair.append(str(string.count(x)))
            if pair[0] in "".join(compressed):
                pass
            else: 
                compressed.append("".join(pair))
        else:
            pair.append(x)
            pair.append(str(string.count(x)))
            if pair[0] in "".join(compressed):
                pass
            else: 
                compressed.append("".join(pair))
    if len("".join(compressed)) < len(string):
        return "".join(compressed)
    else:
        return string
compre_string('AAABBCCDDEEEF')


Out[151]:
'AAABBCCDDEEEF'

In [4]:
def compress_string(string):
    """
    Returns a compressed string
    
    takes a string and checks if it can be compressed. Returns compressed string if possible,
    otherwise returns input string
    
    Parameters
    ------------
    Input: 
    string: string
    
    Output:
    compressed: string
    a string of compressed characters
    
    string: string
    the original input string if compression isnt possible
    
    """
    if string: 
        compressed = []
        num = 0
        prev = string[0]
        for x in string:
            if x == prev:
                num += 1 # adds 1 to count if character matches previous iteration
            else: 
                compressed.append(prev) # appends new character to list
                compressed.append(str(num)) # appends a count of 1 next to newly added character
                num = 1
                prev = x
        compressed.append(x)
        compressed.append(str(num))
        if len("".join(compressed)) < len(string):
            return "".join(compressed)
        else:
            return string
    else:
        return string

Unit Test

The following unit test is expected to fail until you solve the challenge.


In [5]:
# %load test_compress.py
from nose.tools import assert_equal


class TestCompress(object):

    def test_compress(self, func):
        assert_equal(func(None), None)
        assert_equal(func(''), '')
        assert_equal(func('AABBCC'), 'AABBCC')
        assert_equal(func('AAABCCDDDD'), 'A3B1C2D4')
        print('Success: test_compress')


def main():
    test = TestCompress()
    test.test_compress(compress_string)


if __name__ == '__main__':
    main()


Success: test_compress

Solution Notebook

Review the Solution Notebook for a discussion on algorithms and code solutions.