This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://github.com/donnemartin/interactive-coding-challenges).

Solution Notebook

Problem: Remove duplicates from a linked list

Constraints

  • Is this a singly or doubly linked list?
    • Singly
  • Can you insert None values in the list?
    • No
  • Can you use additional data structures?
    • Implement both solutions
  • Can we assume we already have a linked list class that can be used for this problem?
    • Yes

Test Cases

  • Empty linked list -> []
  • One element linked list -> [element]
  • General case with no duplicates
  • General case with duplicates

Algorithm: Hash Map Lookup

  • For each node
    • If the node's value is in the hash map
      • Delete the node
    • Else
      • Add node's value to the hash map

Complexity:

  • Time: O(n)
  • Space: O(n)

Note:

  • Deletion requires two pointers, one to the previous node and one to the current node

Algorithm: In-Place

  • For each node
    • Compare node with every other node
      • If the node's value is in the hash map
        • Delete the node
      • Else
        • Add node's value to the hash map

Complexity:

  • Time: O(n^2)
  • Space: O(1)

Note:

  • Deletion requires two pointers, one to the previous node and one to the current node
  • We'll need to use a 'runner' to check every other node and compare it to the current node

Code


In [1]:
%run ../linked_list/linked_list.py

In [2]:
class MyLinkedList(LinkedList):

    def remove_dupes(self):
        seen_data = set()
        curr = self.head
        prev = None
        while curr is not None:
            if curr.data in seen_data:
                prev.next = curr.next
            else:
                seen_data.add(curr.data)
                prev = curr
            curr = curr.next

    def remove_dupes_in_place(self):
        curr = self.head
        while curr is not None:
            runner = curr
            while runner.next is not None:
                if runner.next.data == curr.data:
                    runner.next = runner.next.next
                else:
                    runner = runner.next
            curr = curr.next

Unit Test


In [3]:
%%writefile test_remove_duplicates.py
from nose.tools import assert_equal


class TestRemoveDupes(object):

    def test_remove_dupes(self, linked_list):
        print('Test: Empty list')
        linked_list.remove_dupes()
        assert_equal(linked_list.get_all_data(), [])

        print('Test: One element list')
        linked_list.insert_to_front(2)
        linked_list.remove_dupes()
        assert_equal(linked_list.get_all_data(), [2])

        print('Test: General case, duplicates')
        linked_list.insert_to_front(1)
        linked_list.insert_to_front(3)
        linked_list.insert_to_front(1)
        linked_list.insert_to_front(1)
        linked_list.remove_dupes()
        assert_equal(linked_list.get_all_data(), [1, 3, 2])

        print('Test: General case, no duplicates')
        linked_list.remove_dupes()
        assert_equal(linked_list.get_all_data(), [1, 3, 2])

        print('Success: test_remove_dupes\n')


def main():
    test = TestRemoveDupes()
    linked_list = MyLinkedList(None)
    test.test_remove_dupes(linked_list)


if __name__ == '__main__':
    main()


Overwriting test_remove_duplicates.py

In [4]:
run -i test_remove_duplicates.py


Test: Empty list
Test: One element list
Test: General case, duplicates
Test: General case, no duplicates
Success: test_remove_dupes