Problem 34

https://projecteuler.net/problem=34

145 is a curious number, as 1! + 4! + 5! = 1 + 24 + 120 = 145.

Find the sum of all numbers which are equal to the sum of the factorial of their digits.

Note: as 1! = 1 and 2! = 2 are not sums they are not included.

Brute force

We can brute force this by iterating numbers 10 and above, breaking them down into their component digits, and summing up their factorials. The challenge here is when to stop. My hunch is that we can stop once the sum of the factorials is greater than the original number.

Hmm... Okay, my hunch is wrong. 9! is 362880. Therefore 1! + 9! > 19. If we had stopped here we would never find 145.


In [247]:
from scipy.special import factorial
factorial(9)


Out[247]:
array(362880.0)

In [248]:
def fac(n):
    return int(factorial(n))
fac(3)


Out[248]:
6

In [249]:
import numpy as np
N = 100000
for i in range(10, N+1):
    digits = list(''+str(i))
    factorials = list(map(lambda x: fac(int(x)), digits))
    summation = np.sum(factorials)
    #print('{} -> {} -> sum {}'.format(i, factorials, summation))
    if i == summation:
        print(i)


145
40585

Brute force in the manner above is too slow. We can do better. For example we can save the result of the summations.

I think there's a better way...

Take 2

Firstly we only need factorials for digits 0 to 9. It's much faster to lookup an array than to execute scipy's factorial function every time.


In [ ]:
from scipy.special import factorial
def create_fac():
    factorials = [int(factorial(i)) for i in range(10)]
    def f(x):
        return factorials[x]
    return f
fac = create_fac()

In [ ]:
time fac(9)

In [ ]:
time factorial(9)

In [ ]:
import numpy as np
def dfs(chain, max_length):
    concat_value = 0
    for i in chain:
        concat_value *= 10
        concat_value += i
    fac_sum = sum(map(lambda x: fac(x), chain))
    #print(concat_value, fac_sum)
    if fac_sum == concat_value and len(chain) >= 2:
        yield concat_value
    if len(chain) < max_length:
        if len(chain) == 0:
            lo = 1
        else:
            lo = 0
        for i in range(lo, 10):
            for x in dfs(chain + [i], max_length):
                yield x

In [ ]:
print(sum(map(lambda x: fac(x), [1,4,5])))

In [ ]:
list(dfs([],5))

I have no idea where take 2 is going.

Thinking in terms of multi sets

Observe that the sum of factorials for 145 is the same as the sum of factorials for any permutation of those digits. 541, 154, etc.

Let the notation {1145} denote a multi set of digits 1, 1, 4, and 5. For clarity, we will always list the smaller digits first. Also, since it is a multi set, repeated digits are allowed. For example, the repeated 1s in {1145}.

Let the function F(x) map a multi set x into a sum of factorials, as per the definition in problem 34.

Let the function M(y) map a number y into a multi set corresponding to its digits. For example M(123) == M(321).

Using this notation we can say the following about 145.

F(M(145)) == 145

Generalizing, we can rephrase problem 34 as finding all values y such that F(M(y)) == y.

Observe that M(123) is the same as M(321). That is, a single multi set maps from multiple numbers. A many to one mapping. Another way to put this is to say that a multi set maps to one or more numbers. Given this, and that we are searching for factorial sums over multi sets, it makes more sense to search the space of multi sets than to search the space of numbers.

Observe that given a multi set, there is exactly one factorial sum. Then we can evaluate if that sum is a curious number by computing its multi set. Therefore we can rephrase the problem in terms of multi sets as follows. This form is better suited for searching the space of multi sets.

M(F(x)) == x

The next piece of the puzzle is how to navigate the space of multi sets in an efficient manner. Also, how would we know when to stop?

Vector representation for multi sets

Since the only set members in our multi sets are digits, we can represent our multi sets as vectors of digit counts. For example the multi set {1145} would be a vector [0,2,0,0,1,1,0,0,0,0].

Note: The count at the 0-th position of the vector is also important because 0! == 1.

Multi set equality using this notation is when two vectors have the same count at each digit position. In Python you can directly use the == operator.

Code


In [2]:
# Define `fac(n)` function that returns the factorial of `n`.
# For values of n in 0..9. We precalculate the factorials
# since those are the only factorials we will need to solve
# this problem.
from scipy.special import factorial
def create_fac():
    factorials = [int(factorial(i)) for i in range(10)]
    def f(x):
        return factorials[x]
    return f
fac = create_fac()

In [3]:
# Define F(v) where v is a multiset in vector form.
def F(v):
    assert len(v) == 10
    s = 0
    for i in range(10):
        s += fac(i) * v[i]
    return s

In [4]:
# The following assertion should pass.
assert F([0,1,0,0,1,1,0,0,0,0]) == 145
assert F([1,0,0,0,1,2,0,0,1,0]) == 40585

In [5]:
# Vector equality works in python
assert [0,1,0,0,1,1,0,0,0,0] == [0,1,0,0,1,1,0,0,0,0]
assert [1,1,0,0,0,0,0,0,0,0] != [0,0,0,0,0,0,0,0,1,1]

In [6]:
# Define M(k) that converts a number k into a multiset vector.
def M(k):
    v = [0] * 10
    while k > 0:
        r = k % 10
        k = k // 10
        v[r] += 1
    return v

In [7]:
# The following assertions about M(k) should pass.
assert M(145) == [0,1,0,0,1,1,0,0,0,0]
assert M(541) == [0,1,0,0,1,1,0,0,0,0]
assert M(5141) == [0,2,0,0,1,1,0,0,0,0]
assert M(40585) == [1,0,0,0,1,2,0,0,1,0]

In [8]:
# Assertions to test relationships mentioned in the design.
assert F(M(145)) == 145
assert M(F([0,1,0,0,1,1,0,0,0,0])) == [0,1,0,0,1,1,0,0,0,0]

A list of factorials


In [9]:
for i in range(10):
    print('{}! = {}'.format(i, fac(i)))


0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
8! = 40320
9! = 362880

Chaining factorials

Seed with a number in the first cell below. Then keep refreshing the second cell below. Seems like there is a loop. ..., 169, 363601, 1454, 169, ...


In [251]:
v = M(170)

In [264]:
print('F(v) = {}, M(F(v)) = {}'.format(F(v), M(F(v))))
v = M(F(v))


F(v) = 169, M(F(v)) = [0, 1, 0, 0, 0, 0, 1, 0, 0, 1]

Chaining factorials 2

Rather than refreshing, let's just print the chains.


In [265]:
for number in range(160,169):
    first_number = number
    known = set()
    chain = list()
    while number not in known:
        vector = M(number)
        factorial_sum = F(vector)
        #print('Number: {} -> {} -> {}'.format(number, vector, factorial_sum))
        known.add(number)
        chain.append(number)
        number = factorial_sum
    repeated_number = number
    #print('Number: {}'.format(number))
    print('Chain {} -> {}: {}'.format(first_number, repeated_number, chain))


Chain 160 -> 169: [160, 722, 5044, 169, 363601, 1454]
Chain 161 -> 169: [161, 722, 5044, 169, 363601, 1454]
Chain 162 -> 169: [162, 723, 5048, 40465, 889, 443520, 177, 10081, 40324, 57, 5160, 842, 40346, 775, 10200, 6, 720, 5043, 151, 122, 5, 120, 4, 24, 26, 722, 5044, 169, 363601, 1454]
Chain 163 -> 169: [163, 727, 10082, 40325, 153, 127, 5043, 151, 122, 5, 120, 4, 24, 26, 722, 5044, 169, 363601, 1454]
Chain 164 -> 169: [164, 745, 5184, 40465, 889, 443520, 177, 10081, 40324, 57, 5160, 842, 40346, 775, 10200, 6, 720, 5043, 151, 122, 5, 120, 4, 24, 26, 722, 5044, 169, 363601, 1454]
Chain 165 -> 169: [165, 841, 40345, 175, 5161, 842, 40346, 775, 10200, 6, 720, 5043, 151, 122, 5, 120, 4, 24, 26, 722, 5044, 169, 363601, 1454]
Chain 166 -> 169: [166, 1441, 50, 121, 4, 24, 26, 722, 5044, 169, 363601, 1454]
Chain 167 -> 169: [167, 5761, 5881, 80761, 46082, 41067, 5786, 46200, 748, 45384, 40494, 362953, 363734, 5802, 40443, 79, 367920, 368649, 404670, 5810, 40442, 75, 5160, 842, 40346, 775, 10200, 6, 720, 5043, 151, 122, 5, 120, 4, 24, 26, 722, 5044, 169, 363601, 1454]
Chain 168 -> 169: [168, 41041, 51, 121, 4, 24, 26, 722, 5044, 169, 363601, 1454]

Let's brute force again

The first cell initializes the search space. known_number records the numbers we have found so far. lo and hi are updated by the second cell after searching each region.


In [226]:
known_numbers = list()
lo = 10
hi = 100000

In [246]:
for number in known_numbers:
    print('Curious number: {}'.format(number))
for number in range(lo, hi):
    if F(M(number)) == number:
        print('Curious number: {}'.format(number))
        known_numbers.append(number)
print('hi={}'.format(hi))
print('sum of known numbers: {}'.format(sum(known_numbers)))
lo = hi
hi += 100000


Curious number: 145
Curious number: 40585
hi=4400000
sum of known numbers: 40730

Is 40730 it?

Yes! There were only two numbers all along. 145 and 40585. /facepalm

These numbers are called factorian.

http://mathworld.wolfram.com/Factorion.html

There are only 4 of them. 1! and 2! are excluded by the specification of this problem. So the answer was the sum of the other two: 145 and 40585.