Sometimes computing the likelihood is not as fast as we would like. Theano provides handy profiling tools, which pymc3 provides a wrapper model.profile
which returns a ProfileStats
object. Here we'll profile the likelihood and gradient for the stochastic volatility example.
First we build the model.
In [1]:
import numpy as np
from pymc3 import *
from pymc3.math import exp
from pymc3.distributions.timeseries import *
n = 400
returns = np.genfromtxt(get_data('SP500.csv'))[-n:]
with Model() as model:
sigma = Exponential('sigma', 1. / .02, testval=.1)
nu = Exponential('nu', 1. / 10)
s = GaussianRandomWalk('s', sigma ** -2, shape=n)
r = StudentT('r', nu, lam=exp(-2 * s), observed=returns)
Then call profile and summarize it.
In [2]:
model.profile(model.logpt).summary()
Function profiling
==================
Message: /home/jovyan/pymc3/pymc3/model.py:605
Time in 1000 calls to Function.__call__: 1.775136e-01s
Time in Function.fn.__call__: 1.416550e-01s (79.800%)
Time in thunks: 8.041668e-02s (45.302%)
Total compile time: 1.353232e+00s
Number of Apply nodes: 20
Theano Optimizer time: 6.614311e-01s
Theano validate time: 4.212379e-03s
Theano Linker time (includes C, CUDA code generation/compiling): 6.327283e-01s
Import time 3.420997e-02s
Node make_thunk time 6.312668e-01s
Node Elemwise{Composite{(Switch(Identity(GT(i0, i1)), (i2 - (i3 * i0)), i4) + i5 + Switch(Identity(GT(i6, i1)), (i7 - (i8 * i6)), i4) + i9 + i10 + i11)}}[(0, 0)](Elemwise{exp,no_inplace}.0, TensorConstant{0}, TensorConstant{3.9120230674743652}, TensorConstant{50.0}, TensorConstant{-inf}, sigma_log_, Elemwise{exp,no_inplace}.0, TensorConstant{-2.3025850929940455}, TensorConstant{0.1}, nu_log_, Sum{acc_dtype=float64}.0, Sum{acc_dtype=float64}.0) time 5.813510e-01s
Node InplaceDimShuffle{x}(sigma) time 5.518913e-03s
Node Elemwise{Composite{Switch(Identity((GT(Composite{exp((i0 * i1))}(i0, i1), i2) * i3 * GT(inv(sqrt(Composite{exp((i0 * i1))}(i0, i1))), i2))), (((i4 + (i5 * log(((i6 * Composite{exp((i0 * i1))}(i0, i1)) / i7)))) - i8) - (i5 * i9 * log1p(((Composite{exp((i0 * i1))}(i0, i1) * i10) / i7)))), i11)}}(TensorConstant{(1,) of -2.0}, s, TensorConstant{(1,) of 0}, Elemwise{gt,no_inplace}.0, Elemwise{Composite{scalar_gammaln((i0 * i1))}}.0, TensorConstant{(1,) of 0.5}, TensorConstant{(1,) of 0...8309886184}, InplaceDimShuffle{x}.0, Elemwise{Composite{scalar_gammaln((i0 * i1))}}.0, Elemwise{add,no_inplace}.0, TensorConstant{[ 4.05769..48400e-06]}, TensorConstant{(1,) of -inf}) time 5.294800e-03s
Node Elemwise{Composite{inv(sqr(i0))}}(InplaceDimShuffle{x}.0) time 4.826069e-03s
Node Elemwise{Composite{log((i0 * i1))}}(TensorConstant{(1,) of 0...9154943092}, Elemwise{Composite{inv(sqr(i0))}}.0) time 4.477978e-03s
Time in all call to theano.grad() 0.000000e+00s
Time since theano import 10.620s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
71.7% 71.7% 0.058s 4.81e-06s C 12000 12 theano.tensor.elemwise.Elemwise
7.4% 79.1% 0.006s 2.98e-06s C 2000 2 theano.tensor.elemwise.Sum
7.2% 86.3% 0.006s 2.90e-06s C 2000 2 theano.tensor.subtensor.Subtensor
7.0% 93.4% 0.006s 2.83e-06s C 2000 2 theano.tensor.elemwise.DimShuffle
6.6% 100.0% 0.005s 2.66e-06s C 2000 2 theano.compile.ops.ViewOp
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
30.5% 30.5% 0.025s 2.45e-05s C 1000 1 Elemwise{Composite{Switch(Identity((GT(Composite{exp((i0 * i1))}(i0, i1), i2) * i3 * GT(inv(sqrt(Composite{exp((i0 * i1))}(i0, i1))), i2))), (((i4 + (i5 * log(((i6 * Composite{exp((i0 * i1))}(i0, i1)) / i7)))) - i8) - (i5 * i9 * log1p(((Composite{exp((i0 * i1))}(i0, i1) * i10) / i7)))), i11)}}
7.5% 38.0% 0.006s 3.02e-06s C 2000 2 Elemwise{exp,no_inplace}
7.4% 45.4% 0.006s 2.98e-06s C 2000 2 Sum{acc_dtype=float64}
7.0% 52.5% 0.006s 2.83e-06s C 2000 2 InplaceDimShuffle{x}
7.0% 59.4% 0.006s 2.81e-06s C 2000 2 Elemwise{Composite{scalar_gammaln((i0 * i1))}}
6.6% 66.1% 0.005s 2.66e-06s C 2000 2 ViewOp
5.1% 71.2% 0.004s 4.14e-06s C 1000 1 Elemwise{Composite{Switch(i0, (i1 * ((-(i2 * sqr((i3 - i4)))) + i5)), i6)}}
3.9% 75.1% 0.003s 3.16e-06s C 1000 1 Elemwise{Composite{(Switch(Identity(GT(i0, i1)), (i2 - (i3 * i0)), i4) + i5 + Switch(Identity(GT(i6, i1)), (i7 - (i8 * i6)), i4) + i9 + i10 + i11)}}[(0, 0)]
3.8% 78.9% 0.003s 3.02e-06s C 1000 1 Subtensor{int64::}
3.6% 82.5% 0.003s 2.93e-06s C 1000 1 Elemwise{gt,no_inplace}
3.6% 86.2% 0.003s 2.92e-06s C 1000 1 Elemwise{Composite{log((i0 * i1))}}
3.6% 89.8% 0.003s 2.89e-06s C 1000 1 Elemwise{Composite{Identity(GT(inv(sqrt(i0)), i1))}}
3.5% 93.2% 0.003s 2.78e-06s C 1000 1 Subtensor{:int64:}
3.4% 96.7% 0.003s 2.76e-06s C 1000 1 Elemwise{add,no_inplace}
3.3% 100.0% 0.003s 2.69e-06s C 1000 1 Elemwise{Composite{inv(sqr(i0))}}
... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
30.5% 30.5% 0.025s 2.45e-05s 1000 16 Elemwise{Composite{Switch(Identity((GT(Composite{exp((i0 * i1))}(i0, i1), i2) * i3 * GT(inv(sqrt(Composite{exp((i0 * i1))}(i0, i1))), i2))), (((i4 + (i5 * log(((i6 * Composite{exp((i0 * i1))}(i0, i1)) / i7)))) - i8) - (i5 * i9 * log1p(((Composite{exp((i0 * i1))}(i0, i1) * i10) / i7)))), i11)}}(TensorConstant{(1,) of -2.0}, s, TensorConstant{(1,) of 0}, Elemwise{gt,no_inplace}.0, Elemwise{Composite{scalar_gammaln((i0 * i1))}}.0, TensorConstant{(1,) o
5.1% 35.6% 0.004s 4.14e-06s 1000 15 Elemwise{Composite{Switch(i0, (i1 * ((-(i2 * sqr((i3 - i4)))) + i5)), i6)}}(Elemwise{Composite{Identity(GT(inv(sqrt(i0)), i1))}}.0, TensorConstant{(1,) of 0.5}, Elemwise{Composite{inv(sqr(i0))}}.0, Subtensor{int64::}.0, Subtensor{:int64:}.0, Elemwise{Composite{log((i0 * i1))}}.0, TensorConstant{(1,) of -inf})
3.9% 39.6% 0.003s 3.16e-06s 1000 19 Elemwise{Composite{(Switch(Identity(GT(i0, i1)), (i2 - (i3 * i0)), i4) + i5 + Switch(Identity(GT(i6, i1)), (i7 - (i8 * i6)), i4) + i9 + i10 + i11)}}[(0, 0)](Elemwise{exp,no_inplace}.0, TensorConstant{0}, TensorConstant{3.9120230674743652}, TensorConstant{50.0}, TensorConstant{-inf}, sigma_log_, Elemwise{exp,no_inplace}.0, TensorConstant{-2.3025850929940455}, TensorConstant{0.1}, nu_log_, Sum{acc_dtype=float64}.0, Sum{acc_dtype=float64}.0)
3.9% 43.5% 0.003s 3.14e-06s 1000 0 Elemwise{exp,no_inplace}(sigma_log_)
3.8% 47.2% 0.003s 3.03e-06s 1000 17 Sum{acc_dtype=float64}(Elemwise{Composite{Switch(i0, (i1 * ((-(i2 * sqr((i3 - i4)))) + i5)), i6)}}.0)
3.8% 51.0% 0.003s 3.02e-06s 1000 3 Subtensor{int64::}(s, Constant{1})
3.7% 54.7% 0.003s 2.99e-06s 1000 14 Elemwise{Composite{scalar_gammaln((i0 * i1))}}(TensorConstant{(1,) of 0.5}, Elemwise{add,no_inplace}.0)
3.6% 58.4% 0.003s 2.93e-06s 1000 18 Sum{acc_dtype=float64}(Elemwise{Composite{Switch(Identity((GT(Composite{exp((i0 * i1))}(i0, i1), i2) * i3 * GT(inv(sqrt(Composite{exp((i0 * i1))}(i0, i1))), i2))), (((i4 + (i5 * log(((i6 * Composite{exp((i0 * i1))}(i0, i1)) / i7)))) - i8) - (i5 * i9 * log1p(((Composite{exp((i0 * i1))}(i0, i1) * i10) / i7)))), i11)}}.0)
3.6% 62.0% 0.003s 2.93e-06s 1000 11 Elemwise{gt,no_inplace}(InplaceDimShuffle{x}.0, TensorConstant{(1,) of 0})
3.6% 65.6% 0.003s 2.92e-06s 1000 12 Elemwise{Composite{log((i0 * i1))}}(TensorConstant{(1,) of 0...9154943092}, Elemwise{Composite{inv(sqr(i0))}}.0)
3.6% 69.2% 0.003s 2.91e-06s 1000 1 Elemwise{exp,no_inplace}(nu_log_)
3.6% 72.8% 0.003s 2.89e-06s 1000 13 Elemwise{Composite{Identity(GT(inv(sqrt(i0)), i1))}}(Elemwise{Composite{inv(sqr(i0))}}.0, TensorConstant{(1,) of 0})
3.6% 76.4% 0.003s 2.86e-06s 1000 6 InplaceDimShuffle{x}(sigma)
3.5% 79.9% 0.003s 2.81e-06s 1000 7 InplaceDimShuffle{x}(nu)
3.5% 83.3% 0.003s 2.78e-06s 1000 2 Subtensor{:int64:}(s, Constant{-1})
3.4% 86.8% 0.003s 2.76e-06s 1000 9 Elemwise{add,no_inplace}(TensorConstant{(1,) of 1.0}, InplaceDimShuffle{x}.0)
3.3% 90.1% 0.003s 2.69e-06s 1000 8 Elemwise{Composite{inv(sqr(i0))}}(InplaceDimShuffle{x}.0)
3.3% 93.4% 0.003s 2.68e-06s 1000 5 ViewOp(Elemwise{exp,no_inplace}.0)
3.3% 96.7% 0.003s 2.65e-06s 1000 4 ViewOp(Elemwise{exp,no_inplace}.0)
3.3% 100.0% 0.003s 2.63e-06s 1000 10 Elemwise{Composite{scalar_gammaln((i0 * i1))}}(TensorConstant{(1,) of 0.5}, InplaceDimShuffle{x}.0)
... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
Here are tips to potentially make your code run faster
(if you think of new ones, suggest them on the mailing list).
Test them first, as they are not guaranteed to always provide a speedup.
- Try the Theano flag floatX=float32
We don't know if amdlibm will accelerate this scalar op. scalar_gammaln
We don't know if amdlibm will accelerate this scalar op. scalar_gammaln
- Try installing amdlibm and set the Theano flag lib.amdlibm=True. This speeds up only some Elemwise operation.
In [3]:
model.profile(gradient(model.logpt, model.vars)).summary()
Function profiling
==================
Message: /home/jovyan/pymc3/pymc3/model.py:605
Time in 1000 calls to Function.__call__: 3.743136e-01s
Time in Function.fn.__call__: 3.272467e-01s (87.426%)
Time in thunks: 1.778915e-01s (47.525%)
Total compile time: 1.396206e+00s
Number of Apply nodes: 47
Theano Optimizer time: 6.084559e-01s
Theano validate time: 1.443505e-02s
Theano Linker time (includes C, CUDA code generation/compiling): 7.295318e-01s
Import time 8.256626e-02s
Node make_thunk time 7.264183e-01s
Node Elemwise{Composite{Switch(i0, (i1 * (i2 + ((i3 * i4 * i5 * i6) / i7))), i8)}}[(0, 6)](Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, TensorConstant{(1,) of -2.0}, TensorConstant{(1,) of 0.5}, TensorConstant{(1,) of -0.5}, InplaceDimShuffle{x}.0, TensorConstant{[ 4.05769..48400e-06]}, Elemwise{Composite{exp((i0 * i1))}}.0, Elemwise{Add}[(0, 1)].0, TensorConstant{(1,) of 0}) time 5.903370e-01s
Node Join(TensorConstant{0}, Rebroadcast{1}.0, Rebroadcast{1}.0, IncSubtensor{InplaceInc;:int64:}.0) time 1.472402e-02s
Node Elemwise{Composite{Switch(i0, ((i1 * i2 * i3 * i4) / i5), i6)}}(Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, TensorConstant{(1,) of 0.5}, InplaceDimShuffle{x}.0, Elemwise{Composite{exp((i0 * i1))}}.0, TensorConstant{[ 4.05769..48400e-06]}, Elemwise{Add}[(0, 1)].0, TensorConstant{(1,) of 0}) time 1.461983e-02s
Node Elemwise{Composite{Switch(i0, (i1 * i2 * i3), i4)}}(Elemwise{Composite{Identity(GT(inv(sqrt(i0)), i1))}}.0, TensorConstant{(1,) of -1.0}, InplaceDimShuffle{x}.0, Elemwise{sub,no_inplace}.0, TensorConstant{(1,) of 0}) time 7.709503e-03s
Node Elemwise{Composite{(i0 + Switch(Identity(GT(i1, i2)), (i3 * i1), i2) + (((i4 * i5 * psi((i4 * i6))) + (i7 * (i8 / i9)) + (i4 * i10 * psi((i4 * i9))) + (i4 * i11) + (i12 / i9)) * i1))}}[(0, 5)](TensorConstant{1.0}, Elemwise{exp,no_inplace}.0, TensorConstant{0}, TensorConstant{-0.1}, TensorConstant{0.5}, Sum{acc_dtype=float64}.0, Elemwise{add,no_inplace}.0, TensorConstant{3.141592653589793}, Sum{acc_dtype=float64}.0, nu, Sum{acc_dtype=float64}.0, Sum{acc_dtype=float64}.0, Sum{acc_dtype=float64}.0) time 7.651806e-03s
Time in all call to theano.grad() 7.784910e-01s
Time since theano import 13.326s
Class
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
54.0% 54.0% 0.096s 4.00e-06s C 24000 24 theano.tensor.elemwise.Elemwise
12.9% 66.9% 0.023s 3.29e-06s C 7000 7 theano.tensor.elemwise.Sum
6.4% 73.3% 0.011s 5.70e-06s C 2000 2 theano.tensor.subtensor.IncSubtensor
4.9% 78.2% 0.009s 2.88e-06s C 3000 3 theano.tensor.elemwise.DimShuffle
3.8% 81.9% 0.007s 6.69e-06s C 1000 1 theano.tensor.basic.Join
3.7% 85.6% 0.007s 3.28e-06s C 2000 2 theano.tensor.subtensor.Subtensor
3.4% 89.0% 0.006s 3.04e-06s C 2000 2 theano.tensor.basic.Reshape
3.2% 92.3% 0.006s 5.70e-06s C 1000 1 theano.tensor.basic.Alloc
3.1% 95.4% 0.006s 2.76e-06s C 2000 2 theano.compile.ops.ViewOp
3.0% 98.3% 0.005s 2.66e-06s C 2000 2 theano.compile.ops.Rebroadcast
1.7% 100.0% 0.003s 2.95e-06s C 1000 1 theano.compile.ops.Shape_i
... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
Ops
---
<% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
12.9% 12.9% 0.023s 3.29e-06s C 7000 7 Sum{acc_dtype=float64}
7.6% 20.6% 0.014s 3.40e-06s C 4000 4 Elemwise{Switch}
4.9% 25.4% 0.009s 2.88e-06s C 3000 3 InplaceDimShuffle{x}
4.8% 30.2% 0.009s 8.55e-06s C 1000 1 Elemwise{Composite{Switch(i0, (-log1p((i1 / i2))), i3)}}
4.0% 34.3% 0.007s 7.18e-06s C 1000 1 IncSubtensor{InplaceInc;int64::}
4.0% 38.2% 0.007s 3.53e-06s C 2000 2 Elemwise{exp,no_inplace}
3.8% 42.0% 0.007s 6.69e-06s C 1000 1 Join
3.6% 45.6% 0.006s 6.32e-06s C 1000 1 Elemwise{Composite{exp((i0 * i1))}}
3.4% 49.0% 0.006s 3.04e-06s C 2000 2 Reshape{1}
3.2% 52.2% 0.006s 5.70e-06s C 1000 1 Alloc
3.1% 55.3% 0.006s 2.76e-06s C 2000 2 ViewOp
3.0% 58.3% 0.005s 2.66e-06s C 2000 2 Rebroadcast{1}
2.9% 61.2% 0.005s 5.22e-06s C 1000 1 Elemwise{Composite{Switch(i0, (i1 * (i2 + ((i3 * i4 * i5 * i6) / i7))), i8)}}[(0, 6)]
2.9% 64.1% 0.005s 5.19e-06s C 1000 1 Elemwise{Composite{Switch(i0, ((i1 * i2 * i3 * i4) / i5), i6)}}
2.9% 67.0% 0.005s 5.14e-06s C 1000 1 Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}
2.4% 69.4% 0.004s 4.31e-06s C 1000 1 Elemwise{Composite{(i0 + Switch(Identity(GT(i1, i2)), (i3 * i1), i2) + (((i4 * i5 * psi((i4 * i6))) + (i7 * (i8 / i9)) + (i4 * i10 * psi((i4 * i9))) + (i4 * i11) + (i12 / i9)) * i1))}}[(0, 5)]
2.4% 71.8% 0.004s 4.23e-06s C 1000 1 IncSubtensor{InplaceInc;:int64:}
2.2% 74.0% 0.004s 3.96e-06s C 1000 1 Elemwise{Composite{Switch(i0, (i1 * i2 * i3), i4)}}
2.2% 76.3% 0.004s 3.94e-06s C 1000 1 Elemwise{sub,no_inplace}
2.2% 78.4% 0.004s 3.83e-06s C 1000 1 Elemwise{Composite{Switch(i0, (i1 * sqr(i2)), i3)}}
... (remaining 12 Ops account for 21.59%(0.04s) of the runtime)
Apply
------
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
4.8% 4.8% 0.009s 8.55e-06s 1000 22 Elemwise{Composite{Switch(i0, (-log1p((i1 / i2))), i3)}}(Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, Elemwise{mul,no_inplace}.0, InplaceDimShuffle{x}.0, TensorConstant{(1,) of 0})
4.0% 8.8% 0.007s 7.18e-06s 1000 39 IncSubtensor{InplaceInc;int64::}(Elemwise{Composite{Switch(i0, (i1 * (i2 + ((i3 * i4 * i5 * i6) / i7))), i8)}}[(0, 6)].0, Elemwise{Composite{Switch(i0, (i1 * i2 * i3), i4)}}.0, Constant{1})
3.8% 12.6% 0.007s 6.69e-06s 1000 46 Join(TensorConstant{0}, Rebroadcast{1}.0, Rebroadcast{1}.0, IncSubtensor{InplaceInc;:int64:}.0)
3.6% 16.2% 0.006s 6.32e-06s 1000 5 Elemwise{Composite{exp((i0 * i1))}}(TensorConstant{(1,) of -2.0}, s)
3.2% 19.4% 0.006s 5.70e-06s 1000 29 Alloc(Elemwise{switch,no_inplace}.0, Elemwise{Composite{(i0 - Switch(LT(i1, i0), i2, i0))}}[(0, 0)].0)
2.9% 22.3% 0.005s 5.22e-06s 1000 36 Elemwise{Composite{Switch(i0, (i1 * (i2 + ((i3 * i4 * i5 * i6) / i7))), i8)}}[(0, 6)](Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, TensorConstant{(1,) of -2.0}, TensorConstant{(1,) of 0.5}, TensorConstant{(1,) of -0.5}, InplaceDimShuffle{x}.0, TensorConstant{[ 4.05769..48400e-06]}, Elemwise{Composite{exp((i0 * i1))}}.0, Elemwise{Add}[(0, 1)].0, TensorConstant{(1,) of 0})
2.9% 25.2% 0.005s 5.19e-06s 1000 34 Elemwise{Composite{Switch(i0, ((i1 * i2 * i3 * i4) / i5), i6)}}(Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, TensorConstant{(1,) of 0.5}, InplaceDimShuffle{x}.0, Elemwise{Composite{exp((i0 * i1))}}.0, TensorConstant{[ 4.05769..48400e-06]}, Elemwise{Add}[(0, 1)].0, TensorConstant{(1,) of 0})
2.9% 28.1% 0.005s 5.14e-06s 1000 18 Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}(Elemwise{Composite{exp((i0 * i1))}}.0, TensorConstant{(1,) of 0}, Elemwise{gt,no_inplace}.0)
2.4% 30.5% 0.004s 4.31e-06s 1000 40 Elemwise{Composite{(i0 + Switch(Identity(GT(i1, i2)), (i3 * i1), i2) + (((i4 * i5 * psi((i4 * i6))) + (i7 * (i8 / i9)) + (i4 * i10 * psi((i4 * i9))) + (i4 * i11) + (i12 / i9)) * i1))}}[(0, 5)](TensorConstant{1.0}, Elemwise{exp,no_inplace}.0, TensorConstant{0}, TensorConstant{-0.1}, TensorConstant{0.5}, Sum{acc_dtype=float64}.0, Elemwise{add,no_inplace}.0, TensorConstant{3.141592653589793}, Sum{acc_dtype=float64}.0, nu, Sum{acc_dtype=float64}.0, Sum{
2.4% 32.9% 0.004s 4.23e-06s 1000 42 IncSubtensor{InplaceInc;:int64:}(IncSubtensor{InplaceInc;int64::}.0, Elemwise{Composite{Switch(i0, (i1 * i2), i3)}}[(0, 2)].0, Constant{-1})
2.2% 35.1% 0.004s 3.96e-06s 1000 19 Elemwise{Composite{Switch(i0, (i1 * i2 * i3), i4)}}(Elemwise{Composite{Identity(GT(inv(sqrt(i0)), i1))}}.0, TensorConstant{(1,) of -1.0}, InplaceDimShuffle{x}.0, Elemwise{sub,no_inplace}.0, TensorConstant{(1,) of 0})
2.2% 37.3% 0.004s 3.94e-06s 1000 7 Elemwise{sub,no_inplace}(Subtensor{int64::}.0, Subtensor{:int64:}.0)
2.2% 39.5% 0.004s 3.88e-06s 1000 3 Elemwise{exp,no_inplace}(sigma_log_)
2.2% 41.7% 0.004s 3.83e-06s 1000 20 Elemwise{Composite{Switch(i0, (i1 * sqr(i2)), i3)}}(Elemwise{Composite{Identity(GT(inv(sqrt(i0)), i1))}}.0, TensorConstant{(1,) of 0.5}, Elemwise{sub,no_inplace}.0, TensorConstant{(1,) of 0})
2.1% 43.7% 0.004s 3.66e-06s 1000 10 Elemwise{mul,no_inplace}(Elemwise{Composite{exp((i0 * i1))}}.0, TensorConstant{[ 4.05769..48400e-06]})
2.0% 45.8% 0.004s 3.62e-06s 1000 38 Elemwise{Composite{(i0 + Switch(Identity(GT(i1, i2)), (i3 * i1), i2) + (i4 * (((i5 * i6 * Composite{inv(Composite{(sqr(i0) * i0)}(i0))}(i7)) / i8) - (i9 * Composite{inv(Composite{(sqr(i0) * i0)}(i0))}(i7))) * i1))}}[(0, 6)](TensorConstant{1.0}, Elemwise{exp,no_inplace}.0, TensorConstant{0}, TensorConstant{-50.0}, TensorConstant{-2.0}, TensorConstant{0.5}, Sum{acc_dtype=float64}.0, sigma, Elemwise{Composite{inv(sqr(i0))}}.0, Sum{acc_dtype=float64}.0)
2.0% 47.7% 0.003s 3.48e-06s 1000 25 Elemwise{Switch}(Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, TensorConstant{(1,) of 1.0}, TensorConstant{(1,) of 0.0})
1.9% 49.7% 0.003s 3.45e-06s 1000 24 Elemwise{switch,no_inplace}(Elemwise{Composite{Identity((GT(i0, i1) * i2 * GT(inv(sqrt(i0)), i1)))}}.0, TensorConstant{(1,) of -0..9154943092}, TensorConstant{(1,) of 0})
1.9% 51.6% 0.003s 3.44e-06s 1000 2 Subtensor{int64::}(s, Constant{1})
1.9% 53.5% 0.003s 3.44e-06s 1000 35 Sum{acc_dtype=float64}(Alloc.0)
... (remaining 27 Apply instances account for 46.46%(0.08s) of the runtime)
Here are tips to potentially make your code run faster
(if you think of new ones, suggest them on the mailing list).
Test them first, as they are not guaranteed to always provide a speedup.
- Try the Theano flag floatX=float32
We don't know if amdlibm will accelerate this scalar op. psi
We don't know if amdlibm will accelerate this scalar op. psi
- Try installing amdlibm and set the Theano flag lib.amdlibm=True. This speeds up only some Elemwise operation.
Content source: dolittle007/dolittle007.github.io
Similar notebooks: