In [1]:
%autosave 10


Autosaving every 10 seconds

Background

  • Not just data munging as primary problem, but it is a big problem.
    • Sources, formats, cleaning missing data.
  • Performance too
  • Organisational problems
    • Teams are silos. People who need answers can't ask questions. People who can give answers can't express them.
  • Continuum Analytics vision
    • Simple, interactive, collaborative, but still scalable performance

Notes

  • pandas offers access to free financial sources, but beware! Not clean, not reliable, but good for playing around.
  • Volatility clustering: if you plot log difference in close (log returns) you notice volatility clusters, isn't randomly distributed.
  • Investors want volalitity, offers short-term trading profit chances
  • Do you have to shift Returns by 1 (back 1) before multiplying? No.
  • This model doesn't take portfolio rebalancing (?) into account.
    • Rebalancing means restoring e.g. 70% in X, 30% in Y balance of your portfolio
  • Should use discounting, rather than simple sum of Earnings
  • Err on the side of readability. Don't put too many operations, particular in Pandas, onto one line.
    • Unless performance, when measured, is an issue.
  • VSTOXX vs EUROSTOXX
    • EUROSTOXX is mean reverting, standard theory of stocks apply.
    • VSTOXX is kind of like an interest rate. Percentage points, aggregate, implies volatility of puts and calls.
  • Log returns helps comparing two different time series in a mathematical way. Seems a common pattern.
  • Good link: http://scipy-lectures.github.io/advanced/mathematical_optimization/

 High frequency trading data

  • High frequency data not well covered by textbooks, even just the data sizes changes the game.
  • Worse, heterogenous time intervals! Tick data comes when it comes, not fixed.
  • !!AI Can you use numexpr to df.apply(...) some optimized function?

Why Python?

  • Nothing compares to Python's sheer breadth.
    • What, in Ruby, comes close to NumPy, SciPy, and Pandas?
  • R?
    • Systems development, actual production code, web development, ..., Python can do it.
  • Performance?
    • Python has overcome this stigma.
    • Python is less a glue between system components or libraries, and more a glue between high performance methods.
      • LLVM, multi-core, GPUs, clusters.

In [ ]: