In [1]:
%autosave 10


Autosaving every 10 seconds

Context of financial industry

Cost, resources, regulations

  • Increasing regulatory requirements.
  • Require cost efficiency and short time to market; contradictory.
    • Outsourcing, cheaper in short term, has higher end-to-end long term cost.
      • Need better QA, more management.

Homogenous -> heterogenous tech stacks

  • Used to have everything (OpenVMS, COBOL) for 10-15 years.
  • Now very hetereogenous (LAMP).
  • Financial companies keep adding very niche, massive services, and interconnect with interfaces.
    • Unwieldy, massive, extremely proprietary.
  • Can Python become a universal end-to-end tool? Deutsche Börse Group have tried.

What is an exchange now?

  • Used to be large, spacious buildings full of people
  • Now large data centres, multiple to offer failover.
  • Spectacular amount of cooling. Data centre techs like talking about cooling.
  • Deutsche Börse is unique, integrated vertical chain of many services, unlike LSE or NASDAQ.
    • Many integrated services, many direct connections to other countries.
    • more than 420 participants, > 8000 traders, > 30 markets
  • Round trip processing time from 4ms to 200 microseconds. New system Linux, distribution.
    • Why does it need to be fast? To ameliorate arbitrage.
      • If you don't execute fast enough then the fastest competitors can effectively execute arbitrage amongst exchanges.

History of Python

  • SEC proposed Issuers of Asset Backed Securities, in April 2010, to document algorithms in executable Python
    • Caused consternation amongst big compnies; why Python, and can't do open source.
  • 2010-today: Python for analysis and simulation of Eurex pricing, very successful.

Current stack (T7)

  • RHEL with MRG (?) kernel
  • Boost (C++)
  • IBM WLLM (?)
  • Python
    • Spreadsheet-based data driven script generator (D2SG)
      • Specify model and data in spreadsheet
      • Auto-generate test cases, then execution.
      • Uses PyUno, macro language module.
    • Distributed automated performance tester (AUTOPET)
  • MySQL, JBOSS, Apache ActiveMQ

 Pricing

  • Very complex, unmodelable in spreadsheets
    • e.g. Market makers get discount for providing volume that allows exchanges to function, risky.
  • Use Python to run simulations to determine pricing
    • Can't use averages, summary statistics. Too stochastic.
    • pandas + HDF5 makes short work of massive data sets that serve as basis of simulations.
      • e.g. Monte Carlo
    • Can execute 10 million+ row data set simulations in seconds.
    • Also very concise, human-readable descriptions of contracts that govern pricing agreements that cannot be expressed in spreadsheets.

VSTOXX tutorial

  • Help people new to the VSTOXX index develop their own stochastic models and conduct their own research
  • pandas / numpy / scipy / HDF5
  • http://eurexchange.com/vstoxx
  • JavaScript visualisation of example generated model.
  • (!!AI I think this is the same tutorial we did the first day)

StatistiX

  • Currently share SQL data to customers. Interface is phone calls or email (!)
  • Want to move to an on-demand Python-based interface to exchange data.
  • Excited about IPython Notebook as primary customer interface.
  • Also DevOps (configuration management), to improve documentation of system architecture.
  • Python is increasingly used not to small websites but large deployments, many industries.
  • Java/C++ decreasing in mind-share, Python is creeping up.
  • Ohloh statistics: Java/C++ stagnating, Python is steadily increasing in monthy contributors and contributions.
  • Python is glue to extremely high performance software libraries (LLVM, Numba, Cython, pandas, HDF5, scipy, scikit).

 Why should "software factory" businesses care?

  • Python is multi-paradigm. Teach once, apply everywhere.
  • Rapid prototyping, test and fail fast. Reduce time to market.
  • Avoid vendor lock in, because increasingly vendors lag too far behind industry cutting edge.
  • Open source: "invest in people, not in licenses".

End goal

  • IT value chain:
    • Business -> IT -> IT Factory -> IT Operations
  • End-to-end Python development framework
  • Good at last two, not good at first two.
  • Get everyone in business to use Python, integrate the whole business under one active, supported framework.
  • But, not looking for a Holy Grail and throw away everything and start from scratch.
  • Python is a swiss army knife, fun, vibrant, and offers end-to-end integration.

Questions

  • How can you hope to convert people from spreadsheets?
    • No Holy Grail. Financial people are very wedded to spreadsheets and VBA.
    • People, when they encounter problems difficult to solve in spreadsheets/VBA, will move themselves to Python.
  • Do you use Python for high performance tasks?
    • No, not for low-latency environment. There they use C to achieve sub-millisecond round trip times.
    • But not all tasks are low latency.
    • For end-to-end IT projects, higher proportion, substantial portion, spent in prototyping / initial feasibility stages.
    • Python helps to reduce upfront costs on large IT proejcts.
    • And if there are specific latency/speed/data issues, then tweak that small fraction.
    • Prototyping in Python 3-4 lines can take massive amounts of time in C++, and you never shed this initial burden or share it easily.
  • Largest obstacles to Python are human and cultural. What are your experiences?
    • No, haven't encountered this.
    • Don't force people to use Python, just show them how much easier it is.

In [ ]: