Monday, 8 December 2014

Why you need two systems for running automating trading strategies

Running a fully automated trading strategy requires very little time. Apart from the 6 months or so of flat out coding you need to do first of course. Before doing this coding there is a chicken or the egg question to resolve. Do you write backtesting code and then some extra bits to make it trade live, or do you write live trading code which you then try and backtest?

Some background

If you haven't had the pleasure of writing an automated trading strategy, perhaps because you use prebaked software like "Me Too! Trader" or an online platform such as, you may wonder what on earth I am talking about. 

The issue is that there are two completely different user requirements for what is usually one piece of software. The first user is a researcher. They want something highly flexible that they can use to test different, and novel, trading strategies; and to simulate their profitability by "backtesting". Any component needs to be interactive, dynamic and easy to modify.

The next user - lets call them the implementor -  does not rate flexibility, indeed it may be viewed as potentially dangerous. They want something that is ultra robust and can run with minimal human intervention. Every component must be unit tested to the eyeballs; modifications should be minimal and rigorously tested. Interaction is strongly discouraged and should be limited to reading diagnostic output. The code needs to be stuffed full of fail safes, "what ifs?", and corner case catchers.

Ultimately you won't benefit from a systematic trading strategy unless both users are happy. You will end up with a product which is either untested with market data and which may not be profitable (unhappy researcher), or with one which should be profitable but is so badly implemented it will either crash daily or produce fat finger class errors and buy 10e6 too many contracts (unhappy implementor). 

Weirdly of course if, like myself, you're trading with your own money these users are the same person!

Ideas have to be tested

In the vast majority of cases the backtest code comes first, for the same reason that when it comes to building a new car you don't just weld together a bunch of panels and see what they look like; you get out your little clay model (or in this less romantic world, your CAD package). Pretty much every design discipline uses a 'sandbox' environment to develop ideas. Important fact: The people playing in the sandpit aren't usually professionally trained programmers (including yours truly).

Either in a greenfield corporate context, or if you are developing your own stuff, the first thing you will do is write some code that turns prices or other data into positions; and then a little routine to pretend you were actually trading live in the past to see how much money you did, or didn't make.

If you're sensible then you might even have some of your core mathematical routines tidied up and unit tested so they are properly reusable. You can try and modularise the code as much as possible, so running a different trading rule just involves repointing one line of code. You could get quite fancy and have code that is flexible and is configurable by file or arguments, rather than  "configuration" by script. Your simulation of backtested performance can get quite sophisticated.

At some point though you're going to want to run real money on this.



Productionization - bringing in the grownups

This simulation code isn't normally up to the job of running with real money. In theory all you need to do is write a script that runs the simulation every day / hour / minute and then another piece of code that turns the output of that into actual real live trades.
I suspect most people who are running their own money go down this path. However I would estimate that only 10% of my own code base (of which more in a moment) is needed to run a simulation. What that means in practice is you start with code that isn't sufficiently robust to run in a fully automated way (because it's missing most of the other 90%) and if you're lucky you end up with a vast jerry built structure of things tacked on when you realise you needed them.

If you are trading your own money and not interested in the machinations of corporate fund management politics you'll probably want to skip ahead to 'Two systems'.
Alternatively what tends to happen next in a corporate context is some proper programmers get brought in to productionize the system. The simulation code is normally treated as a specification document, and a seriously incomplete and badly written one at that, rather than as a prototype. The rest of the spec, which is the stuff you need to do the 90%, then has to be written by the implementor.

The result is a robust trading systems but one on which it's now impossible to do any research. The reason why it's are that it's very hard to unpick the 10% of code that can be mucked about with, muck about with it and then re-run it to see what will happen.

When lunatics run the asylum: need for innovation

What usually happens next is that the research user comes up with some clever idea that the solid monolithic tank like existing production code isn't capable of doing. Given that most quant finance businesses have an oversupply of clever people with clever ideas, and an under supply of people who can actually make things work properly, they will then be faced with a choice. Either wait for many months for some programming talent to become available, or try and twist the arm of management to let them implement the simulation system with real money.

Most quant finance businesses are run by quants (Which you might think is the natural order of things. But being very clever and insightful AND being a great business person are quite unusual skills to find in the same person. Perhaps it is sometimes better to have the business run by a glorified COO whilst you stick to what you're good at, which is usually the cool and fun stuff. Tech company bosses also take note). Which means that the simulation system ends up being used to trade real money, despite this being an insane idea. By the way having quants in charge is also why there is an under supply of builders AKA programmers versus architects AKA researchers in these businesses. That and separate reporting / manpower budget lines for CTO's.

Anyway the bottom line is that rather than modify the existing production code to do the new new thing the programmers then often have to work with a hacked up backtest pretending to be a swan like production system. But because this is actually running real money it's treated more as a prototype than a badly written spec. This means a lot of crud gets ported across into the production system, and the process of productionizing takes a lot longer.

Eventually we end up with a robust system again. Until that is some bright spark has another clever idea, and the cycle begins again.

Two systems: An aside on testing and matching

One way - indeed the best way - of dealing with this is to keep your two code bases completely separate. Once your bright idea is fully developed then you show the programmers your code. After laughing hard at your pathetic attempt they then incorporate it into the production system. You then continue to use your simulation code

You can also do this as an individual, although you probably won't laugh at your own code, not if you've just written it anyway. As an individual programmer and trader having to maintain two systems is also a serious time overhead, but ultimately worth it.

Back in the corporate world an obvious problem with this is you still have the bottleneck of needing enough programmers to keep up with the flow of wonderful ideas. However at least they aren't wasting their time trying to deal with hurriedly rewriting cruddy simulation systems that are already running real money before they blow up.

A slight problem with this is that you have created two ways to do something. Corporate types running systematic fund businesses have an unhealthy obsession with things being 'right'. You have to prove that the position coming from your production system is 'right'. If you have a simulation the most obvious way of doing this is to run that and crosscheck them. In this way the simulation becomes a glorified integration test of the production code. 

This is a recipe for tens of thousands of person hours of wasted time and effort trying to work out why two are slightly different. This is completely stupid. For starters there is no 'right'. All trading rules are guesses anyway. Under this logic a trading rule that did exactly what it was 'supposed' to do, but lost a billion dollars would be better than one which was a bit wayward but which made the same amount in profit.

Second of all this is a very stupid way of testing anything. You should have a spec as to what the trading system should do. In case it isn't obvious, I don't think the simulation code should be the spec. At best it's a starting point for writing the spec. But should you reproduce a bug in the simulation if it isn't what was intended? No. You should find out what's intended, write it down, and that is what you should implement. You then write tests to check the production code meets the spec. And mostly they should be unit tests. This is very obvious indeed to anyone working in any kind of other industry where you build stuff after prototyping it.

Production first?!?!

When it came to writing my own trading system about a year ago I did something radical. Since I knew exactly what I wanted to implement, I just sat down and wrote the production code. Of course I was in the unusual position of having already designed enough trading systems to know what I wanted to do, albeit in a corporate context and I had never written an end to end production system before.

I don't think writing production first is unattainable even if you don't know exactly what you're going to do. If you have the pleasure of working in a greenfield setting you have two main jobs to do. The first is to write a production system, and the second is to come up with some new and profitable ideas. Don't wait until you've come up with ideas to hire your proper programmers, hire them now. Get them to code up a simple trading rule in a robust production system. Meanwhile you can do your clever stuff. Occasionally they will come and confront you with questions, and hopefully this will will force you to direct your cleverness in the direction of clarifying what your investment process might end up being.

Similarly if you are writing your own stuff then it might be worth coding the simplest possible production system first before you do your research. You could even do both jobs them in parallel. It's quite nice being able to shift to doing some hard core econometrics when you've been coding up corner cases for trading algorithms, and some mindless script writing can be just the ticket when you are stuck for inspiration and the great trading ideas just aren't coming.

If your code is modular enough you should be able to subsequently write the simulation code from production rather than vice versa. The simulation code will just be some scaffolding around the core trading rule part of your production code (the 10% bit, remember?). 

With my own system I did get round to doing this, but only after I'd be trading for 6 months. But to be honest I don't really run my simulation code that much, and I certainly don't check it against my production code. It's only used for what it should be used for - a sandbox for playing in. If I come up with any new ideas then I'll then have to go and implement them in the production code. So I am firmly in the two system world, although I approached it from the other direction than what we normally see.



No not the early 90's grunge band, but the idea of some perfect system existing that can do both. A giant uber-system which can meet both requirements. I don't think such a nirvana is attainable, for a couple of reasons. Firstly the work involved is substantial - I would estimate at least four times as much as developing a separate production and simulation system.

Secondly, in a corporate context, there is usually too big a disparity in the needs of different users particularly on the research side. Often there is a temptation to over specify the flashy aspects of the project, such as the user interface having lots of interactive graphics. This often happens because the senior managers with the authority to order such large IT projects haven't done much coding for a while and need more of a point and click interface.

Small steps

I don't believe in the fairy story of 'one system to rule them all'. Instead I believe that two systems probably works best, but with some sensible code reuse where it makes sense. Here are some of the small steps you can take.

As I've already mentioned your core utilities, like calculate a moving average*, should be shared, and tested to death, so you can trust them. 

* Okay bad example, since I get pandas to do this for me. But you get the idea.

You can't possible reuse code unless you have good modularity. The wrapper around the 10% of my production code that is reusable for simulations looks like this:
data1 = get_live_data_to_do_step_one(*args for live data)
config1 = get_live_config_to_do_step_one(*args for live config)
diag = diagnostic(* define where live diagnostics are written to)

output1 = do_step_one(data=data1, config=config1, diag=diag)

data2 = get_live_data_to_do_step_two(*args for live data)
config2 = get_live_config_to_do_step_two(*args for live config)
output2 = do_step_two(output1, somedata=output1, moredata=data2, config2=config2, diag)

Hopefully I don't need to spell out how the simulation code is different, or how it would be hard to replace step one with a different step one in a research context if the code wasn't broken down like this.

Try and separate out the parts that do all the corner case and type testing from the actual algorithm. The latter part you will want to play with and look at. This does however mean you can't have a simple 'doughnut' model of production and simulation code, where there is just a different 'scaffolding' around a core position generation function (which I realise is what my pseudo code implies...). It needs to be more dynamic than that.

Don't make stuff reusable for the sake of it. For example I toyed with creating a fancy accounting object which could analyse either live or simulated profitability. But ultimately I didn't think it was worth it, just because it would have been cool. Instead I wrote a lot of small routines that did various small analysis, that I could stick together in different ways for each task.

As well as code reuse you can also have data reuse. It doesn't make any sense to have two databases of price data, one for simulation and one for live data. If there are certain prebaked calculations that you always do, such as working out price volatility, then you should have your production system work them out as often as it needs to and dump the results where the rest of your system, including your simulation code, can get it.

Go forth and code

That's it then. Hopefully I've convinced you that the two system model makes sense. Now if you will excuse me I'm going to go and hack some back testing code ....


  1. The dichotomy between strategy research/testing and production can be challenging. It really depends on how well this was implemented. It is possible to have flexibility and safety / performance, though getting something baked-in is more likely than not with the typical trading infrastructure team.

    I've been on both sides of this (HFT - intra day trading sys developer & quant/trader / strategy researcher). With the exception of ultra-HFT, one way to deal with this at the cost of a small amount of latency is to:

    - create a separate service for execution that interacts with exchanges at low-latency and is rock-solid
    - create a separate service to run the strategies

    The strategies give high-level orders (execute some size with some execution algo) to the execution service. The execution service breaks that down into fine-grained limit orders with the market place, closely observing impact, momentum, etc.

    The execution service is also responsible to enforce money-management, risk limits, reject fat finger(suspicious) prices or sizes - so all sensitive controls are in this module.

    The strategy service can have a lower bar in terms of degree to which was unit tested. This can be very flexible and change more often than the execution / connectivity codebase.

    The above would not be suitable for ultra-HFT (where the game is at the metal), however, for anything else, the 1-5microseconds of additional overhead is nothing.

    Out of curiousity, what asset class are you trading and what is motivating your signals?

  2. I'm trading just over 50 futures markets, across all asset classes. I trade pretty slowly; average holding period in the several weeks range; with a combination of simple technical systems. That means that latency isn't really an issue at all.

    I run something similar in that I have a bunch of processes / services set up, one of which is akin to the execution service, another the strategy generation. The execution service is a bit dumber than the one you describe. But it does have the fail safes built in, so if the strategy generation goes haywire it should catch it.

    I guess the service idea is an extension of modularity; you don't just have separate modules but separate processes.

    It's a fairly common setup probably in many firms as well. At my last place we did try and build something which meant you could drop a lightly worked up position generation engine into a rock solid wrapper that did all the other services for you.

    However we still run into the institutional 'everything must be tested to death' barrier which means even the 'quick and easy' position generation had to be tested to death and we didn't really gain the flexibility we'd hoped for; although there were advantages to enforcing a common interface and general way of doing things.

    It was also hard to specify an interface which made sense for both a research and a production 'wrapper', so in the end we still had a two system problem.