Leaking the Future

In a prior article I mentioned a convention of using Excel in a way that ensures a programmer can implement what the spreadsheet represents.

Since a spreadsheet is a program, obviously being interpreted (the XLSX file is being processed by Excel for instance) it doesn’t make sense that a spreadsheet can be created that can’t have its computation implemented in a language like Python.

So, how does a spreadsheet do what can’t be done in software? How is that even possible?

The Setup

I encountered this problem originally working with an equity trader that sought automation for their trading algorithm.

Trading algorithms work fundamentally in two modes:

  1. Backtesting by feeding historical data and simulating results
  2. Operational by feeding live market feeds (FIX, or whatever platform they are using)

If the software is written properly, the exact same code is used in both modes. This means that the backtesting is accomplished by feeding historical data into the same code that the operational feed drives.

The backtest software takes the signals from the algorithm (long/short/no action) and maintains a fake portfolio and P&L. The operational software takes those same signals and places orders. The fills are then used for a real portfolio.

This approach of using the same algo for both test and production is needed to ensure that no assumptions or implementation features are different. If the test and production versions aren’t the same, whatever test is testing isn’t what production is doing.

Of course, it’s not possible to build the software without being told the algorithm.

My customer had used Excel to model the day’s ticks and to show his signal in response to those ticks. So, I had a full requirement, executable no less. Exactly and precisely what is needed for a developer to write code with no possible misinterpretation of the customer’s wishes.

Weren’t we both surprised when I notified the customer that his algorithm could not be written and was in fact impossible to implement.

The Problem

A quick reminder. When building a spreadsheet with intent to be used as a requirement for software, there are two rules needed to maintain computability:

  1. All rows above the current row on a working sheet may be referenced freely (this represents reference to prior computation, inputs, and results)
  2. On the current row, a formula in a cell may only refer to columns that are left of it (this represents intermediate values already prepared)

My customer’s model (I’m under NDA, I’m never going to divulge the actual model so please don’t ask) was built using certain Excel formulas that allowed him to perform statistical and mathematical trends. I can’t say what those formulas were, so I’m “inventing” a trading algo that’s so pathetic and stupid nobody will be dumb enough to try it. This algo is purely to explain my two Excel rules above and not to trade equities.

Imagine this subset of ticks (these aren’t real ticks for any real symbol):

Again: totally fake data, made it up just now …

My strategy is based on how a tick compares to the range of the day (the difference between the high and low). So, for each tick (after the first) I’ll calculate the delta between price(n) and price(n-1) over the range. Here’s how that looks in Excel … note that the cell E3 has the focus to show the formula …

The math is right — that’s not the problem. My first rule was broken, though. And it’s not possible for cell E3 to be calculated by a Python program reading data from an exchange.

Take a moment, and try to identify why I had to notify the customer that his algo as defined couldn’t be implemented.

The Bug

And pretend the theme to Jeopardy has completed. Your answer?

If you said “It’s not possible to know the day’s trading range until after the day has completed” you are 100% correct.

In order for my formula to work, the software has to know what is going to happen in the future.

If I knew that, I wouldn’t bother with a trading algo. I’d just trade based on certain pricing knowledge.

The bug is not a bug in Excel. Excel has no problem finding min or max of any range. The problem is that the model implemented in Excel was computationally impossible. The bug is what I call Leaking the Future.

This is legal if the intent is to explain behavior. Doing an after the fact study of what happened, it’s reasonable to consider the entire historical information for that period.

This is always a mistake if the goal is to define a method of doing something that is to be automated. In particular, it’s a mistake for any simulation or requirement. This was a requirement (and the Excel sheet was doing a simulation).

The give-away of the error was the use of the range B2:B4 on row 3. B4 is not above row 3, nor is it to the left of any cell on row 3. It breaks the two rules.

Pain and Anguish

My customer fixed the spreadsheet, to follow the rules that allowed a simulation to be safe and computationally possible. Once the sheet was fixed it no longer produced successful trading results.

There’s a lesson there as well: it’s much easier to successfully trade if you know the future. That’s why I call this bug leaking — the future isn’t known, but in simulation, the future may not be the future and unsafe data practices can lead to awful assumptions.

This is also why professional model-makers require their predictive models to work when given past data to predict (and test the prediction) against actual results. If a model that’s supposed to predict the future given knowledge of the present can’t predict the present given knowledge of the past the model doesn’t work.

Conclusion

When building executable requirements in Excel, or when building predictive models in Excel, keep the two rules in mind or it’s very easy to fool yourself. If the model isn’t then implemented by a programmer in another language (if the Excel is the delivery!) then the bug may very well not be caught at all.

Keep the Light,
Otter
Brian Jones

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s