Interim report about AI development and code performance optimisation

Story title image

The Poker-Bot is dead, long lives the BitSurfer! - No seriously, what happened?

The PokerBot and Poker Advisor were two very interesting projects and I learned a lot from both of them! The Poker Advisor picks up the cards from the screen, automatically feeds them into an AI and predicts the outcome of the Hand. Whereas the Pokerbot was planned to play by himself - everything was set up, but online Poker pretty much got banned in Switzerland.

So it was time to move on. But what could have been the next big challenge?

I stumbled upon Coinbase, a Crypto Broker where I was mainly interested in Bitcoin.

They neatly have an API where you can connect your preferred trading Software such as Metatrader or similar. But no.

The plan is of course not to hook up our preferred Trading Software. Rather, my Idea was to build up my own interface from scratch with my needs in regards. It was increadibly hard and a very steep learning curve, but it was certainly worth the hassle.

The first alpha release is ready. The interface is way ugly, but form follows function. My idea of a well-designed software is not primarily a fancy User-Interface. Rather I just want my software to just do its thing. Unnotably. In the Background. Nothing to install, no sitting in front of the screen, telling the software, what to do next. Just execute and watch fascinated.

The interface basically consists of a visualization of the market data as well as the normalized input data for the AI.

What have I learned? First, I had to connect to Coinbase. I learned a lot about APIs and it was fortunately quite easy with the available packages.

Getting into the performance aspect of Programming

I had to learn a lot about data structuring for AI and that basically came with a lot of knowledge about code performance. To the point where it gets utterly insane. The issue was mainly that I am still working on my 7 years old machine back from my apprenticeship. It has aged well but you really feel the age.. From my next imminent computer, I can expect roughly 10 Folds performance for my use cases! The processor alone is around 400% faster when consuming many threads (which I do)

I was in the following situation:

My code takes roughly a week to process 8 years of data (each datapoint is 2s ~ 126'000'000 price points) Meanwhile the next generation of processors and graphics cards is just a couple of months away and very promising.

The amount of data to process is immense and so I got into code performance optimization. I already had some experience with multi-threaded workers to do several tasks at once. That already gave me little less than 3.5x performance because my cpu has 4 cores. It now took two days to process 8 years of data. That was still to much. If I backtest, I want to see the results. Fast.

I had to go more into performance optimization and developed my own worker abstraction based on the task in c#. Basically, it implements a way of manufacturing which I learned in my practikum at INA Lahr. Its like you have a machine (or worker) with an input queue and output queue. The queue acts like a buffer and can help ensure that every Cpu core can spread the work evenly. It is like a modular, scalable worker pipeline in which the workers can pick items from a queue or bag and process them.

That also gave a great performance boost. I could now process all data in about a night but i decided to still rep it up further just to see how far it could go. I subscribed to a Visual Studio extension "dotTrace" which is gorgeous when searching for bottlenecks inside the code. You can record a trace and afterwards you can see function by function how long it takes to execute. I found one function which I did not expect at all - the output string builder was throttling the whole pipeline. More about that can be found in the stack overflow question

All in all the process is quite simple:

  1. Identify the biggest performance burden
  2. Eliminate it
  3. Move on like that until you are at a reasonable performance.

In the above Example, this would be the CLR worker Process with the ID 10584. This soundy a little Confusing at this stage but JetBrains has the call tree on the right hand side where we can see which function uses:

Conclusions and Takeaways

I know there would be even more performance in but for now I am quite happy with the outcome.

Right now, 2 years fly by in under 4 Minutes and I added a throttle if the actions should be watched more carefully.

screenshot of backtester

Also the equity line as well as statistics are saved. It is possible to simulate / run as many accounts and strategies as one likes with only minor performance decrease.

I have tried different strategies with indicators and so on already. Without much promising results. After all I found a profitable strategy which instead does none of those things. It does not follow the trend, nor does it bet against it. It does not try to predict anything. Neither the perfect entrance point nor the perfect exit. It's quite simple but I do not want to go more into the detail of the strategy. Think outside the box and you can find something.

The equity line is beautifully straight. It follows the trend line nearly perfectly straight.

equity line going up

That's a whopping 20% annual return. To be fair, I would have made way more with the buy and hold strategy but on the other hand I am not predicting anything, we do not have those insane spikes (and falls)

That now is something where I can build upon and see what's possible and what is not. A full market simulation sandbox. Test-environment with almost limitless possibilities.

That is it for today and I hope you enjoyed the insight about some of my ongoing projects.