Interim report on AI development and code performance optimisation
The poker bot is dead, long live the BitSurfer! - No seriously, what happened?
The PokerBot and Poker Advisor were two very interesting projects, and I learned a lot from both! The PokerAdvisor picks up the cards from the screen, automatically inputs them into an AI and predicts the outcome of the hand. While the Pokerbot was planned to play itself - everything was prepared, but online poker was pretty much banned in Switzerland.
So it was time to move on. But what could have been the next big challenge?
I stumbled upon Coinbase, a crypto broker, where I was mostly interested in Bitcoin.
They neatly have an API through which you can connect your favorite trading software like Metatrader or the like. But no.
The plan, of course, is not to interface with our favorite trading software. Rather, it was my idea to build my own interface from scratch with my needs in mind. It was incredibly hard and a very steep learning curve, but it was certainly worth the effort.
The first alpha version is ready. The interface is very ugly, but form follows function. My idea of well-designed software is not primarily a fancy user interface. Rather, I want my software to simply do its thing. Absolutely. In the background. Nothing to install, nothing to sit in front of the screen and tell the software what to do next. Just run it and watch in fascination.
The interface essentially consists of a visualization of the market data as well as the normalized input data for the AI.
What did I learn? First, I had to connect to Coinbase. I learned a lot about APIs and fortunately with the packages available it was quite easy.
Introduction to the performance aspect of programming
I had to learn a lot about data structuring for AI, and that basically involved a lot of knowledge about code performance. To the point where it gets completely insane. It was mostly about me still working on my 7 year old machine that I got back from my training. It's aged well, but you can really feel the age.... From my next imminent computer, I can expect about 10x the performance for my use cases! The processor alone is about 400% faster if it consumes a lot of threads (which I do).
I found myself in the following situation:
My code takes about a week to process 8 years of data (each data point is 2s ~ 126,000,000 price points). Meanwhile, the next generation of processors and graphics cards is only a few months away and very promising.
The amount of data to be processed is immense, and so I got into code performance optimization. I already had some experience with multi-threaded workers to handle multiple tasks simultaneously. As a result, I already had a little less than 3.5x performance since my CPU has 4 cores. It was now taking two days to process 8 years worth of data. That was still too much. When I run backtests, I want to see the results. And I want them fast.
I had to look more into performance optimization and developed my own worker abstraction based on the task in C#. Basically, it implements a fabrication method I learned in my internship at INA Lahr. It's like having a machine (or worker) with an input queue and an output queue. The queue acts like a buffer and can help each cpu core distribute work evenly. It's like a modular, scalable worker pipeline where workers can take items from a queue or bag and process them.
This also gave a big performance boost. I was now able to process all the data in about a night, but I decided to replicate it even further just to see how far it could go. I subscribed to a Visual Studio extension "dotTrace" which works great at finding bottlenecks within code. You can record a trace and then see function by function how long it takes to execute. I found a feature I wasn't expecting at all - the output string builder throttled the entire pipeline. You can read more about this in thestack overflow question
All in all, the process is quite simple:
- Identify the greatest power load
- Eliminate it
- Keep doing this until you have a reasonable performance.
In the example above, this would be the CLR worker process with ID 10584. This sounds a bit confusing at this point, but JetBrains has the call tree on the right where we can see which function is being used:
Conclusions and takeaways
I know there would be more power in this area, but for now I'm pretty happy with the result.
Right now, 2 years flies by in less than 4 minutes, and I've shifted gears if the actions are to be watched more closely.
The equity line as well as the statistics are also saved. It is possible to simulate / run as many accounts and strategies as you want, with only a small drop in performance.
I have already tried different strategies with indicators and so on. Without promising results. After all, I found a winning strategy that does none of these things instead. It neither follows the trend nor bets against it. It doesn't try to predict anything. Neither the perfect entry point nor the perfect exit. It's quite simple, but I won't go into the details of the strategy. Think outside the box and you may find something.
That's a whopping 20% annual return. To be fair, I would have made more room with the buy-and-hold strategy, but on the other hand, I'm not predicting anything, we don't have these insane spikes (and tumbles).
Now that's something I can build on and see what's possible and what's not. A complete market simulation in the sandbox. A test environment with almost unlimited possibilities.
That's it for today, and I hope you enjoyed the glimpse into some of my ongoing projects.