Martin Dachselt, the CEO and Managing Director of esports data company Bayes Esports, writes for Esports Insider to discuss the journey of esports data, from its origins to its incredible rise in prominence.
Ever since the act of gaming graced the screens of many, games have provided information to players based on their performance.
Be that the basics, like notifying players how many lives they have, or providing deep insights into MMO boss fights, data has always been around in gaming — and in esports. In this article, we will dive deep dive into the history of esports data to show how it has evolved in almost a decade, and how it’s a moving force for the entire esports industry.
The early beginnings
Back in 2014, when the esports industry was just starting to take the shape we know today, official server-based game data was rare and therefore most examples saw either manual scouting or scraping to get match data. During that time, rights holders such as game developers and tournament organisers were simply not using data and statistics in the way it is utilised today. The overall production values were lower, and even the major tournaments such as the now world-famous Katowice Major — won by Virtus.pro in 2014 — were not taking advantage of the immense amounts of data CS:GO could provide.
Although Pinnacle, the first company to offer esports betting, was established in 2010, the market was very slim. The company did not really realise the full potential of using data sets to improve their betting services.
What the esports scene did have, however, were community apps such as LOLSUMO, a League of Legends companion app developed by DOJO Madness, a company that we later evolved to Bayes Esports. Apps like this one enabled users to connect their in-game accounts in games, such as League of Legends to get live updated tips and information about their game, thus helping them improve in the early days of League of Legends. These apps were actually digital coaches, based heavily on making sense of available casual data in games before using data was the industry standard.
Such companion apps still exist today, and new ones are regularly entering the market with new approaches and new technologies. Companion apps still are focused on casual players aiming to improve skills and review their games.
Push for commercialisation
After a few years of experimentation with different ideas, apps and platforms, the esports industry has slowly, but steadily, matured when it comes to using data and making sense of it. From 2016 onwards, tournament organisers and game publishers have started to recognise the value that data brings to everyone involved. But now, the question was how to actually get and analyse data efficiently so that it can be used by other parties?
With time passing by, third parties started using optical character recognition to scrape data from esports broadcasts such as streams to have something close to real-time data. This was, however, done without the consent of rights holders.
Of course, another issue with this was the fact that the broadcasts themselves are never truly live; the action audiences see on stream is always delayed by at least a couple of seconds and sometimes even minutes. Scraping streams was not the best or most ideal tactic to approach data collection since it was typically late and unreliable.To make sense of these large data sets, which were never before seen in other traditional sports, even the most experienced sports data science companies turned towards those specialising in esports. This led to some interesting partnerships, for example, our venture with Sportradar, during which we created the Bayes Esports brand.
The power of official data
As the data industry matured, official data available through game clients still remained in use through various apps, along with scraped data. However, soon data companies realised that the best way to improve the quality of their offerings was to partner with game developers and tournament operators themselves, creating official live data sets.
This new creation allowed teams and the broader community to access historical data, replays and scrim data in one place, allowing them to properly analyse their performance and improve their level of playing. Since the data is official and comes right from the source (Riot Games’ client), there are an abundance of advantages when compared to simple and cheaper stream scraping techniques, making this a superior way of approaching data.
For example, using official live data also helps with prediction models that use data to predict the results of esports games much more efficiently than humans, helping betting companies immensely.
However, despite the fact that the official live data approach is clearly a better alternative than using replays of previous streams, the latter, although dangerous, remains used by a large number of companies due to its smaller price.
Push for data protection
The industry today is trying to return to its roots, however strange that might sound. With the market flooded by dozens of companies scraping broadcasts to create data and sell it to clients, major rights holders aim to promote official live data as the only right way of doing things, similar to the early days when you could only get live data from in-game clients.
Today, a recent court ruling found that the use of data scraping should never be called or advertised as ‘live’ in Germany, and it’s a reflection of the fact that these grey market practices are hurting the broader esports industry. In fact, scraping itself may be illegal. Legal rights holders and official data distributors are now fighting to protect esports data for the overall development of the industry. Bayes Esports has made a large move in this field, with a successful court ruling against the framing of such grey market practices. Whilst it does use more resources, official live data will always create more value to the end customer.
UPDATE 08/12/2023: The wording in the last paragraph of this article has been updated to clarify that the outcome of the German court ruling pertained to the advertisement of live data.