wotreplays.org: A little bit more on heat maps
As you may have noticed I’ve been adding these to a few places, but there’s a few interesting things that have come up – as always really – that mean things are going to have to change, again.
So what is a heat map anyway? Simply put it’s a way to measure the “weight” of a given point, and it’s expressed by a gradient color. The gradient usually runs from dark blue to cyan, to yellow, to green, to red and then to white. The higher in the gradient, the heavier the point is. And yes, it’s a very simplified explanation but it’ll suffice for now
You could, for example, plot a heatmap of you walking down the street and ringing doorbells. Every time you ring a doorbell, that position’s weight is increased by 1. Eventually the heatmap will show whose doorbell you’ve been ringing most – unless you ring them all.
See the doorbell analogy above? Well, the game maps are only a finite size, this means that eventually the heatmap will be one big bright white blob because every position on the map has been “visited” a nearly equal number of times. Or perhaps not equal, but equal enough as far as the math behind it goes. This is already apparent on some of the heatmaps where you have to fiddle with the detail level in order to actually see any difference. It’s also strictly speaking not a detail level but determines the radius of the point being drawn, which in turn determines which other points it may or may not include.
The additional problem
When a player dies,you still receive position updates for them, even though they haven’t moved. Or they may have moved if someone pushed their wreck a little bit, but that aside. There’s also the spotting mechanic which comes in – you can see this on many heatmaps currently where one side of the map has a decidedly larger sampling of data than the other, and this is mostly due to the replays being uploaded having been recorded by someone who started off on that side. There is also the whole thing where un-spotted tanks don’t get position updates until they’re seen – which skews the data even more.
What really gets it going is spotting ranges. When a tank is within 50 meters, you receive a position update every game tick (0.1 seconds). The further out they go, the longer the time between updates. (Side note: this is what explains invisible tanks). So what happens is, the recorder of the replay is always within 50 meters (obviously), so receives a position update every 0.1 seconds. Or in heatmap speak, 10 data points per second. A spotted enemy player that is between 270 and 445 meters out will only generate a single data point per second – so if I were to plot that on a heat map, the enemy’s data points would probably not even show due to being drowned out by the recorders’ data points.
The solution seems rather obvious: normalize the data points. That’s all fine and dandy, but you’d end up with a problem of a rather massive scale: it means replaying every battle, start to finish, because for every time a position update comes in for any player that is not the recorder of the replay, you would have to find out how far away they were from the recorder, and in turn how much the data point value should be. Assuming we use a value of 1 for the recorder, then anyone at 270+ meters would have a value of 10 for their data point. That would work if it weren’t for the fact that the update times are’t a fixed thing and can in fact vary between maps, some maps have more frequent update ticks than others, or so it’s rumoured.
It also doesn’t solve the problem that the friendly team always gets more position updates by virtue of being within radio range, which again skews the data set towards the friendly team’s side. Given that the data is extracted from uploaded replays, there isn’t a single map where there is a 50/50 distribution between starting points.
To make matters worse, the way I store the data amplifies this problem. To keep the data set small enough to be easily downloaded and displayed, all coordinates are converted to a sub-cell of the map. Every map has 100×100 sub-cells on it, which means that for most maps every subcell is exactly 10×10 meters in size – and when recording heatmap data, all coordinates falling within a certain sub-cell are counted towards that sub-cell’s position.
As you can see that means that at match start, there may be only 2 sub-cells with tanks in them, and their value will skyrocket right away. After that any time there’s more than one tank in a sub-cell, that cell’s value can go up like a rocket if it’s 2 friendlies. Even faster if it happens to be the recorder, and there’s 3 tanks with him.
The solution currently seems to be to forget about “proper” heatmaps altogether, and switch to plotting out points and doing some juju with opacity based on how many tanks have visited said point. We’ll see how that one ends up going.