Finding Twitch Streamers in a PUBG Match
We recently released a new feature that displays videos from Twitch streamers who participated in a PUBG match alongside the replays on minmax.gg/chickendinner. To build this we needed to detect Twitch streamers from their in-game account names, which turned out to be a pretty interesting problem to solve.
Probability of Playing with a Streamer
Before we decided to develop this feature, we wanted to ensure ourselves that the chance of a match containing a player who is streaming on Twitch is high enough to make our efforts worthwhile. We can make a very rough estimation by looking at the number of active players and comparing it with the number of active live streamers.
At the time of writing, there are about 2100 streamers broadcasting the game on Twitch, and about 700 thousand current PUBG players on Steam. This means that roughly 0.3% of the player base is streamers at this moment.
Considering that a match is most of the time played out with 100 players joined in, the chance of at least one of them being a streamer would be the opposite of the probability that no one is a streamer. Since the probability that a player isn't a streamer is 99.7%, the chance that none of the 100 players are streamers would be 0.997 to the power of 100, which is 0.74. In other words, in any given match we are about 25% likely to find at least one player who is streaming on Twitch.
It is worth noting that we can assume a high bias towards the North American region on Twitch compared to the overall PUBG player base, so if you are looking at NA matches this percentage will probably be much higher.
Now that we know that there's a good chance that we'll find a streamer in any given match, how do we actually figure out that a player is streaming on Twitch?
You can connect your Twitch account in the game for added bonuses, but this data is unfortunately not available from the PUBG API. We need another way to map a player's name to a streamer on Twitch.
Let's consider a hypothetical streamer with the account name Mitch. Mitch is a regular PUBG player, but his Twitch channel isn't yet receiving the viewership that he dreams of having. Clever as he is, he decides to set his in-game name to TwitchMitch. Now everyone knows that Mitch is, in fact, streaming on Twitch.
This naming convention happens to be very common, with other similar variations such as TTVMitch, or perhaps Mitch_TV. These names can be programmatically detected and stripped down the account on Twitch, giving us a video to display in the match replay.
The approach above gives us a pretty good starting point, but it won't be nearly enough. Most of the well known streamers won't have names following the pattern above. In order to register them, we would need a custom mapping from their PUBG account to the name of their Twitch channel.
To tackle this, we started out by manually looking at the top streamers on Twitch, and registering their in-game account names that we saw on screen. If you are lucky, you happen to catch a moment where the player is waiting in the lobby, where there will be multiple places to spot the name:
You are however much more likely to find the player currently playing a match. If they are playing with teammates, you'll have their name visible in the bottom left corner at all times:
You will have to deduce which ones of the names is theirs though, as it won't always be the same. An easy way to do this is to just look at the mini map on the right bottom corner, which will be centered on their player marker which shows the number and color.
The most consistent approach would be to look at the center bottom of the screen, where you'll find this:
This text will always include the account name, current version of the game, the last 6 characters of the match id and the server region. Note that it will rarely be as clean as above, since it tends to blend together with whatever is happening behind it on the screen. The small text also makes it prone to becoming too blurred out to read during moments of heavy video artifacts going on. But sooner or later you'll catch a good enough view to be able to read it.
You might already be thinking what we also realised at this point — this seems like a perfect task to be handled automatically by a computer!
In ideal conditions, an OCR algorithm should be able to pick up what the text at the bottom says. After trying a couple of different alternatives, we eventually judged Google's Cloud Vision API to produce the best results. We could set up a script that does the following:
- Fetches a PUBG livestream and crops out the bottom part from the screenshot (the Twitch API provides a full size preview image of all streams so we don't have to do any of the screenshotting ourselves).
- Sends the screenshot the Cloud Vision API and parses out the player name from the result.
- Verifies that the player exists through the PUBG API, then saves the Twitch to PUBG account mapping to our database.
This would all work great except for one little detail: Google charges $1.5 for every 1000 Cloud Vision API requests. At any given time there's between 2-3 thousand active streamers, so just running this script once would cost us up to $4.5. Considering that we are likely to get a lot screenshots with illegible text, we would need to run the script multiple times to start gathering any meaningful amounts of data. In other words, this approach would quickly become a very expensive endeavour.
Fortunately, we can be a little bit smarter with how we set up our Cloud Vision calls. We can make use of the fact that Google charges for individual requests, regardless of the size of the image. This means we can stitch together multiple images into a grid, like this:
We then send this image grid to the API and map the resulting text to the streamer that corresponds to the region where the text was detected.
There is a limit to how much we can send to Google at once, so we'll stick to 300 stitched together images per request. This means we now can run the same script 300 times for the same cost as before, now making it a very viable and effective approach.
Filling the Last Gaps
We've now been running this script for a couple of weeks, and so far we've gathered 25 000 Twitch to PUBG account connections. There are however some streamers that the script will never be able to map, for example those that have their own overlay blocking the bottom text.
These are streamers that we would still need to register manually. Instead of doing this ourselves, we decided to add a feature that allows our visitors to submit any streamers that they happen to notice are missing from our database:
We're excited to continue experimenting with ways to gather these connections, and we're even more excited to find new ways to utilize the information that we already have. You can look forward to more from us in the future!