Modeling matchups mathematically

Nintendude · Oct 2, 2009

Let me start by saying I hate matchup "ratios." In a traditional sense, they are completely meaningless to me. Apparently one interpretation of them (the way Street Fighter does it apparently) is that, for example, a 7-3 matchup means that out of 10 matches one character wins 7 on average and the other wins 3 on average. Well, I think is fundamentally flawed. Taking 3 matches out of 10 is actually pretty decent, so how can this be considered a horrible matchup ratio? To use a more extreme example, why should 9-1 exist? If a character is losing so badly, why should he be able to win 1/10 games? It may be possible, but the ratio makes no attempt to illustrate it.

Now here's my approach to it:

Imagine an infinitely long stock match. Let's say that for a Fox vs. Falcon match, on average, for every 6 stocks Fox takes, Falcon takes 4 stocks.

This simply translates to 6-4 in ratio form. Simple.

Now let's consider Fox vs. Sheik. Let's say, for the purposes of this model, that for every 6 stocks Fox takes, Sheik takes 5. Let's scale this to a ratio in the 1-10 form:

6*10/(6+5) = 5.45
5*10/(6+5) = 4.55

This roughly translates to a 55-45 matchup (or 5.45-4.55), which can then be rounded to 5-5.

I'd say the biggest fundamental difference between this interpretation and the traditional, "Street Fighter," interpretation is you are ignoring the fact that matches are finite and end after 4 stocks. Well, to me this is actually a more meaningful approach. A ratio of 10-0 can either mean that a character is getting ***** repeatedly or the character is barely losing repeatedly. They are two completely different situations being represented by the same ratio, and it makes no sense. If you instead say that on average a certain character takes 4 stocks while losing 1, you can easily see that either character has a chance of winning, despite it being very uphill for the worse character.

Also, while I'm not knowledgeable about Street Fighter, I'm pretty sure the game is a traditional fighter in the sense that there aren't stocks - just individual games/matches. Basically, it seems to me like Street Fighter's games are sort of an equivalent to Smash's stocks. So, why are we carrying over Street Fighter's ratio interpretation to games rather than stocks?

If you wish you may skip this next section, cause it is really complex and strays from the overall point I'm trying to make.
------------------------------------------------------------------------------------------------

Now, how do we deal with counterpicks? This is really difficult but I have an idea. Let's examine Pikachu vs. Fox in SSB64 (cause SSB64 is simpler and doesn't use Dave's Stupid Rule).

For this example I'll say that the first stage in a best of 3 is random between Hyrule and Dreamland. Let's say that we determine the stock ratios to be 5-5 on Hyrule and 7-4 (Pika takes 7 for every 4 Fox takes) on Dreamland.

In a best of 3 set, Hyrule and Dreamland have equal chances of being chosen for the first match. If Hyrule gets chosen, both characters have an equal chance of winning. If Fox wins, Pika will counterpick Dreamland (or another stage that Fox sucks about as much on), and if Pika wins, Fox will pick Hyrule again. Then it happens again the third match.

Sample space:

H = Hyrule, D = Dreamland
(first stage, second stage, third stage)
HDD
HDH
DHD
DHH
HH
DD

Compute probabilities of each element in the sample space (based on chances of the characters winning on each particular stage). The probability of HH, for example, is .5*.5 = .25. I derived those numbers from H having a 50% chance of showing up in the first match and then a 50% chance of Fox losing, resulting in H being picked again.

With the probabilities of each one, you can then do a weighted average with the stock ratios determined up above to yield a composite stock ratio. Then use the scaling formula to convert it to a 1-10 ratio, round, and you got your matchup result.

As an example, consider this:
There are 2 stages, and on stage 1 the stock ratio is 6-4 and on stage 2 the stock ratio is 2-1. Stage 1 has a 70% chance of being played while stage 2 has a 30% chance.

First weight the stock ratios:
.7*6 + .3*2 = 4.8
.7*4 + .3*1 = 3.1

This means the average stock ratio is 4.8 to 3.1. Now scale it as a 1-10 ratio:
4.8*10/(4.8+3.1) = 6.08
3.1*10/(4.8+3.1) = 3.92

Round it and you get that this is a 6-4 matchup.

There are obviously issues with this approach, which are mainly due to preferences that players have for certain stages. For example, Peach in Melee has many viable counterpicking options, and it depends on the player's choice as well as what the opponent strikes. This makes it nearly impossible to calculate stage probabilities absolutely, but you can still approximate it (and also approximate certain stages as equal).

Also, this problem explodes enormously the more factors you consider. Here I only considered 2 stages and it is already very complicated. It's a start at quantifying counterpicks though.
----------------------------------------------------------------------------------------------

Start reading again if you skipped the counterpicks part.

One thing that this mathematical model makes possible is using actual data to determine matchup ratios. If you want to figure out Peach vs. Jiggly, take a look at matches between Mango, Armada, PC (when he went Jiggly vs. Mango), etc. Count total stocks that Jiggly lost in these sets and total stocks Peach lost in these sets. Then use the formula up above and get a ratio. If you are ambitious, you can break it up by stage and also compute the fractions of stage usage and use the formulas up above.

The biggest question in terms of using data is how do you know what valid data is? Like, I'm sure m2k would beat nearly anyone on any stage regardless of characters. Also, how old can data be while still being valid data? Yes, this is a serious issue to work out but that is not the focus of what I'm presenting here.

There are obviously other issues with this approach, but I believe it offers something much more concrete than has ever been used in the past. It allows for actual data input into equations to spit out a matchup ratio, rather than trying to arbitrarily weigh pros and cons of a matchup. Even without inputting data, it is much easier to conceptualize a character averaging 7 stocks to an opponent's 3 rather than trying to say a character can win 2 out of 10 matches.

HawaiianJigglyPuff · Oct 2, 2009

Nintendude1189 said:
6*10/(6+5) = 5.45
5*10/(6+5) = 4.55

I got lost how you got from here...

Nintendude1189 said:
This roughly translates to a 55-54 matchup, or 5-5 if you round to whole numbers.

...to here

so i didn't read the rest. Is this a mistake? If someone has a slight advantage, why would we round it to 5-5?

Nintendude · Oct 2, 2009

Well first of all I made a mistake. That was supposed to be 55-45, but all 55-45 is is 5.45-4.55 multiplied by 10, which when rounded gives 5-5.

HawaiianJigglyPuff · Oct 2, 2009

This seems so much more complicated, but is definitely going to be a better system. I'm just gonna actually have to think from now on. O_O

Fly_Amanita · Oct 2, 2009

I like the idea, but it would be really difficult to implement. We could probably cut down the sizes of the sample spaces down a lot by assuming that neither player would make idiotic counterpicks (e.g., Bowser wouldn't take Sheik to FD), but they'd still be large and we'd need ratios for each match-up on each stage that we could expect to see that match-up take place. Plus, we'd need to know how often high level players will ban or counterpick so-and-so levels.

Actually, we could probably get decently accurate results if we assume that character A will always ban stage C against character B and B will then always counterpick stage D or something, but we'd need to at least come to some agreement on what the best and worst stages are for every match-up.

tubes · Oct 2, 2009

One thing I notice is that comparing rounds in SF to stocks in Smash doesn't really work. In SF both characters have full health at the start of each round. In Smash you keep your damage.

otg · Oct 2, 2009

We should just do a matchup chart like they have it in the 64 boards. Life would be easier and would be more general.

Strong Badam · Oct 2, 2009

here's an idea...
how about we not give a **** about match-up charts and just play

Comrade · Oct 2, 2009

The math is a little flawed in that it doesn't really represent much, but I understand what you're trying to do.

I think... So you're turning everything in a 1-10 ratio? so it's (A/B+A)=(X/10) with A being character one and B being character two? Would it not just be easier to make it (A/B) and leave it as a simple fraction?

Either way, this is much more efficient and logical than the street fighter way. I fully support this!

pockyD · Oct 2, 2009

this is why whenever people mention "matchup ratio", my brain shuts off

your way makes a lot more sense than the current magical way, but i'd still contend you'd have a hard time finding any matchup that's worse than 2:1, and therefore you're going to have precision issues as all matchups end up looking the same unless you go to 2-3 decimal points, which then implies accuracy that isn't really present

also it's important to qualify the other factors, in particular the stage(s) involved when analyzing a matchup

Comrade · Oct 2, 2009

pockyD said:
this is why whenever people mention "matchup ratio", my brain shuts off

You should see a doctor, that's serious!

INSANE CARZY GUY · Oct 2, 2009

I understand math in fact i can do it unhumanly good sometimes and it looks like i,m freakin out. I like your idea but there are a FEW flaws like the fact how do we say Mango=Armada/ same for anyone one esle sometimes the better person barly loses and it's a REALLY up hill battle. Plus the fact mindgames and SDs and in many matches, in normal matches jugglypuffs don't trick them into rolling into rest.

I like the stage idea but most people will walk away just by seeing the DHD because they might be a little confused or a lot.

I think there needs to be a mix in your ideas and their ideas or a way to take out the mindgame/SD problem

Vulcan55 · Oct 2, 2009

Good players don't SD that often, and in the long run, don't matter very much, I'm sure.

HawaiianJigglyPuff · Oct 2, 2009

otg said:
We should just do a matchup chart like they have it in the 64 boards. Life would be easier and would be more general.

QFT

it's way simpler and just makes more sense

Stratford · Oct 2, 2009

Wanting matchup ratio to mean "character X takes x stocks for every y stocks character Y takes" does make more sense than "character X wins x matches for every y matches character Y wins."

However, plugging in data from real matches to get matchup ratios doesn't exactly work because you have no way to quantify player skill, other than "Mang0 too ****ing good."

LLDL · Oct 2, 2009

Strong_Bad said:
here's an idea...
how about we not give a **** about match-up charts and just play

lol excatly, just pick your favorite character, get good, and go to tournies. I've never looked at the matchup chart, and have only seen a glance at the tier list, which i forgot already.

Stratford · Oct 2, 2009

I heard Fox isn't bad.

N64 · Oct 2, 2009

To Nintendude: In general I like the idea. PockyD's concerns are relatively valid, but I'm sure an agreeable balance can be found.

To SB/ChainAce: Yeah, but honestly I don't see this chart as being for us. At least, it's not going to help our play at all. Mostly it's for those new to smash, or for others looking for a secondary/etc. Lets say I do bad against Ganons, lemme look at this chart and oh, X character does well against ganons, lemme pick up that character. I still have to actually learn the character and matchup, yes. But it's better than somewhat blindly picking a char and playing the matchup enough to realize that they also do bad against Ganons.

Nintendude · Oct 3, 2009

PockyD definitely brings up some valid flaws with the system, and I'm totally up for ironing out some details to make it work better. I'm glad to see that people agree with the concept though, and at the very least I think trying to debate matchups as stock ratios is better than magically creating numbers like 7-3.

victra♥ · Oct 3, 2009

I gotta say, this is pretty bad ***.

Its very interesting, and would be neat to see if there are any major differences with this chart in comparison with the matchup chart currently.

Rappster · Oct 3, 2009

to determine stock ratios, may i suggest playing several 15 minute timed matches?

Smoke and smash · Oct 3, 2009

I want to see matchup chart ratios for every single stage. I think match ups need to be more in depth, instead of going in smash64'a matchup chart diretion, I'd like to get more specific and precise stats.

Fortress | Sveet · Oct 3, 2009

i love the theory behind this, its the *right* way to do the chart, but how do we find the non-weighted match-up ratios on each stage? is that what we should be debating?

would it be better to rank each character individually on each stage in different categories (projectile effectiveness, combo effectiveness, range advantage, ect) and then use that as a tool when determining character match-ups?

like DL64 for marth, platforms are really bad for him, stage size reduces his range advantage, but the stage boundaries are great for his recovery and doesnt hurt his edge guarding.

when playing against fox on that stage, fox's speed advantage combined with the platforms and stage size gives him stage advantage in that match-up and only slightly decreases his killing potential. then we could debate about a stock ratio, say 7-5 fox or something. then do the same for every other stage, weigh them and do the math in the first post.

GawdImFoxy · Oct 3, 2009

Here's my point of view.

It makes sense, but it's like Fly said--there are basically infinite variables that could alter the numbers in one character's favor. I notice that sometimes I can't just pick up on someone's patterns, so it's hard for me to beat them, but someone else may pick up on them from the start while I can consistently beat that person.

There are FAR too many variables to consider, so I think it's best that we just stick to the system we have now--using tournament results, and concrete statistics. It's nothing like street Fighter, where every stage is essentially the same, and every character only has so many options. In Melee, there are significantly more options, and significantly different stages.

Though I do respect the ambition, don't get me wrong. I think it takes a lot of motivation to decide something like this. Keep up the good work.

Fortress | Sveet · Oct 3, 2009

gawd, the current system is basically "magic". people just say a ratio and give a few reasons why. when someone gives an opinion thats not what everyone else says, they get ostracized and flamed. this system gives us a way to break the elements down and make a list with math instead. lots of work, but overall a much more realistic system.

Nintendude · Oct 3, 2009

GawdImFoxy said:
It makes sense, but it's like Fly said--there are basically infinite variables that could alter the numbers in one character's favor. I notice that sometimes I can't just pick up on someone's patterns, so it's hard for me to beat them, but someone else may pick up on them from the start while I can consistently beat that person.

There are FAR too many variables to consider, so I think it's best that we just stick to the system we have now--using tournament results, and concrete statistics. It's nothing like street Fighter, where every stage is essentially the same, and every character only has so many options. In Melee, there are significantly more options, and significantly different stages.

I am well aware of all the variables that go into competitive play, which is why stock ratios are approximate averages. To be fair, unless you go and get data from matches stock ratios are gonna be nearly as magical as the traditional ratios that people have been using. This is definitely at least an improvement though, as stock ratios make more sense than winning x out of 10 matches imo.

The main reason I came up with this is because the old interpretation of the 1-10 ratio is severely flawed. I actually do not think that this (or any 1-10 scale) is a practical approach to fully completing a matchup chart, and I honestly would rather see the neutral, advantage, large advantage system used in the SSB64 chart.

Rappster · Oct 4, 2009

Stratford said:
Wanting matchup ratio to mean "character X takes x stocks for every y stocks character Y takes" does make more sense than "character X wins x matches for every y matches character Y wins."

However, plugging in data from real matches to get matchup ratios doesn't exactly work because you have no way to quantify player skill, other than "Mang0 too ****ing good."

get two ppl who agree to be of equal skill, and have them play some marathon time matches.

Fortress | Sveet · Oct 5, 2009

Way too many factors. Even if they were exactly even at the start, one of them will learn faster than the other and take the lead, even if just slightly, and skew the results.

Rappster · Oct 5, 2009

which is why repetition would be crucial. if you have enough people playing like this, it would eventually even out

Nintendude · Oct 6, 2009

In terms of collecting data I think the best approach is just to take a really large sample among top players. Of course not all the players will be considered at the top of the metagame but if you choose carefully it would be pretty close and having a large sample helps balance out the other factors.

I'm actually really curious to see how this would turn out for Fox/Falco. There is ample data on every commonly used stage for that matchup.

Modeling matchups mathematically

Smash Hero

Smash Ace

Smash Hero

Smash Ace

Master of Caribou

Smash Ace

Smash Master

Super Elite

Smash Journeyman

Smash Legend

Smash Journeyman

Banned via Warnings

Smash Lord

Smash Ace

Smash Champion

Smash Hero

Smash Champion

Smash Champion

Smash Hero

crystal skies

Smash Ace

Smash Journeyman

▀▄▀▄▀▄▀▄▀▄▀▄▀▄▀▄▀

Smash Journeyman

▀▄▀▄▀▄▀▄▀▄▀▄▀▄▀▄▀

Smash Hero

Smash Ace

▀▄▀▄▀▄▀▄▀▄▀▄▀▄▀▄▀

Smash Ace

Smash Hero

Information

Network