• Welcome to Smashboards, the world's largest Super Smash Brothers community! Over 250,000 Smash Bros. fans from around the world have come to discuss these great games in over 19 million posts!

    You are currently viewing our boards as a visitor. Click here to sign up right now and start on your path in the Smash community!

Empirical Matchups

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
UPDATED 5-27-2009 WITH THE COMPETITION (Midwest Circuit East #3)

These are matchup numbers based on results in the top 8 at tournaments with more than 25 entrants. We include more people for events with more than 50 people.



Yeah, not gonna worry about overall stats unless I get NSF funding or something. :p

TheKiest and I have started collecting data on how often a character beats another character in tournament matches in order to create an empirical matchup table. You probably notice two things about the charts above: the weird matchup numbers and the gigantic holes in the data. In order to fix these, WE NEED YOUR HELP. Our data currently come from only two central Ohio tournaments, so we have big blank spots because not every character gets played around here. Our data is also extremely skewed by a few very good players (e.g. Diddy's 100:0 matchup on Falco is mainly because ChamP loses to AlphaZealot). If you're willing to take note of who is beating whom, read on.

The basic idea is that if we record how often Marth beats Kirby, we can come up with a matchup ratio without resorting to judgment calls. There is one unavoidable judgment, however: which games we should include.

There are two ways to do this - one is to record every game played in a tournament, from top to bottom. On the plus side, this generates large amounts of data very quickly. Its major drawback, however, is that it necessarily includes a lot of incompetent players. While mitigated by the simple fact that better players will play more matches in a tournament, this flaw means that rather than revealing matchups based on near-optimal play, data collected by this method will tell us more about the community which plays a particular character and their overall ability. In practice, it's also a bit difficult to record every single match; Kiest and I have managed to get every match in pools recorded at a few ~30 person tournaments, but unless the TO governs with an iron fist, it's pretty tough to keep track of when matches finish. I tried doing this and it is not worth the trouble.

The other way to look at it is Ankoku's method, which only includes games played for 8th place and up. I think it's safe to say that people who make it to the Winner's Semifinals are reasonably competent, so data taken from this group of players will be closer to whatever figure would be generated by optimal play. The data is also easier to collect, since all you have to do is track down the top 8 players after the bracket and ask them how the sets that knocked them out went. The major drawback here, though, is that instead of ~200 data points for a 32 man bracket, you're looking at ~50 data points. This may seem like a lot until you realize that there are a little more than 1,300 matchups, due the size of the Brawl roster. Restricting the sample to people who place highly in tournaments will also tend to exclude lower tier characters from consideration since there simply won't be enough of them to collect data.

Bottom line is that if tournament organizers record which character won which matchups when updating their brackets, they can get the information without making a huge time investment. Collecting only 8th place and up is even easier. It is slightly more time consuming, though not by much, to keep two sets of records (one overall, and one for 8th+). So, TOs, if you'e willing to mark 2-6 tallies per set, we can collect a lot of good data. I've made some tables which TOs can use to record their data - you can get it as a spreadsheet from Google Docs, or I've made 200 dpi pngs which you can print instead. Kiest is graciously hosting some on the OUGA website, and I also have them in .xls and .pdf; if you'd like them in one of these formats, you can e-mail my gmail account, han.138

Once you've collected some data from a tournament, post your results here or PM or e-mail me. You can also catch me on AIM (ahab117).

Google Docs
http://spreadsheets.google.com/pub?key=rRRq5qX9VjqXIifNTJmmZPg

Color PNGs
http://i105.photobucket.com/albums/m225/Schwaumlaut/Color-TO-sheets1.png
http://i105.photobucket.com/albums/m225/Schwaumlaut/Color-TO-sheets2.png
http://i105.photobucket.com/albums/m225/Schwaumlaut/Color-TO-sheets3.png
http://i105.photobucket.com/albums/m225/Schwaumlaut/Color-TO-sheets4.png

Grayscale PNGs
http://i105.photobucket.com/albums/m225/Schwaumlaut/Grayscale-TO-sheets1.png
http://i105.photobucket.com/albums/m225/Schwaumlaut/Grayscale-TO-sheets2.png
http://i105.photobucket.com/albums/m225/Schwaumlaut/Grayscale-TO-sheets3.png
http://i105.photobucket.com/albums/m225/Schwaumlaut/Grayscale-TO-sheets4.png

.xls spreadsheets
http://www.originalupholdersofthegamingarts.com/downloads/TOsheets.xls

Color PDF
http://www.originalupholdersofthegamingarts.com/downloads/ColorTOSheets.pdf

Grayscale PDF
http://www.originalupholdersofthegamingarts.com/downloads/GrayscaleTOsheets.pdf

Again, for this to work, we'll need as much data as we can get. Right now, we're only looking at data from the midwest, so the help we can get from the wider smash community will be crucial. Kiest said he'd put some of the TO files up on the OUGA website, as well, so keep an eye out for that. Thanks for your interest!
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
If you want to help out, there are a few other things you should be aware of: we're trying to minimize the number of judgment calls human beings make so that we can minimize biases. It's not possible to completely eliminate bias from any experimental design, but this leads me to another important point: while it is possible to correct for biased results with statistical methods, they only work if data has been collected rigorously and consistently. This means that there are a few things that all info you send in should include:

First, the number of matches you should look at. Since larger tournaments draw more skilled players, we'll look at them more closely. We're interested in players who place above a certain threshold in the bracket.

Top 8 out of 25~49 (7th+)
Top 16 out of 50~99 (A four way tie for 13th place means 16 people placing 13th and up, right?)
Top 24 out of 100~149 (17th+)
Top 32 out of 150+ (25th+)

This means that we don't care about tournaments smaller than 25 people. Sorry. Conversely, this means that if you're sending in data from a tournament, DON'T CHERRY PICK. SEND IN EVERYONE WHO PLACED HIGHLY ENOUGH.

Once you've figured out who you should pay attention to, the simplest method for finding all their sets without double-counting is to ask each person about the set which knocked them out of the tournament. The info you're looking for is which character won each game of the set, and which character lost each game of the set. Next, you'll want to report those results. Here's a possible format for that (let's say Mario beats Luigi twice in Grand Finals, the Luigi player wins one then switches to Peach and wins again, then the Mario player switches to Metaknight and wins the tournament)

Grand Finals
Mario beat Luigi 2-1
Peach beat Mario 1-0
MK beat Peach 1-0

People sometimes include the specific players, which is cool, but not necessary.

Anyway, this should provide a simple framework for investigating. Thanks!
 

ndayday

stuck on a whole different plaaaanet
BRoomer
Joined
Jun 12, 2008
Messages
19,614
Location
MI
This is awesome, really.

Now you just need to get honest and plentiful reports from tournys. Good luck, I like this idea.
 

TheKiest

Smash Champion
Joined
Mar 10, 2008
Messages
2,531
Location
Worthington, Ohio
Always glad to help Schwaa!

I would re-edit your first post to change the size of the words: Top 8 and OVERALL
This will make it easier to sort through.
 

Col. Stauffenberg

Smash Lord
Joined
Jun 14, 2008
Messages
1,989
Location
San Diego <3
Not a bad idea, but... you can't take the top 8 from a regional tourney at the same value as the top 8 from some 13-person gamestop deal or somethin. >>
 

B!squick

Smash Master
Joined
Jan 4, 2009
Messages
4,629
Location
The Sunny South
He's talking about data from a regional top 8 =/= a top 8 from a smaller tourny, such as a GameStop one (they do have those).

Anyway, I like this. Let's hope TOs start keeping better track of everything. Only top 8's would be lame for characters like, say, mine, lol.

Also, you might want to format everything to make it pretty and easier to read. I can help with this if you like. :)
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
Not a bad idea, but... you can't take the top 8 from a regional tourney at the same value as the top 8 from some 13-person gamestop deal or somethin. >>
This is a pretty good point. I think we'll only look at 25+ tournaments.

Top 8 out of 25~49
Top 16 out of 50~99 (A four way tie for 13th place means 16 people placing 13th and up, right?)
Top 24 out of 100~149 (17th+)
Top 32 out of 150+ (25th+)

It'll be pretty tough to keep track of it for a 150+ person tournament, even if you only look at high placing players, but it should be doable.
 

Scott!

Smash Lord
Joined
Apr 25, 2008
Messages
1,575
Location
The Forest Temple
I kinda wish someone started this a while ago, because there's tons of possible data that could have been. It's a fine idea, though.
 

TheKiest

Smash Champion
Joined
Mar 10, 2008
Messages
2,531
Location
Worthington, Ohio
I think its better that we did wait a year since the game came out so that people could discover the Meta Game to brawl.

Awesome Avatar btw Scott.
 

Scott!

Smash Lord
Joined
Apr 25, 2008
Messages
1,575
Location
The Forest Temple
I think its better that we did wait a year since the game came out so that people could discover the Meta Game to brawl.

Awesome Avatar btw Scott.
That's true. Matches from a year ago probably wouldn't reflect the meta-game very well. Perhaps, and this is probably far too ambitious, but if there was like a rolling window of time that was represented, like the last year or so, so that old metagame is phased out regularly. Not sure how this could be implemented, but if new characters rise up, they might have trouble fitting where they belong thanks to old matches filling the stats.

Also, thanks! :D
 

Zhamy

Smash Champion
Joined
Apr 22, 2008
Messages
2,088
Location
NorCal
Only top 8's would be lame for characters like, say, mine, lol.
This is the other thing to keep in mind.

If you're only taking Top 8, some characters never get to that point in tournaments, meaning you'll have very skewed data (if any) for low (and even mid) tier characters.

Also, this is being calculated per game or set?
 
Joined
Jan 11, 2006
Messages
1,427
Location
Ohio / Michigan
How much data do you have as of right now?

I will say as of right now it does look a bit complicated and messy. You guys (or us) need to fix that so it's easy to read and understand for others.

Just trying to help mainly.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
We only have W&B II and HOMB right now, plus some NM tourney (I think it's from El Paso?). The NM one isn't in the posted data yet.

If anyone wants to play graphic designer, that's cool, but while I do want to put this together, I'm not super interested in copying my spreadsheet's output cell by cell into an image. Screenshotting and copying that four times is enough of a PITA.
 

B!squick

Smash Master
Joined
Jan 4, 2009
Messages
4,629
Location
The Sunny South
My spreadsheet software allows me to save docs in html format. :D

What kind of look do you want or do you really care?
 

adumbrodeus

Smash Legend
Joined
Aug 21, 2007
Messages
11,321
Location
Tri-state area
Interesting, just understand that this is more of a "what happened" chart, because unfortunately differeneces in player skill do inform it to much to actually be an accurate match-up chart. In other words, the problem of induction keeps it from being truly accurate, and thus we need deduction.


Add that to the fact that the top of the metagame is only reflected in a very select few players (only people like M2K could lay claim to even the possibility), if at all, and we've got a problem with actually calling match-ups true ratios based on this.


Still, knowing what generally occurs can be very helpful, but maybe moreso in telling players of certain characters what they need to concentrate on to win.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
Interesting, just understand that this is more of a "what happened" chart, because unfortunately differeneces in player skill do inform it to much to actually be an accurate match-up chart. In other words, the problem of induction keeps it from being truly accurate, and thus we need deduction.


Add that to the fact that the top of the metagame is only reflected in a very select few players (only people like M2K could lay claim to even the possibility), if at all, and we've got a problem with actually calling match-ups true ratios based on this.


Still, knowing what generally occurs can be very helpful, but maybe moreso in telling players of certain characters what they need to concentrate on to win.
Definitely true. I don't think anyone is mistaking what these charts mean yet, but it's worth noting this.

I'm personally interested in whether or not real life reflects the theorycraft matchups already circulating on each character board, and if not (as I suspect), why that is.
 

adumbrodeus

Smash Legend
Joined
Aug 21, 2007
Messages
11,321
Location
Tri-state area
Definitely true. I don't think anyone is mistaking what these charts mean yet, but it's worth noting this.

I'm personally interested in whether or not real life reflects the theorycraft matchups already circulating on each character board, and if not (as I suspect), why that is.
I expect not, and that is for two reasons.

1. The theoretical model that most match-up discussions use is fundamentally flawed because it doesn't incorperate mindgames potential. I doubt that's the only thing, but I am unsure as of yet what the other issues with the match-up model are.

2. Uneven distribution of high level players in the metagame.


Unless those issues are addressed (good luck in the second one) empirical data will never match the true match-ups.
 

Phantomwake

Smash Journeyman
Joined
Mar 22, 2008
Messages
227
Location
Boston
You should try and get a hold of some of Ankoku's data to see if it is helpful because this seems like a very interesting chart
 

TK Wolf

Smash Ace
Joined
Sep 1, 2007
Messages
792
Location
Bellevue, WA
Before I can get started on a better chart, what the heck does do the colors and numbers represent? Key please.
1 - 0.61 = green
0.6 - 0.4 = yellow
0.39 - 0 = red/orange

The numbers are the percentage of the time that the character on the left has beaten the character above. Or maybe it's the other way around....
Also, I think you'd need the total wins/losses to properly maintain the chart....
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
You only need total wins, because total number of matches is Character A wins + Character B wins. We don't allow draws by SBR rules, which is convenient for this.

If A is on the left and B is on the top, formula for the numbers is (Character A wins)/(Character A Wins + Character B wins). I wrote a spreadsheet which does the number crunching for me and spits out the colors; I would have had more, but Excel only allows 3 conditional formatting rules.

Read this chart pretty much identically to the other chart in Tactical made from board-stated matchups.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
Okay, updated. I made some changes to the presentation, so please tell me if you think it's an improvement or not.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
Hmmm, I think I'm going to toss a counter into the next revision, so everyone can see what our sample size for each match is.

Awwwww, yeah, my new spreadsheet is awesome.

Also, PLEASE SEND ME MORE DATA. I'm going to start bugging people in Tourney Results, but it's way more efficient for you to send me info than for me to track you down.
 

UltiMario

Out of Obscurity
Joined
Sep 23, 2007
Messages
10,438
Location
Maryland
NNID
UltiMario
3DS FC
1719-3180-2455
Great idea, but the chart DOES feel a bit big.
That's just me though.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
I think i could help with this. There are tonsss of Youtube Tourney matches so 'll post the results of as many of those as possible.
This is a great idea. I'd use the Tourney Results forum as a good place to start looking; it might also be a good idea to only look at stuff in the last month or so.

As well as a 32 person one with all of Colorado and some of NM (my crew) who showed up.

Top 5 were the only players/matches worth mentioning, imo. Others were poorly represented characters/lots of suicides/Colorado is kind of not that good at Brawl :X

1 Diddy Kong (me) NM
2 MK (Fluxus) CO
3 ICs/Wario (Goyf) NM
4 Ness/Wario (Timotee) CO
5 DK/Kirby (GoldenGlove) NM



Diddy > MK 5-2 (once in winner's bracket, then again in GFs)
Diddy > ICs 3-1 (winner's finals)

MK > DK 3-0 (DK got knocked into loser's, then lost one more round and started using kirby against MK)
MK = Kirby 1-1 (2nd and 3rd matches of the loser's semis set)
MK > ICs 2-0 (loser's finals, Goyf then switched to Wario)
MK > Wario 1-0 (3rd match)
MK > Ness 2-0 (winner's bracket, at some point)

ICs > Ness 1-0 (Tim then switched to Wario)
ICs > Wario 1-0
Here's an example report that I was sent - it contains pretty much everything I'm looking for, except that it omits total number of entrants and doesn't have the results for one 5th place and both 7th place finishes. I understand the desire to avoid including unskilled players, but we need to stick to a consistent methodology if we want to be able to interpret results later. Consistent biases can be corrected for with a variety of statistical techniques, but without consistent methodology, all bets are off.

Bottom line is that we're trying to minimize human judgment calls and opinion, so please don't omit stuff unless the protocol calls for it. I'll update the second post with a detailed protocol.

Okay, the second post has a simple "how to gather data" guide thing.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
Great idea, but the chart DOES feel a bit big.
That's just me though.
Yeah, unfortunately the chart has to have space foa little more than 1300 possible combinations. It could be a lot smaller if we, say, ignored low tiers, but I'm trying to err on the side of inclusion.
 

TK Wolf

Smash Ace
Joined
Sep 1, 2007
Messages
792
Location
Bellevue, WA
Yeah, unfortunately the chart has to have space foa little more than 1300 possible combinations. It could be a lot smaller if we, say, ignored low tiers, but I'm trying to err on the side of inclusion.
If you rename the characters on the top/bottom rows using 2-3 letters and keep the percents down to double-digits, then squeeze the rows together, you'll save A LOT of visual space. :)
BW CF DDK DK MK D3 IC ZSS Fox GW, etc

Edit: You could also use an icon.
 

Schwaumlaut

Smash Apprentice
Joined
Jan 8, 2009
Messages
135
Actually, in the next update you'll also be able to see our sample size for each matchup. I think this is pretty important information, but it takes up space, too.
 

B!squick

Smash Master
Joined
Jan 4, 2009
Messages
4,629
Location
The Sunny South
Okay. To get your chart on the interwebs you screen shotted it I think you said? I made one myself and I can't for the life of me figure out how to get it on the net. I figured I could just make a free web page and post there or something and I even found an article that explained how to upload html files on GeoCities. But it costs money to make a GeoCities site, so I'm fresh out of ideas. D:

Oh yeah, your chart seems to be getting better, good work. :)
 
Top Bottom