• Welcome to Smashboards, the world's largest Super Smash Brothers community! Over 250,000 Smash Bros. fans from around the world have come to discuss these great games in over 19 million posts!

    You are currently viewing our boards as a visitor. Click here to sign up right now and start on your path in the Smash community!

Elo Ratings in Competitive Melee

hectohertz

Smash Ace
Joined
Aug 20, 2006
Messages
800
Location
Brooklyn, NY
this is a terrific idea, this is made of win
ranking gives newer players something to gain from playing tourneys, allows them to feel progress as they improve, let's players have more even matches, etc etc


if you need another programmer, holla
 

Lukahn

Smash Apprentice
Joined
May 5, 2010
Messages
148
Location
Dijon, France
What I think is good is, for noobs like me, that this rating is really dynamic. That's why it's motivating because we can really measure our progress trough time.
 

LLDL

Smash Hero
Joined
Apr 27, 2007
Messages
7,128
Wow, that program is impressive. Still, who is going to keep track of the main file that will be used? Values will be different for everyone that uses them, no?
 

Dimitris

Smash Ace
Joined
Apr 13, 2010
Messages
571
Try to keep in mind guys that it's a "rating" not a "ranking." Chess players can get very irritated over this lol.
But can you explain why you can't rank people with these "ratings"? If it's such an acurate system of measuring someones level, why can't you just order people according to their points...
I've been trying to get my head around this, but I really can't understand the clear difference between "rating" and "ranking".

Also if you can't compare your rating with other people's rating, then what the **** is the point of it all?
x: "Yo, noob, I'm 2700!!" noob: "I'm not that bad I'm.." x: "Shut it noob, it's not a ranking." ? Sound like fun.
 

Madtsunami

Smash Cadet
Joined
Feb 17, 2010
Messages
33
"Often people who are not familiar with the nature and limitations of statistical methods tend to expect too much of the rating system. Ratings provide merely a comparison of performances, no more and no less. The measurement of the performance of an individual is always made relative to the performance of his competitors and both the performance of the player and of his opponents are subject to much the same random fluctuations. The measurement of the rating of an individual might well be compared with the measurement of the position of a cork bobbing up and down on the surface of agitated water with a yard stick tied to a rope and which is swaying in the wind. -- Dr. Arpad Elo, Chess Life (1962)."

tl;dr it's not a ranking of actual skill just a rate of results
 

Dimitris

Smash Ace
Joined
Apr 13, 2010
Messages
571
Personally, I don't need a 4 digit number and algebra to see if I'm doing better than before or not.
Sorry for the negativity though. I'll just leave this. Cool to see people making an effort.
 

Divinokage

Smash Legend
Joined
Aug 6, 2006
Messages
16,250
Location
Montreal, Quebec
It's a clear MEASUREMENT, it's to say how you actually compare to everyone else. A system like this lowers bias and it doesn't leave you open to say like oh maybe I'm worse or better than I think I am. A lot of people THINK they are doing better, but it's just BS confidence that you make yourself believe.
 

0Room

Smash Lord
Joined
Aug 21, 2008
Messages
1,953
Location
Boone, NC
I appreciate you doing that Cobalt, computer programming and I do NOT get along :/
Looking forward to using this, I'll let you know what happens.
 

Cobalt

Smash Journeyman
Joined
Aug 22, 2007
Messages
448
Location
Pittsburgh, PA
Ok I found a bug.

I am going through the Springfield tournament monthlies, starting with January. I added every singles entrant from January, February, and March. Then went through and started putting results in. Names that I had already entered have been disappearing from the list. Like, I put that LinkMO beat Jake. Then did the rest of the matches for round one. Then when I went to put that Jake lost again in R2, Jake's name had disappeared off the list.
Thanks, I'll look into this.

First, a couple things to try. When you sort the list by name, is everything in the correct order? Just some names disappear? Similarly, sort by Elo and see what happens. Can you add Jake again and have his name appear, or does the program refuse to add him again?

If I haven't posted a solution by the time you see this, send me in a PM as much of the following as you can:

1) A screenshot of the program before and after you notice the issue
2) The results or Tio files you're inputting data from
3) The contents of players.elo. Just open it in Notepad, select everything, and copy/paste it.

Oh, and for anyone putting in results manually, for accurate bracket ratings, put in the entire winner's bracket first, in order. Then the entire loser's bracket, in order. Followed by the first set of grand finals, then the second set of grand finals. This is the order in which matches for all players are played, and since ratings change after each match, the matches should be ordered correctly to ensure accurate results.
 

Bones0

Smash Legend
Joined
Aug 31, 2005
Messages
11,153
Location
Jarrettsville, MD
Tbh, it doesn't seem worth it to ever have the option to enter matches individually. When would it be used, for like MMs or just two players wanting to do a ranking match? I'm not sure that would even be fair because players could potentially boost their rating off by playing MMs/pride matches against weaker players.
 

Cobalt

Smash Journeyman
Joined
Aug 22, 2007
Messages
448
Location
Pittsburgh, PA
It's entirely possible that after some tournaments, the Tio file used for the tournament might be lost, but a bracket image had been posted in the results thread or something. And with both names and handles necessary to match players individually (since there are players with the same name or handle), without the Tio file, creating a new Tio file representing the same players would be very hard. Similarly, some tournaments might not use Tio, instead favoring other tournament software. The option for MMs to count for rating is entirely up to whoever ends up running the actual project. I know some state power rankings at least used to count MMs for a portion of what tournament matches counted, so the idea is similar.

The single match option is there since it has its niche uses that are necessary to the success of the system in cases where Tio files are unavailable. (And also because it was the first thing I implemented even before the GUI, and adding a GUI to it was simple so why not lol)
 

Nintendude

Smash Hero
Joined
Feb 23, 2006
Messages
5,024
Location
San Francisco
Wow, that program is impressive. Still, who is going to keep track of the main file that will be used? Values will be different for everyone that uses them, no?
Perhaps in the future there will be a way to use the program as a web applet. That's really the ideal way to do it imo, because then you can move towards implementing some of the features the FIDE has, and everyone can be centralized. First the program itself needs to be debugged though.
 

Cobalt

Smash Journeyman
Joined
Aug 22, 2007
Messages
448
Location
Pittsburgh, PA
Yeah, a web applet would be ideal but that's nowhere near my area of expertise. Once I've got EloCalc (or whatever I end up calling it) debugged to the point where it can be used (by, say, having just one person on the committee do the actual updating and then sending out the updated players.elo file), then if it takes off enough to warrant a web app, I can give it to someone else who can/wants to make one out of it.
 

GreatRit

Smash Cadet
Joined
May 31, 2010
Messages
56
I have a 1961 ELO blitz rating at chess.com...lol
My ELO here is probably like 0.
 

Zoler

Smash Ace
Joined
Aug 30, 2009
Messages
991
Location
Sweden
ok, so I made a tio file with just a bracket, no pools, but it's not working. Did I miss something?

Nothing happens when I browse the tio file and press update.
 

CloneHat

Smash Champion
Joined
Jan 18, 2009
Messages
2,131
Location
Montreal, Quebec
It's a rating in the sense that you don't look at a player and see that they're 54th in the world (we most likely won't have every player in the world), but you see him/her as having a skill level in the 2200's.
 

Cobalt

Smash Journeyman
Joined
Aug 22, 2007
Messages
448
Location
Pittsburgh, PA
ok, so I made a tio file with just a bracket, no pools, but it's not working. Did I miss something?

Nothing happens when I browse the tio file and press update.
Okay I've replicated this one. Working on a fix now. Try a tournament without byes, I think something might have gotten screwed up with how they're handled.

edit: Okay, that was an easy fix. Yeah, things were screwing up if player 1 of a match in Tio had a bye. I'll post a link to the updated version in the release post. Try it and see if it works.
 

Cobalt

Smash Journeyman
Joined
Aug 22, 2007
Messages
448
Location
Pittsburgh, PA
No, the Elo system doesn't impose a cap. However players' ratings can approach a limit based on how many players there are, since one player can only get so far ahead of anyone else before he stops gaining points from people.
 

AlphaZealot

Former Smashboards Owner
Administrator
Premium
BRoomer
Joined
Jul 6, 2003
Messages
12,731
Location
Bellevue, Washington
On the Brawl side of things UTDZac has been doing this for about a year I believe. I think his system is simpler, involving a tio plug in- so if you have the tio bracket you can send it and it should automatically update results. Not 100% positive on this though. If it isn't the case, it should eventually be, in order to make it easier to enter results in.


That's not true at all. I believe chess is very inviting to weaker players because of how they use their ratings system. The ratings system is more than just seeing how you stack up. It's a great tool for allowing amateurs to play people at their own level, it removes bias towards good players in tournaments (this is a big turn-off to me), and it gives everyone something to work for. You know those 95% of players who never have any shot at prize money? And are just entering to try and get more experienced? Well now there's something real for them to play for.
Typically chess tournaments are divided though into several (typically 3 at smaller events, many more at larger events) divisions, where you are separated away from your opponents based on your elo rank. The top division involving the highest rated players has the most amount of money on the line, but even the lower division, populated with low rated players, has a fair sum on the line. I doubt this is something that would fly with the top players in this community. For example, the 2011 US Open Chess Championship had this distribution:
Prizes[Projected]
Top Places: $8000-4000-2000-1500-1000-800-600-500. Clear winner - $200 bonus.
If tie for first, top two on tiebreak play speed game [W - 5 min, B - 3 min and draw odds] for bonus and title.
Class Prizes:
Top Master: $2500-1200-800-500.
Top Expert: $2500-1200-800-500.
Class A: $2500-1200-800-500.
Class B: $2500-1200-800-500.
Class C: $2000-1000-600-400.
Class D: $1500-700-500-300.
Class E & Below: $1500-700-500-300.
Unrated: $800-400-200.

That would mean a Class A who might not even place in the money, would likely have won a Class D/C/possibly B event, but he was not allowed to enter at those levels due to his rating. Due to the general selfishness of the top players in the Smash community, along with years of precedent, I find it unlikely they would ever accept such a system that spread the wealth to lesser players. Then again, I have noticed a recent rise of AM brackets in the last 6 months.
 

Cobalt

Smash Journeyman
Joined
Aug 22, 2007
Messages
448
Location
Pittsburgh, PA
On the Brawl side of things UTDZac has been doing this for about a year I believe. I think his system is simpler, involving a tio plug in- so if you have the tio bracket you can send it and it should automatically update results. Not 100% positive on this though. If it isn't the case, it should eventually be, in order to make it easier to enter results in.
If this is the case we should probably just do that; it's a vetted solution that's easy to use. Until then I'll keep working on my thing for the experience if nothing else :)
 

Nintendude

Smash Hero
Joined
Feb 23, 2006
Messages
5,024
Location
San Francisco
I feel like in order for an international / universal rating system to exist there has to be a better way to do it than sending Tio files to a single guy. There really has to be some sort of web database that the system works with. I'm sure Cobalt's program can later be integrated into a system like that but we need to find someone capable of developing the web portion of it. I don't think it's particularly difficult or time consuming for someone who just knows what to do.

I would agree that running amateur events as it is done in chess would not work for the Smash community, but I don't consider that to be the purpose of this anyway.
 

Zivilyn Bane

Smash Master
Joined
Nov 18, 2004
Messages
3,119
Location
Springfield, MO
That would mean a Class A who might not even place in the money, would likely have won a Class D/C/possibly B event, but he was not allowed to enter at those levels due to his rating. Due to the general selfishness of the top players in the Smash community, along with years of precedent, I find it unlikely they would ever accept such a system that spread the wealth to lesser players. Then again, I have noticed a recent rise of AM brackets in the last 6 months.
That's true but all it would take is a simple explanation of why the lesser players get money. The fact is, a large chess tournament like that could have 2000 participants, each paying up to a $250 entry fee for just the one event. That's a 500,000 prize pot. Players in the E, D, and C categories are going to make up 50% of that. These players ONLY GO because they have a legitimate shot at winning money. B and A players will go for the same reason, but are usually more serious players that not only want the shot to win some money, but also the opportunity to get better. Same goes for Expert for the most part. Even Masters couldn't possibly win a tournament that large if it was just one huge open.

If Chess did like melee and only had one big open event, they'd have tournaments with MAYBE 200 people. This is because only 10 people are going to win money and nobody wants to pay a $250 entry fee if they have no shot.

I've actually been working on a project to help increase melee attendance which involves implementing a more professional and serious amateurs bracket. It's been in the works for awhile and I've kind got lazy with it but if smash is really not dying I'll post it one of these days.
 

Rubyiris

Smash Hero
Joined
Apr 19, 2007
Messages
6,033
Location
Tucson, AZ.
got bored, did an ELO chart for AZ's last three tournaments. The rankings ended up being VERY contrary to our current PR.

AZ PR:

1. Axe
2. GamerGuitarist7
3. Tai
4. Rubyiris
5. Okami
6. Silly Kyle
7. Angel
8. Falcoty
9. Nicknyte
10. Shiv

ELO Rating: (Note, I used bracket matches only. I avoided pools/swiss rounds.)

1. Axe 1278
2. GG7 1124
3. Silly Kyle 1060
4. Forward 1057
5. Rubyiris 1051
6. Tai 1028
7. Okami 1011
8. Angel 1000
9. AZenith 988
10. Frosty 976

With the ELO, there are players who are ranked far below their actual skill level due to difficult brackets, or upsets such as my beating Axe skewing other players records.
 

Nintendude

Smash Hero
Joined
Feb 23, 2006
Messages
5,024
Location
San Francisco
If everyone started at the same rating then of course it will be really wrong after only 3 tournaments. I'm not sure what you are proving besides what we already know about how Elo works.
 

Nintendude

Smash Hero
Joined
Feb 23, 2006
Messages
5,024
Location
San Francisco
What wold you have done differently then?
First see if there's a way you can estimate the rankings before calculating ratings after 3 tournaments. Ideally you'd just use old tournament results but one thing you can try is say Axe is rated 2000, the lowest ranked guy is rated 1200, and equally space everyone else's initial ratings (based on your power rankings) in that range. Players not on the power rankings get an initial rating of 1000. THEN run the updater on your 3 tournaments. You'll find that people losing to Axe will barely have a dent in their rating, because the system acknowledges the high probability of Axe winning. People who pull off upsets will be rewarded due to the rating differential.

The only way to get good results without initial rating estimates is to have a sample much larger than 3. I have no way of estimating the size of a "large sample" but I'd guess you'd need at least 30 tournaments (for a local scene, 10 might be sufficient). This is the reason that I propose a 1-year "incubation period" before ratings are ever used to seed tournaments.

I'm really curious what your result will be if you try this method. Let me know what happens.
 

Rubyiris

Smash Hero
Joined
Apr 19, 2007
Messages
6,033
Location
Tucson, AZ.
I've got about a years worth of results to work with. I was just trying to only use THIS seasons results.

Also, should I start everyone out at 1000, starting with the earliest tournament?

We've got everything in AZ recorded back to 4/2010
 

Nintendude

Smash Hero
Joined
Feb 23, 2006
Messages
5,024
Location
San Francisco
Try different things and see what happens. If you got over 1 year of data, try starting everyone at 1000 and then compare the result with giving players initial ratings, as I described above. Again, I'm really curious of the outcome.
 

Rubyiris

Smash Hero
Joined
Apr 19, 2007
Messages
6,033
Location
Tucson, AZ.
ok.

I originally only went with the last 3 tournaments since we decided to give everybody a clean slate for 2011.

I'll get to work on the new ELO file right away.

Also lets say a player didn't enter any tournaments in 2010, but started to enter them in 2011, where should I start them at?
 

Mahie

Smash Lord
Joined
Aug 18, 2007
Messages
1,067
Location
Lille, France
I can't seem to be able to use it. It says the .tio file is invalid. Am I doing something incorrectly?
 

Zivilyn Bane

Smash Master
Joined
Nov 18, 2004
Messages
3,119
Location
Springfield, MO
But can you explain why you can't rank people with these "ratings"? If it's such an acurate system of measuring someones level, why can't you just order people according to their points...
I've been trying to get my head around this, but I really can't understand the clear difference between "rating" and "ranking".

Also if you can't compare your rating with other people's rating, then what the **** is the point of it all?
x: "Yo, noob, I'm 2700!!" noob: "I'm not that bad I'm.." x: "Shut it noob, it's not a ranking." ? Sound like fun.
I think you're misunderstanding me. The elo number itself, for example, is 2100. That is a rating, not a ranking. Because he's not "ranked" 2100, that would mean there are 2099 people ranked or rated higher. The number 2100 is a rating of skill level. That might coincide with the fact that that player is ranked 1st. So it wouldn't make sense to say "My elo ranking is 2100." It would make sense to say "My elo rating is 2100." Get it?

EDIT: Cobalt, I'll PM you when I get off work today in about 10 hours.
 

PEEF!

Smash Hero
Joined
Jun 25, 2008
Messages
5,201
I think you're misunderstanding me. The elo number itself, for example, is 2100. That is a rating, not a ranking. Because he's not "ranked" 2100, that would mean there are 2099 people ranked or rated higher. The number 2100 is a rating of skill level. That might coincide with the fact that that player is ranked 1st. So it wouldn't make sense to say "My elo ranking is 2100." It would make sense to say "My elo rating is 2100." Get it?

EDIT: Cobalt, I'll PM you when I get off work today in about 10 hours.
I understand. It's like in Starcraft ladder when you present yourself as a "3200 Master level Zerg". So I'd introduce myself as a 1550 level IC or something like that.
 

AlphaZealot

Former Smashboards Owner
Administrator
Premium
BRoomer
Joined
Jul 6, 2003
Messages
12,731
Location
Bellevue, Washington
If you are going to use the ELO system accurately you must start everyone with the same ranking for whatever their first tournament in the system is.

In Chess, you technically have a provisional ranking until you attend (IIRC) 5 tournaments that qualify for the rankings. After which you have an official ranking (rating).

Also, don't be surprised if the elo system is different than the PR system. PR's are typically very biased, but they also heavily weight someone for their past accomplishments far more than their present accomplishments. For example, a player who use to be very good, but no longer attends many events, will likely still be ranked highly even though they likely dropped off significantly. It varies from PR panel to PR panel, so there isn't a concrete way to see how biased the rankings are, but I would easily trust an elo rating more than a PR panel. For example: why should beating Axe (something others could not do) not be weighted heavily in that person's favor?
 

Zivilyn Bane

Smash Master
Joined
Nov 18, 2004
Messages
3,119
Location
Springfield, MO
If you are going to use the ELO system accurately you must start everyone with the same ranking for whatever their first tournament in the system is.

In Chess, you technically have a provisional ranking until you attend (IIRC) 5 tournaments that qualify for the rankings. After which you have an official ranking (rating).

Also, don't be surprised if the elo system is different than the PR system. PR's are typically very biased, but they also heavily weight someone for their past accomplishments far more than their present accomplishments. For example, a player who use to be very good, but no longer attends many events, will likely still be ranked highly even though they likely dropped off significantly. It varies from PR panel to PR panel, so there isn't a concrete way to see how biased the rankings are, but I would easily trust an elo rating more than a PR panel. For example: why should beating Axe (something others could not do) not be weighted heavily in that person's favor?
Last time I checked, Chess used a higher K value for "provisionally" rated players with less than 25 rated games. This allows their rating to fluctuate quicker without drastically affecting the ratings of opponents they beat. It ONLY works though after a huge majority of tournament participants are already rated.

For example, if you start everyone off at 1,000 even though 90% of the entrants are already rated, this group of new players could adversely affect the quality of accurate ratings. Let's say one of the new players is actually very good but has yet to play in a rated event. He might go in there and get first place and beat several players rated over 1500. Well, it's not fair for the 1500 to lose so many points just because they lost to a new guy. So the way chess does it is they take the new players rating based off a simpler equation that calculates just their first tournament, and then calculates opponents with established ratings based off the provisional rating (aka performance rating).

It doesn't work however if everyone at the tournament is unrated, which will be the case for melee for a long time until everyone gets one established. That's why for melee it makes sense to start everyone off at the same number, like 1000 (at least until the vast majority of current tournament players are rated, if that ever happens.) Here are some very good links for you guys to check out:

http://main.uschess.org/content/view/7875/400/
http://math.bu.edu/people/mg/ratings/approx/approx.html
 

MTKO

Smash Journeyman
Joined
Feb 18, 2008
Messages
294
Location
Hampden, Maine
If you are going to use the ELO system accurately you must start everyone with the same ranking for whatever their first tournament in the system is.

In Chess, you technically have a provisional ranking until you attend (IIRC) 5 tournaments that qualify for the rankings. After which you have an official ranking (rating).

Also, don't be surprised if the elo system is different than the PR system. PR's are typically very biased, but they also heavily weight someone for their past accomplishments far more than their present accomplishments. For example, a player who use to be very good, but no longer attends many events, will likely still be ranked highly even though they likely dropped off significantly. It varies from PR panel to PR panel, so there isn't a concrete way to see how biased the rankings are, but I would easily trust an elo rating more than a PR panel. For example: why should beating Axe (something others could not do) not be weighted heavily in that person's favor?
Maybe if a person does not attend events for a certain period of time, then their rating can be reset or lowered a significant amount.

Also since skill level varies greatly from region to region, would it be fair for everyone to start at the same rating? For example: say group A has a much lower skill level than group B. Both groups host tournaments frequently in their areas, but are too far apart to attend or frequently attend each others events. They end up playing the same large group of people each time at these tournaments and some of them end up all being rated. Now Group A and Group B are finally able to attend a tournament together, and the seeds are based on the ratings. Even if some of the players form group A have really high ratings that are in the range of some of the top players from group B, group B is far more skilled. So would it be better to only apply the rating system for tournaments that have smashers from several regions or larger tournaments?

I'm also wondering what people's thoughts are on what K value to use and how it should be changed depending on the players rating.
 
Top Bottom