The TGL Power Ranking System - programming update 4/15/2010

Dr. Tuen · Feb 28, 2010

The TGL Ranking System

Table of Contents
Introduction
Motivation
Theory: Placement System
Theory: Gain-Loss System
Automation
Beta Test Results
Future Work
Closing Remarks

INTRODUCTION
The TGL Power Ranking System is a new and innovative method for ranking individuals in the Super Smash Brothers community. Using a system which accounts for each win and loss in tournament play, the TGL power ranking system has the potential to exceed the ranking accuracy currently seen in panel based systems. While this accuracy has not yet been achieved, future work and constant dedication will steer this project toward the future of smash power rankings.

It should be noted here that the drawbacks of any given system, including the TGL PR System, are not overlooked. Instead, they are analyzed and solutions for future numerical power ranking systems are considered.

MOTIVATION
The original motivation for this project came from the lack of an Oregonian Power Ranking. General lack of interest made the formation of a panel difficult. Under the idea that numerical power ranking systems could be run by one person, various ideas were put forth and tested lightly. Eventually a gain-loss system was implemented and the system has been steadily improving every since.

The biggest issue being addressed here is the accuracy of panel based power rankings. On the regional level, it is difficult for a panel to discern, with high accuracy, the places of the players ranking 3rd and below. On higher scales, the error may even extend to the 2nd and 1st place positions. This error occurs because while the first few places may be obvious, the places below that become very dependent on several factors which must be accounted for simultaneously for accurate placement. These factors include: tournament attendance, average placement, highest placement, lowest placement, general reputation, and performance against other ranked players.

The benefits of a gain-loss system include the following factors: removing the reputation factor, tabulation of all wins and losses, account for new players, all while carrying over interactions between tournaments.

The drawbacks include the following: high work load, complicated factors, and difficulties in accounting for a lack of tournament attendance.

THEORY: Placement System
This section of the overview covers the theory behind the two numerical systems that have been considered. The first of the two is the Placement System. The placement system is an easy numerical system put into place and follows these principles:
**Players who place in tournament receive points
** Players who place under a certain threshold do not receive any points
**Points are carried over in between tournaments
**Players are ranked based on their accumulated points over the tabulation period

Benefits
**Easy to implement
**Accounts for player attendance

Drawbacks
**The difference between players placing 3rd and below can become very small, making the results somewhat unclear in some situations
**Players who create an early lead are not punished for huge losses
**Catching up to top players, even through beating them directly, is very difficult

Overall, the placement system does not yield very good results because the mobility of players below 3rd place is low. It is difficult to account for players who truly improve their average performance without resetting the entire power ranking.

THEORY: Gain-Loss System
The gain-loss system is the primary system used in the TGL Power Ranking System. This system acts on the idea that every match is important. Every single interaction in every single tournament is analyzed and taken into consideration when an event is tabulated.

Terms
WinnerScore = the winner’s starting score
LoserScore = the loser’s starting score
NewWinnerScore = the winner’s new score after a match
NewLoserScore = the loser’s new score after a match
NPI = New Player Index

These terms are useful for understanding the gain-loss equations. These are the backbone of the TGL power ranking system. Every match results in a new score for both players, and those new scores are calculated as follows:

NewWinnerScore = 100*(LoserNPI/20)*(LoserScore/WinnerScore)

NewLoserScore = 50*(WinnerNPI/20)*(LoserScore/WinnerScore)

There are a number of effects that must be described for these equations to make sense.

Primary Score Interaction
The ratio on the far right of each equation (LoserScore/WinnerScore) is the primary score interaction in these computations.

From the Winner’s Prospective
If the winner has an expected win (Loser Score < Winner Score), then the key ratio will be less than one and the winner will yield very little points. If the winner’s victory is unexpected (WinnerScore < LoserScore) then the winner will yield many points because the key ratio will be greater than one.

From the Loser’s Prospective
If the loser experiences an expected loss (Loser Score < Winner Score), then the key ratio will be less than one and the point loss will be minimal. If the loser’s loss is unexpected (WinnerScore < LoserScore) then the loser will sustain a large loss in points because the key ratio will be greater than one.

New Player Index
This is a number that ranges from 10 to 20. When a new player enters their first tournament which is analyzed by this PR system, they are assigned an NPI of 10. This index increases by 1 for every match they attend.

This index helps reduce any new player effects. The index is applied to the new player’s opponent, which can cut that player’s score change in half. Here is why this works:

Player A is new
Player B is a veteran (NPI of 10)
Player A is BETTER than player B

Without the NPI, Player A will beat Player B and strip hundreds of points from that player. As this occurs, Player B falls in the PR and Player A rises to his or her true score. When the tournament is over, Player A is at his or her appropriate score, and Player B’s score is lower than Player B’s true score.

With the NPI, Player A will beat Player B and the score loss will be reduced by the NPI. Since Player B’s NPI is 20, Player A’s score gain is unaffected. This allows Player A to rise to his or her appropriate score without negatively effecting Player B. This is the positive effect of the New Player Index.

Benefits
**Accounts for all player interactions on all skill levels, yielding accuracy throughout the rankings
**Can properly account for new players entering a previously established power ranking
**Allows for players to overtake first place, should their performance warrant such movement
**Scores tend to be far separated from each other, so scores and placements are clear

Drawbacks
**Attendance can affect placement
**The work load associated with tabulating each match is tremendous (hundreds of computations per tournament)
**Inconsistent gain-loss distribution

Automation
This assesses one of the main drawbacks of the gain-loss power ranking format. A code has been written that has a number of capabilities. First, a couple of explanations:
**The code is written in MATLAB. Actual coders will probably have a good laugh at that, but it’s true. When asking several coders at Oregon State, all of my requests for assistance were turned down and I learned how to program using the most common code for Chemical Engineers (the area I have a degree in).
**TIO files can be converted to text files for easy handling
**Inside the TIO file, each player is paired with an ID number, this is used for all events in the tournament
**An excel file is used for score archiving. A score progression from event to event can be viewed after the PR calculations are finished.

The program has a number of features included
**Events of any size can be tabulated
**Bye’s are properly accounted for, and have no effect on player score
**Grand Finals and Double Grand Finals are properly tabulated
**Names are checked individually to account for multiple pseudonyms and misspellings
**Names, Scores, ID numbers, and NPI are all properly paired before the tournament is analyzed
**New scores and names are recorded automatically
**New players are automatically given a new row in the Archive file, a score of 1000, and an NPI of 10

Drawbacks
**NAMES. The program operator must have a good social knowledge of the players in question to cover the possibility of multiple names, misspellings, and other mistakes. The largest source of errors is through extra pseudonyms created for fun or via a name change the player undergoes at one point or another. If this is not properly accounted for, the player is given a new score and the interactions from there on out are inaccurate.

Beta Test Results
The finished program underwent a beta test using pertinent tournaments from the North West Region (Oregon, Washington and Idaho). Below is a list of tournaments included:

TP3 Pools http://allisbrawl.com/ttournament.aspx?id=3482
TP3 Pro http://allisbrawl.com/ttournament.aspx?id=3482
GC November http://allisbrawl.com/ttournament.aspx?id=7726
GC December http://allisbrawl.com/ttournament.aspx?id=7916
GO 3.0 http://allisbrawl.com/ttournament.aspx?id=6534
GC January http://allisbrawl.com/ttournament.aspx?id=7917
PRI Smash II http://allisbrawl.com/ttournament.aspx?id=7931
TP4 http://allisbrawl.com/ttournament.aspx?id=7727

Using these tournaments, the following results were obtained:

1 felix 2714
2 jem 2256
3 nerd 1982
4 carlos 1940
5 pwneroni 1901
6 bladewise 1863
7 valdens 1845
8 gage 1729
9 zeionut 1704
10 itakio 1660
11 sagemoon 1624
12 chip 1616
13 weruop 1571
14 c!z 1556
15 eggz 1538
16 t1mmy 1537
17 mr.b0jangle 1498
18 uchiha78 1412
19 tuen 1383
20 dr.mario12 1373

Oregonian players occupy positions 5 (pwneroni), 8 (gage), 16 (t1mmy), and 19 (tuen). All other players are Washington players.

Looking through the tournaments, there are player-based arguments that can justify the scores of each player. To go through these separately would be lengthy and tedious. Feel free to inspect the performance of any individual player.

Noted Criticism
These numbers are not without error. Here are some of the issues that need to be worked out as this project moves forward:

Tournament Inclusion
The addition of PRI II (an Oregonian tournament) without including IES (a Washington tournament) has been noted. This prompted a discussion about tournament addition under context. It has been suggested that regional tournaments with a certain degree of region interaction are the only tournaments which warrant tabulation on a regional scale. Other arguments ask for inclusion of many region specific tournaments while keeping a balance between those regions in doing so.

Attendance
In this Power Ranking, C!Z is an anomaly. He has not attended an event since one of the first tournaments in the list and is still ranked 14th. Without a method for decaying a player’s score based on attendance, this kind of error can occur. Carlos has been noted as another attendance anomaly, having only attended two of the events listed. His performance at one of them was outstanding, (a run which includes TWO defeats of the current #1 ranked player), but his attendance record still calls for some kind of minor decay to his score, in some opinions.

Regional attendance with respect to regional tournament inclusion creates another layer of complication. If a player can only attend events in their state due to financial reasons, what is to be done about missing out of state regionals? For players who still attend their in-state events (regionals and state specific) do they still experience attendance decay?

The automation for this kind of decay is not a problem. But as discussed here, the appropriate implementation is.

Gain-Loss Distribution
This is an issue because the benefits a player experiences for beating someone of near-equal score to themselves as opposed to someone better than them is not balanced. This will be covered in more detail in “Future Work”.

Future Work
This section details the work that is currently being done on this project. The project will attempt to move forward whenever possible, seeking higher accuracy and a wider spectrum of application within the Smash Community.

Gain-Loss Distribution
This is a primary concern with this project. Two players which are of exactly equal score (this can happen if two new players play each other) experience high levels of fluctuation because their score interchange is not too different from a score exchange of two players whose scores differ by 1.5 times.

This occurs because the two are related linearly. As the score ratio (LoserScore/WinnerScore) increases (e.g. ratios over 1 imply an upset, the winner was worse than the loser; and ratios under 1 imply the winner was better than the loser), the points gained increases linearly as well. This creates very little difference between beating someone of equal score to you (exchange about 100 points) and someone who is 1.5 times your score (exchange about 150 points).

The other issue is the fact that point gain is UNBOUNDED. Upsets, while great, should not yield hundreds and hundreds of points. Statistically speaking, all players have some minute chance of beating the best in the world. If a player does, it may be an uncharacteristically good performance. If that player’s average tournament performance changes and that player starts beating all the best players consistently, then the number one spot is not unearned. Otherwise, it is a single non-repeating occurrence which can throw off future tournament data.

To fix this, the ratio can be put through the Logistic Function. Here is more information on this curve:

http://en.wikipedia.org/wiki/Logistic_function

This function can be adjusted so that different values occur at different places. For its initial tests, it will be set so that players of equal score (key ratio is one) exchange 30 points, and players who have a difference of 1.5 times (key ratio is 1.5) exchange 90 points. Score exchanges are capped at 100 points.

Note, these large score changes are for unexpected wins. Expected wins and expected losses yield very little points of gain and very little points of loss.

Tournament Categorization
In the ELO ranking system, which this is loosely based off of, tournaments are ranked based on the average score of the players attending. THIS IS NOT A FUNCTION OF ATTENDANCE. Here’s an example. Sakura Con holds a tournament every year, and gets a very large turnout (80+ I think). Under this system’s ranking, their scores (if estimated properly) would be around 500-700 each. The average score would be fairly low. If you compare this to PRI II, an Oregon tournament held last January, which had 30-someodd people attend, you’d get an average score of approximately 1200.

This tournament ranking would yield a small effect on the overall score change (5%), but it could be useful for improving accuracy.

Performance Rating
This is another ELO idea which allows statistically anomalous performances to be taken into account. Say, for instance, for one tournament only a player ranked 17th won a NorthWest regional. After this regional, this person’s performance didn’t rise above 10th. That one win is an outlier overall.

This information can be used to make that one anomalous performance affect the PR slightly less (while still benefitting the one-time winner). It should be noted that when a player’s average performance changes (in this case, getting 2nd and 3rd consistently after that one win), then scores will change appropriately.

The calculations necessary for this factor to be appropriately implemented are not yet fully understood.

Attendance Correction
The basic idea for an attendance correction is to detract points for missing a certain number of events in a row. However, with the ideas of analyzing multiple regions in consideration, doing this without taking points in an unreasonable fashion is troublesome. No ideas are yet put forth which seem reasonable.

Program Translation
The tournament tabulation program, the “TGL-9000”, is written in MATLAB. At some point, far in the future, this will be converted to Java, or something common like that, to facilitate wide spread usage.

Closing Remarks
This project has been in the works for nearly three quarters of a year and is still going strong. The hope is that once some of these new ideas are in place, the inaccuracies of the system will fall below that of a panel-based power ranking. In truth, the system is tending towards displaying widely accurate results with blatantly obvious inaccuracies. In a hybrid system, panel and numerical, the panel could easily pick out the anomaly and have it removed.

With more hard work and support from the community, this could become the future of smash power rankings. Whether or not that actually happens is of no consequence to me, I am working through this project for the sake of my own curiosity. Those who believe in the system will follow it, those who do not believe in the system will not. Either way, I intend on working on this until it reaches the efficiency and accessibility which will allow any player to tabulate interactions in their own regions.

A clear picture of who is ranked where can motivate players, invigorate a region, and create exciting rivalries. That clarity is all that this project seeks.

Until the next update, thanks for reading!

Best Regards,

Tuen
Head of the TGL Power Ranking Project

----

The original project page can be viewed here:
http://allisbrawl.com/group.aspx?id=9557

====

Updates:

4/15/2010
Programming
It looks like there are MATLAB clones out there available for download. I hear that none of them have 100% compatibility with MATLAB's ".m" files, but I don't think I used very complicated functions to begin with. After the equations settle out, this may be an accessible alternative to converting my near 1000 lines of code.

Efficiency
It may become useful to add this into the code: check for names that match the archive EXACTLY (100%). All of these names do not have to be double checked and re-typed by the user. Depending on how accurately the TO inputs names, this could save the number cruncher a lot of time

Drawback: when the scale of this project increases, the number of randoms willing to enter under their real name (say, eric), and the chances of their similarity, increases. It has also been shown that for fun and jokes, people like to enter tournaments under famous smash names (M2K, Larry, Ally, etc), which could be an issue to this lazy method.

Solution: Do a quick run through of the bracket (in TIO) before running it. If you have a good knowledge of your players, the fixes are easy.

---

4/2/2010
The variable gain loss parameter has been programmed into the automated tournament ranking system. With this adjustment in place, the TGL system becomes more and more alike to the ELO system, with some key differences. However, because of this, the system has inherited ELO's slow score convergence trait.

Scores in ELO can take YEARS to converge to their proper values. The smash community needs something that can adjust to players changing skill levels in the matter of MONTHS. Adjustments will be made the key equations until something suitable is set.

Once this is complete, an update for the Pac Northwest will be tabulated. After which, discussion for this project's expansion can begin.

Dr. Tuen · Mar 4, 2010

For those interested, the same list of tournaments have been re-done with the Logistic function described in "Future Work". The results and any pending discussion can be found here:

http://allisbrawl.com/forum/topic.aspx?pid=1395533#p1395533

The Greater Leon · Mar 4, 2010

looks good man

gimme like a day to read it all though lol

Dr. Tuen · Mar 4, 2010

After 8 months of work, I'd hope it wasn't a light read. I think this'll be worthwhile for the community though. :-D

-Tuen

Also, any input is appreciated!

Overswarm · Mar 9, 2010

This is a ladder system, pretty much.

I've been working on different ways to do stuff like this myself!

Some things you need to add:

Tournament Removal

This is pretty standard for most ladders in this fashion. Tournaments either are fit into a "season", or after X amount of months previous tournaments are removed. Ankoku uses a 6 month time span and it seems to be fairly accurate and allows for minor fluctuations over time. This also has the added benefit of having several months with nationals in it for each 6 month time span; something shorter could see huge spikes and drop offs in score simply because of how many good players attend.

Score decay!

This is the most important part of "competition" in a ladder. If I get first 5 tournaments in a row and then don't attend any more, it becomes incredibly difficult for others to catch up. A simple score decay system that has a player lose some of their score after two or three weeks of not attending a tournament will help keep things accurate.

Blizzard, with their warcraft 3 ladder, took a no-nonsense approach and made top players play every day to keep their score. You wouldn't have to be quite so Draconian, but could make it work.

Before they implemented this, they had one guy reach the top 10 on their ladder then quit Warcraft 3. He was still on the top 10 for months, and it was an inaccurate representation of the metagame at the time.

"Top Player" bug

Mew2king often attends my tournaments. If I get 2nd place and M2K gets 1st, my score is inflated drastically. I don't think this is particularly accounted for. Doing any sort of ranking system involving more than one state would find this to be a pretty serious issue. Low competition that is very top heavy for one or two placements isn't uncommon in many regions, and I'm sure m2k's point total would be astronomical and would inflate the score of those below him.

All in all, this is pretty fantastic and I can't wait to see what comes of it.

If you want me to do testing in my own region, I can do so.

Rappster · Mar 9, 2010

what's wrong with elo rating?
or more accurately, glicko (the stuff chess federations use)

Dr. Tuen · Mar 9, 2010

@OS

Tournament Removal:
Definitely! I agree with pretty much everything you've said here. Right now I'm sticking to the set of tournaments described above to work out the bugs in the equations (direct comparison demands that the data set does not change), but when that's done I'll expand it to a set time scale.

Score Decay:
Here's something I'd love to discuss at length. This system makes a lot of sense on a small scale, because everyone should get to 70% or more of their local tournaments (if they are vying for a position in the top 10 in their region). But what happens on a regional, and even national scale? Say I go to every event in OR, WA plus R3 and TP5 between now and when I crunch national numbers... what about all the East Coast tournaments I missed? On large scales, I see score decay as being something that will become difficult to properly implement, even though it's pretty much the best idea ever

**We're already seeing the drawbacks of not using score decay. C!Z has pretty much displayed the exact property of shooting to a high score and then dropping attendance and staying there.

"Top Player" Bug
I believe the system fixes this on it's own. I'm going through a few re-adjustments of the equations, but it's already getting to the point where a good player will suddenly gain less and less points for beating people he's better than. E.G. instead of gaining reasonable points for a win (15-25 points on the logistic scale), they start gaining .3 points for every win.

The other thing that's going to assist this is the adjustable base gain-loss factor. This will tabulate two values: N & m. N is the effective games and m is the number of games you played in a tournament. It'll help account for player variability (in one tournament I may beat the top 5 in my region, but from then on out I keep doing my #7 type quality performance) and it will help with the top player effect, since their N+m will be HUGE (and base gain-loss is 800/(N+m).

===

Can your run MATLAB code? If you can, I can send this to you and you can give it a shot.

Best regards,

Tuen

Dr. Tuen · Mar 9, 2010

Rappster said:
what's wrong with elo rating?
or more accurately, glicko (the stuff chess federations use)

Using this directly is not very good for the smash community. Their system moves very *very* slowly. We're talking about 200 point gains over the course of a year. This community needs rankings that are more volatile to allow for up and coming players to claim their rightful high ranks in a reasonable amount of time.

The current system is very heavily based on ELO though. Many of it's factors are very well done and they are mirrored here in some of the most recent and future work.

Some of the factors are taken from multiple sources:

The New Player Ranking was something I saw in a forum for World of Warcraft ratings of sorts.

The logistic curve is definitely ELO

The player performance (N&m) is from another chess rating system, whose name currently escapes me.

Anyways, hours of research has gone into this. Now that I have the programming tools to make this work, the equations will undergo many tests until they seem stable and capable of handling all the common player bugs (things that cause score anomalies). Then the project will expand :-p.

Best regards,

Tuen

Overswarm · Mar 9, 2010

Score Decay:
Here's something I'd love to discuss at length. This system makes a lot of sense on a small scale, because everyone should get to 70% or more of their local tournaments (if they are vying for a position in the top 10 in their region). But what happens on a regional, and even national scale? Say I go to every event in OR, WA plus R3 and TP5 between now and when I crunch national numbers... what about all the East Coast tournaments I missed? On large scales, I see score decay as being something that will become difficult to properly implement, even though it's pretty much the best idea ever

It depends, more or less, what you're attempting to accomplish.

If you're trying to accomplish a ranking system for the entirety of smashboards, you're going to have some serious region issues naturally... people will miss tournaments. With MLG coming up, you might be able to make MLG a subset and rank people in that fashion within MLG, but ALL tournaments? That's stretching things.

It might be worthwhile to separate the US into several smaller regions (my own personal region would probably be Ohio, Kentucky, Indiana) and then do them each separately and simply have people visually compare them... but I think that's the opposite direction you are shooting for.

With your current level of research into this, you're really close to making a great ladder system for a specific state or region of players that routinely play together... very few holes in it.

What you will have trouble doing is accurately ranking people when they don't play each other; the "WC / EC" thing. Score Decay and natural consistency should solve most of the problems, but you're right in that score decay for not attending national tournaments would really mess things up on a national level..... I'm not sure how you can really fix this without dividing the US up into regions.

Troubling..... and I'm rambling too.

"Top Player" Bug
I believe the system fixes this on it's own. I'm going through a few re-adjustments of the equations, but it's already getting to the point where a good player will suddenly gain less and less points for beating people he's better than. E.G. instead of gaining reasonable points for a win (15-25 points on the logistic scale), they start gaining .3 points for every win.

This could work if implemented properly. I'd be worried about goign to a tournament only to get .3 points for winning but other people skyrocketing for beating me though.

What I'm more talking about is the non-uber player there; if m2k is only getting .3 points for beating people he's way better than, doesn't the person getting 2nd end up getting a huge boost to their score simply for having M2K in attendance? I may have misread.

The other thing that's going to assist this is the adjustable base gain-loss factor. This will tabulate two values: N & m. N is the effective games and m is the number of games you played in a tournament. It'll help account for player variability (in one tournament I may beat the top 5 in my region, but from then on out I keep doing my #7 type quality performance) and it will help with the top player effect, since their N+m will be HUGE (and base gain-loss is 800/(N+m).

Excellent!

I have no idea how to run MATLAB code whatsoever, but if you tell me what I'd need I could probably find a way to learn.

Dr. Tuen · Mar 9, 2010

Overswarm said:
This could work if implemented properly. I'd be worried about goign to a tournament only to get .3 points for winning but other people skyrocketing for beating me though.

What I'm more talking about is the non-uber player there; if m2k is only getting .3 points for beating people he's way better than, doesn't the person getting 2nd end up getting a huge boost to their score simply for having M2K in attendance? I may have misread.

As for the question about being #2 in a tournament M2K dominates, the answer is "nope" :-p. You only take points away from wins. So if nobody wins against him (and you have to win a set), then nobody takes home an obscene amount of points.

Also, the system helps with people that beat M2K, or other high level players. I did surprisingly good against Larry (Falco is my best match up :-D), and if that were a win, under the first set of equations, that would yield a near unbounded number of points. It'd be ridiculous. And
I'd end up passing that on to whoever knocked me out of bracket.

The new system can actually cap your point gain. I don't care if you are a random from Madagascar beating M2K and Larry and ADHD in one go, you'll have capped gain and finish somewhere with 1300 points, where as those top players would still have 2700 points.

The adjusting base gain-loss factor helps protect those top players from upset victories too. However, if the average performance of a player warrants a score drop, then your score will drop.

===

We can talk programming sometime soon. For now, I have to take data for my students (I'm a graduate TA at Oregon State).

Best Regards,

Tuen

Overswarm · Mar 9, 2010

I'm working on SAP support and have a degree in English and Secondary EDU. EDU people are best, whether assistants or instructors.

Dr. Tuen · Mar 9, 2010

Overswarm said:
I'm working on SAP support and have a degree in English and Secondary EDU. EDU people are best, whether assistants or instructors.

EDU as in Education? I've been doing educational research since last summer!

Overswarm · Mar 9, 2010

Yeah; I graduated last May. Little to no teaching jobs here though, so it's back to computers.

Dr. Tuen · Mar 9, 2010

Overswarm said:
Yeah; I graduated last May. Little to no teaching jobs here though, so it's back to computers.

The job market is why I'm trying to stick to grad school. Gotta try and wait it out. -.-;

Grad school usually lends time for smash and side projects like this! So it's nice.

Dr. Tuen · Apr 2, 2010

Small update:

The variable gain-loss parameter is being tested. It looks like it works, but the results are weird and the interactions are hard to understand thus far.

The biggest issues is the convergence of scores. This system is becoming more and more alike to the ELO chess ranking system. Which is fine, since that is a very robust system. However, that system refreshes once every YEAR or so, and scores converge slowly. The smash community is more volatile, and needs a system that converges in MONTHS.

Factors will be adjusted until something appropriate for this application is set. Once that is finished, an update for the Pac Northwest will be done, and discussion for project expansion can begin.

-Tuen

hungrybox · Apr 11, 2010

good work guys.

you should try and implement this into the melee competitive scene.

Dr. Tuen · Apr 13, 2010

The automatic updater can read any TIO file you can make. Well, except for Round Robin (which I believe is in there... somewhere). So this could be used for the Melee community too after the base equations get sorted out (I'm estimating 2-3 weeks on that, tops), and after this program gets re-written onto something more widely applicable than MATLAB.

==

Also, I commonly post ideas when I think of them. And I just thought of one.

Future work idea:
Allow the updater to skip the name check if the name matches 100%. This could reduce the work load on the person inputting the data if the TO has impeccable spelling.

==

-Tuen

Overswarm · Apr 13, 2010

Keep up the work Tuen; this will be incredibly useful if completed!

Dr. Tuen · Apr 15, 2010

Updates:

4/15/2010
Programming (Overswarm, I think you'd be interested in this, as you expressed interest in crunching numbers too)

It looks like there are MATLAB clones out there available for download. I hear that none of them have 100% compatibility with MATLAB's ".m" files, but I don't think I used very complicated functions to begin with. After the equations settle out, this may be an accessible alternative to converting my near 1000 lines of code.

Efficiency
It may become useful to add this into the code: check for names that match the archive EXACTLY (100%). All of these names do not have to be double checked and re-typed by the user. Depending on how accurately the TO inputs names, this could save the number cruncher a lot of time

Drawback: when the scale of this project increases, the number of randoms willing to enter under their real name (say, eric), and the chances of their similarity, increases. It has also been shown that for fun and jokes, people like to enter tournaments under famous smash names (M2K, Larry, Ally, etc), which could be an issue to this lazy method.

Solution: Do a quick run through of the bracket (in TIO) before running it. If you have a good knowledge of your players, the fixes are easy.

Overswarm · Apr 21, 2010

You wouldn't be able to stop faulty reports, save for manually. If people want to corrupt data, they'll corrupt it.

Dr. Tuen · Apr 23, 2010

Hey all!

It looks like I'm really good at solving problems when I don't have the time to implement them... Anyways, there is a solution to the variable K issue (see the TGL Project boards). It's called "Quantized K".

So what we had before was "Continuous K". The basic idea is that when players are worse (and susceptible to large fluctuations in skill... e.g. suddenly improving) they are assigned high K values (this is the max gain-loss value). And when players get better, they are assigned lower and lower K values based on past performances. This makes it hard to climb to the very top. It also makes it hard to knock off #1 via just one upset (say I beat Viviff. yaaaay. If it never happens again, I should NOT overtake him in the rankings).

But the problem before was that these values were continuous. Bad players could get the formulas to work out a K value of 200. If said bad player was just new. And named Carlos (see the first Variable K test) and do very well on their first recorded event... then every win is 200 points. Yep. Then, when the next event comes around, that same person is ranked so high, that they get a very VERY low K value (~5 points), so that score is protected.

===

Anyways. Quantized K.

This just means that if your score is below some value (1300?) then your K value is 100. If it's between 1300 and 1500, it's 75. If it's between 1500 and 1700, it's 50, and if it's above 2000, it's 25. Or something like that, I'll work it out.

That way, we get all the positive effects without the wild score fluctuations and score protection.

It should work.

You know. When I get the time. I've been up for 36 hours, so I'm going to go take a nap...

I just wanted everyone to know that I'm still thinking about this project.

Best Regards,

Tuen

Dr. Tuen · Apr 23, 2010

@OS yeah, that'll be annoying. I know enough about the PNW to eliminate that kind of error though. I'd just need a team of people to help me? Or just get to know everyone in the smash community :-)

Overswarm · Apr 23, 2010

I'd be up to help, and could get others to do so as well.

Dr. Tuen · Apr 23, 2010

@OS

I'm currently on a bit of a Hiatus, as I catch up with my research work. Though the Quantized K idea really really makes me want to spend time on this... I have mass transfer homework and a take home midterm to worry about for Monday.

I digress.

I think that the Quantized K will be the last change to the system before it's essentially complete. Any score decay will be added on after the fact (either with a separate program or manually). But the base program will be done.

After that, I plan on looking at converting my MATLAB code to one of those clones (which should require minimal effort). I'll download these clones and work with it until I can make a tutorial on how to operate the TGL 9000 (the name of the program).

When that's done, it should be ready for controlled distribution. At that step of the project, you will be one of the first people I contact. Then we can get this thread pumped full of interesting results for reliability analysis. I'm also curious to see how differing inflation rates effects the drift between regions. The drift may be correctable later on when trying to use this system for larger and more integrated events... but that's all speculation that will be researched when the time comes.

-Tuen

Overswarm · Apr 23, 2010

I'll be happy to help you out. Post in this thread / PM me and I'll get started with whatever you want to test.

Dr. Tuen · Apr 25, 2010

Version 1.00 should be out sometime this week. (Matlab Version 1.00). When I convert it to a matlab clone, that'll be Version 1.50.

Anyways, a more detailed update to come later. I have to get sleep so that I get my ChE 520 take-home midterm done tomorrow. Whee!

SN Viper · Sep 4, 2010

This is beautiful how can i get a copy of this program or how can i take full advantage of this and the work you have put in i am very interested in this

SOVAman · Sep 5, 2010

didn't read much but, this looks cool.

I wonder if regions will actually use this though, once it is "perfect"

TheTantalus · Sep 6, 2010

Tuen,

This is good work. Once I get the calculations down I'll run it side by side with our panel this season. Our panel really hates statistics and numbers but would rather judge the skill of players individually.

I was thinking of doing something similar that wasn't as complicated as a ladder build. I'm sure there is a routine in visual basic somewhere that can run that ladder program for you in essence.

Dr. Tuen · Sep 14, 2010

Woah, this thread got necro'd something fierce.

Unfortunately my project hit a big snag and I lost a bit of ambition. No updates have occurred in over 4 months.

However, I do plan to revive this thing. I am getting Microsoft Visual Basic for free from the University and I know someone who knows how to program with his eyes closed. Now... he's just going to teach me what to do so that I can do it correctly myself... but the end result will be a distributable program for everyone to use.

The other update I intend on doing is one that may allow this project to go higher scale (multiple regions, national). The issue of region drift must be addressed somehow. I have some other ideas floating in my head [such as a stabilization score for everyone based on the consistency of their performance], but that is an avenue that has yet to be researched.

I hope to breathe life back into this soon.

The TGL Power Ranking System - programming update 4/15/2010

Smash Lord

Smash Lord

Smash Hero

Smash Lord

is laughing at you

Smash Ace

Smash Lord

Smash Lord

is laughing at you

Smash Lord

is laughing at you

Smash Lord

is laughing at you

Smash Lord

Smash Lord

Smash Legend

Smash Lord

is laughing at you

Smash Lord

is laughing at you

Smash Lord

Smash Lord

is laughing at you

Smash Lord

is laughing at you

Smash Lord

Formerly 9th in FL PR

Smash Hero

Smash Hero

Smash Lord

Information

Network