Brawl CPUs Learn?

t3h n00b · Feb 22, 2009

CPU's actions aren't random, but they have certain probabilities I believe. Now sometimes, the probability of an action happening is 100% (often for recoveries, the strange habit of airdodging after getting hit...), but I think in the CPU Ganon example, there could be, say, a 60% chance of him starting with a side-B. Or 100%. But you wouldn't know for sure unless you tried a lot of times.

ibd · Feb 22, 2009

As I pointed out... it's well possible that the Wii only saves the 10, 100 or whatever "best" moves it has learnt. This would mean that the replay size remains constant, because regardless of how many moves the AI has learnt/forgotten, it will only save the 100 recent best ones.

luvs2pluck · Feb 22, 2009

I would just like to point out a big flaw with the "replays just save what you do, then run the cpu with no data so it acts like its a real match" idea.

So say you save a replay where the cpu kills you with game and watch by pulling a lvl9 hammer. You go and watch the replay, and according to this idea, the cpu would still use a hammer at the same time, but it would most likely be a different hammer, since the game would simply generate a random hammer, correct?

Of course if you play a cpu game and watch and he pulls hammers and you save the replay, he will always pull out the same hammers as he did in the actual match when you watch the replay. Therefore, I feel the idea that no data is saved for the cpu is ridiculous, unless Im missing something.

Also, I feel like most of the people arguing here have not had much experience playing cpus and are just speaking from a logical viewpoint. For all these people I think you should go play a few matches and use g&w. Spam uair evertime you knock a cpu into the air. Do this for 4 or 5 matches, then put the cpu as g&w and see what happens; then come back and argue over it.

p.s. People need to stop brining up how the cpus edgehog and RAR, theyve been doing it since day 1.

XienZo · Feb 22, 2009

t3h n00b said:
CPU's actions aren't random, but they have certain probabilities I believe. Now sometimes, the probability of an action happening is 100% (often for recoveries, the strange habit of airdodging after getting hit...), but I think in the CPU Ganon example, there could be, say, a 60% chance of him starting with a side-B. Or 100%. But you wouldn't know for sure unless you tried a lot of times.

Exactly, so there are (at least) two different possibilities, but according to the theory, the file is identical, and there is the problem.

Levitas · Feb 22, 2009

XienZo said:
Thats basically the jest of it. However, I want to point out that it would only record the "learned" behavior, and as an example, would switch off from a minute of un-recorded "standard" to a few seconds of "learned" behavior, so it would only have to save maybe 15-20 seconds, not 3 minutes, of actions. Furthermore, I believe that the chains of "learned" behavior are quite small (jump, Falcon Punch) so the overall file size difference would be rather small.

However, if you find that the file size stays EXACTLY the same regardless of any "learned" behaviors, then, my theory would be incorrect.

Also, I just realized an issue with the current "replays don't save CPU data" theory. Wouldn't that mean that if 2 seperate matches were recorded where the human player did absolutely nothing, the two replays would be exactly identical files and the CPUs should perform the exact same moves in the two orignial matches? (Well, until they trip) That means that if you were playing a CPU Ganon and you were a G&W(doing nothing) on FD, and the Ganon started with a side-B, every single time you played a CPU Ganon as a G&W(doing nothing) on FD (with the same spawn positions), the Ganon should use side-B.

random seed. If the conditions 1. same starting point 2. no human inputs and 3. same random seed are met, it will be the same.

This is extremely improbable, though.

edit: @ luvs2pluck, that's true. I don't play against CPUs ever. However, people have been saying this stuff since early in melee's days, and it wasn't true then, and we have a lot more means to disprove it now.

Frown · Feb 22, 2009

I taught a CPU Dedede to infinite grab me. =D

Dev2000 · Feb 22, 2009

i saw a computer Zapjump once :\ i was like wTF

M3WTW0 J0SH · Feb 22, 2009

I wouldn't think they'd learn. Wouldn't that require some degree of artificial intelligence or something?

I could be wrong.

Crackhead-Tudor · Feb 22, 2009

they start doing things that the player frequently does i think. all i can play are cpus and ive been trying to get them to learn some things and my falco/ddd now chaingrab relentlessly. i think they adapt slightly, but not enough that they will ever be perfect

XienZo · Feb 22, 2009

Levitas said:
random seed. If the conditions 1. same starting point 2. no human inputs and 3. same random seed are met, it will be the same.

This is extremely improbable, though.

I suppose it'd save the random seed as well.

Anyhow, I suppose the best way to test this is to obtain one of the "learned" behavior replays and then use the max-tripping code and see what happens.

t3h n00b · Feb 22, 2009

I've read most of this thread, but don't fully understand the conversation. Are you guys saying that CPUs learning could extend to "learning" probability-based actions, like tripping more/less frequently or a certain number on Judgment Hammer?

XienZo · Feb 22, 2009

t3h n00b said:
I've read most of this thread, but don't fully understand the conversation. Are you guys saying that CPUs learning could extend to "learning" probability-based actions, like tripping more/less frequently or a certain number on Judgment Hammer?

No, its more of a side-argument, where they say that because of how replays work, CPU learning wouldn't work. We're saying that maybe their current theories of how replays work are incorrect, pointing out probability events. However, random seed basically settles it.

KO M · Feb 22, 2009

Ok I think we can safely say, CPUS can learn and do learn in a way.
But they cannot learn on there own.

By fighting humans , they can copy behavior, a build there own style of play.. but they are still basically choosing a random style.

This is what happend when I tested this.

Fight #1 I used Marth , CPU is Falco lvl 9/ The Falco was very agressive and always was staying on my ***, tryed spking me a couple of times, it was very odd. When I got an early kill on it, it was almost as if.
It was thinking "How could that happen" Because it stayed there for about 10 seconds not doing anything.(Could also be it was searching its data banks for a counter messure).
Basic line is 1st fight Falco was agressive.

#2 Sheik, CPU Falco/ The Falco this was very campy... After I ledge gimped it... It started PLANKING
Kept jumping from the edge to double lazer me. I went over to stop it, and it stage spiked me.
Basic line-Falco was planking, very campy

#3 I used Snake, CPU Falco/ He started off by chain grabbing me ....( he saw I was Snake... it knows matchups) Spiked me , 1st stock goes to CPU. The 2nd Stock I kept him at bay with grenades and kept him off the stage. When I planted on mine, and threw him into it. he DI'd out of the way came back and SH lazer'd me and came in witha boost smash /UP smash) Now the Falco is still 3 stocks 180%
I get him with a an uptilt and he loses his stock.He Starts now wave bouncing like crazy, and SH DL'ing
I gets me to 30% grabs me and starts chain grabbing me to the Spike / 2 Stocking me
Basic Line- He played like a good Falco/ A Human

#4 I used ROB, CPU Falco /I have never played a ROB with my Falco on this Wii, So the match starts he really isnt sure what to do. Never had this kind of match up. I 3 stocked him didnt get hit once.
I camped and ledge stalled him. He tryed SH Dl and he even once tryed a boost smash. When I went under the stage. He tryed following to maybe stage spike me(Iv done that alot to Pits when they go under)
Basic Line- Was like a player who was not sure of the match up.

CPUS can learn , it justs not really our kind of learning. It is basically taking data from us, and using it to see if it will pay off.
If im Falco alot and Marth, and I do alot of match ups with Snake and win those. The CPU version will know the match up and beat the hell out of a Snake too.

t3h n00b · Feb 22, 2009

That seems kind of extreme, but pretty cool. Do you CG with Falco or have an opponent that does it a lot?

KO M · Feb 22, 2009

I chain grab a lot

Sigrid Fiinikkusu · Feb 22, 2009

lol my CPU's are finally learning to edgehog me ever since I started doing it to them.

bobson · Feb 23, 2009

Levitas said:
So you're proposing that there's a hidden level along with the displayed level.

That's not learning, that's setting a level.

Also note that the more complicated your explanation as to how it could work becomes, the more likely you're not correct.

Bad choice of words on my part; I meant it could save your AI's particular configuration at that point along with the replay.

I don't see any other explanation for the crouch taunting, chaingrabbing and glidetossing shenanigans that the AI does now which they didn't do before.

ibd · Feb 23, 2009

M3WTW0 J0SH said:
I wouldn't think they'd learn. Wouldn't that require some degree of artificial intelligence or something?

I could be wrong.

Hehe, what you are playing against IS an artificial intelligence, albeit a rather simple one. What you mean is that it would require some kind of system that analyzes your play and extracts "good" move sequences from it. I think it might have that.

luvs2pluck said:
Of course if you play a cpu game and watch and he pulls hammers and you save the replay, he will always pull out the same hammers as he did in the actual match when you watch the replay. Therefore, I feel the idea that no data is saved for the cpu is ridiculous, unless Im missing something.

You are. There is no such thing as "randomness" for a computer. If you start with the same number (random seed), then apply the same operations to it, you will always get the same sequence of "random" numbers.

KO M said:
CPUS can learn , it justs not really our kind of learning. It is basically taking data from us, and using it to see if it will pay off.

Exactly.

Levitas · Feb 23, 2009

The problem is that that we can actually disprove this and you guys are ignoring that evidence.

If this is true, your replays should be bigger for a same length replay on a character that your CPU plays often than one it never plays. Which it isn't.

bobson: I suppose if it started doing a uair 5 times to up b to glide attack chain as metaknight often, you'd claim that that's a specific setting?

RyanPF · Feb 23, 2009

I think everyone is missing the point.

Quite obviously, spiritual beings are taking control of CPUs and making them do weird stuff for the lulz. They're in our heads.

KO M · Feb 23, 2009

No its just CPUs learn in the Der der dUR kind of way. I think this topic has come to a close

Veril · Feb 23, 2009

There seems to be a lot of anecdotal evidence for it. A shame there isn't any hard proof.

I've noticed that Jigglypuff CPUs n-air out of shield and WOP more often since I've mained her. I also saw her waveland into grabs a number of times... That was surprising.

Levitas · Feb 23, 2009

Veril said:
There seems to be a lot of anecdotal evidence for it. A shame there isn't any hard proof.

I've noticed that Jigglypuff CPUs n-air out of shield and WOP more often since I've mained her. I also saw her waveland into grabs a number of times... That was surprising.

I think this is the key, right here. it's that we're learning, not the cpus.

bobson · Feb 23, 2009

Levitas said:
The problem is that that we can actually disprove this and you guys are ignoring that evidence.

So Sakurai and co. found out how to glidetoss and DACUS in testing, decided to leave them in, and then programmed the AI to start using them only after a certain amount of time had passed?
I don't think we're the ones ignoring evidence here.

Levitas said:
bobson: I suppose if it started doing a uair 5 times to up b to glide attack chain as metaknight often, you'd claim that that's a specific setting?

If it never did that ever until I starting using it often against it, I would claim the CPU learned it from me.

Levitas · Feb 23, 2009

Guess what? whether Sakurai intended glidetoss and DACUS, they are valid techniques that can be done with correct inputs. The fact remains that if there's no mechanism for CPUs to learn, then they can't.

As in, they can't.

If you're gonna support a point, make sure that it's actually possible for it to be correct. If it's already been proven that CPUs can't learn, you need to prove my premises false before it's even possible for the conclusion to be false. Meaning I don't have to look at any other data no matter how much it supports a conclusion contradictory to my claim.

ColinJF · Feb 23, 2009

There is one mechanism that can account for "learning" while being consistent with all the evidence, although it is highly unlikely. Consider the following.

How is the random seed for each match determined? One possibility is that it's based on what transpired in the previous games since the console was turned on. Suppose that

1) The random seed for a given game is precisely the number of times a human player has used Meta Knight's up air since the console turned on (which is stored in some hypothetical value in memory).
2) The AI is highly chaotic and different initial states cause the AI player to behave potentially quite differently.
3) One particular initial random seed value causes, by sheer fluke, the AI Meta Knight player to use up air a lot.

Then we would indeed observe using up air lots as Meta Knight to cause AI Meta Knight to go wild with up air in one particular match, while being consistent with the basic premise of the replay argument.

Of course, the seeding scheme I've outlined here is obviously not used in the game, but we can only rule out the seeding being based on some element of previous games by actually looking at what it is based on in the real game.

t3h n00b · Feb 23, 2009

Levitas said:
Guess what? whether Sakurai intended glidetoss and DACUS, they are valid techniques that can be done with correct inputs. The fact remains that if there's no mechanism for CPUs to learn, then they can't.

As in, they can't.

If you're gonna support a point, make sure that it's actually possible for it to be correct. If it's already been proven that CPUs can't learn, you need to prove my premises false before it's even possible for the conclusion to be false. Meaning I don't have to look at any other data no matter how much it supports a conclusion contradictory to my claim.

You can't prove that something can't happen without enormous amounts of trials, and that only proves that something hasn't yet happened. Besides, there are numerous videos showing CPUs doing actions that are very possible to be done, but not habitually done by CPUs before a human player used those techniques multiple times. Try it yourself. Do you have a character that you never use or play against frequently? Play someone that's really good with that character, and you will notice that the level 9 CPU is more "creative", if not more skilled, than before you played against that person (multiple times would make this more noticeable).

Oshtoby · Feb 23, 2009

Levitas said:
Guess what? whether Sakurai intended glidetoss and DACUS, they are valid techniques that can be done with correct inputs. The fact remains that if there's no mechanism for CPUs to learn, then they can't.

As in, they can't.

If you're gonna support a point, make sure that it's actually possible for it to be correct. If it's already been proven that CPUs can't learn, you need to prove my premises false before it's even possible for the conclusion to be false. Meaning I don't have to look at any other data no matter how much it supports a conclusion contradictory to my claim.

"NAH NAH NAH BOO BOO, I CAN'T HEAR YOU."

This is all I heard when I read your post. I've already made my point, so I can't really offer anymore to the table, but I just want to say that saying "NUH-UH, YOU CAN'T SAY ANYTHING 'TIL I AM PROVEN COMPLETELY 100% FALSE" is stifling to the debate.

Arguing is a lost art no one seems to understand anymore. While both sides mouths must be wide open, both sides ears must be, too. From what I understand, the believers have posted videos of near-proofs. The non-believers have brought nothing to the table but"logic," which boils down to nothing but thoughts and beliefs.

Now a clear way to test this would be to set up a match between two level nine CPUs on a fresh, clean SSBB save file, and record it on a PC. Then, afterward, train ONE and ONLY ONE of those two CPUs with all sorts of tactics. Then, redo the initial match. If the Trained CPU clearly does better this time around, we know that it has successfully learned. Since this is an experiment, we should throw some scientific method into the mix, and do this several times to see if it was a fluke.

ColinJF · Feb 23, 2009

Before you respond to Levitas, try actually understanding the argument we're dealing with here.

As far as I can tell there are only two flaws in Levitas's premises, namely (1) the random seen could be chosen in a way dependent on what has occurred in previous games, which would constitute, to me, a sort of "learning", and (2) replays could be padded with a ton of NULLs if not all the "learning data" is filled up.

(1) is nontrivial to falsify.

(2) is trivial to falsify, and I plan on looking into it later.

Oshtoby · Feb 23, 2009

ColinJF said:
Before you respond to Levitas, try actually understanding the argument we're dealing with here.

As far as I can tell there are only two flaws in Levitas's premises, namely (1) the random seen could be chosen in a way dependent on what has occurred in previous games, which would constitute, to me, a sort of "learning", and (2) replays could be padded with a ton of NULLs if not all the "learning data" is filled up.

(1) is nontrivial to falsify.

(2) is trivial to falsify, and I plan on looking into it later.

The argument, to my understanding (correct me if I'm wrong), is that we're trying to prove/disprove that a CPU will start using tidbits of your tactics from recorded replays in any and all fights, not just against you.

XienZo · Feb 23, 2009

Levitas said:
The problem is that that we can actually disprove this and you guys are ignoring that evidence.

I know fully well you "can". But no one has conclusive scientific-method-worthy process of proving or not. See, the hard part about this is that you have to remove ALL OTHER VARIABLES, regardless whether you're proving or disproving it.

If this is true, your replays should be bigger for a same length replay on a character that your CPU plays often than one it never plays. Which it isn't.

That my CPU plays more often? Its more of "your replays should be bigger for a same length replay on a character that the CPU uses seemingly "learned" behavior than one where it doesn't." Simply "plays often" is too general to cover only "learned" behavior. You can't just play a CPU Falco 20 times and test it. Your Falco needs to be doing something we would claim to be "learned", or you're not testing anything.

Levitas · Feb 23, 2009

Colin's got the only relevant response I've heard since this whole thing started.

Logic is pretty cool. It means I'm done arguing when I've proven something.

Unfortunately, if the seed can be determined by previous games and then subsequently determine values that fit some pattern.

We don't even need to see a CPU play necessarily to prove that they don't learn.

t3h n00b · Feb 23, 2009

Levitas said:
Colin's got the only relevant response I've heard since this whole thing started.

Logic is pretty cool. It means I'm done arguing when I've proven something.

Unfortunately, if the seed can be determined by previous games and then subsequently determine values that fit some pattern.

We don't even need to see a CPU play necessarily to prove that they don't learn.

Guys, CPUs learn. They CG if people playing on the save file CG. They sometimes crouch taunt. They Falcon Punch all the time after you go into With Anyone on wifi. Make all the logical statements you want, if you ever get around to your "experiments", you will see this.

Oshtoby · Feb 23, 2009

t3h n00b said:
if you ever get around to your "experiments", you will see this.

In the process. Trying to export my current SSBB save to my computer first.

t3h n00b · Feb 23, 2009

Check this out. It may save you some trouble. http://allisbrawl.com/blogpost.aspx?id=8848

Levitas · Feb 23, 2009

Which would fit under colin's 2nd option, and therefore is easy to verify.

Oshtoby · Feb 23, 2009

t3h n00b said:
Check this out. It may save you some trouble. http://allisbrawl.com/blogpost.aspx?id=8848

This has been posted before in this very thread, but some people count that as speculation, and not REAL proof. The proof I would offer would not be concrete, but a little harder than that.

XienZo · Feb 23, 2009

t3h n00b said:
Check this out. It may save you some trouble. http://allisbrawl.com/blogpost.aspx?id=8848

I've seen that, but it could still be taken as random lucky stuff. What we need is a well-structured experiment:

1. Choose two characters and a stage. Also, decide what techniques will be "taught"

2. Save a few replays of a CPU's behavior before "learning" (control)

3. Start to perform the process of "teaching" the CPU by using only the specific techniques (successfully), and save them as replays, and watch them often.

4. Save replays of CPU's behavior after "learning"

5. Compare how much more often the CPU used the techniques after the "learning" than before.

6. If the last 4 steps are negative, scrap the whole thing and pick someone else, and w/e you do, don't tell anyone.

t3h n00b · Feb 23, 2009

Oshtoby said:
This has been posted before in this very thread, but some people count that as speculation, and not REAL proof. The proof I would offer would not be concrete, but a little harder than that.

Ok I didn't know it was posted before, but you can't get much more proof than that imo. But I'll take that back if you get something good ^_^

-Syn- · Feb 24, 2009

I actually noticed my CPU learning before I even heard this argument at all online. I play 1v1 with my friend quite often and we almost always use the same characters. The matchup is usually Ike vs. Wolf. No other human player at my house (I also only play him 1v1 online and never do "With Anyone" matches because of the lag and the lame rules) ever uses Wolf and I'm pretty much the only one who uses Ike. One day while playing my Ike and randoming lvl. 9 CPU's which I always 6-1 or 6-2 stock, I randomed a Wolf CPU. I immediately noticed a similarity in it's play style and situational moves to that of my friends. Believe me, the furthest thing from my mind was CPU's with the capability to learn until I played this one that mimicked parts of my friend's metagame. Btw: my theory for the other CPU's sucking is that most of my other friends who play are pretty terrible lol and I only play Ike and MK.

Brawl CPUs Learn?

Smash Ace

Smash Rookie

Smash Apprentice

Smash Lord

the moon

poekmon

Smash Lord

Smash Cadet

Smash Rookie

Smash Lord

Smash Ace

Smash Lord

Smash Apprentice

Smash Ace

Smash Apprentice

Smash Apprentice

Smash Lord

Smash Rookie

the moon

Smash Journeyman

Smash Apprentice

Frame Savant

the moon

Smash Lord

the moon

Smash Ace

Smash Ace

Smash Apprentice

Smash Ace

Smash Apprentice

Smash Lord

the moon

Smash Ace

Smash Apprentice

Smash Ace

the moon

Smash Apprentice

Smash Lord

Smash Ace

Smash Cadet

Information

Network