You are on page 1of 6

Pokemon META Prediction:

META-potential Stats

Intro
This topic of data is somehow weird to bring up in any of educational report, but with this
not being one, I think I would bring up the topic. I chose this topic because there were no topic
else that I’m interested in (within kaggle.com) and Pokemon is one of my favorite titles in terms
of game. People out there in the community, just like any other game, would figure out how they
could train and maximize their Pokemon for competitive battle, which is another fun in this
series. As a person that is interested in competitive games in general, and a big fan of Pokemon, I
think I will enjoy doing this a lot.

When I see the data in the website, I immediately figured out of a way to use AI to
predict something with these datas: to figure out if a particular Pokemon contains a set of stats
that could be worthy of being the metagame (Most Effective Tactic Available), assuming we
don’t know if it is or not. I believe that this would be helpful when the newly announced
Generation 8 comes out later in the year.

Sample data
The following data is retrieved from a combination of kaggle.com and
pokemon.neoseeker.com, as the data from kaggle did not contain the label for each Pokemon if it
is the current metagame or not.

(The order of the data is: Form, Tier, Type1, Type2, HP, Attack, Defense, Special Atk, Special Def, Speed, Total)
To introduce the Pokemon with its tier, I would have to explain a bit more about tier list
of Pokemon. There are lots of tiers that many Pokemon falls into, such as Ubers, Over Used
(OU), Never Used (NU) and more, but what is considered as the metagame is usually the OUs as
they are well balanced and not too overpowered. In the context of this report, I am going to count
Uber as the metagames too, since Pokemon could oscillate between these lists (in the past, the
current Ubers, such as Mega Kangaskhan were once an OU), as there might be changes to its
abilities and moveset, which doesn’t matter so much in this context, since we are using only the
stats. The only exception to that is forms that are changed from its ability, such as Aegislash that
has its stat changed through form change.
These are others sample from each of the generation. As noticable in these samples, not all
“legendary” Pokemons and “mythical” Pokemons are metagame.

Preparation and Conversion


Right after I downloaded the data, I noticed that the data from Kaggle provides a lot of
information, such as weight, height, body shape, and Pokedex entries (description of each
Pokemon). Most of them are not necessary for training the AI. Moreover there are forms of
Pokemon that doesn’t matter to the training, since the stats are totally the same.
When I’m finished with the elimination, I start converting the data into numbers by using
the following method:
Process
As soon as I’m done with the conversion, I start rewriting my codes in collaboratory right
away. With a lot of thoughts coming into my mind during the coding, I decided to perform this in
a different ways.
First, I realized that the whole thing was kind of randomly grouped into training
information set and test data set, so I chose to conduct it in a different way: using only Pokemon
from the Generation 1 through 6, to train the AI, then let the AI predict the Generation 7s. This
would be beneficial in real life as well, because this method could also be used to predict newer
Generations as well.
Second, with a bit of playing around with the layers, I found out that the dimension of
(30, 300, 400, 500, 20) when compared to (10,1000,1000,1000,10) is more balanced, while the
latter results in more accurate in predicting the “non-metagame” category. The first type of
dimension, however, could predict the “Ubers” more accurately.

AI Accuracy
My hypothesis at first was that the AI wouldn’t be able to predict accurately, since as I
already know that the tier is also based on other factors than stats too. Surprisingly, the AI was
pretty accurate on the Ubers and the Non-metagame, but in contrast, did very poorly in
predicting the OUs. The reason behind this, in my speculation, is that the OUs are a bit too
balanced and depends more on their movesets and abilities than the stats (making them hard to
identify because they looked too similar to those with normal stats), when compared to the Ubers
that are high in stats and also are consisted of good combination of abilities and moveset as well.

Predicting random Gen1-7


Layers:(30,300,400,500,20)

precision recall f1-score support

0 0.94 0.95 0.95 209

1 (Ubers) 0.88 0.75 0.81 20

2 (OU) 0.18 0.2 0.19 10

micro avg 0.9 0.9 0.9 239

macro avg 0.67 0.63 0.65 239


Weighted avg 0.91 0.9 0.9 239
// Doesn’t take a lot of time

Predicting random Gen1-7


Layers:(10,1000,1000,1000,10)

precision recall f1-score support

0 0.96 0.92 0.94 211

1 (Ubers) 0.59 0.93 0.72 14

2 (OU) 0.27 0.29 0.28 14

micro avg 0.88 0.88 0.88 239

macro avg 0.61 0.71 0.65 239

Weighted avg 0.9 0.88 0.89 239


// Takes a lot of time

Use Gen 1-6 to train, use Gen 7 to test


Layers:(30,300,400,500,20)

precision recall f1-score support

0 0.87 0.89 0.88 93

1 (Ubers) 1.00 0.33 0.5 9

2 (OU) 0.13 0.18 0.15 11

micro avg 0.78 0.78 0.78 113

macro avg 0.67 0.47 0.51 113

Weighted avg 0.81 0.78 0.78 113

Use Gen 1-6 to train, use Gen 7 to test


Layers:(10,1000,1000,1000,10)

precision recall f1-score support


0 0.90 0.95 0.92 93

1 (Ubers) 0.5 0.44 0.47 9

2 (OU) 0.43 0.27 0.33 11

micro avg 0.84 0.84 0.84 113

macro avg 0.61 0.55 0.58 113

Weighted avg 0.82 0.84 0.83 113

Reference
- CSV data:
https://www.kaggle.com/mylesoneill/pokemon-sun-and-moon-gen-7-stats#pokemon.csv
- Pokemon Tier list: ​https://pokemon.neoseeker.com/wiki/Uber#Uber

You might also like