Professional Documents
Culture Documents
Embeddings
Charles Ollion - Olivier Grisel
1/7
Outline
Recommender Systems
2/7
Outline
Recommender Systems
Embeddings
3/7
Outline
Recommender Systems
Embeddings
Recommender Systems
5/7
Recommender Systems
Recommend contents and products
Recommender Systems
Recommend contents and products
Recommender Systems
Recommend contents and products
Recommender Systems
Recommend contents and products
RecSys 101
Content-based vs Collaborative Filtering (CF)
RecSys 101
Content-based vs Collaborative Filtering (CF)
RecSys 101
Content-based vs Collaborative Filtering (CF)
T 2 2 2
L(U , V ) = ||ri,j − u ⋅ v j || + λ(||U || + ||V || )
i 2 2 2
∑
(i,j)∈D
Use T
V to find missing entries in sparse rating data matrix
Use U T V to find missing entries in sparse rating data matrix R
19 / 7
Embeddings
20 / 7
Symbolic variable
Text: characters, words, bigrams...
Symbolic variable
Text: characters, words, bigrams...
Symbolic variable
Text: characters, words, bigrams...
Symbolic variable
Text: characters, words, bigrams...
Notation:
Symbol s in vocabulary V
24 / 7
One-hot representation
|V |
onehot('salad') = [0, 0, 1, . . . , 0] ∈ {0, 1}
One-hot representation
|V |
onehot('salad') = [0, 0, 1, . . . , 0] ∈ {0, 1}
euclidean distance = √‾
euclidean distance = √2
‾
26 / 7
Embedding
d
embed d ing('salad') = [3.28, −0.45, . . . 7.11] ∈ ℝ
27 / 7
Embedding
d
embed d ing('salad') = [3.28, −0.45, . . . 7.11] ∈ ℝ
Embedding
d
embed d ing('salad') = [3.28, −0.45, . . . 7.11] ∈ ℝ
2
||x − y|| = 2 ⋅ (1 − cosine(x, y))
2
34 / 7
2
||x − y|| = 2 ⋅ (1 − cosine(x, y))
2
or alternatively:
2
||x − y||
2
cosine(x, y) = 1 −
2
35 / 7
2
||x − y|| = 2 ⋅ (1 − cosine(x, y))
2
or alternatively:
2
||x − y||
2
cosine(x, y) = 1 −
2
Visualizing Embeddings
Visualizing requires a projection in 2 or 3 dimensions
Objective: visualize which embedded symbols are similar
37 / 7
Visualizing Embeddings
Visualizing requires a projection in 2 or 3 dimensions
Objective: visualize which embedded symbols are similar
PCA
Visualizing Embeddings
Visualizing requires a projection in 2 or 3 dimensions
Objective: visualize which embedded symbols are similar
PCA
t-SNE
Visualizing data using t-SNE, L van der Maaten, G Hinton, The Journal of Machine
Learning Research, 2008
39 / 7
t-Distributed Stochastic
Neighbor Embedding
Unsupervised, low-dimension, non-linear projection
Optimized to keep relative distance between nearest neighbors
40 / 7
t-Distributed Stochastic
Neighbor Embedding
Unsupervised, low-dimension, non-linear projection
Optimized to keep relative distance between nearest neighbors
Visualizing Mnist
43 / 7
Architecture and
Regularization
44 / 7
user
item
45 / 7
user
item
46 / 7
item
47 / 7
user
item-
48 / 7
user
item-
49 / 7
Regularization
Size of the embeddings
56 / 7
Regularization
Size of the embeddings
Regularization
Size of the embeddings
L2 penalty on embeddings
58 / 7
Regularization
Size of the embeddings
L2 penalty on embeddings
Dropout
Dropout
Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava et al.,
Journal of Machine Learning Research 2014
Journal of Machine Learning Research 2014
60 / 7
Dropout
Interpretation
Ensemble interpretation
Dropout
Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava et al.,
Journal of Machine Learning Research 2014
Journal of Machine Learning Research 2014
62 / 7
Overfitting Noise
63 / 7
A bit of Dropout
64 / 7
model = Sequential()
model.add(Dense(hidden_size, input_shape, activation='relu'))
model.add(Dropout(p=0.5))
model.add(Dense(hidden_size, activation='relu'))
model.add(Dropout(p=0.5))
model.add(Dense(output_size, activation='softmax'))
66 / 7
Ethical Considerations of
Recommender Systems
67 / 7
Ethical Considerations
Amplification of existing discriminatory and
unfair behaviors / bias
Ethical Considerations
Amplification of existing discriminatory and
unfair behaviors / bias
Call to action
Designing Ethical Recommender Systems
Call to action
Designing Ethical Recommender Systems
Transparency
Call to action
Designing Ethical Recommender Systems
Transparency