Professional Documents
Culture Documents
Azamat Berdyshev
University of Toronto
October 1, 2017
Outline
X
H(X) = − P (x) log P (x)
x∈X
= − E [log P (X)]
XX PX (x)
H(X|Y ) = PXY (x, y) log
PXY (x, y)
x∈X y∈Y
X
= PX (x)H(Y |X = x)
x∈X
f (x) g(t)
◮ Consider the information channel: X −−−→ T −−→ Y
minimize I(X; T )
PT |X (t|x)
(1)
subject to I(T ; Y ) > ǫ
◮ problem:
PN
minimize f (x) + λ i=1 kxi k2
i.e., like lasso, but require groups of variables to be zero or not
◮ problem:
PN
minimize f (x) + i=1 λi kxgi k2
where gi ⊆ [n] and G = {g1 , . . . , gN }
hierarchical selection:
2 3
4 5 6
AA = A*A’;
L = chol(eye(m) + AA);
xt = xt + xx - xz;
yt = yt + yx - yz;
end
Applications in Deep Learning 15