Welcome to Scribd!

Information Bottleneck Method and Its Applications in Deep Learning

Uploaded by

0% found this document useful (0 votes)

33 views15 pages

The document discusses the information bottleneck method, an information-theoretic approach for extracting relevant information from one random variable about another. It begins with introducing some information theory concepts like entropy, conditional entropy, and mutual information. It then describes the information bottleneck problem, which aims to find a representation of input variables that retains as much information as possible about a target variable, subject to compressing the representation. The document concludes by discussing applications of this method in deep learning, such as using it for regularization or imposing structured sparsity.

Original Description:

Presentation

Original Title

Info Bottleneck Method

Copyright

Available Formats

PS, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PS, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

33 views15 pages

Information Bottleneck Method and Its Applications in Deep Learning

Uploaded by

Azamat

Copyright:

Available Formats

Download as PS, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 15

Search inside document

Information Bottleneck Method

Azamat Berdyshev

University of Toronto

October 1, 2017
Outline

Some Information Theory basics

Information Bottleneck Method

Applications in Deep Learning

Some Information Theory basics 2

Entropy

Let X ∈ X be a discrete r.v. distributed as X ∼ P then

X
H(X) = − P (x) log P (x)
x∈X
= − E [log P (X)]

Some Information Theory basics 3

Conditional Entropy

Let (X, Y ) ∈ X × Y be a pair of discrete r.v. jointly distributed as

(X, Y ) ∼ PXY then

XX PX (x)
H(X|Y ) = PXY (x, y) log
PXY (x, y)
x∈X y∈Y
X
= PX (x)H(Y |X = x)
x∈X

Some Information Theory basics 4

Mutual Information

I(X; Y ) = H(X) − H(X|Y )

XX PXY (x, y)
= PXY (x, y) log
PX (x) PY (y)
x∈X y∈Y

Some Information Theory basics 5

Data Processing Inequality

◮ Let X → Y → Z be a Markov chain, then

I(X; Y ) > I(X; Z)

◮ Reparametrization invariance trick: for any invertible φ, ψ

I(X; Y ) = I(φ(X); ψ(Z))

Some Information Theory basics 6

Outline

Some Information Theory basics

Information Bottleneck Method

Applications in Deep Learning

Information Bottleneck Method 7

Information Bottleneck Problem
(N. Tishby, F. Pereira, W. Bialek, 1999)

f (x) g(t)
◮ Consider the information channel: X −−−→ T −−→ Y
minimize I(X; T )
PT |X (t|x)
(1)
subject to I(T ; Y ) > ǫ

◮ let λ be the Lagrange multiplier, then

min I(X; T ) − λI(T ; Y ) (2)

PT |X (t|x)

Information Bottleneck Method 8

Outline

Some Information Theory basics

Information Bottleneck Method

Applications in Deep Learning

Applications in Deep Learning 9

Applications in Deep Learning 10
Pictures with tikz

Applications in Deep Learning 11

Group lasso
(e.g., Yuan & Lin; Meier, van de Geer, Bühlmann; Jacob, Obozinski, Vert)

◮ problem:
PN
minimize f (x) + λ i=1 kxi k2
i.e., like lasso, but require groups of variables to be zero or not

◮ also called ℓ1,2 mixed norm regularization

Applications in Deep Learning 12

Structured group lasso
(Jacob, Obozinski, Vert; Bach et al.; Zhao, Rocha, Yu; . . . )

◮ problem:
PN
minimize f (x) + i=1 λi kxgi k2
where gi ⊆ [n] and G = {g1 , . . . , gN }

◮ like group lasso, but the groups can overlap arbitrarily

◮ particular choices of groups can impose ‘structured’ sparsity

◮ e.g., topic models, selecting interaction terms for (graphical) models,

tree structure of gene networks, fMRI data

◮ generalizes to the composite absolute penalties family:

r(x) = k(kxg1 kp1 , . . . , kxgN kpN )kp0

Applications in Deep Learning 13

Structured group lasso
(Jacob, Obozinski, Vert; Bach et al.; Zhao, Rocha, Yu; . . . )

hierarchical selection:

2 3

4 5 6

◮ G = {{4}, {5}, {6}, {2, 4}, {3, 5, 6}, {1, 2, 3, 4, 5, 6}}

◮ nonzero variables form a rooted and connected subtree
– if node is selected, so are its ancestors
– if node is not selected, neither are its descendants

Applications in Deep Learning 14

Sample ADMM implementation: lasso

prox_f = @(v,rho) (rho/(1 + rho))*(v - b) + b;

prox_g = @(v,rho) (max(0, v - 1/rho) - max(0, -v - 1/rho));

AA = A*A’;
L = chol(eye(m) + AA);

for iter = 1:MAX_ITER

xx = prox_g(xz - xt, rho);
yx = prox_f(yz - yt, rho);

yz = L \ (L’ \ (A(xx + xt) + AA(yx + yt)));

xz = xx + xt + A’*(yx + yt - yz);

xt = xt + xx - xz;
yt = yt + yx - yz;
end
Applications in Deep Learning 15

Information Bottleneck Method
Document11 pages
Information Bottleneck Method
Azamat
No ratings yet
Lec 1
Document12 pages
Lec 1
khodang
No ratings yet
PrincipalComponentAnalysisofBinaryData - Applicationstoroll Call Analysis
Document32 pages
PrincipalComponentAnalysisofBinaryData - Applicationstoroll Call Analysis
shree_saha
No ratings yet
Semi-Supervised Learning Explained
Document10 pages
Semi-Supervised Learning Explained
Regent Vaidya
No ratings yet
ACT6100 A2020 Sup 12
Document37 pages
ACT6100 A2020 Sup 12
lebesgues
No ratings yet
Notes08 Infotheory
Document7 pages
Notes08 Infotheory
Asddd
No ratings yet
Statistical Learning Intro
Document10 pages
Statistical Learning Intro
apopop
No ratings yet
ECE 7680 Lecture 2 - Key Definitions and Theorems in Information Theory
Document8 pages
ECE 7680 Lecture 2 - Key Definitions and Theorems in Information Theory
vahap_samanli4102
No ratings yet
NN, Hypersurface, RTA
Document8 pages
NN, Hypersurface, RTA
SpectroMan
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
Document23 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
Talvany Luis de Barros
No ratings yet
ML Lecture II: Neural Networks, Decision Trees and Unsupervised Learning
Document118 pages
ML Lecture II: Neural Networks, Decision Trees and Unsupervised Learning
Narendra Singh
No ratings yet
Lecture 2
Document19 pages
Lecture 2
kungmu
No ratings yet
CS 229, Public Course Problem Set #4 Solutions: Unsupervised Learn-Ing and Reinforcement Learning
Document12 pages
CS 229, Public Course Problem Set #4 Solutions: Unsupervised Learn-Ing and Reinforcement Learning
lalm9931
No ratings yet
Homework Problems: X X y y
Document1 page
Homework Problems: X X y y
Josh Man
No ratings yet
SD-M1 TSI Chapitre 4
Document42 pages
SD-M1 TSI Chapitre 4
Felix Chokwe Danra Taissala
No ratings yet
Lec12 PDF
Document9 pages
Lec12 PDF
juanagallardo01
No ratings yet
Introduction to Kernels
Document16 pages
Introduction to Kernels
Kamesh Reddi
No ratings yet
Lecture 22
Document33 pages
Lecture 22
Win Myo
No ratings yet
Slide 04
Document16 pages
Slide 04
nguyennd_56
No ratings yet
Lec08 LSH
Document7 pages
Lec08 LSH
java
No ratings yet
SS 19
Document22 pages
SS 19
Yaning Zhao
No ratings yet
On Information (pseudo) Metric
Document13 pages
On Information (pseudo) Metric
Jose Gregorio Rodriguez Vilarreal
No ratings yet
PINNs Navier-Stokes Example
Document26 pages
PINNs Navier-Stokes Example
Hari Madhavan Krishna Kumar
No ratings yet
Comparing Differentiable Logics For Learning Systems: A Research Preview
Document13 pages
Comparing Differentiable Logics For Learning Systems: A Research Preview
meyina4311
No ratings yet
Exercise Problems: Information Theory and Coding
Document6 pages
Exercise Problems: Information Theory and Coding
Reagan Torbi
No ratings yet
Pair of RVs
Document20 pages
Pair of RVs
Hải Vũ Trần
No ratings yet
Lecture 10
Document23 pages
Lecture 10
Łukasz Rozmej
No ratings yet
41 DeepLearning
Document4 pages
41 DeepLearning
Caballero Alférez Roy Torres
No ratings yet
Neural Network With Matrix Inputs
Document11 pages
Neural Network With Matrix Inputs
harutyun
No ratings yet
Info Theory Exercise Solutions
Document16 pages
Info Theory Exercise Solutions
AvishekMajumder
No ratings yet
LECTURE 1: Introduction
Document16 pages
LECTURE 1: Introduction
mailstonaik
No ratings yet
Entropy
Document21 pages
Entropy
Gyana Ranjan Mati
No ratings yet
Lecture 1: Entropy and Mutual Information: 2.1 Example
Document8 pages
Lecture 1: Entropy and Mutual Information: 2.1 Example
Lokesh Singh
No ratings yet
1 Lecture 5b: Probabilistic Perspectives On ML Algorithms
Document6 pages
1 Lecture 5b: Probabilistic Perspectives On ML Algorithms
Jeremy Wang
No ratings yet
CS236 Homework 1
Document4 pages
CS236 Homework 1
Raffael Yasin
No ratings yet
Generalized Pascal Functional Matrix and Its 2007 Linear Algebra and Its App
Document16 pages
Generalized Pascal Functional Matrix and Its 2007 Linear Algebra and Its App
keerthanavasagan
No ratings yet
Linear Classification: 1 1 N N I D I
Document33 pages
Linear Classification: 1 1 N N I D I
S
No ratings yet
3 Bayesian Deep Learning
Document33 pages
3 Bayesian Deep Learning
Nishanth Manikandan
No ratings yet
7.4 Expectation-Maximization Algorithm
Document6 pages
7.4 Expectation-Maximization Algorithm
Harsh Gokhru
No ratings yet
New Approaches For Data Reduction: A. S. Salama at H. M. Abu-Donia @
Document13 pages
New Approaches For Data Reduction: A. S. Salama at H. M. Abu-Donia @
Amgad Salama
No ratings yet
Matrix Calculus4ml
Document20 pages
Matrix Calculus4ml
Almog Benda
No ratings yet
ColumbiaX Machine Learning Lecture 1 Overview
Document615 pages
ColumbiaX Machine Learning Lecture 1 Overview
Alvaro Jose Rehnfeldt Schmidt
No ratings yet
Asset-V1 ColumbiaX+CSMM.102x+1T2018+type@asset+block@ML Lecture1
Document17 pages
Asset-V1 ColumbiaX+CSMM.102x+1T2018+type@asset+block@ML Lecture1
Adi
No ratings yet
Interpolation, Smoothing, and Extrapolation Techniques
Document22 pages
Interpolation, Smoothing, and Extrapolation Techniques
Anita Andriani
No ratings yet
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
Document23 pages
CS 725: Foundations of Machine Learning: Lecture 2. Overview of Probability Theory For ML
Anonymous d0rFT76B
No ratings yet
Neural Networks Tirgul W10 Moodle PDF
Document167 pages
Neural Networks Tirgul W10 Moodle PDF
Tzur Levi
No ratings yet
HW 4
Document6 pages
HW 4
juanagallardo01
No ratings yet
A Proposal On Machine Learning Via Dynamical Systems
Document11 pages
A Proposal On Machine Learning Via Dynamical Systems
Pol Mestres
No ratings yet
Soft B-Separation Axioms in Neutrosophic Soft Topological Structures
Document13 pages
Soft B-Separation Axioms in Neutrosophic Soft Topological Structures
Science Direct
No ratings yet
Mathematical Foundations of Learning: Approximation, Sampling and Classification
Document49 pages
Mathematical Foundations of Learning: Approximation, Sampling and Classification
Marc Romaní
No ratings yet
ML: Learning Exponential Family Models in <40
Document11 pages
ML: Learning Exponential Family Models in <40
Sakshi Badoni
No ratings yet
Week 10 - Differential Entropy
Document22 pages
Week 10 - Differential Entropy
LOmesh SaHu
No ratings yet
2021 UNAS REFER Rafi Yon Saputra 173112706420242 Kernel Primer
Document65 pages
2021 UNAS REFER Rafi Yon Saputra 173112706420242 Kernel Primer
Rafi Y.S
No ratings yet
EE376A/STATS376A Lecture 8 Channel Capacity and Continuous Random Variables
Document6 pages
EE376A/STATS376A Lecture 8 Channel Capacity and Continuous Random Variables
sanjay singh
No ratings yet
Information Theory Entropy Relative Entropy
Document60 pages
Information Theory Entropy Relative Entropy
James
No ratings yet
Nonparametric Classification Explained
Document20 pages
Nonparametric Classification Explained
S
No ratings yet
AbdallahPlumbley13 Gsi Accepted
Document8 pages
AbdallahPlumbley13 Gsi Accepted
orestis willemen
No ratings yet
Linear Regression Models Explained
Document23 pages
Linear Regression Models Explained
Daniel Andres Montoya Espinosa
No ratings yet
Lecture3 PDF
Document43 pages
Lecture3 PDF
Serkan Sezin
No ratings yet
Tables of The Legendre Functions P—½+it(x): Mathematical Tables Series
From Everand
Tables of The Legendre Functions P—½+it(x): Mathematical Tables Series
M. I. Zhurina
No ratings yet
Invariants
Document1 page
Invariants
Azamat
No ratings yet
Exact and Fast Computation of Geometric Moments For Gray Level Images
Document9 pages
Exact and Fast Computation of Geometric Moments For Gray Level Images
Azamat
No ratings yet
SO (3) Identities and Approximations. T. Barfoot
Document2 pages
SO (3) Identities and Approximations. T. Barfoot
Azamat
No ratings yet
Differential Geometry Reconstructed. Kennington Alan PDF
Document2,015 pages
Differential Geometry Reconstructed. Kennington Alan PDF
Azamat
100% (1)
Linear Systems. T. Kailath PDF
Document704 pages
Linear Systems. T. Kailath PDF
Azamat
No ratings yet
Lmibook
Document205 pages
Lmibook
أحمد الطنطاوى
No ratings yet
Deborah - May 4 337 Notes
Document2 pages
Deborah - May 4 337 Notes
Azamat
No ratings yet
Report Design (Human Generater Device)
Document39 pages
Report Design (Human Generater Device)
dadin85
No ratings yet
Raft Slab Design
Document5 pages
Raft Slab Design
Lekins Sefiu Yekini
100% (2)
Versidrain 150: Green Roof
Document2 pages
Versidrain 150: Green Roof
Michael Tiu Torres
No ratings yet
Transmission Lines-Basic Principles 01515900
Document13 pages
Transmission Lines-Basic Principles 01515900
Sachin1091
No ratings yet
Remote
Document16 pages
Remote
tok222222
No ratings yet
Self Publishing
Document84 pages
Self Publishing
Francesco Cusumano
No ratings yet
Guidelines For Quality Control Testing For Digital CR DR Mammography V4
Document62 pages
Guidelines For Quality Control Testing For Digital CR DR Mammography V4
khaerul
0% (1)
Account Statement 220422 210722
Document14 pages
Account Statement 220422 210722
Meherun Bibi
No ratings yet
Agilent 700 Series ICP-OES
Document52 pages
Agilent 700 Series ICP-OES
Roger Manzanarez
No ratings yet
My Life - An Illustrated Biograp - A.P.J. Abdul Kalam
Document76 pages
My Life - An Illustrated Biograp - A.P.J. Abdul Kalam
Anonymous OJsGrxlx6
100% (1)
Ial
Document24 pages
Ial
WMONTOYA4897
No ratings yet
Baremos Espanoles CBCL6-18 PDF
Document24 pages
Baremos Espanoles CBCL6-18 PDF
Armando Casillas
No ratings yet
Electrical - BB International PDF
Document480 pages
Electrical - BB International PDF
edelmolina
100% (3)
State Bank of India - Recruitment of Probationary Officers PDF
Document2 pages
State Bank of India - Recruitment of Probationary Officers PDF
Tapas Kumar Nandi
No ratings yet
District Sales Manager or Territory Manager or Regional Manager
Document3 pages
District Sales Manager or Territory Manager or Regional Manager
api-121327024
No ratings yet
Manuale Injectors Delphi 15-23-2 Ediz Ing
Document7 pages
Manuale Injectors Delphi 15-23-2 Ediz Ing
Kary Shito
100% (2)
docPOI Uk
Document27 pages
docPOI Uk
pvitruvian
No ratings yet
Euref2 English PDF
Document10 pages
Euref2 English PDF
basileusbyzantium
No ratings yet
60-500 KV High Voltage-Gallery PDF
Document33 pages
60-500 KV High Voltage-Gallery PDF
uzakcil
No ratings yet
Camlab
Document22 pages
Camlab
viswamanoj
No ratings yet
2021 Feb EMTEK Door Hardware Price Book Web
Document228 pages
2021 Feb EMTEK Door Hardware Price Book Web
Alejandro López
No ratings yet
Mphasis - JD - 2021 Passouts
Document1 page
Mphasis - JD - 2021 Passouts
Mohamed aslam
No ratings yet
Generator Protection Unit, GPU-3 Data Sheet Generator Protection Unit, GPU-3 Data Sheet
Document11 pages
Generator Protection Unit, GPU-3 Data Sheet Generator Protection Unit, GPU-3 Data Sheet
Jhoan arias
No ratings yet
Chapter12 David Operational Geometallurgy
Document9 pages
Chapter12 David Operational Geometallurgy
OROSCOROCA
No ratings yet
Sample Code Lms
Document6 pages
Sample Code Lms
Charan Teja
No ratings yet
Aht PQD
Document315 pages
Aht PQD
srinivas rao
No ratings yet
Huarui Technologies Co.,Ltd Vip
Document19 pages
Huarui Technologies Co.,Ltd Vip
drfaizal01
No ratings yet
Coatings For Wind Power - Uk - 010213
Document7 pages
Coatings For Wind Power - Uk - 010213
King Sabi
No ratings yet
O LVL R
Document85 pages
O LVL R
Rudra Kumar
No ratings yet
Application of PWM Speed Control
Document7 pages
Application of PWM Speed Control
JMCproducts
No ratings yet