BaumWelch
BaumWelch module
- class markovianSBM.BaumWelch.BaumWelch[source]
Bases:
objectClass which performs the Baum-Welch algorithm and that uses it to solve the link prediction and the collaborative filtering problems.
- BaumWelch(m, n, delta, nbite, eps=0.01, K=None)[source]
BaumWelch algorithm that iteratively perform the forward, backward and update steps.
- Parameters
m – We observe fully the graph until time m
n – We do not observe reliably the connection involving nodes between time m and n
delta – We observe the connections between nodes n and n+delta
nbite – Number of iterations for the Baum Welch algorithm
eps – Parameter use to initialize the matrix of emission probabilities O
K – Number of communities
Using this function with a specified K can be typically interesting when we try to estimate the number of clusters with the procedure describes in the paper.
- RES(ini, alpha, beta, O, observed_links, observed_nodes, m, n)[source]
Solve robustly the collaborative filtering problem when we observe fully the graph at time m and we want to predict the community of node n when we observe only a subset of the edges that connects (or not) n with the nodes 1,…,m.
- Parameters
ini – initial distribution of the Markov chain given by the Baum-Welch algorithm
alpha – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
beta – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
observed_links – A vector with binary variables of length less than m. For any i, observed_links[i] is 1 if and only if we observe an edge between nodes n and observed_nodes[i] (and 0 otherwise).
observed_nodes – A vector with the same length as the vector “observed_links”. It contains the nodes for which we observe the connection (or not) with node n
m – We observe fully the graph until time m
n – Node that we want to learn the community
Our implementation do not respect striclty the formula of the paper. We get rid of quantities that do not depend on the cluster k of node n. This allows to avoid underflow issues.
- adjacency_BaumWelch(m, n, delta)[source]
Builds an adjacency matrix from the attribute self.X removing the nodes between m+1 and n-1 (that we consider to be not reliable).
- Parameters
m – We observe fully the graph until time m
n – We do not observe reliably how nodes between m+1 and n-1 are connected
delta – We observe reliably how nodes between n and n+delta are connected
- backward(P, O, m, n, delta, Y, K=None)[source]
Backward step of the BaumWelch algorithm. We observe the connections between nodes 0,…,m,n,…,n+delta. However, we consider that the edges involving nodes between m+1 and n-1 are not reliable and we do not take into account the estimated clusters for the nodes between m+1 and n-1.
- Parameters
P – Transition matrix of the Markov chain
O – matrix of emission probabilities
m – We observe fully the graph until time m
n – We do not observe reliably the connection involving nodes between time m and n
delta – We observe the connections between nodes n and n+delta
Y – vector of length n+delta+1 of the estimated communities
K – Number of communities
Using this function with a specified K can be typically interesting when we try to estimate the number of clusters with the procedure describes in the paper.
- collaborative_filtering_optimalMAP(alpha, beta, observed_links, observed_nodes, m, n)[source]
Solve robustly the collaborative filtering problem using the optimal approach when we observe fully the graph at time m and we want to predict the community of node n when we observe only a subset of the edges that connects (or not) n with the nodes 1,…,m.
- Parameters
alpha – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
beta – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
observed_links – A vector with binary variables of length less than m. For any i, observed_links[i] is 1 if and only if we observe an edge between nodes n and observed_nodes[i] (and 0 otherwise).
observed_nodes – A vector with the same length as the vector “observed_links”. It contains the nodes for which we observe the connection (or not) with node n
m – We observe fully the graph until time m
n – Node that we want to learn the community
- collaborative_filtering_pluginMAP(alpha, beta, observed_links, observed_nodes, m, n)[source]
Solve the collaborative filtering problem using the plugin approach when we observe fully the graph at time m and we want to predict the community of node n when we observe only a subset of the edges that connects (or not) n with the nodes 1,…,m.
- Parameters
alpha – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
beta – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
observed_links – A vector with binary variables of length less than m. For any i, observed_links[i] is 1 if and only if we observe an edge between nodes n and observed_nodes[i] (and 0 otherwise).
observed_nodes – A vector with the same length as the vector “observed_links”. It contains the nodes for which we observe the connection (or not) with node n
m – We observe fully the graph until time m
n – Node that we want to learn the community
- collaborative_filtering_robustMAP(ini, alpha, beta, O, observed_links, observed_nodes, m, n, Pbaum=None)[source]
Solve robustly the collaborative filtering problem when we observe fully the graph at time m and we want to predict the community of node n when we observe only a subset of the edges that connects (or not) n with the nodes 1,…,m.
- Parameters
ini – initial distribution of the Markov chain given by the Baum-Welch algorithm
alpha – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
beta – Matrix of size $K imes n+delta$ given by the Baum-Welch algorithm
observed_links – A vector with binary variables of length less than m. For any i, observed_links[i] is 1 if and only if we observe an edge between nodes n and observed_nodes[i] (and 0 otherwise).
observed_nodes – A vector with the same length as the vector “observed_links”. It contains the nodes for which we observe the connection (or not) with node n
m – We observe fully the graph until time m
n – Node that we want to learn the community
Our implementation do not respect striclty the formula of the paper. We get rid of quantities that do not depend on the cluster k of node n. This allows to avoid underflow issues.
- forward(ini, P, O, m, n, delta, Y, K=None)[source]
Forward step of the BaumWelch algorithm. We observe the connections between nodes 0,…,m,n,…,n+delta. However, we consider that the edges involving nodes between m+1 and n-1 are not reliable and we do not take into account the estimated clusters for the nodes between m+1 and n-1.
- Parameters
ini – initial distribution of the Markov chain
P – Transition matrix of the Markov chain
O – matrix of emission probabilities
m – We observe fully the graph until time m
n – We do not observe reliably the connection involving nodes between time m and n
delta – We observe the connections between nodes n and n+delta
Y – vector of length n+delta+1 of the estimated communities
K – Number of communities
Using this function with a specified K can be typically interesting when we try to estimate the number of clusters with the procedure describes in the paper.
- update(alpha, beta, P, O, m, n, delta, Y, K=None)[source]
Update step of the BaumWelch algorithm: the parameters of the HMM are updated and returned.
- Parameters
alpha – Matrix of size $K imes n+delta$ giving the output of the “forward” method
beta – Matrix of size $K imes n+delta$ giving the output of the “backward” method
P – Transition matrix of the Markov chain
O – matrix of emission probabilities
m – We observe fully the graph until time m
n – We do not observe reliably the connection involving nodes between time m and n
delta – We observe the connections between nodes n and n+delta
Y – vector of length n+delta+1 of the estimated communities
K – Number of communities
Using this function with a specified K can be typically interesting when we try to estimate the number of clusters with the procedure describes in the paper.