Poolers¶
Return a pooling operator initialized with filtered |
|
The Adaptive Structure Aware Pooling operator from the paper "ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations" (Ranjan et al., AAAI 2020). |
|
The asymmetric cheeger cut pooling layer from the paper "Total Variation Graph Neural Networks" (Hansen & Bianchi, ICML 2023). |
|
The BN-Pool operator from the paper "BN-Pool: Bayesian Nonparametric Graph Pooling" (Castellana & Bianchi, 2025). |
|
The differentiable pooling operator from the paper "Hierarchical Graph Representation Learning with Differentiable Pooling" (Ying et al., NeurIPS 2018). |
|
The DMoN pooling operator from the paper "Graph Clustering with Graph Neural Networks" (Tsitsulin et al., JMLR 2023). |
|
The edge pooling operator from the papers "Towards Graph Pooling by Edge Contraction" (Diehl et al. 2019) and "Edge Contraction Pooling for Graph Neural Networks" (Diehl, 2019). |
|
The EigenPooling operator from "Graph Convolutional Networks with EigenPooling" (Ma et al., KDD 2019). |
|
The Graclus pooling operator inspired by the paper "Weighted Graph Cuts without Eigenvectors: A Multilevel Approach" (Dhillon et al., TPAMI 2007). |
|
The high-order pooling operator from the paper "Higher-order clustering and pooling for Graph Neural Networks" (Duval & Malliaros, CIKM 2022).. |
|
The LaPool pooling operator from the paper Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling (Noutahi et al., 2019). |
|
The Just Balance pooling operator from the paper "Simplifying Clustering with Graph Neural Networks" (Bianchi et al., NLDL 2023). |
|
The Maximal \(k\)-Independent Set (\(k\)-MIS) pooling operator from the paper "Generalizing Downsampling from Regular Data to Graphs" (Bacciu et al., AAAI 2023). |
|
The MaxCut pooling operator from the paper "MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks" (Abate & Bianchi, ICLR 2025). |
|
The MinCut pooling operator from the paper "Spectral Clustering in Graph Neural Networks for Graph Pooling" (Bianchi et al., ICML 2020). |
|
The pooling operator from the paper "Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling" (Bianchi et al., TNNLS 2020). |
|
The Non-negative Matrix Factorization pooling as proposed in the paper "A Non-Negative Factorization approach to node pooling in Graph Convolutional Neural Networks" (Bacciu and Di Sotto, AIIA 2019). |
|
Identity pooling operator that performs no actual pooling. |
|
The path integral based pooling operator from the paper "Path Integral Based Convolution and Pooling for Graph Neural Networks" (Ma et al., NeurIPS 2020). |
|
The self-attention pooling operator from the paper "Self-Attention Graph Pooling" (Lee et al., ICML 2019). |
|
The SEPPooling operator from the paper "Structural Entropy Guided Graph Hierarchical Pooling" (Wu et al., ICML 2022). |
|
The \(\mathrm{top}_k\) pooling operator from the papers "Graph U-Nets" (Gao & Ji, ICML 2019), "Towards Sparse Hierarchical Graph Classifiers" (Cangea et al., 2018), and "Understanding Attention and Generalization in Graph Neural Networks" (Knyazev et al., NeurIPS 2019). |
- get_pooler(pooler_name: str, **kwargs)[source]¶
Return a pooling operator initialized with filtered
**kwargs.- Parameters:
pooler_name (str) – Name of the pooler.
**kwargs – Additional keyword arguments to be passed to the pooler constructor; irrelevant ones are discarded.
- Returns:
A pooling layer instance corresponding to pooler_name.
- class ASAPooling(in_channels: int, ratio: float | int = 0.5, GNN: Module | None = None, dropout: float = 0.0, negative_slope: float = 0.2, add_self_loops: bool = False, nonlinearity: str | Callable = 'sigmoid', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False, **kwargs)[source]¶
The Adaptive Structure Aware Pooling operator from the paper “ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations” (Ranjan et al., AAAI 2020).
The \(\texttt{select}\) operator is implemented by passing a special score to
TopkSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
in_channels (int) – Size of each input sample.
ratio (float or int) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of
ratioisfloatorint. (default:0.5)GNN (Module, optional) – A graph neural network layer for using intra-cluster properties. Especially helpful for graphs with higher degree of neighborhood (one of
GraphConv,GCNConvor any GNN which supports theedge_weightparameter). (default:None)dropout (float, optional) – Dropout probability of the normalized attention coefficients which exposes each node to a stochastically sampled neighborhood during training. (default:
0)negative_slope (float, optional) – LeakyReLU angle of the negative slope. (default:
0.2)nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default:
"tanh")lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)**kwargs (optional) – Additional parameters for initializing the graph neural network layer.
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
The forward pass of the pooling operator.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class AsymCheegerCutPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, totvar_coeff: float = 1.0, balance_coeff: float = 1.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The asymmetric cheeger cut pooling layer from the paper “Total Variation Graph Neural Networks” (Hansen & Bianchi, ICML 2023).
The \(\texttt{select}\) operator is implemented with
MLPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer optimizes two auxiliary losses:
the total variation loss (
totvar_loss),the asymmetric norm loss (
asym_norm_loss).
- Parameters:
in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.
k (int) – Number of clusters or supernodes in the pooler graph.
act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.
dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default:
0.0)totvar_coeff (float) – Coefficient for graph total variation loss term. (default:
1.0)balance_coeff (float) – Coefficient for asymmetric norm loss term. (default:
1.0)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step inDenseSRCPoolingand theDenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).
adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adjwhen sparse connectivity is provided. (default:None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with
Trueon real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default:None)batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default:
None)batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when
lifting=True. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]¶
Computes the auxiliary loss terms for unbatched (sparse) mode.
This method is used when
batched=Falseand operates on sparse adjacency matrices without requiring padding or densification.- Parameters:
- Returns:
- A dictionary with the different terms of the auxiliary loss:
'total_variation_loss': The sparse total variation loss.'balance_loss': The unbatched asymmetric norm loss.
- Return type:
- class BNPool(in_channels: int | List[int], k: int, alpha_DP=1.0, K_var=1.0, K_mu=10.0, K_init=1.0, eta=1.0, train_K=True, act: str | None = None, dropout: float = 0.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False, num_neg_samples: int | None = None)[source]¶
The BN-Pool operator from the paper “BN-Pool: Bayesian Nonparametric Graph Pooling” (Castellana & Bianchi, 2025).
BN-Pool implements a Bayesian nonparametric approach to graph pooling using a Dirichlet Process with stick-breaking construction for cluster assignment. The method learns both the number of clusters and their assignments through variational inference.
The \(\texttt{select}\) operator is implemented with
DPSelectto perform variational inference of the stick-breaking process.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
The method uses a truncated stick-breaking representation of the Dirichlet Process:
\[v_{ik} \sim \text{Beta}(\alpha_{ik}, \beta_{ik}), \quad i = 1, \ldots, N \quad k = 1, \ldots, K-1\]\[\pi_{ik} = v_{ik} \prod_{j=1}^{k-1} (1 - v_{ij})\]where \(\pi_{ik}\) represents the probability of assigning node \(i\) to cluster \(k\). The coefficients \(\alpha_{ik}\) and \(\beta_{ik}\) are computed by an MLP from node features \(\mathbf{x}_i\).
The cluster connectivity is modeled through a learnable matrix \(\mathbf{K} \in \mathbb{R}^{K \times K}\) and the pooled adjacency matrix is computed as:
\[\mathbf{A}_{\text{rec}} = \mathbf{S} \mathbf{K} \mathbf{S}^{\top}\]where \(S_{ik} = \pi_{ik}\).
This layer optimizes three auxiliary losses:
Reconstruction loss (
weighted_bce_reconstruction_loss()): Binary cross-entropy loss between the true and reconstructed adjacency matrix \(\mathbf{A}_{\text{rec}}\).KL divergence loss (
kl_loss()): KL divergence between the prior and posterior variational approximation of the stick-breaking variables.Cluster connectivity prior loss (
cluster_connectivity_prior_loss()): Prior regularization on the cluster connectivity matrix \(\mathbf{K}\).
- Parameters:
in_channels (Union[int, List[int]]) – The number of input node feature channels. If a list is provided, it specifies the architecture of the MLP in
DPSelect.k (int) – The maximum number of clusters \(K\) to be used in the pooling mechanism. The actual number of active clusters is learned through the stick-breaking process.
alpha_DP (float, optional) – Prior concentration parameter \(\alpha\) of the Dirichlet Process. Controls the expected number of clusters. Higher values encourage more clusters. (default:
1.0)K_var (float, optional) – Variance \(\sigma^2\) of the Gaussian prior on the cluster connectivity matrix \(\mathbf{K}\). (default:
1.0)K_mu (float, optional) – Mean parameter for the cluster connectivity prior. The prior mean matrix is constructed as \(\mathbf{K}_{\mu} = \mu \mathbf{I} - \mu (\mathbf{1}\mathbf{1}^{\top} - \mathbf{I})\). (default:
10.0)K_init (float, optional) – Initial value for the cluster connectivity matrix \(\mathbf{K}\). (default:
1.0)eta (float, optional) – Weights the KL divergence loss term. (default:
1.0)train_K (bool, optional) – If
True, the cluster connectivity matrix \(\mathbf{K}\) is learnable. IfFalse, \(\mathbf{K}\) is fixed to its initial value. (default:True)act (str, optional) – Activation function for the MLP in
DPSelect. (default:None)dropout (float, optional) – Dropout rate in the MLP of
DPSelect. (default:0.0)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step intgp.src.DenseSRCPoolingand thetgp.connect.DenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of thetgp.select.SelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of thetgp.select.SelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)batched (bool, optional) – If
True, uses the batched dense representation of the input. IfFalse, uses an unbatched representation without padding. (default:True)sparse_output (bool, optional) – If
True, returns block-diagonal sparse outputs. IfFalse, returns batched dense outputs. (default:False)num_neg_samples (int, optional) – Cap on the number of negative edges sampled per graph in the unbatched (sparse-loss) path. If
None, defaults to matching the number of positive edges. (default:None)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, mask: Tensor | None = None, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).
adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adjwhen sparse connectivity is provided. (default:None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default:
None)batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when
lifting=True. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\), where
Truemarks real (non-padded) nodes. Only used when inputs are already dense/padded. (default:None)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(adj: Tensor | SparseTensor, batch: Tensor | None, so: SelectOutput) dict[source]¶
Compute BNPool auxiliary losses in unbatched sparse mode.
- class DiffPool(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, link_loss_coeff: float = 1.0, ent_loss_coeff: float = 1.0, normalize_loss: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The differentiable pooling operator from the paper “Hierarchical Graph Representation Learning with Differentiable Pooling” (Ying et al., NeurIPS 2018).
The \(\texttt{select}\) operator is implemented with
MLPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer optimizes two auxiliary losses:
the link prediction loss (
link_pred_loss),the entropy regularization loss (
entropy_loss)
- Parameters:
in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.
k (int) – Number of clusters or supernodes in the pooler graph.
act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.
dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default:
0.0)link_loss_coeff (float, optional) – Coefficient for the link prediction loss. (default:
1.0)ent_loss_coeff (float, optional) – Coefficient for the entropy regularization loss. (default:
1.0)normalize_loss (bool, optional) – If set to
False, the link prediction loss is not divided byadj.numel(). (default:True)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step inDenseSRCPoolingand theDenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).
adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adjwhen sparse connectivity is provided. (default:None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with
Trueon real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default:None)batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default:
None)batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when
lifting=True. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]¶
Computes the auxiliary loss terms for unbatched (sparse) mode.
This method is used when
batched=Falseand operates on sparse adjacency matrices without requiring padding or densification.- Parameters:
- Returns:
- A dictionary with the different terms of the auxiliary loss:
'link_loss': The sparse link prediction loss.'entropy_loss': The unbatched entropy loss.
- Return type:
- class DMoNPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, spectral_loss_coeff: float = 1.0, cluster_loss_coeff: float = 1.0, ortho_loss_coeff: float = 0.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The DMoN pooling operator from the paper “Graph Clustering with Graph Neural Networks” (Tsitsulin et al., JMLR 2023).
The \(\texttt{select}\) operator is implemented with
MLPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer optimizes two auxiliary losses:
the spectral loss (
spectral_loss),the cluster loss (
cluster_loss),the orthogonality loss (
orthogonality_loss).
- Parameters:
in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.
k (int) – Number of clusters or supernodes in the pooler graph.
act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.
dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default:
0.0)spectral_loss_coeff (float, optional) – Coefficient for the spectral loss (default:
1.0)cluster_loss_coeff (float, optional) – Coefficient for the cluster loss (default:
1.0)ortho_loss_coeff (float, optional) – Coefficient for the orthogonality loss. This loss does not appear in the original paper. (default:
0.0)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step inDenseSRCPoolingand theDenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).
adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adjwhen sparse connectivity is provided. (default:None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with
Trueon real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default:None)batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default:
None)batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when
lifting=True. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]¶
Computes the auxiliary loss terms for unbatched (sparse) mode.
This method is used when
batched=Falseand operates on sparse adjacency matrices without requiring padding or densification.- Parameters:
- Returns:
- A dictionary with the different terms of the auxiliary loss:
'spectral_loss': The sparse spectral loss.'cluster_loss': The unbatched cluster loss.'ortho_loss': The unbatched orthogonality loss.
- Return type:
- class EdgeContractionPooling(in_channels: int, edge_score_method: Callable | None = None, dropout: float | None = 0.0, add_to_edge_score: float = 0.5, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False)[source]¶
The edge pooling operator from the papers “Towards Graph Pooling by Edge Contraction” (Diehl et al. 2019) and “Edge Contraction Pooling for Graph Neural Networks” (Diehl, 2019). This implementation is based on the paper “Revisiting Edge Pooling in Graph Neural Networks” (Landolfi, 2022).
The \(\texttt{select}\) operator is implemented with
EdgeContractionSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
To duplicate the configuration of the paper “Towards Graph Pooling by Edge Contraction” (Diehl et al. 2019), use either
compute_edge_score_softmax()orcompute_edge_score_tanh(), and setadd_to_edge_scoreto0.0. To duplicate the configuration of the paper “Edge Contraction Pooling for Graph Neural Networks” (Diehl, 2019), setdropoutto0.2.- Parameters:
in_channels (int) – Size of each input sample.
edge_score_method (callable, optional) – The function to apply to compute the edge score from raw edge scores. By default, this is the softmax over all incoming edges for each node. This function takes in a
raw_edge_scoretensor of shape[num_nodes], anedge_indextensor and the number of nodesnum_nodes, and produces a new tensor of the same size asraw_edge_scoredescribing normalized edge scores. Included functions arecompute_edge_score_softmax(),compute_edge_score_tanh(), andcompute_edge_score_sigmoid(). (default:compute_edge_score_softmax())dropout (float, optional) – The probability with which to drop edge scores during training. (default:
0.0)add_to_edge_score (float, optional) – A value to be added to each computed edge score. Adding this greatly helps with unpooling stability. (default:
0.5)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce(e.g.,'sum','mean','max'). (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter(e.g.,'sum','mean','max'). (default:"sum")remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class EigenPooling(k: int, num_modes: int = 5, normalized: bool = True, cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = False, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The EigenPooling operator from “Graph Convolutional Networks with EigenPooling” (Ma et al., KDD 2019).
The \(\texttt{select}\) operator is implemented with
EigenPoolSelect.The \(\texttt{reduce}\) operator is implemented with
EigenPoolReduce.The \(\texttt{connect}\) operator is implemented with
EigenPoolConnect.The \(\texttt{lift}\) operator is implemented with
EigenPoolLift.
Let:
\(\mathbf{X} \in \mathbb{R}^{N \times F}\) be node features;
\(\mathbf{S} \in \{0,1\}^{N \times K}\) be the hard assignment matrix produced by
EigenPoolSelect;\(\boldsymbol{\Omega} := \mathbf{S}\) (same matrix, connectivity notation);
\(\mathbf{A}_{\text{ext}} \in \mathbb{R}^{N \times N}\) be the input (possibly block-diagonal) adjacency used by the connector;
\(H\) be the number of eigenvector modes.
EigenPooling first partitions nodes into \(K\) clusters via spectral clustering, then builds a multi-mode pooling matrix \(\boldsymbol{\Theta} \in \mathbb{R}^{N \times (K\cdot H)}\) from Laplacian eigenvectors of each cluster-induced subgraph. Features are pooled as:
\[\mathbf{X}_{\text{pool,raw}} = \boldsymbol{\Theta}^{\top}\mathbf{X},\]then reshaped from \([H\!\cdot\!K, F]\) to \([K, H\!\cdot\!F]\).
Connectivity is coarsened as:
\[\mathbf{A}_{\text{coar}} = \boldsymbol{\Omega}^{\top}\mathbf{A}_{\text{ext}}\boldsymbol{\Omega}.\]Notes
This implementation supports sparse inputs and multi-graph batches via
edge_index+batch.Dense padded batched inputs (\([B, N, N]\)) are not supported.
- Parameters:
k (int) – Number of clusters (supernodes) in the pooled graph.
num_modes (int, optional) – Number of eigenvector modes \(H\). (default:
5)normalized (bool, optional) – If
True, use the normalized Laplacian. (default:True)cached (bool, optional) – If
True, cacheSelectOutput. (default:False)remove_self_loops (bool, optional) – Whether to remove self-loops after coarsening. (default:
True)degree_norm (bool, optional) – If
True, symmetrically normalize pooled adjacency. (default:True)edge_weight_norm (bool, optional) – Whether to normalize pooled edge weights. (default:
False)adj_transpose (bool, optional) – Passed to the connector for adjacency post-processing. (default:
True)lift (LiftType, optional) – Kept for API compatibility. EigenPooling always uses eigenvector-based lifting and ignores this option. (default:
"precomputed")s_inv_op (SinvType, optional) – Operation used to compute \(\mathbf{S}_\text{inv}\) in
SelectOutput. (default:"transpose")batched (bool, optional) – Kept for API compatibility. Dense batched mode is unsupported and this option is ignored. Use sparse inputs with
batchinstead. (default:False)sparse_output (bool, optional) – If
True, return sparse pooled connectivity. (default:False)cache_preprocessing (bool, optional) – Passed to
DenseSRCPooling; has no practical effect for this sparse-oriented path. (default:False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput | Tensor[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node features \(\mathbf{X} \in \mathbb{R}^{N \times F}\). During lifting, accepts pooled features \(\mathbf{X}_{\text{pool}} \in \mathbb{R}^{K \times (H\cdot F)}\).
adj (Adj, optional) – Sparse graph connectivity (edge index,
SparseTensor, or torch COO tensor). Internally interpreted as \(\mathbf{A}_{\text{ext}}\); required whenlifting=False. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adj. (default:None)so (SelectOutput, optional) – Pre-computed selection output. (default:
None)mask (Tensor, optional) – Unused input-node validity mask. (default:
None)batch (Tensor, optional) – Batch vector for sparse multi-graph inputs. (default:
None)batch_pooled (Tensor, optional) – Batch vector for pooled nodes, used during lifting. (default:
None)lifting (bool, optional) – If
True, apply \(\texttt{lift}\) instead of pooling. (default:False)
- Returns:
Pooled output if
lifting=False, otherwise lifted features.- Return type:
- class GraclusPooling(lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False)[source]¶
The Graclus pooling operator inspired by the paper “Weighted Graph Cuts without Eigenvectors: A Multilevel Approach” (Dhillon et al., TPAMI 2007).
The \(\texttt{select}\) operator is implemented with
GraclusSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")cached (bool, optional) – If set to
True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. (default:False)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
torch_sparse.SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class HOSCPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, mu: float = 0.1, alpha: float = 0.5, hosc_ortho: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The high-order pooling operator from the paper “Higher-order clustering and pooling for Graph Neural Networks” (Duval & Malliaros, CIKM 2022)..
The \(\texttt{select}\) operator is implemented with
MLPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer optimizes a combination of the following auxiliary losses:
the mincut loss (
mincut_loss),the orthogonality loss (
orthogonality_loss),the hosc orthogonality loss (
hosc_orthogonality_loss),
- Parameters:
in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.
k (int) – Number of clusters or supernodes in the pooler graph.
act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.
dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default:
0.0)mu (float, optional) – A scalar that controls the importance given to regularization loss. (default:
0.1)alpha (float, optional) – A scalar in [0,1] controlling the importance granted to higher-order information in the loss function. (default:
0.5)hosc_ortho (bool, optional) – Specifies either to use the hosc_orthogonality_loss or the orthogonality_loss. (default:
False)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step inDenseSRCPoolingand theDenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).
adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adjwhen sparse connectivity is provided. (default:None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with
Trueon real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default:None)batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default:
None)batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when
lifting=True. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None, adj_pool: Tensor | SparseTensor | None = None, edge_weight_pool: Tensor | None = None, batch_pooled: Tensor | None = None) dict[source]¶
Computes the auxiliary loss terms for unbatched (sparse) mode.
This method is used when
batched=Falseand operates on sparse adjacency matrices.- Parameters:
edge_index (Adj) – Graph connectivity in sparse format.
edge_weight (Tensor, optional) – Edge weights of shape \((E,)\).
S (Tensor) – The dense assignment matrix of shape \((N, K)\).
batch (Tensor, optional) – Batch vector of shape \((N,)\).
adj_pool (Adj, optional) – The postprocessed pooled adjacency. When
self.sparse_output=True, anedge_indexof shape \((2, E_\\text{pool})\) over the block-diagonal supernode graph. Whenself.sparse_output=False, a dense tensor of shape \((B, K, K)\). Required whenalpha < 1.edge_weight_pool (Tensor, optional) – Edge weights of the postprocessed pooled adjacency, of shape \((E_\\text{pool},)\). Required when
self.sparse_output=Trueandalpha < 1.batch_pooled (Tensor, optional) – Batch vector for the pooled supernodes. Required when
self.sparse_output=True,alpha < 1and the input contains multiple graphs.
- Returns:
A dictionary with
'hosc_loss'and'ortho_loss'.- Return type:
- class LaPooling(shortest_path_reg: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', lift_red_op: str = 'sum', batched: bool = True, sparse_output: bool = False)[source]¶
The LaPool pooling operator from the paper Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling (Noutahi et al., 2019).
The \(\texttt{select}\) operator is implemented with
LaPoolSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
shortest_path_reg (bool, optional) – If
True, applies the shortest path regularization to the selection matrix (this can be expensive since it runs on CPU). (default:False)remove_self_loops (bool, optional) – Whether to remove self-loops from the graph after coarsening. (default:
True)degree_norm (bool, optional) – If
True, normalize the pooled adjacency matrix by the nodes’ degree. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")batched (bool, optional) – If
True, uses the batched dense path. (default:True)sparse_output (bool, optional) – If
True, returns block-diagonal sparse outputs. IfFalse, returns batched dense outputs. (default:False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, mask: Tensor | None = None, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\) (unbatched) or \([B, N, F]\) (batched), where \(N\) is the number of nodes, \(B\) is the batch size, and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. For unbatched mode: It can either be a
torch_sparse.SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. For batched mode: it can be either sparse connectivity (edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor of shape \([B, N, N]\), or an already dense tensor of shape \([B, N, N]\). IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges (unbatched mode only). (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)batch_pooled (torch.Tensor, optional) – The batch vector for the pooled nodes. Required when lifting with dense \([N, K]\) SelectOutput on multi-graph batches. Pass out.batch from the pooling call. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\), where
Truemarks real (non-padded) nodes. Only used when inputs are already dense/padded. (default:None)
- Returns:
The output of the pooling operator.
- Return type:
- class JustBalancePooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, normalize_loss: bool = True, loss_coeff: float = 1.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The Just Balance pooling operator from the paper “Simplifying Clustering with Graph Neural Networks” (Bianchi et al., NLDL 2023).
The \(\texttt{select}\) operator is implemented with
MLPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer optimizes an auxiliary balance loss (
just_balance_loss())- Parameters:
in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.
k (int) – Number of clusters or supernodes in the pooler graph.
act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.
dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default:
0.0)normalize_loss (bool, optional) – If set to
True, the loss is normalized by the number of nodes (default:True)loss_coeff (float, optional) – Coefficient for the loss (default:
1.0)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step inDenseSRCPoolingand theDenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).
adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default:None)edge_weight (Tensor, optional) – Edge weights associated with
adjwhen sparse connectivity is provided. (default:None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with
Trueon real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default:None)batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default:
None)batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when
lifting=True. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(S: Tensor, batch: Tensor | None) dict[source]¶
Computes the auxiliary loss term for unbatched (sparse) mode.
This method is used when
batched=False. The balance loss does not require adjacency; only the assignment matrix and batch vector are used.
- static data_transforms()[source]¶
Transforms the adjacency matrix \(\mathbf{A}\) by applying the following transformation:
\[\mathbf{A} \to \mathbf{I} - \delta \mathbf{L}\]where \(\mathbf{L}\) is the normalized Laplacian of the graph and \(\delta\) is a scaling factor. By default, \(\delta\) is set to \(0.85\).
- class KMISPooling(in_channels: int | None = None, order_k: int = 1, scorer: str = 'linear', score_heuristic: str | None = 'greedy', force_undirected: bool = False, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', reduce_red_op: str | None = 'sum', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False, cached: bool = False)[source]¶
The Maximal \(k\)-Independent Set (\(k\)-MIS) pooling operator from the paper “Generalizing Downsampling from Regular Data to Graphs” (Bacciu et al., AAAI 2023).
The \(k\)-MIS pooling method selects a subset of nodes based on their score and a maximum independent set strategy. The pooling operates by first scoring nodes and then selecting a maximal independent set of nodes, where the score of each node is computed using one of the provided methods in the attribute
scorer. The selected nodes are then pooled using the specified aggregation functions, with options to lift the node features using different matrix inversion strategies.The \(\texttt{select}\) operator is implemented with
KMISSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
in_channels (int, optional) – Size of each input sample. Ignored if
scoreris not"linear". (default:None)order_k (int) – The \(k\)-th order for the independent set. (default:
1)scorer (str or Callable) –
A function that computes a score for each node. Nodes with higher score have a higher chance of being selected for pooling. It can be one of:
"linear"(default): Uses a sigmoid-activated linear layer to compute the scores.in_channelsandscore_passthroughmust be set when using this option."random": Assigns a random score in \([0, 1]\) to each node."constant": Assigns a constant score of \(1\) to each node."canonical": Assigns the score \(-i\) to the \(i\)-th node."first"(or"last"): Uses the first (or last) feature dimension of \(\mathbf{X}\) as the node scores."degree": Uses the degree of each node as the score.A custom function: Accepts the arguments
(x, edge_index, edge_weight, batch)and must return a one-dimensionalTensor.
score_heuristic (str, optional) –
Heuristic to increase the total score of selected nodes. Given an initial score vector \(\mathbf{s} \in \mathbb{R}^n\), options include:
None: No heuristic applied."greedy"(default): Computes the updated score \(\mathbf{s}'\) as\[\mathbf{s}' = \mathbf{s} \oslash (\mathbf{A} + \mathbf{I})^k \mathbf{1}\]where \(\oslash\) is element-wise division.
"w-greedy": Computes the updated score \(\mathbf{s}'\) as\[\mathbf{s}' = \mathbf{s} \oslash (\mathbf{A} + \mathbf{I})^k \mathbf{s}\]
force_undirected (bool, optional) – Whether to force the input graph to be undirected. (default:
False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
reduce_red_op (ReduceType, optional) – If
None, node features are taken by indexing the MIS nodes (no reduction). Otherwise the reducer is used; the reduce step always computes \(\mathbf{S}^\top \mathbf{X}\). (default:"sum")connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")remove_self_loops (bool, optional) – Whether to remove self-loops from the graph after coarsening. (default:
True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)cached (bool, optional) – If set to
True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. IfTrue, the scorer cannot be"linear". (default:False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
torch_sparse.SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class MaxCutPooling(in_channels: int, ratio: float | int = 0.5, assign_all_nodes: bool = True, max_iter: int = 5, loss_coeff: float = 1.0, mp_units: list = [32, 32, 32, 32], mp_act: str = 'tanh', mlp_units: list = [16, 16], mlp_act: str = 'relu', act: str = 'tanh', delta: float = 2.0, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = True)[source]¶
The MaxCut pooling operator from the paper “MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks” (Abate & Bianchi, ICLR 2025).
This pooling layer uses a differentiable MaxCut objective to learn node assignments. It is particularly effective for heterophilic graphs and provides robust pooling through graph topology-aware scoring.
The \(\texttt{select}\) operator is implemented with
MaxCutSelect, which computes MaxCut-aware node scores and performs top-k selection.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer provides one auxiliary loss:
the MaxCut loss (
maxcut_loss).
- Parameters:
in_channels (int) – Size of each input sample.
ratio (Union[float, int]) – Graph pooling ratio for top-k selection. (default:
0.5)assign_all_nodes (bool, optional) – Whether to create assignment matrices that map all nodes to the closest supernode (True) or perform standard top-k selection (False). (default:
True)max_iter (int, optional) – Maximum distance for the closest node assignment. (default:
5)loss_coeff (float, optional) – Coefficient for the MaxCut auxiliary loss. (default:
1.0)mp_units (list, optional) – List of hidden units for message passing layers. (default:
[32, 32, 32, 32, 16, 16, 16, 16, 8, 8, 8, 8])mp_act (str, optional) – Activation function for message passing layers. (default:
"tanh")mlp_units (list, optional) – List of hidden units for MLP layers. (default:
[16, 16])mlp_act (str, optional) – Activation function for MLP layers. (default:
"relu")act (str, optional) – Activation function for the final score. (default:
"tanh")delta (float, optional) – Delta parameter for propagation matrix computation. (default:
2.0)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
True)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass of the MaxCut pooling operator.
- Parameters:
x (Tensor) – Node features of shape \((N, F)\).
adj (Adj, optional) – Graph connectivity. Can be edge_index tensor of shape \((2, E)\) or SparseTensor. (default:
None)edge_weight (Tensor, optional) – Edge weights of shape \((E,)\). (default:
None)so (SelectOutput, optional) – The output of the select operator. (default:
None)batch (Tensor, optional) – Batch assignments of shape \((N,)\). (default:
None)lifting (bool, optional) – If
True, perform lift operation. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class MinCutPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, cut_loss_coeff: float = 1.0, ortho_loss_coeff: float = 1.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The MinCut pooling operator from the paper “Spectral Clustering in Graph Neural Networks for Graph Pooling” (Bianchi et al., ICML 2020).
The \(\texttt{select}\) operator is implemented with
MLPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
This layer optimizes two auxiliary losses:
the mincut loss (
mincut_loss()for batched,sparse_mincut_loss()for unbatched),the orthogonality loss (
orthogonality_loss()for batched,unbatched_orthogonality_loss()for unbatched).
- Parameters:
in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.
k (int) – Number of clusters or supernodes in the pooler graph.
act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.
dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default:
0.0)cut_loss_coeff (float, optional) – Coefficient for the MinCut loss (default:
1.0)ortho_loss_coeff (float, optional) – Coefficient for the orthogonality loss (default:
1.0)remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step intgp.src.DenseSRCPoolingand thetgp.connect.DenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of thetgp.select.SelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of thetgp.select.SelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
batched (bool, optional) – If
True, uses the batched dense path which converts sparse inputs to dense padded tensors. IfFalse, uses the unbatched path which operates on sparse adjacency matrices without padding, providing better memory efficiency for graphs with varying sizes. (default:True)sparse_output (bool, optional) – If
True, returns block-diagonal sparse outputs. IfFalse, returns batched dense outputs. (default:False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor. For batched mode: \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\). For unbatched mode: \(\mathbf{X} \in \mathbb{R}^{N \times F}\), where \(N\) is the total number of nodes across all graphs.
adj (Adj, optional) – The connectivity matrix. For batched mode: it can be either sparse connectivity (
edge_index,torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor of shape \([B, N, N]\), or an already dense tensor of shape \([B, N, N]\). For unbatched mode: Sparse connectivity matrix in one of the formats supported byAdj(edge_index, SparseTensor, etc.). (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges (unbatched mode only). (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with
Trueon real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default:None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)batch_pooled (Tensor, optional) – The batch vector for the pooled nodes. Required when lifting with dense \([N, K]\) SelectOutput on multi-graph batches. Pass out.batch from the pooling call. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]¶
Computes the auxiliary loss terms for unbatched (sparse) mode.
This method is used when
batched=Falseand operates on sparse adjacency matrices without requiring padding or densification.- Parameters:
- Returns:
- A dictionary with the different terms of the auxiliary loss:
'cut_loss': The sparse mincut loss weighted bycut_loss_coeff.'ortho_loss': The unbatched orthogonality loss weighted byortho_loss_coeff.
- Return type:
- class NDPPooling(lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', lift_red_op: str = 'sum', cached: bool = False)[source]¶
The pooling operator from the paper “Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling” (Bianchi et al., TNNLS 2020).
The \(\texttt{select}\) operator is implemented with
NDPSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
KronConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")cached (bool, optional) – If set to
True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. (default:False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
torch_sparse.SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class NMFPooling(k: int, cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = False, sparse_output: bool = False, cache_preprocessing: bool = False)[source]¶
The Non-negative Matrix Factorization pooling as proposed in the paper “A Non-Negative Factorization approach to node pooling in Graph Convolutional Neural Networks” (Bacciu and Di Sotto, AIIA 2019).
NMF pooling performs a Nonnegative Matrix Factorization of the adjacency matrix
\[\mathbf{A} \approx \mathbf{W} \mathbf{H}\]where \(\mathbf{H}\) is the soft cluster assignment matrix and \(\mathbf{W}\) is the cluster centroid matrix.
The \(\texttt{select}\) operator is implemented with
NMFSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
DenseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
Notes
This implementation supports sparse inputs and multi-graph batches via
edge_index+batch.Dense padded batched inputs (\([B, N, N]\)) are not supported.
- Parameters:
k (int) – Number of clusters or supernodes in the pooler graph.
cached (bool, optional) – If set to
True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. (default:False)cache_preprocessing (bool, optional) – If
True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default:False)remove_self_loops (bool, optional) – Whether to remove self-loops from the graph after coarsening. (default:
True)degree_norm (bool, optional) – If
True, normalize the pooled adjacency matrix by the nodes’ degree. (default:True)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)adj_transpose (bool, optional) – If
True, the preprocessing step inDenseSRCPoolingand theDenseConnectoperation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default:True)lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
batched (bool, optional) – Kept for API compatibility. Dense padded batched mode is unsupported and this option is ignored. (default:
False)sparse_output (bool, optional) – If
True, return sparse pooled connectivity. (default:False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput | Tensor[source]¶
Forward pass.
- Parameters:
x (Tensor) – Node feature tensor. Node features \(\mathbf{X} \in \mathbb{R}^{N \times F}\).
adj (Adj, optional) – The connectivity matrix. Sparse connectivity in one of the formats supported by
Adj. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – Edge weights for sparse inputs. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)mask (Tensor, optional) – Unused input-node validity mask. (default:
None)batch (Tensor, optional) – Batch vector \(\mathbf{b} \in \{0,\ldots,B-1\}^{N}\) for sparse inputs. (default:
None)batch_pooled (Tensor, optional) – Batch vector for pooled nodes. Required when lifting from dense \([N, K]\) assignments on multi-graph batches. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class NoPool[source]¶
Identity pooling operator that performs no actual pooling. This pooler creates a consistent SelectOutput and PoolingOutput structure but doesn’t perform any actual pooling - each node maps to itself and all features and edges are preserved unchanged.
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. (default:
None)edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
PoolingOutput or Tensor
- class PANPooling(in_channels: int, ratio: float = 0.5, min_score: float | None = None, multiplier: float = 1.0, nonlinearity: str | Callable = 'tanh', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = False, degree_norm: bool = False, edge_weight_norm: bool = False)[source]¶
The path integral based pooling operator from the paper “Path Integral Based Convolution and Pooling for Graph Neural Networks” (Ma et al., NeurIPS 2020).
PAN pooling performs top-\(k\) pooling where global node importance is measured based on node features \(\mathbf{X}\) and the MET matrix \(\mathbf{M}\):
\[{\rm score} = \beta_1 \mathbf{X} \cdot \mathbf{p} + \beta_2 {\rm deg}(\mathbf{M})\]The MET matrix must be computed by the
PANConvlayer.The \(\texttt{select}\) operator is implemented with
TopkSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
in_channels (int) – Size of each input sample.
ratio (float) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\). This value is ignored if
min_scoreis notNone. (default:0.5)min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{s}_i > \tilde{\alpha}\). When this value is not
None, theratioargument is ignored. (default:None)multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when
min_scoreis used. (default:1.0)nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default:
"tanh")lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:False)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)
- forward(x: Tensor, adj: SparseTensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (SparseTensor) – The MET matrix \(\mathbf{M}\) from the
PANConvlayer. It has a (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batchso (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class SAGPooling(in_channels: int, ratio: float | int = 0.5, GNN: Module | None = None, min_score: float | None = None, multiplier: float = 1.0, nonlinearity: str | Callable = 'tanh', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False, **kwargs)[source]¶
The self-attention pooling operator from the paper “Self-Attention Graph Pooling” (Lee et al., ICML 2019).
It computes the attention scores \(\mathbf{a}\) top-\(k\) selector as:
\[\mathbf{a} = \textrm{GNN}(\mathbf{X}, \mathbf{A})\]The \(\texttt{select}\) operator is implemented with
TopkSelect.The \(\texttt{reduce}\) operator is implemented with
BaseReduce.The \(\texttt{connect}\) operator is implemented with
SparseConnect.The \(\texttt{lift}\) operator is implemented with
BaseLift.
- Parameters:
in_channels (int) – Size of each input sample.
ratio (float or int) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of
ratioisfloatorint. This value is ignored ifmin_scoreis notNone. (default:0.5)GNN (Module, optional) – A graph neural network layer for calculating projection scores (one of
GraphConv,GCNConv,GATConvorSAGEConv). (default:GraphConv)min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{s}_i > \tilde{\alpha}\). When this value is not
None, theratioargument is ignored. (default:None)multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when
min_scoreis used. (default:1)nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default:
"tanh")lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)**kwargs (any, optional) – Additional parameters for initializing the graph neural network layer.
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, attn: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)attn (Tensor, optional) – Optional node-level matrix to use for computing attention scores instead of using the node feature matrix
x. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type:
- class SEPPooling(lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False)[source]¶
The SEPPooling operator from the paper “Structural Entropy Guided Graph Hierarchical Pooling” (Wu et al., ICML 2022).
SEP performs graph pooling by optimizing cluster assignments globally with the goal of minimizing structural entropy. SEP internally builds a coding tree. In standard pooling mode (
forward()), only the first partition above the original nodes is exposed, i.e., node-to-depth-1 clusters.Note
A single call to
forward()only returns the finest pooled partition (the bottom non-leaf level of the SEP tree). This corresponds to using only a depth-2 tree view (nodes -> first supernodes -> root). To use deeper SEP hierarchies (depth > 2) as intended by the original method, use pre-coarsening viamulti_level_precoarsening()(orPreCoarseningwith repeated"sep"levels).Example
Standard one-level forward (returns only depth-1 assignments):
pool = SEPPooling() out = pool( x=x, adj=edge_index, edge_weight=edge_weight, batch=batch, ) # out.so maps original nodes -> first-level SEP clusters only.
Multi-level SEP pre-coarsening (returns hierarchy levels):
pool = SEPPooling() levels = pool.multi_level_precoarsening( levels=3, edge_index=edge_index, edge_weight=edge_weight, batch=batch, num_nodes=x.size(0), ) # levels[0].so: nodes -> level-1 # levels[1].so: level-1 -> level-2 # levels[2].so: level-2 -> level-3
Equivalent transform-level usage:
from tgp.data.transforms import PreCoarsening transform = PreCoarsening(poolers=["sep", "sep", "sep"]) data = transform(data) # data.pooled_data contains 3 pooled levels in order.
- Parameters:
cached (bool, optional) – If
True, cacheSelectOutput. (default:False)remove_self_loops (bool, optional) – Whether to remove self-loops after coarsening. (default:
True)degree_norm (bool, optional) – If
True, symmetrically normalize pooled adjacency. (default:True)edge_weight_norm (bool, optional) – Whether to normalize pooled edge weights. (default:
False)lift (LiftType, optional) – Operation used by
BaseLiftto compute \(\mathbf{S}_\text{inv}\) during lifting. (default:"precomputed")s_inv_op (SinvType, optional) – Operation used to compute \(\mathbf{S}_\text{inv}\) in
SelectOutput. (default:"transpose")
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput | Tensor[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
torch_sparse.SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
Pooled output if
lifting=False, otherwise lifted features.- Return type:
- class TopkPooling(in_channels: int, ratio: int | float = 0.5, min_score: float | None = None, multiplier: float = 1.0, nonlinearity: str | Callable = 'tanh', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False)[source]¶
The \(\mathrm{top}_k\) pooling operator from the papers “Graph U-Nets” (Gao & Ji, ICML 2019), “Towards Sparse Hierarchical Graph Classifiers” (Cangea et al., 2018), and “Understanding Attention and Generalization in Graph Neural Networks” (Knyazev et al., NeurIPS 2019).
- Parameters:
in_channels (int) – Size of each input sample.
ratio (float or int) – The graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of
ratioisfloatorint. This value is ignored ifmin_scoreis notNone. (default:0.5)min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{s}_i > \tilde{\alpha}\). When this value is not
None, theratioargument is ignored. (default:None)multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when
min_scoreis used. (default:1)nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default:
"tanh")lift (LiftType, optional) –
Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.
"precomputed"(default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the"s_inv"attribute of theSelectOutput."transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
s_inv_op (SinvType, optional) –
The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the
"s_inv"attribute of theSelectOutput. It can be one of:"transpose"(default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\)."inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).
connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class
ConnectionTypeadmitted bycoalesce, e.g.,'sum','mean','max') (default:"sum")lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class
ReduceTypeadmitted byscatter, e.g.,'sum','mean','max') (default:"sum")remove_self_loops (bool, optional) – If
True, the self-loops will be removed from the adjacency matrix. (default:True)degree_norm (bool, optional) – If
True, the adjacency matrix will be symmetrically normalized. (default:False)edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default:
False)
- forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, attn: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]¶
Forward pass.
- Parameters:
x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.
adj (Adj, optional) – The connectivity matrix. It can either be a
SparseTensorof (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or aTensorof shape \([2, E]\), where \(E\) is the number of edges in the batch. IfliftingisFalse, it cannot beNone. (default:None)edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default:
None)so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default:
None)batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default:
None)attn (Tensor, optional) – Optional node-level matrix to use for computing attention scores instead of using the node feature matrix
x. (default:None)lifting (bool, optional) – If set to
True, the \(\texttt{lift}\) operation is performed. (default:False)
- Returns:
The output of the pooling operator.
- Return type: