Poolers

get_pooler

Return a pooling operator initialized with filtered **kwargs.

ASAPooling

The Adaptive Structure Aware Pooling operator from the paper "ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations" (Ranjan et al., AAAI 2020).

AsymCheegerCutPooling

The asymmetric cheeger cut pooling layer from the paper "Total Variation Graph Neural Networks" (Hansen & Bianchi, ICML 2023).

BNPool

The BN-Pool operator from the paper "BN-Pool: Bayesian Nonparametric Graph Pooling" (Castellana & Bianchi, 2025).

DiffPool

The differentiable pooling operator from the paper "Hierarchical Graph Representation Learning with Differentiable Pooling" (Ying et al., NeurIPS 2018).

DMoNPooling

The DMoN pooling operator from the paper "Graph Clustering with Graph Neural Networks" (Tsitsulin et al., JMLR 2023).

EdgeContractionPooling

The edge pooling operator from the papers "Towards Graph Pooling by Edge Contraction" (Diehl et al. 2019) and "Edge Contraction Pooling for Graph Neural Networks" (Diehl, 2019).

EigenPooling

The EigenPooling operator from "Graph Convolutional Networks with EigenPooling" (Ma et al., KDD 2019).

GraclusPooling

The Graclus pooling operator inspired by the paper "Weighted Graph Cuts without Eigenvectors: A Multilevel Approach" (Dhillon et al., TPAMI 2007).

HOSCPooling

The high-order pooling operator from the paper "Higher-order clustering and pooling for Graph Neural Networks" (Duval & Malliaros, CIKM 2022)..

LaPooling

The LaPool pooling operator from the paper Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling (Noutahi et al., 2019).

JustBalancePooling

The Just Balance pooling operator from the paper "Simplifying Clustering with Graph Neural Networks" (Bianchi et al., NLDL 2023).

KMISPooling

The Maximal \(k\)-Independent Set (\(k\)-MIS) pooling operator from the paper "Generalizing Downsampling from Regular Data to Graphs" (Bacciu et al., AAAI 2023).

MaxCutPooling

The MaxCut pooling operator from the paper "MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks" (Abate & Bianchi, ICLR 2025).

MinCutPooling

The MinCut pooling operator from the paper "Spectral Clustering in Graph Neural Networks for Graph Pooling" (Bianchi et al., ICML 2020).

NDPPooling

The pooling operator from the paper "Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling" (Bianchi et al., TNNLS 2020).

NMFPooling

The Non-negative Matrix Factorization pooling as proposed in the paper "A Non-Negative Factorization approach to node pooling in Graph Convolutional Neural Networks" (Bacciu and Di Sotto, AIIA 2019).

NoPool

Identity pooling operator that performs no actual pooling.

PANPooling

The path integral based pooling operator from the paper "Path Integral Based Convolution and Pooling for Graph Neural Networks" (Ma et al., NeurIPS 2020).

SAGPooling

The self-attention pooling operator from the paper "Self-Attention Graph Pooling" (Lee et al., ICML 2019).

SEPPooling

The SEPPooling operator from the paper "Structural Entropy Guided Graph Hierarchical Pooling" (Wu et al., ICML 2022).

TopkPooling

The \(\mathrm{top}_k\) pooling operator from the papers "Graph U-Nets" (Gao & Ji, ICML 2019), "Towards Sparse Hierarchical Graph Classifiers" (Cangea et al., 2018), and "Understanding Attention and Generalization in Graph Neural Networks" (Knyazev et al., NeurIPS 2019).

get_pooler(pooler_name: str, **kwargs)[source]

Return a pooling operator initialized with filtered **kwargs.

Parameters:
  • pooler_name (str) – Name of the pooler.

  • **kwargs – Additional keyword arguments to be passed to the pooler constructor; irrelevant ones are discarded.

Returns:

A pooling layer instance corresponding to pooler_name.

class ASAPooling(in_channels: int, ratio: float | int = 0.5, GNN: Module | None = None, dropout: float = 0.0, negative_slope: float = 0.2, add_self_loops: bool = False, nonlinearity: str | Callable = 'sigmoid', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False, **kwargs)[source]

The Adaptive Structure Aware Pooling operator from the paper “ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations” (Ranjan et al., AAAI 2020).

  • The \(\texttt{select}\) operator is implemented by passing a special score to TopkSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • in_channels (int) – Size of each input sample.

  • ratio (float or int) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of ratio is float or int. (default: 0.5)

  • GNN (Module, optional) – A graph neural network layer for using intra-cluster properties. Especially helpful for graphs with higher degree of neighborhood (one of GraphConv, GCNConv or any GNN which supports the edge_weight parameter). (default: None)

  • dropout (float, optional) – Dropout probability of the normalized attention coefficients which exposes each node to a stochastically sampled neighborhood during training. (default: 0)

  • negative_slope (float, optional) – LeakyReLU angle of the negative slope. (default: 0.2)

  • nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default: "tanh")

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • **kwargs (optional) – Additional parameters for initializing the graph neural network layer.

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

The forward pass of the pooling operator.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class AsymCheegerCutPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, totvar_coeff: float = 1.0, balance_coeff: float = 1.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The asymmetric cheeger cut pooling layer from the paper “Total Variation Graph Neural Networks” (Hansen & Bianchi, ICML 2023).

  • The \(\texttt{select}\) operator is implemented with MLPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer optimizes two auxiliary losses:

Parameters:
  • in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.

  • k (int) – Number of clusters or supernodes in the pooler graph.

  • act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.

  • dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default: 0.0)

  • totvar_coeff (float) – Coefficient for graph total variation loss term. (default: 1.0)

  • balance_coeff (float) – Coefficient for asymmetric norm loss term. (default: 1.0)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in DenseSRCPooling and the DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj when sparse connectivity is provided. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with True on real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default: None)

  • batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default: None)

  • batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when lifting=True. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]

Computes the auxiliary loss terms for unbatched (sparse) mode.

This method is used when batched=False and operates on sparse adjacency matrices without requiring padding or densification.

Parameters:
  • edge_index (Adj) – Graph connectivity in sparse format.

  • edge_weight (Tensor, optional) – Edge weights of shape \((E,)\).

  • S (Tensor) – The dense assignment matrix of shape \((N, K)\).

  • batch (Tensor, optional) – Batch vector of shape \((N,)\).

Returns:

A dictionary with the different terms of the auxiliary loss:
  • 'total_variation_loss': The sparse total variation loss.

  • 'balance_loss': The unbatched asymmetric norm loss.

Return type:

dict

class BNPool(in_channels: int | List[int], k: int, alpha_DP=1.0, K_var=1.0, K_mu=10.0, K_init=1.0, eta=1.0, train_K=True, act: str | None = None, dropout: float = 0.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False, num_neg_samples: int | None = None)[source]

The BN-Pool operator from the paper “BN-Pool: Bayesian Nonparametric Graph Pooling” (Castellana & Bianchi, 2025).

BN-Pool implements a Bayesian nonparametric approach to graph pooling using a Dirichlet Process with stick-breaking construction for cluster assignment. The method learns both the number of clusters and their assignments through variational inference.

  • The \(\texttt{select}\) operator is implemented with DPSelect to perform variational inference of the stick-breaking process.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

The method uses a truncated stick-breaking representation of the Dirichlet Process:

\[v_{ik} \sim \text{Beta}(\alpha_{ik}, \beta_{ik}), \quad i = 1, \ldots, N \quad k = 1, \ldots, K-1\]
\[\pi_{ik} = v_{ik} \prod_{j=1}^{k-1} (1 - v_{ij})\]

where \(\pi_{ik}\) represents the probability of assigning node \(i\) to cluster \(k\). The coefficients \(\alpha_{ik}\) and \(\beta_{ik}\) are computed by an MLP from node features \(\mathbf{x}_i\).

The cluster connectivity is modeled through a learnable matrix \(\mathbf{K} \in \mathbb{R}^{K \times K}\) and the pooled adjacency matrix is computed as:

\[\mathbf{A}_{\text{rec}} = \mathbf{S} \mathbf{K} \mathbf{S}^{\top}\]

where \(S_{ik} = \pi_{ik}\).

This layer optimizes three auxiliary losses:

  • Reconstruction loss (weighted_bce_reconstruction_loss()): Binary cross-entropy loss between the true and reconstructed adjacency matrix \(\mathbf{A}_{\text{rec}}\).

  • KL divergence loss (kl_loss()): KL divergence between the prior and posterior variational approximation of the stick-breaking variables.

  • Cluster connectivity prior loss (cluster_connectivity_prior_loss()): Prior regularization on the cluster connectivity matrix \(\mathbf{K}\).

Parameters:
  • in_channels (Union[int, List[int]]) – The number of input node feature channels. If a list is provided, it specifies the architecture of the MLP in DPSelect.

  • k (int) – The maximum number of clusters \(K\) to be used in the pooling mechanism. The actual number of active clusters is learned through the stick-breaking process.

  • alpha_DP (float, optional) – Prior concentration parameter \(\alpha\) of the Dirichlet Process. Controls the expected number of clusters. Higher values encourage more clusters. (default: 1.0)

  • K_var (float, optional) – Variance \(\sigma^2\) of the Gaussian prior on the cluster connectivity matrix \(\mathbf{K}\). (default: 1.0)

  • K_mu (float, optional) – Mean parameter for the cluster connectivity prior. The prior mean matrix is constructed as \(\mathbf{K}_{\mu} = \mu \mathbf{I} - \mu (\mathbf{1}\mathbf{1}^{\top} - \mathbf{I})\). (default: 10.0)

  • K_init (float, optional) – Initial value for the cluster connectivity matrix \(\mathbf{K}\). (default: 1.0)

  • eta (float, optional) – Weights the KL divergence loss term. (default: 1.0)

  • train_K (bool, optional) – If True, the cluster connectivity matrix \(\mathbf{K}\) is learnable. If False, \(\mathbf{K}\) is fixed to its initial value. (default: True)

  • act (str, optional) – Activation function for the MLP in DPSelect. (default: None)

  • dropout (float, optional) – Dropout rate in the MLP of DPSelect. (default: 0.0)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in tgp.src.DenseSRCPooling and the tgp.connect.DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the tgp.select.SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the tgp.select.SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • batched (bool, optional) – If True, uses the batched dense representation of the input. If False, uses an unbatched representation without padding. (default: True)

  • sparse_output (bool, optional) – If True, returns block-diagonal sparse outputs. If False, returns batched dense outputs. (default: False)

  • num_neg_samples (int, optional) – Cap on the number of negative edges sampled per graph in the unbatched (sparse-loss) path. If None, defaults to matching the number of positive edges. (default: None)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, mask: Tensor | None = None, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj when sparse connectivity is provided. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default: None)

  • batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when lifting=True. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\), where True marks real (non-padded) nodes. Only used when inputs are already dense/padded. (default: None)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(adj: Tensor | SparseTensor, batch: Tensor | None, so: SelectOutput) dict[source]

Compute BNPool auxiliary losses in unbatched sparse mode.

get_rec_adj(S)[source]

Return the reconstructed dense adjacency logits from assignments.

get_sparse_rec_loss(node_assignment, adj, batch, batch_size)[source]

Compute sparse reconstruction loss using sampled positive/negative edges.

Score candidate edges from assignment probabilities and matrix \(K\).

class DiffPool(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, link_loss_coeff: float = 1.0, ent_loss_coeff: float = 1.0, normalize_loss: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The differentiable pooling operator from the paper “Hierarchical Graph Representation Learning with Differentiable Pooling” (Ying et al., NeurIPS 2018).

  • The \(\texttt{select}\) operator is implemented with MLPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer optimizes two auxiliary losses:

Parameters:
  • in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.

  • k (int) – Number of clusters or supernodes in the pooler graph.

  • act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.

  • dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default: 0.0)

  • link_loss_coeff (float, optional) – Coefficient for the link prediction loss. (default: 1.0)

  • ent_loss_coeff (float, optional) – Coefficient for the entropy regularization loss. (default: 1.0)

  • normalize_loss (bool, optional) – If set to False, the link prediction loss is not divided by adj.numel(). (default: True)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in DenseSRCPooling and the DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj when sparse connectivity is provided. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with True on real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default: None)

  • batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default: None)

  • batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when lifting=True. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]

Computes the auxiliary loss terms for unbatched (sparse) mode.

This method is used when batched=False and operates on sparse adjacency matrices without requiring padding or densification.

Parameters:
  • edge_index (Adj) – Graph connectivity in sparse format.

  • edge_weight (Tensor, optional) – Edge weights of shape \((E,)\).

  • S (Tensor) – The dense assignment matrix of shape \((N, K)\).

  • batch (Tensor, optional) – Batch vector of shape \((N,)\).

Returns:

A dictionary with the different terms of the auxiliary loss:
  • 'link_loss': The sparse link prediction loss.

  • 'entropy_loss': The unbatched entropy loss.

Return type:

dict

class DMoNPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, spectral_loss_coeff: float = 1.0, cluster_loss_coeff: float = 1.0, ortho_loss_coeff: float = 0.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The DMoN pooling operator from the paper “Graph Clustering with Graph Neural Networks” (Tsitsulin et al., JMLR 2023).

  • The \(\texttt{select}\) operator is implemented with MLPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer optimizes two auxiliary losses:

Parameters:
  • in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.

  • k (int) – Number of clusters or supernodes in the pooler graph.

  • act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.

  • dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default: 0.0)

  • spectral_loss_coeff (float, optional) – Coefficient for the spectral loss (default: 1.0)

  • cluster_loss_coeff (float, optional) – Coefficient for the cluster loss (default: 1.0)

  • ortho_loss_coeff (float, optional) – Coefficient for the orthogonality loss. This loss does not appear in the original paper. (default: 0.0)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in DenseSRCPooling and the DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj when sparse connectivity is provided. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with True on real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default: None)

  • batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default: None)

  • batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when lifting=True. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]

Computes the auxiliary loss terms for unbatched (sparse) mode.

This method is used when batched=False and operates on sparse adjacency matrices without requiring padding or densification.

Parameters:
  • edge_index (Adj) – Graph connectivity in sparse format.

  • edge_weight (Tensor, optional) – Edge weights of shape \((E,)\).

  • S (Tensor) – The dense assignment matrix of shape \((N, K)\).

  • batch (Tensor, optional) – Batch vector of shape \((N,)\).

Returns:

A dictionary with the different terms of the auxiliary loss:
  • 'spectral_loss': The sparse spectral loss.

  • 'cluster_loss': The unbatched cluster loss.

  • 'ortho_loss': The unbatched orthogonality loss.

Return type:

dict

class EdgeContractionPooling(in_channels: int, edge_score_method: Callable | None = None, dropout: float | None = 0.0, add_to_edge_score: float = 0.5, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False)[source]

The edge pooling operator from the papers “Towards Graph Pooling by Edge Contraction” (Diehl et al. 2019) and “Edge Contraction Pooling for Graph Neural Networks” (Diehl, 2019). This implementation is based on the paper “Revisiting Edge Pooling in Graph Neural Networks” (Landolfi, 2022).

  • The \(\texttt{select}\) operator is implemented with EdgeContractionSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

To duplicate the configuration of the paper “Towards Graph Pooling by Edge Contraction” (Diehl et al. 2019), use either compute_edge_score_softmax() or compute_edge_score_tanh(), and set add_to_edge_score to 0.0. To duplicate the configuration of the paper “Edge Contraction Pooling for Graph Neural Networks” (Diehl, 2019), set dropout to 0.2.

Parameters:
  • in_channels (int) – Size of each input sample.

  • edge_score_method (callable, optional) – The function to apply to compute the edge score from raw edge scores. By default, this is the softmax over all incoming edges for each node. This function takes in a raw_edge_score tensor of shape [num_nodes], an edge_index tensor and the number of nodes num_nodes, and produces a new tensor of the same size as raw_edge_score describing normalized edge scores. Included functions are compute_edge_score_softmax(), compute_edge_score_tanh(), and compute_edge_score_sigmoid(). (default: compute_edge_score_softmax())

  • dropout (float, optional) – The probability with which to drop edge scores during training. (default: 0.0)

  • add_to_edge_score (float, optional) – A value to be added to each computed edge score. Adding this greatly helps with unpooling stability. (default: 0.5)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce (e.g., 'sum', 'mean', 'max'). (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter (e.g., 'sum', 'mean', 'max'). (default: "sum")

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class EigenPooling(k: int, num_modes: int = 5, normalized: bool = True, cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = False, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The EigenPooling operator from “Graph Convolutional Networks with EigenPooling” (Ma et al., KDD 2019).

Let:

  • \(\mathbf{X} \in \mathbb{R}^{N \times F}\) be node features;

  • \(\mathbf{S} \in \{0,1\}^{N \times K}\) be the hard assignment matrix produced by EigenPoolSelect;

  • \(\boldsymbol{\Omega} := \mathbf{S}\) (same matrix, connectivity notation);

  • \(\mathbf{A}_{\text{ext}} \in \mathbb{R}^{N \times N}\) be the input (possibly block-diagonal) adjacency used by the connector;

  • \(H\) be the number of eigenvector modes.

EigenPooling first partitions nodes into \(K\) clusters via spectral clustering, then builds a multi-mode pooling matrix \(\boldsymbol{\Theta} \in \mathbb{R}^{N \times (K\cdot H)}\) from Laplacian eigenvectors of each cluster-induced subgraph. Features are pooled as:

\[\mathbf{X}_{\text{pool,raw}} = \boldsymbol{\Theta}^{\top}\mathbf{X},\]

then reshaped from \([H\!\cdot\!K, F]\) to \([K, H\!\cdot\!F]\).

Connectivity is coarsened as:

\[\mathbf{A}_{\text{coar}} = \boldsymbol{\Omega}^{\top}\mathbf{A}_{\text{ext}}\boldsymbol{\Omega}.\]

Notes

  • This implementation supports sparse inputs and multi-graph batches via edge_index + batch.

  • Dense padded batched inputs (\([B, N, N]\)) are not supported.

Parameters:
  • k (int) – Number of clusters (supernodes) in the pooled graph.

  • num_modes (int, optional) – Number of eigenvector modes \(H\). (default: 5)

  • normalized (bool, optional) – If True, use the normalized Laplacian. (default: True)

  • cached (bool, optional) – If True, cache SelectOutput. (default: False)

  • remove_self_loops (bool, optional) – Whether to remove self-loops after coarsening. (default: True)

  • degree_norm (bool, optional) – If True, symmetrically normalize pooled adjacency. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize pooled edge weights. (default: False)

  • adj_transpose (bool, optional) – Passed to the connector for adjacency post-processing. (default: True)

  • lift (LiftType, optional) – Kept for API compatibility. EigenPooling always uses eigenvector-based lifting and ignores this option. (default: "precomputed")

  • s_inv_op (SinvType, optional) – Operation used to compute \(\mathbf{S}_\text{inv}\) in SelectOutput. (default: "transpose")

  • batched (bool, optional) – Kept for API compatibility. Dense batched mode is unsupported and this option is ignored. Use sparse inputs with batch instead. (default: False)

  • sparse_output (bool, optional) – If True, return sparse pooled connectivity. (default: False)

  • cache_preprocessing (bool, optional) – Passed to DenseSRCPooling; has no practical effect for this sparse-oriented path. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput | Tensor[source]

Forward pass.

Parameters:
  • x (Tensor) – Node features \(\mathbf{X} \in \mathbb{R}^{N \times F}\). During lifting, accepts pooled features \(\mathbf{X}_{\text{pool}} \in \mathbb{R}^{K \times (H\cdot F)}\).

  • adj (Adj, optional) – Sparse graph connectivity (edge index, SparseTensor, or torch COO tensor). Internally interpreted as \(\mathbf{A}_{\text{ext}}\); required when lifting=False. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj. (default: None)

  • so (SelectOutput, optional) – Pre-computed selection output. (default: None)

  • mask (Tensor, optional) – Unused input-node validity mask. (default: None)

  • batch (Tensor, optional) – Batch vector for sparse multi-graph inputs. (default: None)

  • batch_pooled (Tensor, optional) – Batch vector for pooled nodes, used during lifting. (default: None)

  • lifting (bool, optional) – If True, apply \(\texttt{lift}\) instead of pooling. (default: False)

Returns:

Pooled output if lifting=False, otherwise lifted features.

Return type:

PoolingOutput or Tensor

precoarsening(edge_index: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, *, batch: Tensor | None = None, num_nodes: int | None = None, **kwargs) PoolingOutput[source]

Precompute pooling outputs with a fixed assignment width k.

class GraclusPooling(lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False)[source]

The Graclus pooling operator inspired by the paper “Weighted Graph Cuts without Eigenvectors: A Multilevel Approach” (Dhillon et al., TPAMI 2007).

  • The \(\texttt{select}\) operator is implemented with GraclusSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • cached (bool, optional) – If set to True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. (default: False)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a torch_sparse.SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class HOSCPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, mu: float = 0.1, alpha: float = 0.5, hosc_ortho: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The high-order pooling operator from the paper “Higher-order clustering and pooling for Graph Neural Networks” (Duval & Malliaros, CIKM 2022)..

  • The \(\texttt{select}\) operator is implemented with MLPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer optimizes a combination of the following auxiliary losses:

Parameters:
  • in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.

  • k (int) – Number of clusters or supernodes in the pooler graph.

  • act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.

  • dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default: 0.0)

  • mu (float, optional) – A scalar that controls the importance given to regularization loss. (default: 0.1)

  • alpha (float, optional) – A scalar in [0,1] controlling the importance granted to higher-order information in the loss function. (default: 0.5)

  • hosc_ortho (bool, optional) – Specifies either to use the hosc_orthogonality_loss or the orthogonality_loss. (default: False)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in DenseSRCPooling and the DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj when sparse connectivity is provided. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with True on real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default: None)

  • batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default: None)

  • batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when lifting=True. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None, adj_pool: Tensor | SparseTensor | None = None, edge_weight_pool: Tensor | None = None, batch_pooled: Tensor | None = None) dict[source]

Computes the auxiliary loss terms for unbatched (sparse) mode.

This method is used when batched=False and operates on sparse adjacency matrices.

Parameters:
  • edge_index (Adj) – Graph connectivity in sparse format.

  • edge_weight (Tensor, optional) – Edge weights of shape \((E,)\).

  • S (Tensor) – The dense assignment matrix of shape \((N, K)\).

  • batch (Tensor, optional) – Batch vector of shape \((N,)\).

  • adj_pool (Adj, optional) – The postprocessed pooled adjacency. When self.sparse_output=True, an edge_index of shape \((2, E_\\text{pool})\) over the block-diagonal supernode graph. When self.sparse_output=False, a dense tensor of shape \((B, K, K)\). Required when alpha < 1.

  • edge_weight_pool (Tensor, optional) – Edge weights of the postprocessed pooled adjacency, of shape \((E_\\text{pool},)\). Required when self.sparse_output=True and alpha < 1.

  • batch_pooled (Tensor, optional) – Batch vector for the pooled supernodes. Required when self.sparse_output=True, alpha < 1 and the input contains multiple graphs.

Returns:

A dictionary with 'hosc_loss' and 'ortho_loss'.

Return type:

dict

class LaPooling(shortest_path_reg: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', lift_red_op: str = 'sum', batched: bool = True, sparse_output: bool = False)[source]

The LaPool pooling operator from the paper Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling (Noutahi et al., 2019).

  • The \(\texttt{select}\) operator is implemented with LaPoolSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • shortest_path_reg (bool, optional) – If True, applies the shortest path regularization to the selection matrix (this can be expensive since it runs on CPU). (default: False)

  • remove_self_loops (bool, optional) – Whether to remove self-loops from the graph after coarsening. (default: True)

  • degree_norm (bool, optional) – If True, normalize the pooled adjacency matrix by the nodes’ degree. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • batched (bool, optional) – If True, uses the batched dense path. (default: True)

  • sparse_output (bool, optional) – If True, returns block-diagonal sparse outputs. If False, returns batched dense outputs. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, mask: Tensor | None = None, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\) (unbatched) or \([B, N, F]\) (batched), where \(N\) is the number of nodes, \(B\) is the batch size, and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. For unbatched mode: It can either be a torch_sparse.SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. For batched mode: it can be either sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor of shape \([B, N, N]\), or an already dense tensor of shape \([B, N, N]\). If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges (unbatched mode only). (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • batch_pooled (torch.Tensor, optional) – The batch vector for the pooled nodes. Required when lifting with dense \([N, K]\) SelectOutput on multi-graph batches. Pass out.batch from the pooling call. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\), where True marks real (non-padded) nodes. Only used when inputs are already dense/padded. (default: None)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class JustBalancePooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, normalize_loss: bool = True, loss_coeff: float = 1.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The Just Balance pooling operator from the paper “Simplifying Clustering with Graph Neural Networks” (Bianchi et al., NLDL 2023).

  • The \(\texttt{select}\) operator is implemented with MLPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer optimizes an auxiliary balance loss (just_balance_loss())

Parameters:
  • in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.

  • k (int) – Number of clusters or supernodes in the pooler graph.

  • act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.

  • dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default: 0.0)

  • normalize_loss (bool, optional) – If set to True, the loss is normalized by the number of nodes (default: True)

  • loss_coeff (float, optional) – Coefficient for the loss (default: 1.0)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in DenseSRCPooling and the DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (Adj, optional) – The connectivity matrix. In batched mode, this accepts sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\), or an already dense adjacency tensor with the same shape. (default: None)

  • edge_weight (Tensor, optional) – Edge weights associated with adj when sparse connectivity is provided. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with True on real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default: None)

  • batch (Tensor, optional) – Batch assignment vector for input nodes. Required in sparse mode and optional in dense mode. (default: None)

  • batch_pooled (Tensor, optional) – Optional precomputed batch assignment for pooled nodes, used when lifting=True. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(S: Tensor, batch: Tensor | None) dict[source]

Computes the auxiliary loss term for unbatched (sparse) mode.

This method is used when batched=False. The balance loss does not require adjacency; only the assignment matrix and batch vector are used.

Parameters:
  • S (Tensor) – The dense assignment matrix of shape \((N, K)\).

  • batch (Tensor, optional) – Batch vector of shape \((N,)\).

Returns:

A dictionary with 'balance_loss'.

Return type:

dict

static data_transforms()[source]

Transforms the adjacency matrix \(\mathbf{A}\) by applying the following transformation:

\[\mathbf{A} \to \mathbf{I} - \delta \mathbf{L}\]

where \(\mathbf{L}\) is the normalized Laplacian of the graph and \(\delta\) is a scaling factor. By default, \(\delta\) is set to \(0.85\).

class KMISPooling(in_channels: int | None = None, order_k: int = 1, scorer: str = 'linear', score_heuristic: str | None = 'greedy', force_undirected: bool = False, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', reduce_red_op: str | None = 'sum', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False, cached: bool = False)[source]

The Maximal \(k\)-Independent Set (\(k\)-MIS) pooling operator from the paper “Generalizing Downsampling from Regular Data to Graphs” (Bacciu et al., AAAI 2023).

The \(k\)-MIS pooling method selects a subset of nodes based on their score and a maximum independent set strategy. The pooling operates by first scoring nodes and then selecting a maximal independent set of nodes, where the score of each node is computed using one of the provided methods in the attribute scorer. The selected nodes are then pooled using the specified aggregation functions, with options to lift the node features using different matrix inversion strategies.

  • The \(\texttt{select}\) operator is implemented with KMISSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • in_channels (int, optional) – Size of each input sample. Ignored if scorer is not "linear". (default: None)

  • order_k (int) – The \(k\)-th order for the independent set. (default: 1)

  • scorer (str or Callable) –

    A function that computes a score for each node. Nodes with higher score have a higher chance of being selected for pooling. It can be one of:

    • "linear" (default): Uses a sigmoid-activated linear layer to compute the scores. in_channels and score_passthrough must be set when using this option.

    • "random": Assigns a random score in \([0, 1]\) to each node.

    • "constant": Assigns a constant score of \(1\) to each node.

    • "canonical": Assigns the score \(-i\) to the \(i\)-th node.

    • "first" (or "last"): Uses the first (or last) feature dimension of \(\mathbf{X}\) as the node scores.

    • "degree": Uses the degree of each node as the score.

    • A custom function: Accepts the arguments (x, edge_index, edge_weight, batch) and must return a one-dimensional Tensor.

  • score_heuristic (str, optional) –

    Heuristic to increase the total score of selected nodes. Given an initial score vector \(\mathbf{s} \in \mathbb{R}^n\), options include:

    • None: No heuristic applied.

    • "greedy" (default): Computes the updated score \(\mathbf{s}'\) as

      \[\mathbf{s}' = \mathbf{s} \oslash (\mathbf{A} + \mathbf{I})^k \mathbf{1}\]

      where \(\oslash\) is element-wise division.

    • "w-greedy": Computes the updated score \(\mathbf{s}'\) as

      \[\mathbf{s}' = \mathbf{s} \oslash (\mathbf{A} + \mathbf{I})^k \mathbf{s}\]

  • force_undirected (bool, optional) – Whether to force the input graph to be undirected. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • reduce_red_op (ReduceType, optional) – If None, node features are taken by indexing the MIS nodes (no reduction). Otherwise the reducer is used; the reduce step always computes \(\mathbf{S}^\top \mathbf{X}\). (default: "sum")

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • remove_self_loops (bool, optional) – Whether to remove self-loops from the graph after coarsening. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • cached (bool, optional) – If set to True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. If True, the scorer cannot be "linear". (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a torch_sparse.SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class MaxCutPooling(in_channels: int, ratio: float | int = 0.5, assign_all_nodes: bool = True, max_iter: int = 5, loss_coeff: float = 1.0, mp_units: list = [32, 32, 32, 32], mp_act: str = 'tanh', mlp_units: list = [16, 16], mlp_act: str = 'relu', act: str = 'tanh', delta: float = 2.0, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = True)[source]

The MaxCut pooling operator from the paper “MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks” (Abate & Bianchi, ICLR 2025).

This pooling layer uses a differentiable MaxCut objective to learn node assignments. It is particularly effective for heterophilic graphs and provides robust pooling through graph topology-aware scoring.

  • The \(\texttt{select}\) operator is implemented with MaxCutSelect, which computes MaxCut-aware node scores and performs top-k selection.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer provides one auxiliary loss:

Parameters:
  • in_channels (int) – Size of each input sample.

  • ratio (Union[float, int]) – Graph pooling ratio for top-k selection. (default: 0.5)

  • assign_all_nodes (bool, optional) – Whether to create assignment matrices that map all nodes to the closest supernode (True) or perform standard top-k selection (False). (default: True)

  • max_iter (int, optional) – Maximum distance for the closest node assignment. (default: 5)

  • loss_coeff (float, optional) – Coefficient for the MaxCut auxiliary loss. (default: 1.0)

  • mp_units (list, optional) – List of hidden units for message passing layers. (default: [32, 32, 32, 32, 16, 16, 16, 16, 8, 8, 8, 8])

  • mp_act (str, optional) – Activation function for message passing layers. (default: "tanh")

  • mlp_units (list, optional) – List of hidden units for MLP layers. (default: [16, 16])

  • mlp_act (str, optional) – Activation function for MLP layers. (default: "relu")

  • act (str, optional) – Activation function for the final score. (default: "tanh")

  • delta (float, optional) – Delta parameter for propagation matrix computation. (default: 2.0)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: True)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass of the MaxCut pooling operator.

Parameters:
  • x (Tensor) – Node features of shape \((N, F)\).

  • adj (Adj, optional) – Graph connectivity. Can be edge_index tensor of shape \((2, E)\) or SparseTensor. (default: None)

  • edge_weight (Tensor, optional) – Edge weights of shape \((E,)\). (default: None)

  • so (SelectOutput, optional) – The output of the select operator. (default: None)

  • batch (Tensor, optional) – Batch assignments of shape \((N,)\). (default: None)

  • lifting (bool, optional) – If True, perform lift operation. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

property has_loss: bool

Returns True if this pooler computes auxiliary losses.

class MinCutPooling(in_channels: int | List[int], k: int, act: str | None = None, dropout: float = 0.0, cut_loss_coeff: float = 1.0, ortho_loss_coeff: float = 1.0, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = True, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The MinCut pooling operator from the paper “Spectral Clustering in Graph Neural Networks for Graph Pooling” (Bianchi et al., ICML 2020).

  • The \(\texttt{select}\) operator is implemented with MLPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

This layer optimizes two auxiliary losses:

Parameters:
  • in_channels (int, list of int) – Number of hidden units for each hidden layer in the MLP of the \(\texttt{select}\) operator. The first integer must match the size of the node features.

  • k (int) – Number of clusters or supernodes in the pooler graph.

  • act (str or Callable, optional) – Activation function in the hidden layers of the MLP of the \(\texttt{select}\) operator.

  • dropout (float, optional) – Dropout probability in the MLP of the \(\texttt{select}\) operator. (default: 0.0)

  • cut_loss_coeff (float, optional) – Coefficient for the MinCut loss (default: 1.0)

  • ortho_loss_coeff (float, optional) – Coefficient for the orthogonality loss (default: 1.0)

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in tgp.src.DenseSRCPooling and the tgp.connect.DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the tgp.select.SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the tgp.select.SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • batched (bool, optional) – If True, uses the batched dense path which converts sparse inputs to dense padded tensors. If False, uses the unbatched path which operates on sparse adjacency matrices without padding, providing better memory efficiency for graphs with varying sizes. (default: True)

  • sparse_output (bool, optional) – If True, returns block-diagonal sparse outputs. If False, returns batched dense outputs. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor. For batched mode: \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\). For unbatched mode: \(\mathbf{X} \in \mathbb{R}^{N \times F}\), where \(N\) is the total number of nodes across all graphs.

  • adj (Adj, optional) – The connectivity matrix. For batched mode: it can be either sparse connectivity (edge_index, torch_sparse.SparseTensor, or torch COO), which is internally converted to a dense padded tensor of shape \([B, N, N]\), or an already dense tensor of shape \([B, N, N]\). For unbatched mode: Sparse connectivity matrix in one of the formats supported by Adj (edge_index, SparseTensor, etc.). (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges (unbatched mode only). (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Input-node validity mask \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) with True on real (non-padded) nodes in each graph. Only used when inputs are already dense/padded. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • batch_pooled (Tensor, optional) – The batch vector for the pooled nodes. Required when lifting with dense \([N, K]\) SelectOutput on multi-graph batches. Pass out.batch from the pooling call. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

compute_sparse_loss(edge_index: Tensor | SparseTensor, edge_weight: Tensor | None, S: Tensor, batch: Tensor | None) dict[source]

Computes the auxiliary loss terms for unbatched (sparse) mode.

This method is used when batched=False and operates on sparse adjacency matrices without requiring padding or densification.

Parameters:
  • edge_index (Adj) – Graph connectivity in sparse format.

  • edge_weight (Tensor, optional) – Edge weights of shape \((E,)\).

  • S (Tensor) – The dense assignment matrix of shape \((N, K)\).

  • batch (Tensor, optional) – Batch vector of shape \((N,)\).

Returns:

A dictionary with the different terms of the auxiliary loss:
  • 'cut_loss': The sparse mincut loss weighted by cut_loss_coeff.

  • 'ortho_loss': The unbatched orthogonality loss weighted by ortho_loss_coeff.

Return type:

dict

class NDPPooling(lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', lift_red_op: str = 'sum', cached: bool = False)[source]

The pooling operator from the paper “Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling” (Bianchi et al., TNNLS 2020).

  • The \(\texttt{select}\) operator is implemented with NDPSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with KronConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • cached (bool, optional) – If set to True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a torch_sparse.SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class NMFPooling(k: int, cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False, adj_transpose: bool = True, lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', batched: bool = False, sparse_output: bool = False, cache_preprocessing: bool = False)[source]

The Non-negative Matrix Factorization pooling as proposed in the paper “A Non-Negative Factorization approach to node pooling in Graph Convolutional Neural Networks” (Bacciu and Di Sotto, AIIA 2019).

NMF pooling performs a Nonnegative Matrix Factorization of the adjacency matrix

\[\mathbf{A} \approx \mathbf{W} \mathbf{H}\]

where \(\mathbf{H}\) is the soft cluster assignment matrix and \(\mathbf{W}\) is the cluster centroid matrix.

  • The \(\texttt{select}\) operator is implemented with NMFSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with DenseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Notes

  • This implementation supports sparse inputs and multi-graph batches via edge_index + batch.

  • Dense padded batched inputs (\([B, N, N]\)) are not supported.

Parameters:
  • k (int) – Number of clusters or supernodes in the pooler graph.

  • cached (bool, optional) – If set to True, the output of the \(\texttt{select}\) and \(\texttt{select}\) operations will be cached, so that they do not need to be recomputed. (default: False)

  • cache_preprocessing (bool, optional) – If True, caches the dense adjacency produced during preprocessing. This should only be enabled when the same graph is reused across iterations. (default: False)

  • remove_self_loops (bool, optional) – Whether to remove self-loops from the graph after coarsening. (default: True)

  • degree_norm (bool, optional) – If True, normalize the pooled adjacency matrix by the nodes’ degree. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • adj_transpose (bool, optional) – If True, the preprocessing step in DenseSRCPooling and the DenseConnect operation returns transposed adjacency matrices, so that they could be passed “as is” to the dense message-passing layers. (default: True)

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • batched (bool, optional) – Kept for API compatibility. Dense padded batched mode is unsupported and this option is ignored. (default: False)

  • sparse_output (bool, optional) – If True, return sparse pooled connectivity. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, mask: Tensor | None = None, batch: Tensor | None = None, batch_pooled: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput | Tensor[source]

Forward pass.

Parameters:
  • x (Tensor) – Node feature tensor. Node features \(\mathbf{X} \in \mathbb{R}^{N \times F}\).

  • adj (Adj, optional) – The connectivity matrix. Sparse connectivity in one of the formats supported by Adj. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – Edge weights for sparse inputs. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • mask (Tensor, optional) – Unused input-node validity mask. (default: None)

  • batch (Tensor, optional) – Batch vector \(\mathbf{b} \in \{0,\ldots,B-1\}^{N}\) for sparse inputs. (default: None)

  • batch_pooled (Tensor, optional) – Batch vector for pooled nodes. Required when lifting from dense \([N, K]\) assignments on multi-graph batches. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

precoarsening(edge_index: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, *, batch: Tensor | None = None, num_nodes: int | None = None, **kwargs) PoolingOutput[source]

Precompute pooling outputs while forcing a fixed cluster count k.

class NoPool[source]

Identity pooling operator that performs no actual pooling. This pooler creates a consistent SelectOutput and PoolingOutput structure but doesn’t perform any actual pooling - each node maps to itself and all features and edges are preserved unchanged.

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput or Tensor

precoarsening(edge_index: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, *, batch: Tensor | None = None, num_nodes: int | None = None, **select_kwargs) PoolingOutput[source]

Precoarsening for NoPool - returns identity mapping with features.

class PANPooling(in_channels: int, ratio: float = 0.5, min_score: float | None = None, multiplier: float = 1.0, nonlinearity: str | Callable = 'tanh', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = False, degree_norm: bool = False, edge_weight_norm: bool = False)[source]

The path integral based pooling operator from the paper “Path Integral Based Convolution and Pooling for Graph Neural Networks” (Ma et al., NeurIPS 2020).

PAN pooling performs top-\(k\) pooling where global node importance is measured based on node features \(\mathbf{X}\) and the MET matrix \(\mathbf{M}\):

\[{\rm score} = \beta_1 \mathbf{X} \cdot \mathbf{p} + \beta_2 {\rm deg}(\mathbf{M})\]

The MET matrix must be computed by the PANConv layer.

  • The \(\texttt{select}\) operator is implemented with TopkSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • in_channels (int) – Size of each input sample.

  • ratio (float) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\). This value is ignored if min_score is not None. (default: 0.5)

  • min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{s}_i > \tilde{\alpha}\). When this value is not None, the ratio argument is ignored. (default: None)

  • multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when min_score is used. (default: 1.0)

  • nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default: "tanh")

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: False)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

forward(x: Tensor, adj: SparseTensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (SparseTensor) – The MET matrix \(\mathbf{M}\) from the PANConv layer. It has a (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class SAGPooling(in_channels: int, ratio: float | int = 0.5, GNN: Module | None = None, min_score: float | None = None, multiplier: float = 1.0, nonlinearity: str | Callable = 'tanh', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False, **kwargs)[source]

The self-attention pooling operator from the paper “Self-Attention Graph Pooling” (Lee et al., ICML 2019).

It computes the attention scores \(\mathbf{a}\) top-\(k\) selector as:

\[\mathbf{a} = \textrm{GNN}(\mathbf{X}, \mathbf{A})\]
  • The \(\texttt{select}\) operator is implemented with TopkSelect.

  • The \(\texttt{reduce}\) operator is implemented with BaseReduce.

  • The \(\texttt{connect}\) operator is implemented with SparseConnect.

  • The \(\texttt{lift}\) operator is implemented with BaseLift.

Parameters:
  • in_channels (int) – Size of each input sample.

  • ratio (float or int) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of ratio is float or int. This value is ignored if min_score is not None. (default: 0.5)

  • GNN (Module, optional) – A graph neural network layer for calculating projection scores (one of GraphConv, GCNConv, GATConv or SAGEConv). (default: GraphConv)

  • min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{s}_i > \tilde{\alpha}\). When this value is not None, the ratio argument is ignored. (default: None)

  • multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when min_score is used. (default: 1)

  • nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default: "tanh")

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

  • **kwargs (any, optional) – Additional parameters for initializing the graph neural network layer.

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, attn: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • attn (Tensor, optional) – Optional node-level matrix to use for computing attention scores instead of using the node feature matrix x. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput

class SEPPooling(lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: Literal['sum', 'mean', 'min', 'max', 'mul'] = 'sum', lift_red_op: str = 'sum', cached: bool = False, remove_self_loops: bool = True, degree_norm: bool = True, edge_weight_norm: bool = False)[source]

The SEPPooling operator from the paper “Structural Entropy Guided Graph Hierarchical Pooling” (Wu et al., ICML 2022).

SEP performs graph pooling by optimizing cluster assignments globally with the goal of minimizing structural entropy. SEP internally builds a coding tree. In standard pooling mode (forward()), only the first partition above the original nodes is exposed, i.e., node-to-depth-1 clusters.

Note

A single call to forward() only returns the finest pooled partition (the bottom non-leaf level of the SEP tree). This corresponds to using only a depth-2 tree view (nodes -> first supernodes -> root). To use deeper SEP hierarchies (depth > 2) as intended by the original method, use pre-coarsening via multi_level_precoarsening() (or PreCoarsening with repeated "sep" levels).

Example

Standard one-level forward (returns only depth-1 assignments):

pool = SEPPooling()
out = pool(
    x=x,
    adj=edge_index,
    edge_weight=edge_weight,
    batch=batch,
)
# out.so maps original nodes -> first-level SEP clusters only.

Multi-level SEP pre-coarsening (returns hierarchy levels):

pool = SEPPooling()
levels = pool.multi_level_precoarsening(
    levels=3,
    edge_index=edge_index,
    edge_weight=edge_weight,
    batch=batch,
    num_nodes=x.size(0),
)
# levels[0].so: nodes -> level-1
# levels[1].so: level-1 -> level-2
# levels[2].so: level-2 -> level-3

Equivalent transform-level usage:

from tgp.data.transforms import PreCoarsening

transform = PreCoarsening(poolers=["sep", "sep", "sep"])
data = transform(data)
# data.pooled_data contains 3 pooled levels in order.
Parameters:
  • cached (bool, optional) – If True, cache SelectOutput. (default: False)

  • remove_self_loops (bool, optional) – Whether to remove self-loops after coarsening. (default: True)

  • degree_norm (bool, optional) – If True, symmetrically normalize pooled adjacency. (default: True)

  • edge_weight_norm (bool, optional) – Whether to normalize pooled edge weights. (default: False)

  • lift (LiftType, optional) – Operation used by BaseLift to compute \(\mathbf{S}_\text{inv}\) during lifting. (default: "precomputed")

  • s_inv_op (SinvType, optional) – Operation used to compute \(\mathbf{S}_\text{inv}\) in SelectOutput. (default: "transpose")

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput | Tensor[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a torch_sparse.SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) or \([E, 1]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

Pooled output if lifting=False, otherwise lifted features.

Return type:

PoolingOutput or Tensor

multi_level_precoarsening(levels: int, edge_index: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, *, batch: Tensor | None = None, num_nodes: int | None = None, **kwargs) list[PoolingOutput][source]

Compute multiple SEP pre-coarsening levels from a single tree hierarchy.

class TopkPooling(in_channels: int, ratio: int | float = 0.5, min_score: float | None = None, multiplier: float = 1.0, nonlinearity: str | Callable = 'tanh', lift: Literal['transpose', 'inverse', 'precomputed'] = 'precomputed', s_inv_op: Literal['transpose', 'inverse'] = 'transpose', connect_red_op: str = 'sum', lift_red_op: str = 'sum', remove_self_loops: bool = True, degree_norm: bool = False, edge_weight_norm: bool = False)[source]

The \(\mathrm{top}_k\) pooling operator from the papers “Graph U-Nets” (Gao & Ji, ICML 2019), “Towards Sparse Hierarchical Graph Classifiers” (Cangea et al., 2018), and “Understanding Attention and Generalization in Graph Neural Networks” (Knyazev et al., NeurIPS 2019).

Parameters:
  • in_channels (int) – Size of each input sample.

  • ratio (float or int) – The graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of ratio is float or int. This value is ignored if min_score is not None. (default: 0.5)

  • min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{s}_i > \tilde{\alpha}\). When this value is not None, the ratio argument is ignored. (default: None)

  • multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when min_score is used. (default: 1)

  • nonlinearity (str or callable, optional) – The non-linearity to use when computing the score. (default: "tanh")

  • lift (LiftType, optional) –

    Defines how to compute the matrix \(\mathbf{S}_\text{inv}\) to lift the pooled node features.

    • "precomputed" (default): Use as \(\mathbf{S}_\text{inv}\) what is already stored in the "s_inv" attribute of the SelectOutput.

    • "transpose": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Recomputes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • s_inv_op (SinvType, optional) –

    The operation used to compute \(\mathbf{S}_\text{inv}\) from the select matrix \(\mathbf{S}\). \(\mathbf{S}_\text{inv}\) is stored in the "s_inv" attribute of the SelectOutput. It can be one of:

    • "transpose" (default): Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^\top\), the transpose of \(\mathbf{S}\).

    • "inverse": Computes \(\mathbf{S}_\text{inv}\) as \(\mathbf{S}^+\), the Moore-Penrose pseudoinverse of \(\mathbf{S}\).

  • connect_red_op (ConnectionType, optional) – The aggregation function to be applied to all edges connecting nodes assigned to supernodes \(i\) and \(j\). Can be any string of class ConnectionType admitted by coalesce, e.g., 'sum', 'mean', 'max') (default: "sum")

  • lift_red_op (ReduceType, optional) – The aggregation function to be applied to the lifted node features. Can be any string of class ReduceType admitted by scatter, e.g., 'sum', 'mean', 'max') (default: "sum")

  • remove_self_loops (bool, optional) – If True, the self-loops will be removed from the adjacency matrix. (default: True)

  • degree_norm (bool, optional) – If True, the adjacency matrix will be symmetrically normalized. (default: False)

  • edge_weight_norm (bool, optional) – Whether to normalize the edge weights by dividing by the maximum absolute value per graph. (default: False)

forward(x: Tensor, adj: Tensor | SparseTensor | None = None, edge_weight: Tensor | None = None, so: SelectOutput | None = None, batch: Tensor | None = None, attn: Tensor | None = None, lifting: bool = False, **kwargs) PoolingOutput[source]

Forward pass.

Parameters:
  • x (Tensor) – The node feature matrix of shape \([N, F]\), where \(N\) is the number of nodes in the batch and \(F\) is the number of node features.

  • adj (Adj, optional) – The connectivity matrix. It can either be a SparseTensor of (sparse) shape \([N, N]\), where \(N\) is the number of nodes in the batch or a Tensor of shape \([2, E]\), where \(E\) is the number of edges in the batch. If lifting is False, it cannot be None. (default: None)

  • edge_weight (Tensor, optional) – A vector of shape \([E]\) containing the weights of the edges. (default: None)

  • so (SelectOutput, optional) – The output of the \(\texttt{select}\) operator. (default: None)

  • batch (Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which indicates to which graph in the batch each node belongs. (default: None)

  • attn (Tensor, optional) – Optional node-level matrix to use for computing attention scores instead of using the node feature matrix x. (default: None)

  • lifting (bool, optional) – If set to True, the \(\texttt{lift}\) operation is performed. (default: False)

Returns:

The output of the pooling operator.

Return type:

PoolingOutput