# GraphormerLayerο

Bases: `Module`

Graphormer Layer with Dense Multi-Head Attention, as introduced in Do Transformers Really Perform Bad for Graph Representation?

Parameters:
• feat_size (int) β Feature size.

• hidden_size (int) β Hidden size of feedforward layers.

• num_heads (int) β Number of attention heads, by which `feat_size` is divisible.

• attn_bias_type (str, optional) β

The type of attention bias used for modifying attention. Selected from βaddβ or βmulβ. Default: βaddβ.

• βmulβ is for multiplicative attention bias.

• norm_first (bool, optional) β If True, it performs layer normalization before attention and feedforward operations. Otherwise, it applies layer normalization afterwards. Default: False.

• dropout (float, optional) β Dropout probability. Default: 0.1.

• attn_dropout (float, optional) β Attention dropout probability. Default: 0.1.

• activation (callable activation layer, optional) β Activation function. Default: nn.ReLU().

Examples

```>>> import torch as th
>>> from dgl.nn import GraphormerLayer
```
```>>> batch_size = 16
>>> num_nodes = 100
>>> feat_size = 512
>>> nfeat = th.rand(batch_size, num_nodes, feat_size)
>>> bias = th.rand(batch_size, num_nodes, num_nodes, num_heads)
>>> net = GraphormerLayer(
feat_size=feat_size,
hidden_size=2048,
)
>>> out = net(nfeat, bias)
```

Forward computation.

Parameters:
• nfeat (torch.Tensor) β A 3D input tensor. Shape: (batch_size, N, `feat_size`), where N is the maximum number of nodes.

• attn_bias (torch.Tensor, optional) β The attention bias used for attention modification. Shape: (batch_size, N, N, `num_heads`).

• attn_mask (torch.Tensor, optional) β The attention mask used for avoiding computation on invalid positions, where invalid positions are indicated by True values. Shape: (batch_size, N, N). Note: For rows corresponding to unexisting nodes, make sure at least one entry is set to False to prevent obtaining NaNs with softmax.

Returns:

y β The output tensor. Shape: (batch_size, N, `feat_size`)

Return type:

torch.Tensor