dgl.distributed.sample_neighbors

dgl.distributed.sample_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False)[source]

Sample from the neighbors of the given nodes from a distributed graph.

For each node, a number of inbound (or outbound when edge_dir == 'out') edges will be randomly chosen. The returned graph will contain all the nodes in the original graph, but only the sampled edges.

Node/edge features are not preserved. The original IDs of the sampled edges are stored as the dgl.EID feature in the returned graph.

For heterogeneous graphs, nodes is a dictionary whose key is node type and the value is type-specific node IDs.

Parameters
  • g (DistGraph) – The distributed graph..

  • nodes (tensor or dict) – Node IDs to sample neighbors from. If it’s a dict, it should contain only one key-value pair to make this API consistent with dgl.sampling.sample_neighbors.

  • fanout (int) –

    The number of edges to be sampled for each node.

    If -1 is given, all of the neighbors will be selected.

  • edge_dir (str, optional) –

    Determines whether to sample inbound or outbound edges.

    Can take either in for inbound edges or out for outbound edges.

  • prob (str, optional) –

    Feature name used as the (unnormalized) probabilities associated with each neighboring edge of a node. The feature must have only one element for each edge.

    The features must be non-negative floats, and the sum of the features of inbound/outbound edges for every node must be positive (though they don’t have to sum up to one). Otherwise, the result will be undefined.

  • replace (bool, optional) –

    If True, sample with replacement.

    When sampling with replacement, the sampled subgraph could have parallel edges.

    For sampling without replacement, if fanout > the number of neighbors, all the neighbors are sampled. If fanout == -1, all neighbors are collected.

Returns

A sampled subgraph containing only the sampled neighboring edges. It is on CPU.

Return type

DGLGraph