🆕 dgl.graphbolt¶
dgl.graphbolt is a dataloading framework for GNN that provides well-defined APIs for each stage of the data pipeline and multiple standard implementations.
Dataset¶
A dataset is a collection of graph structure data, feature data and tasks.
An abstract dataset which provides abstraction for accessing the data required for training. |
|
An on-disk dataset which reads graph topology, feature data and Train/Validation/Test set from disk. |
|
A utility class to download built-in dataset from AWS S3 and load it as |
|
A Graphbolt dataset for legacy DGLDataset. |
|
An abstract task which consists of meta information and Train/Validation/Test Set. |
Graph¶
A graph is a collection of nodes and edges. It can be a homogeneous graph or a heterogeneous graph.
Class for sampling graph. |
|
A sampling graph in CSC format. |
Feature and FeatureStore¶
A feature is a collection of data(tensor, array). A feature store is a collection of features.
A wrapper of feature data for access. |
|
A store to manage multiple features for access. |
|
A basic feature store to manage multiple features for access. |
|
A wrapper of pytorch based feature. |
|
A store to manage multiple pytorch based feature for access. |
|
GPU cached feature wrapping a fallback feature. |
DataLoader¶
A dataloader is for iterating over a dataset and generate mini-batches.
Multiprocessing DataLoader. |
ItemSet¶
An item set is an iterable collection of items.
A wrapper of iterable data or tuple of iterable data. |
|
Dictionary wrapper of ItemSet. |
ItemSampler¶
An item sampler is for sampling items from an item set.
A sampler to iterate over input items and create subsets. |
|
A sampler to iterate over input items and create subsets distributedly. |
MiniBatch¶
A mini-batch is a collection of sampled subgraphs and their corresponding features. It is the basic unit for training a GNN model.
A composite data class for data structure in the graphbolt. |
|
A mini-batch transformer used to manipulate mini-batch. |
NegativeSampler¶
A negative sampler is for sampling negative items from mini-batches.
A negative sampler used to generate negative samples and return a mix of positive and negative samples. |
|
Sample negative destination nodes for each source node based on a uniform distribution. |
SubgraphSampler¶
A subgraph sampler is for sampling subgraphs from a graph.
A subgraph sampler used to sample a subgraph from a given set of nodes from a larger graph. |
|
An abstract class for sampled subgraph. |
|
Sample neighbor edges from a graph and return a subgraph. |
|
Sample layer neighbor edges from a graph and return a subgraph. |
|
Sampled subgraph of CSCSamplingGraph. |
|
Sample the subgraph induced on the inbound edges of the given nodes. |
FeatureFetcher¶
A feature fetcher is for fetching features from a feature store.
A feature fetcher used to fetch features for node/edge in graphbolt. |
CopyTo¶
This datapipe is for copying data to a device.
DataPipe that transfers each element yielded from the previous DataPipe to the given device. |
Utilities¶
Create a FusedCSCSamplingGraph object from a CSC representation. |
|
Load a FusedCSCSamplingGraph object from shared memory. |
|
Convert a DGLGraph to FusedCSCSamplingGraph. |
|
Convert canonical etype from string to tuple. |
|
Convert canonical etype from tuple to string. |
|
Tests if each element of elements is in test_elements. |
|
Set the random seed of Graphbolt. |
|
This function finds the reverse edges of the given edges and returns the composition of them. |
|
Exclude seed edges with or without their reverse edges from the sampled subgraphs in the minibatch. |
|
Relabel the row (source) IDs in the csc formats into a contiguous range from 0 and return the original row node IDs per type. |
|
Compact a list of nodes tensor. |
|
Compact csc formats and return unique nodes (per type). |