CSVDataset

class dgl.data.CSVDataset(data_path, force_reload=False, verbose=True, ndata_parser=None, edata_parser=None, gdata_parser=None, transform=None)[source]

Bases: DGLDataset

Dataset class that loads and parses graph data from CSV files.

This class requires the following additional packages:

  • pyyaml >= 5.4.1

  • pandas >= 1.1.5

  • pydantic >= 1.9.0

The parsed graph and feature data will be cached for faster reloading. If the source CSV files are modified, please specify force_reload=True to re-parse from them.

Parameters:
  • data_path (str) – Directory which contains β€˜meta.yaml’ and CSV files

  • force_reload (bool, optional) – Whether to reload the dataset. Default: False

  • verbose (bool, optional) – Whether to print out progress information. Default: True.

  • ndata_parser (dict[str, callable] or callable, optional) – Callable object which takes in the pandas.DataFrame object created from CSV file, parses node data and returns a dictionary of parsed data. If given a dictionary, the key is node type and the value is a callable object which is used to parse data of corresponding node type. If given a single callable object, such object is used to parse data of all node type data. Default: None. If None, a default data parser is applied which load data directly and tries to convert list into array.

  • edata_parser (dict[(str, str, str), callable], or callable, optional) – Callable object which takes in the pandas.DataFrame object created from CSV file, parses edge data and returns a dictionary of parsed data. If given a dictionary, the key is edge type and the value is a callable object which is used to parse data of corresponding edge type. If given a single callable object, such object is used to parse data of all edge type data. Default: None. If None, a default data parser is applied which load data directly and tries to convert list into array.

  • gdata_parser (callable, optional) – Callable object which takes in the pandas.DataFrame object created from CSV file, parses graph data and returns a dictionary of parsed data. Default: None. If None, a default data parser is applied which load data directly and tries to convert list into array.

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

graphs

Graphs of the dataset

Type:

dgl.DGLGraph

data

any available graph-level data such as graph-level feature, labels.

Type:

dict

Examples

Please refer to 4.6 Loading data from CSV files.

__getitem__(i)[source]

Gets the data object at index.

__len__()[source]

The number of examples in the dataset.