BGSDatasetο
- class dgl.data.BGSDataset(print_every=10000, insert_reverse=True, raw_dir=None, force_reload=False, verbose=True, transform=None)[source]ο
Bases:
RDFGraphDataset
BGS dataset for node classification task
BGS namespace convention:
http://data.bgs.ac.uk/(ref|id)/<Major Concept>/<Sub Concept>/INSTANCE
. We ignored all literal nodes and the relations connecting them in the output graph. We also ignored the relation used to mark whether a term is CURRENT or DEPRECATED.BGS dataset statistics:
Nodes: 94806
Edges: 672884 (including reverse edges)
Target Category: Lexicon/NamedRockUnit
Number of Classes: 2
Label Split:
Train: 117
Test: 29
- Parameters:
print_every (int) β Preprocessing log for every X tuples. Default: 10000.
insert_reverse (bool) β If true, add reverse edge and reverse relations to the final graph. Default: True.
raw_dir (str) β Raw file directory to download/contains the input data directory. Default: ~/.dgl/
force_reload (bool) β Whether to reload the dataset. Default: False
verbose (bool) β Whether to print out progress information. Default: True.
transform (callable, optional) β A transform that takes in a
DGLGraph
object and returns a transformed version. TheDGLGraph
object will be transformed before every access.
Examples
>>> dataset = dgl.data.rdf.BGSDataset() >>> graph = dataset[0] >>> category = dataset.predict_category >>> num_classes = dataset.num_classes >>> >>> train_mask = g.nodes[category].data['train_mask'] >>> test_mask = g.nodes[category].data['test_mask'] >>> label = g.nodes[category].data['label']