BitcoinOTCDataset

class dgl.data.BitcoinOTCDataset(raw_dir=None, force_reload=False, verbose=False, transform=None)[source]

Bases: DGLBuiltinDataset

BitcoinOTC dataset for fraud detection

This is who-trusts-whom network of people who trade using Bitcoin on a platform called Bitcoin OTC. Since Bitcoin users are anonymous, there is a need to maintain a record of users’ reputation to prevent transactions with fraudulent and risky users.

Offical website: https://snap.stanford.edu/data/soc-sign-bitcoin-otc.html

Bitcoin OTC dataset statistics:

  • Nodes: 5,881

  • Edges: 35,592

  • Range of edge weight: -10 to +10

  • Percentage of positive edges: 89%

Parameters:
  • raw_dir (str) – Raw file directory to download/contains the input data directory. Default: ~/.dgl/

  • force_reload (bool) – Whether to reload the dataset. Default: False

  • verbose (bool) – Whether to print out progress information. Default: True.

  • transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

graphs

A list of DGLGraph objects

Type:

list

is_temporal

Indicate whether the graphs are temporal graphs

Type:

bool

Raises:

UserWarning – If the raw data is changed in the remote server by the author.

Examples

>>> dataset = BitcoinOTCDataset()
>>> len(dataset)
136
>>> for g in dataset:
....    # get edge feature
....    edge_weights = g.edata['h']
....    # your code here
>>>
__getitem__(item)[source]

Get graph by index

Parameters:

item (int) – Item index

Returns:

The graph contains:

  • edata['h'] : edge weights

Return type:

dgl.DGLGraph

__len__()[source]

Number of graphs in the dataset.

Return type:

int