Model Zoo

Chemistry

Utils

chem.load_pretrained(model_name[, log]) Load a pretrained model

Property Prediction

Currently supported model architectures:

  • GCNClassifier
  • GATClassifier
  • MPNN
  • SchNet
  • MGCN
class dgl.model_zoo.chem.GCNClassifier(in_feats, gcn_hidden_feats, n_tasks, classifier_hidden_feats=128, dropout=0.0)[source]

GCN based predictor for multitask prediction on molecular graphs We assume each task requires to perform a binary classification.

Parameters:
  • in_feats (int) – Number of input atom features
  • gcn_hidden_feats (list of int) – gcn_hidden_feats[i] gives the number of output atom features in the i+1-th gcn layer
  • n_tasks (int) – Number of prediction tasks
  • classifier_hidden_feats (int) – Number of molecular graph features in hidden layers of the MLP Classifier
  • dropout (float) – The probability for dropout. Default to be 0., i.e. no dropout is performed.
forward(bg, feats)

Multi-task prediction for a batch of molecules

Parameters:
  • bg (BatchedDGLGraph) – B Batched DGLGraphs for processing multiple molecules in parallel
  • feats (FloatTensor of shape (N, M0)) – Initial features for all atoms in the batch of molecules
Returns:

Soft prediction for all tasks on the batch of molecules

Return type:

FloatTensor of shape (B, n_tasks)

class dgl.model_zoo.chem.GATClassifier(in_feats, gat_hidden_feats, num_heads, n_tasks, classifier_hidden_feats=128, dropout=0)[source]

GAT based predictor for multitask prediction on molecular graphs. We assume each task requires to perform a binary classification.

Parameters:in_feats (int) – Number of input atom features
forward(bg, feats)

Multi-task prediction for a batch of molecules

Parameters:
  • bg (BatchedDGLGraph) – B Batched DGLGraphs for processing multiple molecules in parallel
  • feats (FloatTensor of shape (N, M0)) – Initial features for all atoms in the batch of molecules
Returns:

Soft prediction for all tasks on the batch of molecules

Return type:

FloatTensor of shape (B, n_tasks)

class dgl.model_zoo.chem.MPNNModel(node_input_dim=15, edge_input_dim=5, output_dim=12, node_hidden_dim=64, edge_hidden_dim=128, num_step_message_passing=6, num_step_set2set=6, num_layer_set2set=3)[source]

MPNN from Neural Message Passing for Quantum Chemistry

Parameters:
  • node_input_dim (int) – Dimension of input node feature, default to be 15.
  • edge_input_dim (int) – Dimension of input edge feature, default to be 15.
  • output_dim (int) – Dimension of prediction, default to be 12.
  • node_hidden_dim (int) – Dimension of node feature in hidden layers, default to be 64.
  • edge_hidden_dim (int) – Dimension of edge feature in hidden layers, default to be 128.
  • num_step_message_passing (int) – Number of message passing steps, default to be 6.
  • num_step_set2set (int) – Number of set2set steps
  • num_layer_set2set (int) – Number of set2set layers
forward(g, n_feat, e_feat)[source]

Predict molecule labels

Parameters:
  • g (DGLGraph) – Input DGLGraph for molecule(s)
  • n_feat (tensor of dtype float32 and shape (B1, D1)) – Node features. B1 for number of nodes and D1 for the node feature size.
  • e_feat (tensor of dtype float32 and shape (B2, D2)) – Edge features. B2 for number of edges and D2 for the edge feature size.
Returns:

res

Return type:

Predicted labels

class dgl.model_zoo.chem.SchNet(dim=64, cutoff=5.0, output_dim=1, width=1, n_conv=3, norm=False, atom_ref=None, pre_train=None)[source]

SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. (NIPS‘2017)

Parameters:
  • dim (int) – Size for atom embeddings, default to be 64.
  • cutoff (float) – Radius cutoff for RBF, default to be 5.0.
  • output_dim (int) – Number of target properties to predict, default to be 1.
  • width (int) – Width in RBF, default to 1.
  • n_conv (int) – Number of conv (interaction) layers, default to be 1.
  • norm (bool) – Whether to normalize the output atom representations, default to be False.
  • atom_ref (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
  • pre_train (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
forward(g, atom_types, edge_distances)[source]

Predict molecule labels

Parameters:
  • g (DGLGraph) – Input DGLGraph for molecule(s)
  • atom_types (int64 tensor of shape (B1)) – Types for atoms in the graph(s), B1 for the number of atoms.
  • edge_distances (float32 tensor of shape (B2, 1)) – Edge distances, B2 for the number of edges.
Returns:

prediction – Model prediction for the batch of graphs, B for the number of graphs, output_dim for the prediction size.

Return type:

float32 tensor of shape (B, output_dim)

class dgl.model_zoo.chem.MGCNModel(dim=128, width=1, cutoff=5.0, edge_dim=128, output_dim=1, n_conv=3, norm=False, atom_ref=None, pre_train=None)[source]

Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective

Parameters:
  • dim (int) – Size for embeddings, default to be 128.
  • width (int) – Width in the RBF layer, default to be 1.
  • cutoff (float) – The maximum distance between nodes, default to be 5.0.
  • edge_dim (int) – Size for edge embedding, default to be 128.
  • out_put_dim (int) – Number of target properties to predict, default to be 1.
  • n_conv (int) – Number of convolutional layers, default to be 3.
  • norm (bool) – Whether to perform normalization, default to be False.
  • atom_ref (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
  • pre_train (Atom embeddings or None) – If None, random representation initialization will be used. Otherwise, they will be used to initialize atom representations. Default to be None.
forward(g, atom_types, edge_distances)[source]

Predict molecule labels

Parameters:
  • g (DGLGraph) – Input DGLGraph for molecule(s)
  • atom_types (int64 tensor of shape (B1)) – Types for atoms in the graph(s), B1 for the number of atoms.
  • edge_distances (float32 tensor of shape (B2, 1)) – Edge distances, B2 for the number of edges.
Returns:

prediction – Model prediction for the batch of graphs, B for the number of graphs, output_dim for the prediction size.

Return type:

float32 tensor of shape (B, output_dim)

Generative Models

Currently supported model architectures:

  • DGMG
  • JTNN
class dgl.model_zoo.chem.DGMG(atom_types, bond_types, node_hidden_size, num_prop_rounds, dropout)[source]

DGMG model

Learning Deep Generative Models of Graphs

Users only need to initialize an instance of this class.

Parameters:
  • atom_types (list) – E.g. [‘C’, ‘N’]
  • bond_types (list) – E.g. [Chem.rdchem.BondType.SINGLE, Chem.rdchem.BondType.DOUBLE, Chem.rdchem.BondType.TRIPLE, Chem.rdchem.BondType.AROMATIC]
  • node_hidden_size (int) – Size of atom representation
  • num_prop_rounds (int) – Number of message passing rounds for each time
  • dropout (float) – Probability for dropout
forward(actions=None, rdkit_mol=False, compute_log_prob=False, max_num_steps=400)[source]
Parameters:
  • actions (list of 2-tuples or None.) – If actions are not None, generate a molecule according to actions. Otherwise, a molecule will be generated based on sampled actions.
  • rdkit_mol (bool) – Whether to maintain a Chem.rdchem.Mol object. This brings extra computational cost, but is necessary if we are interested in learning the generated molecule.
  • compute_log_prob (bool) – Whether to compute log likelihood
  • max_num_steps (int) – Maximum number of steps allowed. This only comes into effect during inference and prevents the model from not stopping.
Returns:

  • torch.tensor consisting of a float only, optional – The log likelihood for the actions taken
  • str, optional – The generated molecule in the form of SMILES

class dgl.model_zoo.chem.DGLJTNNVAE(hidden_size, latent_size, depth, vocab=None, vocab_file=None)[source]

Junction Tree Variational Autoencoder for Molecular Graph Generation

forward(mol_batch, beta=0, e1=None, e2=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.