models

Functions to build BPNet-style models.

bpreveal.models.soloModel(inputLength, outputLength, numFilters, numLayers, inputFilterWidth, outputFilterWidth, headList, modelName)

Generate a model using the classic BPNet architecture.

Parameters:
  • inputLength (int) – is the length of the one-hot encoded DNA sequence.

  • outputLength (int) – is the length of the predicted profile.

  • numFilters (int) – is the number of convolutional filters used at each layer.

  • numLayers (int) – is the number of dilated convolutions.

  • inputFilterWidth (int) – is the width of the first convolutional layer, the one looking for motifs.

  • outputFilterWidth (int) – is the width of the profile head convolutional filter at the very bottom of the network.

  • headList (list[dict]) – is taken directly from a <head-list> in the configuration JSON.

  • modelName (str) – The name you want this model to have when saved.

Returns:

A TF model.

Return type:

tf_keras.models.Model

Input to this model is a (batch x inputLength x NUM_BASES) tensor of one-hot encoded DNA. Output is a list of (profilePreds, profilePreds, profilePreds,… , countPreds, countPreds, countPreds…). profilePreds is a tensor of shape (batch x numTasks x outputLength), containing the logits of the profile values for each task. countsPreds is a tensor of shape (batch x numTasks) containing the log counts for each task.

It is an error to call this function with an inconsistent network structure, such as an input that is too long.

bpreveal.models.transformationModel(soloModelIn, profileArchitectureSpecification, countsArchitectureSpecification, headList)

Construct a simple model used to regress out the solo model from experimental data.

Given a solo model (typically representing bias), generate a simple network that can be used to transform the solo model’s output into the experimental data. That is, experimental = f(bias) and f is a simple function like y=mx+b or something. When you train the model returned by this function, you are training the m and b parameters of that function. Note that this function sets the solo model to non-trainable, since you’re not trying to make the bias model better, you’re trying to transform the solo model’s output to look like experimental data.

Parameters:
  • soloModelIn (tf_keras.models.Model) – A Keras model that you’d like to transform.

  • profileArchitectureSpecification (dict) – Straight from the config JSON.

  • countsArchitectureSpecification (dict) – Straight from the config JSON.

  • headList (list[dict]) – Also from the config JSON.

Returns:

A Keras model with the same output shape as the soloModel.

Return type:

tf_keras.models.Model

bpreveal.models.combinedModel(inputLength, outputLength, numFilters, numLayers, inputFilterWidth, outputFilterWidth, headList, biasModel)

Build a combined model.

This builds a standard BPNet model, but then adds in the bias at the very end:

    ,-----------------SEQUENCE------------------,
    V                                           ,
Cropdown step                                   V
_____________                           _________________
| SOLO MODEL|                           | RESIDUAL MODEL|
|___________|                           |_______________|
     |                                          |
_____V_______          _______                  |
| TRANSFORM |--------> | ADD |<-----------------'
|___________|          |_____|
                          |
                     _____V_____
                     |COMBINED |
                     |_________|

Since you’ll usually want to isolate the bias-free model (AKA residual model), that is returned separately.

Parameters:
  • inputLength (int) – The length of the one-hot encoded DNA sequence (which must be the same for the bias model and the residual model).

  • outputLength (int) – The length of the predicted profile.

  • numFilters (int) – The number of convolutional filters used at each layer in the residual model.

  • numLayers (int) – The number of dilated convolutions in the residual model.

  • inputFilterWidth (int) – The width of the first convolutional layer in the residual model, the one looking for motifs.

  • outputFilterWidth (int) – The width of the profile head convolutional filter in the residual model at the very bottom of the network.

  • headList (list[dict]) – Taken straight from the config json.

  • biasModel (tf_keras.models.Model) – A keras model that goes from sequence to transformed bias. This is the file that is saved when you generate the transformation model, and internally comprises both the solo model and a transformation.

Returns:

Three kmodels.

  • The first is the combined output, i.e., the COMBINED node in the graph above. Input to this model is a (batch x inputLength x NUM_BASES) tensor of one-hot encoded DNA. Output is a list of (profilePreds, profilePreds, profilePreds,… , countPreds, countPreds, countPreds…). profilePreds is a tensor of shape (batch x numTasks x outputLength), containing the logits of the profile values for each task. countsPreds is a tensor of shape (batch x numTasks) containing the log counts for each task.

  • The second is the bias-free model, RESIDUAL MODEL in the graph above. It has the same input and output shapes as the COMBINED model.

  • The final model is the solo model, just in case you need it.

Return type:

tuple[tf_keras.models.Model, tf_keras.models.Model, tf_keras.models.Model]

It is an error to call this function with an inconsistent network structure, such as an input that is too long.