Workflows

Without bias removal

For cases where you don’t have a bias track you want to regress out, here are the steps:

  1. Prepare bigwig tracks and select the regions you are interested in. There are some utilities for this in BPReveal (prepareBed), but you are mostly on your own for this stage.

  2. Prepare training data files with prepareTrainingData. These hdf5 files contain the sequences and experimental profiles for all the heads in your model.

  3. Train a solo model with trainSoloModel.

  4. Measure the performance of your model with metrics.

  5. Make predictions from the model with makePredictions.

  6. Generate importance scores, either one-dimensionally, with interpretFlat, or by making two-dimensional PISA plots with interpretPisa

  7. Run MoDISco to extract motifs (using shapToNumpy the external tfmodisco-lite package)

  8. Use the motif mapping tools motifSeqletCutoffs, motifScan and motifAddQuantiles to map the discovered motifs back to the genome.

With bias removal

If you do have strong experimental biases, you will need to regress them out. In that case, the workflow is the following:

  1. Prepare bigwig tracks and select the regions you are interested in with prepareBed.

  2. Prepare a data file containing bias using prepareTrainingData. The bias regions may be uninteresting regions in the genome, or you may train on your regions of interest but use an experimental control for the data.

  3. Train a bias (AKA solo) model with trainSoloModel.

  4. Train a transformation model to match the bias model to the experimental data, using trainTransformationModel.

  5. Train a residual model to explain non-bias parts of the experimental data, using trainCombinedModel.

  6. Measure the performance of the full model with metrics.

  7. Make predictions from the full model and residual model with makePredictions.

  8. Generate importance scores from the residual model with interpretFlat.

  9. Run MoDISco, using scores generated by shapToNumpy.

  10. Map the discovered motifs back to the genome using motifSeqletCutoffs, motifScan and motifAddQuantiles.