Workflows
=========

Without bias removal
--------------------

For cases where you don't have a bias track you want to regress out, here are
the steps:

1. Prepare bigwig tracks and select the regions you are interested in.
   There are some utilities for this in BPReveal (:py:mod:`prepareBed<bpreveal.prepareBed>`),
   but you are mostly on your own for this stage.
2. Prepare training data files with :py:mod:`prepareTrainingData<bpreveal.prepareTrainingData>`.
   These hdf5 files contain the sequences and experimental profiles for
   all the heads in your model.
3. Train a solo model with :py:mod:`trainSoloModel<bpreveal.trainSoloModel>`.
4. Measure the performance of your model with :py:mod:`metrics<bpreveal.metrics>`.
5. Make predictions from the model with :py:mod:`makePredictions<bpreveal.makePredictions>`.
6. Generate importance scores, either one-dimensionally, with
   :py:mod:`interpretFlat<bpreveal.interpretFlat>`, or by making two-dimensional PISA plots with
   :py:mod:`interpretPisa<bpreveal.interpretPisa>`
7. Run MoDISco to extract motifs (using :py:mod:`shapToNumpy<bpreveal.shapToNumpy>` the
   external ``tfmodisco-lite`` package)
8. Use the motif mapping tools :py:mod:`motifSeqletCutoffs<bpreveal.motifSeqletCutoffs>`,
   :py:mod:`motifScan<bpreveal.motifScan>` and :py:mod:`motifAddQuantiles<bpreveal.motifAddQuantiles>` to map the
   discovered motifs back to the genome.

With bias removal
-----------------

If you do have strong experimental biases, you will need to regress them out.
In that case, the workflow is the following:

1.  Prepare bigwig tracks and select the regions you are interested in with
    :py:mod:`prepareBed<bpreveal.prepareBed>`.
2.  Prepare a data file containing bias using
    :py:mod:`prepareTrainingData<bpreveal.prepareTrainingData>`. The bias
    regions may be uninteresting regions in the genome, or you may train on
    your regions of interest but use an experimental control for the data.
3.  Train a bias (AKA solo) model with
    :py:mod:`trainSoloModel<bpreveal.trainSoloModel>`.
4.  Train a transformation model to match the bias model to the experimental
    data, using
    :py:mod:`trainTransformationModel<bpreveal.trainTransformationModel>`.
5.  Train a residual model to explain non-bias parts of the experimental data,
    using :py:mod:`trainCombinedModel<bpreveal.trainCombinedModel>`.
6.  Measure the performance of the full model with
    :py:mod:`metrics<bpreveal.metrics>`.
7.  Make predictions from the full model and residual model with
    :py:mod:`makePredictions<bpreveal.makePredictions>`.
8.  Generate importance scores from the residual model with
    :py:mod:`interpretFlat<bpreveal.interpretFlat>`.
9.  Run MoDISco, using scores generated by
    :py:mod:`shapToNumpy<bpreveal.shapToNumpy>`.
10. Map the discovered motifs back to the genome using
    :py:mod:`motifSeqletCutoffs<bpreveal.motifSeqletCutoffs>`,
    :py:mod:`motifScan<bpreveal.motifScan>` and
    :py:mod:`motifAddQuantiles<bpreveal.motifAddQuantiles>`.

..
    Copyright 2022, 2023, 2024 Charles McAnany. This file is part of BPReveal. BPReveal is free software: You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version. BPReveal is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with BPReveal. If not, see <https://www.gnu.org/licenses/>.  # noqa  # pylint: disable=line-too-long