tools.plots

bpreveal.tools.plots.getCoordinateTicks(start, end, numTicks, zeroOrigin)

Given a start and end coordinate, return x-ticks that should be used for plotting. Given a start and end coordinate, return a list of ticks and tick labels that 1. include exactly the start and stop coordinates 2. Contain approximately numTicks positions and labels. 3. Try to fall on easy multiples 1, 2, and 5 times powers of ten. 4. Are formatted to reduce redundant label noise by omitting repeated initial digits.

Parameters:
  • start (int) –

  • end (int) –

  • numTicks (int) –

  • zeroOrigin (bool) –

Return type:

tuple[list[float], list[str]]

A convenience function to plot an array of sequence data (like a pwm) on a matplotlib axes object.

Arguments: values is an (N,4) array of sequence data. This could be, for example, a pwm or a one-hot encoded sequence.

width is the width of the total logo, useful for aligning axis labels.

ax is a matplotlib axes object on which the logo will be drawn.

Colors, if provided, can have several meanings:
  1. Give an explicit rgba color for each base.

    colors should be an array of shape (N, 4, 4), where the first dimension is the sequence position, the second is the base (A, C, G, T, in that order), and the third gives an rgba color to use for that base at that position.

  2. Give a color for each base type. In this case, colors will be a dict of tuples:

    {“A”: (220, 38, 127), “C”: (120, 94, 240), “G”: (254, 97, 0), “T”: (255, 176, 0)} This will make each instance of a particular base have the same color.

  3. Give a matplotlib colormap and a min and max value. Each base will be colored

    based on its magnitude. For example, to highlight bases with large negative values, you might specify (‘Blues_r’, -2, 0) to draw all bases with negative scores as blue, and bases with less negative colors will be drawn lighter. Bases with scores outside the limits you provide will be clipped to the limit values.

  4. The string ‘seq’ means A will be drawn green, C will be blue, G will be orange,

    and T will be red. These colors are drawn from a colorblind-aware palette.

Parameters:
  • values (ndarray[Any, dtype[float32]]) –

  • width (float) –

  • spaceBetweenLetters (float) –

Return type:

None

bpreveal.tools.plots.plotPisaWithFiles(pisaDats, cutMiddle, cutLengthX, cutLengthY, receptiveField, genomeWindowStart, genomeWindowChrom, genomeFastaFname, importanceBwFname, motifScanBedFname, profileDats, nameColors, fig, bbox, colorSpan=1.0, boxHeight=0.1, fontsize=5, mini=False)

Given the names of files, make a pisa plot.

Parameters:
  • pisaDats (str | ndarray[Any, dtype[float16]]) – Either a string naming an hdf5 file or an array from loadPisa.

  • cutMiddle (int) – The midpoint of the pisa plot, relative to the start of the profile.

  • cutLengthX (int) – How wide should the X axis be? If 99 or less, a sequence will be plotted.

  • cutLengthY (int) – How tall should the plot be?

  • receptiveField (int) – What is the model’s receptive field?

  • genomeWindowStart (int) – Where in the genome does pisaDats start? This is used to generate the x axis.

  • genomeWindowChrom (str) – What chromosome is the sequence on?

  • genomeFastaFname (str) – Name of the fasta file containing the genome.

  • importanceBwFname (str) – The bigwig of importance scores from interpretFlat

  • motifScanBedFname (str) – The bed file containing mapped motifs.

  • profileDats (str) – The bigwig file containing predicted profile.

  • nameColors (dict[str, tuple[float, float, float]]) – A dict containing the color to be used for each motif name. If a motif is encountered that is not in this dict, then it is added with a color taken from the IBM palette.

  • fig (Figure) – The matplotlib figure to put this plot on.

  • bbox (tuple[float, float, float, float]) – The bounding box to use for drawing the figure. Lets you put multiple pisa plots on a single matplotlib Figure.

  • colorSpan (float) – What are the maximum and minimum values in the color map.

  • boxHeight (float) – How tall should the boxes containing motif names be?

  • fontsize (int) – How big should the font be?

  • mini (bool) –

Returns:

Same as plotPisa

bpreveal.tools.plots.plotPisa(pisaDats, cutMiddle, cutLengthX, cutLengthY, receptiveField, genomeWindowStart, seq, impScores, annotations, profile, nameColors, fig, bbox, colorSpan=1.0, boxHeight=0.1, fontsize=5)

Given the actual vectors to show, make a pretty pisa plot.

Parameters:
  • pisaDats (str | ndarray[Any, dtype[float16]]) – Either a string naming an hdf5 file, or an array from loadPisa.

  • cutMiddle (int) – Where should the midpoint of the plot be, relative to the pisaDats array?

  • cutLengthX (int) – How wide should the plot be?

  • cutLengthY (int) – How tall should the plot be?

  • receptiveField (int) – What is the model’s receptive field?

  • genomeWindowStart (int) – Where in the genome does this sequence start?

  • seq (str) – The sequence of the region.

  • impScores (ndarray[Any, dtype[float16]]) – The importance scores.

  • annotations (tuple[tuple[int, int], str, tuple[float, float, float]]) – A list of annotations, containing ((start, stop), name, color).

  • profile (ndarray[Any, dtype[float32]]) – A vector containing profile information.

  • nameColors (dict[str, tuple[float, float, float]]) – A dict mapping motif name to color. This is ignored.

  • fig (Figure) – The matplotlib Figure onto which the plot should be drawn.

  • bbox (tuple[float, float, float, float]) – The bounding box on the figure that will be used.

  • colorSpan (float) – The limit of the color scale.

  • boxHeight (float) – How tall should the motif name boxes be?

  • fontsize (int) – How large should the font be?

Returns:

(axPisa, axSeq, axProfile, nameColors, axCbar)

bpreveal.tools.plots.plotMiniPisa(pisaDats, cutMiddle, cutLengthX, cutLengthY, receptiveField, genomeWindowStart, seq, impScores, annotations, profile, nameColors, fig, bbox, colorSpan=1.0, boxHeight=0.1, fontsize=5)

Given the actual vectors to show, make a pretty pisa plot.

Parameters:
  • pisaDats (str | ndarray[Any, dtype[float16]]) – Either a string naming an hdf5 file, or an array from loadPisa.

  • cutMiddle (int) – Where should the midpoint of the plot be, relative to the pisaDats array?

  • cutLengthX (int) – How wide should the plot be?

  • cutLengthY (int) – How tall should the plot be?

  • receptiveField (int) – What is the model’s receptive field?

  • genomeWindowStart (int) – Where in the genome does this sequence start?

  • seq (str) – The sequence of the region.

  • impScores (ndarray[Any, dtype[float16]]) – The importance scores.

  • annotations (tuple[tuple[int, int], str, tuple[float, float, float]]) – A list of annotations, containing ((start, stop), name, color).

  • profile (ndarray[Any, dtype[float32]]) – A vector containing profile information.

  • nameColors (dict[str, tuple[float, float, float]]) – A dict mapping motif name to color. Used to build the legend.

  • fig (Figure) – The matplotlib Figure onto which the plot should be drawn.

  • bbox (tuple[float, float, float, float]) – The bounding box on the figure that will be used.

  • colorSpan (float) – The limit of the color scale.

  • boxHeight (float) – How tall should the motif name boxes be?

  • fontsize (int) – How large should the font be?

Returns:

(axPisa, axSeq, axProfile, nameColors, axCbar)