internal.plotUtils

A bunch of helper functions for making plots.

Plot an array of sequence data (like a pwm).

Parameters:
  • values (ndarray[tuple[int, ...], dtype[float32]]) – An (N,NUM_BASES) array of sequence data. This could be, for example, a pwm or a one-hot encoded sequence.

  • width (float) – The width of the total logo, useful for aligning axis labels.

  • ax (Axes) – A matplotlib axes object on which the logo will be drawn.

  • colors (DNA_COLOR_SPEC_T | list[DNA_COLOR_SPEC_T]) – The colors to use for shading the sequence. See below for details.

  • spaceBetweenLetters (float) – How much should the letters be squished? This is given as a fraction of the total letter width. For example, to have a gap of 2 pixels between letters that are 10 pixels wide, set spaceBetweenLetters=0.2.

  • origin (tuple[float, float]) – Where, in the coordinates of the axis, should the logo start? Default: draw the logo starting at (0,0).

Return type:

None

Colors, if provided, can have several meanings:
  1. Give a color for each base type by RGB value. In this case, colors will be a dict of tuples: {"A": (.8, .3, .2), "C": (.5, .3, .9), "G": (1., .4, .0), "T": (1., .7, 0.)} This will make each instance of a particular base have the same color.

  2. Give a color for each base by color-spec. This would be something like: {"A": {"wong": 3}, "C": {"wong": 5}, "G": {"wong": 4}, "T": {"wong": 6}} You can get the default BPReveal color map at dnaWong.

  3. Give a list of colors for each base. This will be a list of length values.shape[0] and each entry should be a dictionary in either format 1 or 2 above. This gives each base its own color palette, useful for shading bases by some profile.

bpreveal.internal.plotUtils.getCoordinateTicks(start, end, numTicks, zeroOrigin)

Given a start and end coordinate, return x-ticks that should be used for plotting.

Parameters:
  • start (int) – The genomic coordinate where your ticks start, inclusive.

  • end (int) – The genomic coordinate where your ticks end, inclusive.

  • numTicks (int) – The approximate number of ticks you want.

  • zeroOrigin (bool) – The actual x coordinate of the ticks should start at zero, even though the labels start at start. Otherwise, the ticks will be positioned at coordinate start to end, and so your axes limits should actually correspond to genomic coordinates.

Returns:

Two lists. The first is the x-coordinate of the ticks, and the second is the string labels that should be used at each tick.

Return type:

tuple[list[float], list[str]]

Given a start and end coordinate, return a list of ticks and tick labels that 1. include exactly the start and stop coordinates 2. Contain approximately numTicks positions and labels. 3. Try to fall on easy multiples 1, 2, and 5 times powers of ten. 4. Are formatted to reduce redundant label noise by omitting repeated initial digits.

bpreveal.internal.plotUtils.replaceThousands(labelList)

If every label ends with ,000, replace the last four letters with k.

Parameters:

labelList (list[str])

Return type:

list[str]

bpreveal.internal.plotUtils.massageTickLabels(labelList)

Remove identical leading digits from labels.

Parameters:

labelList (list[str])

Return type:

list[str]

bpreveal.internal.plotUtils.buildConfig(oldConfig)

Read in a config and add any missing data.

Parameters:

oldConfig (dict) – The original configuration dictionary. All entries from this original dict are copied, so you can mutate the returned dict without messing with the original data.

Return type:

dict

This loads in profile and pisa data from files and expands the color specs.

The returned config will have the following structure:

{
    "pisa": {
        "values": <numpy array of pisa values, from loadPisa>,
        "color-map": <Colormap, default: bpreveal.colors.pisaClip>,
        "rasterize": <boolean, default True>
    },
    "coordinates": {
        "sequence": <string>,
        "midpoint-offset": <integer>,
        "input-slice-width": <integer>,
        "output-slice-width": <integer>,
        "genome-window-start": <integer>,
        "genome-window-chrom": <string>
    },
    "importance": {
        "values": <numpy array>,
        "show-sequence": <boolean>,
        "color": [<list of DNA_COLOR_SPEC_T>]
    },
    "predictions": {
        "values": <numpy array>,
        "show-sequence": <boolean>,
        "color": [<list of DNA_COLOR_SPEC_T>]
    },
    "annotations": {
        "name-colors": <dict[str, COLOR_SPEC_T]>,
        "custom": [<list of {"start": <integer>, "end": <integer>,
                             "name": <string>, "color": <COLOR_SPEC_T>,
                             "shape": "box"}>]
    },
    "figure": {
        "grid-mode": <string, default "on">,
        "diagonal-mode": <string: default "edge">,
        "bottom": <fraction>,
        "left": <fraction>,
        "width": <fraction>,
        "height": <fraction>,
        "annotation-height": <fraction, default 0.13>
        "tick-font-size": <integer, default FONT_SIZE_TICKS>,
        "label-font-size": <integer, default FONT_SIZE_LABELS>,
        "miniature": <boolean, default False>,
        "color-span": <fraction>
    },
    "use-annotation-colors": <boolean, default False>,
    "min-value": <number>
}

Note that "grid-mode" and "diagonal-mode" will only be present if the configuration is for a PISA plot, and "use-annotation-colors" and "min-value" will only be present if the configuration is for a PISA graph.

The test to see if a configuration is for a graph is whether it contains a "min-value" key. If it does, it’s a graph config. If not, it’s a plot config.

Note

The returned configuration will NOT validate as a json, because it contains numpy arrays.

bpreveal.internal.plotUtils.loadFromBigwig(bwFname, start, chrom, length)

Read in the given region from a bigwig file.

Parameters:
  • bwFname (str)

  • start (int)

  • chrom (str)

  • length (int)

Return type:

ndarray[tuple[int, …], dtype[float32]]

bpreveal.internal.plotUtils.normalizeProfileSection(oldConfig, newConfig, group)

Take a raw config dict and populate any defaults and load up bigwig data.

Parameters:
  • oldConfig (dict)

  • newConfig (dict)

  • group (str)

Return type:

None

bpreveal.internal.plotUtils.normalizeProfileColor(colorSpec, numItems)

Take the config color spec and expand it to a list of DNA_COLOR_SPEC_T.

Parameters:
Returns:

A list of a DNA_COLOR_SPEC_T for each position in the profile. If only one colorSpec was provided, then it will be replicated numItems times.

Return type:

list[DNA_COLOR_SPEC_T]

bpreveal.internal.plotUtils.loadSequence(genomeFastaFname, genomeWindowStart, genomeWindowChrom, length)

Read a sequence in from a fasta. Uppercases the resulting string.

Parameters:
  • genomeFastaFname (str)

  • genomeWindowStart (int)

  • genomeWindowChrom (str)

  • length (int)

Return type:

str

bpreveal.internal.plotUtils.loadPisaAnnotations(bedFname, nameColors, start, chrom, length)

Load in a bed full of annotations and prepare boxes for ones in this region.

Parameters:
  • bedFname (str) – The name of the bed file.

  • nameColors (dict[str, COLOR_SPEC_T]) – A dict mapping names onto colorSpecs. Used to determine the colors that will be drawn for the annotations. If a name already exists in nameColors, use it. If a new name is found in the bed, add it to nameColors, drawing from the tolLight palette.

  • start (int) – Genomic start coordinate.

  • chrom (str) – Chromosome that the region is on.

  • length (int) – The length of the region being plotted.

Returns:

A list of dicts, structured like {"start": 1234, "end": 1244, "name": "Abf1", "color": {"tol": 0}.

Return type:

list[dict]

bpreveal.internal.plotUtils.addResizeCallbacks(ax, which, numYTicks, numXTicks, fontSizeTicks)

Given an axes, add callbacks so that when it gets resized, the ticks update.

Parameters:
  • ax (Axes) – The axes that will be interactively resized.

  • which (Literal['both', 'x', 'y']) – Either "x", "y", or "both", indicating which axis the callback should be applied to.

  • numYTicks (int) – How many ticks should be generated on the Y axis?

  • numXTicks (int) – How many ticks should be generated on the X axis?

  • fontSizeTicks (int) – What font size should be used for the tick labels?

Return type:

None

bpreveal.internal.plotUtils.addAnnotations(axAnnot, annotations, boxHeight, genomeStartX, genomeEndX, fontSize, mini)

Apply the given annotations to the drawing area given by axAnnot.

Parameters:
  • axAnnot (Axes) – The matplotlib axes to draw upon. This will usually overlap with some other component of the figure.

  • annotations (list[dict]) – A list of dicts of the form given by loadPisaAnnotations().

  • boxHeight (float) – How tall, as a fraction of the height of axAnnot, should the boxes be?

  • genomeStartX (int) – Where does the x-axis of the annotation axis start, in genomic coordinates?

  • genomeEndX (int) – Where does the annotation axis end, in genomic coordinates?

  • fontSize (int) – How big do you want the text in the boxes?

  • mini (bool) – If True, then don’t write the names of the annotations in the boxes.

Returns:

A dict of the names that were actually plotted, mapping each name to its colorSpec.

Return type:

dict[str, COLOR_SPEC_T]

bpreveal.internal.plotUtils.addPisaPlot(shearMat, colorSpan, axPisa, diagMode, gridMode, fontSizeTicks, fontSizeAxLabel, genomeWindowStart, mini, cmap=<matplotlib.colors.ListedColormap object>, rasterize=True)

Plot the pisa data on an axes.

Parameters:
  • shearMat (ndarray[tuple[int, ...], dtype[float16]]) – The PISA data to actually plot, already sheared and cropped.

  • colorSpan (float) – The span of the color bar. Note that the colorSpan parameter is specified in logit space (which is where PISA data are calculated) but the colorbar is shown in dB, which is a more intuitive unit. The color bar will therefore NOT stop where your colorSpan does.

  • axPisa (Axes) – The axes to draw on.

  • diagMode (Literal['on', 'off', 'edge']) – How should the diagonal be drawn? “on” and “off” are self-explanatory and “edge” means that there will be thick ticks drawn on the borders to indicate where the diagonal is, but the middle of the plot will not have a diagonal line.

  • gridMode (Literal['on', 'off']) – Should a grid be drawn? Options are “on” or “off”.

  • fontSizeTicks (int) – How big should the text be on the ticks, in points?

  • fontSizeAxLabel (int) – How big should the font size be for labels, in points?

  • genomeWindowStart (int) – Where does the x-axis of shearMat start, in genomic coordinates?

  • mini (bool) – If True, then draw a plot that works better as a half-page visual. The axes are simplified and the annotation text is moved to a separate legend.

  • cmap (Colormap) – The color map to use. Defaults to pisaClip.

  • rasterize (bool) – Should the boxes be rendered down to a pixel-based image? Rasterizing can make large pisa plots much easier to work with downstream, but it makes them uneditable.

Returns:

A mappable that can be used to generate the color bar.

Return type:

ScalarMappable

bpreveal.internal.plotUtils.addPisaGraph(similarityMat, minValue, colorSpan, colorBlocks, genomeStart, lineWidth, trim, ax, cmap=<matplotlib.colors.ListedColormap object>, rasterize=True)

Draw a graph representation of a PISA matrix.

Parameters:
  • similarityMat (ndarray[tuple[int, ...], dtype[float16]]) – The PISA array, already sheared. It should be square.

  • minValue (float) – PISA values less than this will not be plotted at all.

  • colorSpan (float) – Values higher than this will be clipped.

  • colorBlocks (list[tuple[int, int, COLOR_SPEC_T]]) – Regions of the plot that override the color of the lines. These are tuples of (start, end, (r, g, b)). If the origin of a line overlaps with a ColorBlock, then its color is set to the rgb color in the block.

  • genomeStart (int) – The genomic coordinate of the left side of similarityMat.

  • lineWidth (float) – The thickness of the drawn lines. For large figures, thicker lines avoid Moiré patterns.

  • trim (int) – The similarity matrix may be bigger than the area you want to show on the plot. This allows lines to go off the edge of the page and makes it clear that a motif’s effect extends to bases in the output that are not seen in the figure. trim bases on each side will not be shown on the x-axis, but the information about them contained in similarityMat will be used to draw lines that go off the edge.

  • ax (Axes) – The axes to draw on. The xlim and ylim will be clobbered by this function.

  • cmap (Colormap) – (Optional) The colormap to use for the graph. Note that colorBlocks overrides the colormap wherever they occur.

  • rasterize (bool) – Should the lines be rendered down to a pixel-based image? Rasterizing can make large pisa graphs much easier to work with downstream, but it makes them uneditable.

Return type:

ScalarMappable

bpreveal.internal.plotUtils.addCbar(pisaCax, axCbar, fontSizeTicks, fontSizeAxLabel, mini)

Add a color bar to the given axes.

Parameters:
  • pisaCax (ScalarMappable) – The mappable generated by the PISA plotting/graphing function.

  • axCbar (Axes) – The axes on which the color bar will be drawn.

  • fontSizeTicks (int) – How big should the tick labels be, in points?

  • fontSizeAxLabel (int) – How big should the label at the bottom be?

  • mini (bool) – If True, squish the label a bit for printing in a smaller space.

Return type:

None

bpreveal.internal.plotUtils.addLegend(usedNames, axLegend, fontSize)

Add a legend to map the annotations to colors.

Parameters:
  • usedNames (dict[str, COLOR_SPEC_T]) – The names that are present in this view. Comes from addAnnotations().

  • axLegend (Axes) – The axes to draw the legend on.

  • fontSize (int) – How big do you want the text, in points?

Return type:

None

bpreveal.internal.plotUtils.getPisaAxes(fig, left, bottom, width, height, mini)

Generate the various axes that will be needed for a PISA plot.

Parameters:
  • fig (Figure) – The figure to draw the axes on.

  • left (float) – The left edge, as a fraction of the figure width, for the plots.

  • bottom (float) – The bottom, as a fraction of figure height, of the plots.

  • width (float) – The width, as a fraction of the figure width, for the plots.

  • height (float) – The height, as a fraction of the figure height, for the plots.

  • mini (bool) – Should the axes be arranged for smaller display? If so, returns an additional axes object for the legend.

Returns:

A tuple of axes, in order Pisa, importance, predictions, cbar, annotations, legend. Legend will be None if mini is False.

Return type:

tuple[Axes, Axes, Axes, Axes, Axes, Axes | None]

bpreveal.internal.plotUtils.getPisaGraphAxes(fig, left, bottom, width, height, mini)

Get axes appropriate for drawing a PISA graph.

Parameters:
  • fig (Figure) – The figure to draw the axes on.

  • left (float) – The left edge, as a fraction of the figure width, for the plots.

  • bottom (float) – The bottom, as a fraction of figure height, of the plots.

  • width (float) – The width, as a fraction of the figure width, for the plots.

  • height (float) – The height, as a fraction of the figure height, for the plots.

  • mini (bool) – Should the axes be arranged for small display spaces?

Returns:

A tuple of axes, in order Graph, importance, predictions, annotations, colorbar.

Return type:

tuple[Axes, Axes, Axes, Axes, Axes, Axes | None]

bpreveal.internal.plotUtils.addVerticalProfilePlot(profile, axProfile, colors, sequence, genomeWindowStart, fontSizeTicks, fontSizeAxLabel, mini)

Plot a profile on a vertical axes.

Parameters:
  • profile (ndarray[tuple[int, ...], dtype[float32]]) – The values that should be plotted. This will be an ndarray.

  • axProfile (Axes) – The axes to draw the profile on.

  • colors (list[DNA_COLOR_SPEC_T]) – A DNA_COLOR_SPEC_T for each base.

  • sequence (str) – The underlying DNA sequence. Used to determine the colors to use.

  • genomeWindowStart (int) – Where in genomic coordinates does the sequence start? Since axProfile has a sharey relationship with axPisa, this puts the profile in the right place.

  • fontSizeTicks (int) – How big do you want the tick labels, in points?

  • fontSizeAxLabel (int) – How big do you want the word “Profile” on your axes?

  • mini (bool) – If True, then all ticks are removed and the axis is not labeled.

Return type:

None

bpreveal.internal.plotUtils.addHorizontalProfilePlot(values, colors, sequence, genomeStartX, genomeEndX, axSeq, axGraph, fontSizeTicks, fontSizeAxLabel, showSequence, labelXAxis, yAxisLabel, mini)

Draw a profile on a horizontal axes.

Parameters:
  • values (ndarray[tuple[int, ...], dtype[float32]]) – The values to plot.

  • colors (list[DNA_COLOR_SPEC_T]) – A list of DNA_COLOR_SPEC_T, one for each base.

  • sequence (str) – The sequence of the region. Used to determine the color for each base, and of course to set the sequence if showSequence is True.

  • genomeStartX (int) – Where, in genomic coordinates, does the x-axis start?

  • genomeEndX (int) – Where, in genomic coordinates, does the x-axis end?

  • axSeq (Axes) – The axes where the plot will be drawn.

  • axGraph (Axes | None) – This is the axes from the PISA graph or plot. If labelAxis is set then the x-ticks on axGraph will be turned off. If axGraph is None, then it won’t be changed (obviously).

  • fontSizeTicks (int) – How big should the tick text be, in points?

  • fontSizeAxLabel (int) – How big should the labels be, in points?

  • showSequence (bool) – Should the DNA sequence be drawn, or just a bar plot?

  • labelXAxis (bool) – If True, then put ticks and tick labels on the x-axis, and also remove any labels from axGraph, if axGraph is not None.

  • yAxisLabel (str) – Text to display on the left side of the axis.

  • mini (bool) – If True, use fewer x-ticks and don’t show a label on the x-axis. Note that even if labelXAxis is True, the string Input base coordinate will not be shown if mini is True.

Return type:

None