internal.constants

Types that are used throughout BPReveal.

bpreveal.internal.constants.ONEHOT_T

Data type for elements of a one-hot encoded sequence.

bpreveal.internal.constants.ONEHOT_AR_T

Data type for an array of one-hot encoded sequences

alias of ndarray[Any, dtype[uint8]]

bpreveal.internal.constants.PRED_T

Data type for coverage.

bpreveal.internal.constants.PRED_AR_T

Data type for an array of predictions.

alias of ndarray[Any, dtype[float32]]

bpreveal.internal.constants.LOGIT_T

Data type for logits from the model.

bpreveal.internal.constants.LOGIT_AR_T

Data type for an array of logits.

alias of ndarray[Any, dtype[float32]]

bpreveal.internal.constants.LOGCOUNT_T

Data type for logcount values.

bpreveal.internal.constants.IMPORTANCE_T

Store importance scores with 16 bits of precision.

Since importance scores (particularly PISA values) take up a lot of space, I use a small floating point type and compression to mitigate the amount of data.

bpreveal.internal.constants.IMPORTANCE_AR_T

Data type for an array of importance values.

alias of ndarray[Any, dtype[float16]]

bpreveal.internal.constants.MODEL_ONEHOT_T

Inside the models, we use floating point numbers to represent one-hot sequences.

For reasons I don’t understand, setting this to uint8 DESTROYS pisa values.

bpreveal.internal.constants.MOTIF_FLOAT_T

The type used to represent cwms and pwms, and also the type used by the jaccard code.

If you change this, be sure to change libJaccard.c and libJaccard.pyf (and run make) so that the jaccard library uses the correct data type.

bpreveal.internal.constants.MOTIF_FLOAT_AR_T

An array of motif data.

alias of ndarray[Any, dtype[float32]]

bpreveal.internal.constants.COLOR_SPEC_T: TypeAlias

A COLOR_SPEC_T is anything that parseSpec can turn into an rgb or rgba triple.

It may be one of the following things:

  1. {"rgb": (0.1, 0.2, 0.3)}

    giving an rgb triple.

  2. {"rgba": (0.1, 0.2, 0.3, 0.5)}

    giving an rgb triple with an alpha value.

  3. {"tol": 1}

    giving a numbered color from the tol palette. Valid numbers are 0 to 7.

  4. {"tol-light": 1}

    giving a numbered color from the tolLight palette. Valid numbers are 0 to 6.

  5. {"wong": 1}

    giving a numbered color from the wong palette. Valid numbers are 0 to 7.

  6. {"ibm": 1}

    giving a numbered color from the ibm palette. Valid numbers are 0 to 4.

  7. (0.1, 0.2, 0.3)

    giving an rgb triple.

  8. (0.1, 0.2, 0.3)

    giving an rgb triple with an alpha value.

class bpreveal.internal.constants.DNA_COLOR_SPEC_T

A type that assigns a color to each of the four bases.

It is a dictionary mapping the bases onto colorSpecs, like this:

{"A": {"wong": 3}, "C": {"wong": 5},
 "G": {"wong": 4}, "T": {"wong": 6}}
Type:

{Literal[“A”]: COLOR_SPEC_T, Literal[“C”]: COLOR_SPEC_T, Literal[“G”]: COLOR_SPEC_T, Literal[“T”]: COLOR_SPEC_T}

bpreveal.internal.constants.RGB_T: TypeAlias = tuple[float, float, float] | tuple[float, float, float, float]

An rgb or rgba triple.

class bpreveal.internal.constants.ANNOTATION_T

Represents a genomic region of interest. Includes color and a name.

start

the start point of the annotation, in genomic coordinates.

end

The end point of the annotation, in genomic coordinates.

name

The string giving the name of the annotation.

color

a COLOR_SPEC_T giving the color to use when drawing the annotation

bottom

As a fraction of the window height for annotations, where should this one’s box start? (Optional, default: 0)

top

As a fraction of the window height for annotations, where should this one’s box end? (Optional, default: 1.0)

bpreveal.internal.constants.H5_CHUNK_SIZE: int = 128

When saving large hdf5 files, store the data in compressed chunks.

This constant sets the number of entries in each chunk that gets compressed. For good performance, whenever you read a compressed hdf5 file, it really helps if you read out whole chunks at a time and buffer them. See, for example, shapToBigwig for an example of a chunked reader.

bpreveal.internal.constants.QUEUE_TIMEOUT: int = 240

How long should a queue wait before crashing?

In parallel code, if something goes wrong, a queue could stay stuck forever. Python’s queues have a nifty timeout parameter so that they’ll crash if they wait too long. If a queue has been blocking for longer than this timeout, have the program crash.

This is measured in seconds.

bpreveal.internal.constants.GLOBAL_TENSORFLOW_LOADED: bool = False

Has Tensorflow been loaded in this process?

This gets set to True if you use any of the tensorflow-importing functions in this file. If you import tensorflow in a parent process, child processes will not be able to use tensorflow, because tensorflow is dumb like that. Tools like the easy® functions and the threaded batcher check to see if Tensorflow has been loaded in the parent process before they spawn children.

bpreveal.internal.constants.GENOME_NUCLEOTIDE_FREQUENCY: dict[str, list[float]] = {'danRer11': [0.316952, 0.183272, 0.183253, 0.31652], 'dm6': [0.290034, 0.210142, 0.209919, 0.289903], 'hg38': [0.295182, 0.203906, 0.204783, 0.296127], 'mm10': [0.291497, 0.208327, 0.208343, 0.291831], 'sacCer3': [0.309806, 0.190882, 0.190596, 0.308714]}

The frequency of A, C, G, and T (in that order) in common reference genomes.

bpreveal.internal.constants.setTensorflowLoaded()

Call this when you first load tensorflow.

bpreveal.internal.constants.getTensorflowLoaded()

Returns true if this process has ever loaded tensorflow.

Return type:

bool