Modules

emase Package

AlignmentMatrixFactory Module

class emase.AlignmentMatrixFactory.AlignmentMatrixFactory(alnfile)[source]
cleanup()[source]
prepare(haplotypes, loci, delim='_', outdir=None)[source]
produce(h5file, title='Alignments', index_dtype='uint32', data_dtype=<type 'float'>, complib='zlib', incidence_only=True)[source]

Sparse3DMatrix Module

class emase.Sparse3DMatrix.Sparse3DMatrix(other=None, h5file=None, datanode='/', shape=None, dtype=<type 'float'>)[source]

3-dim sparse matrix designed for “pooled” RNA-seq alignments

add(addend_mat, axis=1)[source]

In-place addition

Parameters:
  • addend_mat – A matrix to be added on the Sparse3DMatrix object
  • axis – The dimension along the addend_mat is added
Returns:

Nothing (as it performs in-place operations)

add_value(lid, hid, rid, value)[source]
combine(other)[source]
copy()[source]
finalize()[source]
get_cross_section(index, axis=0)[source]
multiply(multiplier, axis=None)[source]

In-place multiplication

Parameters:
  • multiplier – A matrix or vector to be multiplied
  • axis – The dim along which ‘multiplier’ is multiplied
Returns:

Nothing (as it performs in-place operations)

reset()[source]
save(h5file, title=None, index_dtype='uint32', data_dtype=<type 'float'>, incidence_only=True, complib='zlib')[source]
set_value(lid, hid, rid, value)[source]
sum(axis=2)[source]

AlignmentPropertyMatrix Module

class emase.AlignmentPropertyMatrix.AlignmentPropertyMatrix(other=None, h5file=None, datanode='/', metanode='/', shallow=False, shape=None, dtype=<type 'float'>, haplotype_names=None, locus_names=None, read_names=None, grpfile=None)[source]

Bases: emase.Sparse3DMatrix.Sparse3DMatrix

Axis

alias of Enum

bundle(reset=False, shallow=False)[source]

Returns AlignmentPropertyMatrix object in which loci are bundled using grouping information.

Parameters:
  • reset – whether to reset the values at the loci
  • shallow – whether to copy all the meta data
combine(other, shallow=False)[source]
copy(shallow=False)[source]
count_alignments()[source]
count_unique_reads(ignore_haplotype=False)[source]
get_read_data(rid)[source]
get_reads_aligned_to_locus(lid, hid=None)[source]
get_unique_reads(ignore_haplotype=False, shallow=False)[source]

Pull out alignments of uniquely-aligning reads

Parameters:
  • ignore_haplotype – whether to regard allelic multiread as uniquely-aligning read
  • shallow – whether to copy sparse 3D matrix only or not
Returns:

a new AlignmentPropertyMatrix object that particular reads are

load_groups(grpfile)
normalize_reads(axis, grouping_mat=None)[source]

Read-wise normalization

Parameters:
  • axis – The dimension along which we want to normalize values
  • grouping_mat – An incidence matrix that specifies which isoforms are from a same gene
Returns:

Nothing (as the method performs in-place operations)

Return type:

None

print_read(rid)[source]

Prints nonzero rows of the read wanted

pull_alignments_from(reads_to_use, shallow=False)[source]

Pull out alignments of certain reads

Parameters:
  • reads_to_use – numpy array of dtype=bool specifying which reads to use
  • shallow – whether to copy sparse 3D matrix only or not
Returns:

a new AlignmentPropertyMatrix object that particular reads are

report_alignment_counts(filename)[source]
save(h5file, title=None, index_dtype='uint32', data_dtype=<type 'float'>, incidence_only=True, complib='zlib', shallow=False)[source]
sum(axis)[source]
emase.AlignmentPropertyMatrix.enum(**enums)[source]

EMfactory Module

class emase.EMfactory.EMfactory(alignments)[source]

A class that coordinate Expectation-Maximization

export_posterior_probability(filename, title='Posterior Probability')[source]

Writes the posterior probability of read origin

Parameters:
  • filename – File name for output
  • title – The title of the posterior probability matrix
Returns:

Nothing but the method writes a file in EMASE format (PyTables)

get_allelic_expression(at_group_level=False)[source]
prepare(pseudocount=0.0, lenfile=None, read_length=100)[source]

Initializes the probability of read origin according to the alignment profile

Parameters:pseudocount – Uniform prior for allele specificity estimation
Returns:Nothing (as it performs an in-place operations)
report_depths(filename, tpm=True, grp_wise=False, reorder='as-is', notes=None)[source]

Exports expected depths

Parameters:
  • filename – File name for output
  • grp_wise – whether the report is at isoform level or gene level
  • reorder – whether the report should be either ‘decreasing’ or ‘increasing’ order or just ‘as-is’
Returns:

Nothing but the method writes a file

report_read_counts(filename, grp_wise=False, reorder='as-is', notes=None)[source]

Exports expected read counts

Parameters:
  • filename – File name for output
  • grp_wise – whether the report is at isoform level or gene level
  • reorder – whether the report should be either ‘decreasing’ or ‘increasing’ order or just ‘as-is’
Returns:

Nothing but the method writes a file

reset(pseudocount=0.0)[source]

Initializes the probability of read origin according to the alignment profile

Parameters:pseudocount – Uniform prior for allele specificity estimation
Returns:Nothing (as it performs an in-place operations)
run(model, tol=0.001, max_iters=999, verbose=True)[source]

Runs EM iterations

Parameters:
  • model – Normalization model (1: Gene->Allele->Isoform, 2: Gene->Isoform->Allele, 3: Gene->Isoform*Allele, 4: Gene*Isoform*Allele)
  • tol – Tolerance for termination
  • max_iters – Maximum number of iterations until termination
  • verbose – Display information on how EM is running
Returns:

Nothing (as it performs in-place operations)

update_allelic_expression(model=3)[source]

A single EM step: Update probability at read level and then re-estimate allelic specific expression

Parameters:model – Normalization model (1: Gene->Allele->Isoform, 2: Gene->Isoform->Allele, 3: Gene->Isoform*Allele, 4: Gene*Isoform*Allele)
Returns:Nothing (as it performs in-place operations)
update_probability_at_read_level(model=3)[source]

Updates the probability of read origin at read level

Parameters:model – Normalization model (1: Gene->Allele->Isoform, 2: Gene->Isoform->Allele, 3: Gene->Isoform*Allele, 4: Gene*Isoform*Allele)
Returns:Nothing (as it performs in-place operations)