Classes

This module stores the classes (Organism, Scaffold, Hit) used in cblaster.

class cblaster.classes.Hit(query, subject, identity, coverage, evalue, bitscore)

A BLAST hit identified during a cblaster search.

This class is first instantiated when parsing BLAST results, and is then updated with genomic coordinates after querying either the Identical Protein Groups (IPG) resource on NCBI, or a local JSON database.

query

Name of query sequence.

Type:str
subject

Name of subject sequence.

Type:str
identity

Percentage identity (%) of hit.

Type:float
coverage

Query coverage (%) of hit.

Type:float
evalue

E-value of hit.

Type:float
bitscore

Bitscore of hit.

Type:float
start

Start of subject sequence on corresponding scaffold.

Type:int
end

End of subject sequence on corresponding scaffold

Type:int
strand

Orientation of subject sequence (‘+’ or ‘-‘).

Type:str
copy(**kwargs)

Creates a copy of this Hit with any additional args.

classmethod from_dict(d)

Loads class from dict.

to_dict()

Serialises class to dict.

values(decimals=4)

Formats hit attributes for printing.

Parameters:decimals (int) – Total decimal places to show in score values.
Returns:List of formatted attribute strings.
class cblaster.classes.Organism(name, strain, scaffolds=None)

A unique organism containing hits found in a cblaster search.

Every strain (or lack thereof) is a unique Organism, and will be reported separately in cblaster results.

name

Organism name, typically the genus and species epithet.

Type:str
strain

Strain name of this organism, e.g. CBS 536.65.

Type:str
scaffolds

Scaffold objects belonging to this organism.

Type:dict
classmethod from_dict(d)

Loads class from dict.

full_name

The full name (including strain) of the organism. Note: if strain found in name, returns just name.

to_dict()

Serialises class to dict.

total_hit_clusters

Counts total amount of hit clusters in this Organism.

class cblaster.classes.Scaffold(accession, clusters=None, subjects=None)

A genomic scaffold containing hits found in a cblaster search.

accession

Name of this scaffold, typically NCBI accession.

Type:str
hits

Hit objects located on this scaffold.

Type:list
clusters

Clusters of hits identified on this scaffold.

Type:list
classmethod from_dict(d)

Loads class from dict.

to_dict()

Serialises class to dict.

class cblaster.classes.Serializer

JSON serialisation mixin class.

Classes that inherit from this class should implement to_dict and from_dict methods.

classmethod from_dict(d)

Loads class from dict.

classmethod from_json(js)

Instantiates class from JSON handle.

to_dict()

Serialises class to dict.

to_json(fp=None, **kwargs)

Serialises class to JSON.

class cblaster.classes.Session(queries, params, organisms=None)

Stores the state of a cblaster search.

This class stores query proteins, search parameters, Organism objects created during searches, as well as methods for generating summary tables. It can also be dumped to/loaded from JSON for re-filtering, plotting, etc.

>>> s = Session()
>>> with open("session.json", "w") as fp:
...     s.to_json(fp)
>>> with open("session.json") as fp:
...     s2 = Session.from_json(fp)
>>> s == s2
True
queries

Names of query sequences.

Type:list
params

Search parameters.

Type:dict
organisms

Organism objects created in a search.

Type:list
format(form, fp=None, **kwargs)

Generates a summary table.

Parameters:
  • form (str) – Type of table to generate (‘summary’ or ‘binary’).
  • fp (file handle) – File handle to write to.
  • human (bool) – Use human-readable format.
  • headers (bool) – Show table headers.
Raises:

ValueErrorform not ‘binary’ or ‘summary’

Returns:

Summary table.

classmethod from_dict(d)

Loads class from dict.

to_dict()

Serialises class to dict.

class cblaster.classes.Subject(hits=None, ipg=None, start=None, end=None, strand=None)

A sequence representing one or more BLAST hits.

This class is instantiated during the contextual lookup stage. It is important since it allows for subject sequences which hit >1 of the query sequences, while still staying non-redundant.

hits

Hit objects referencing this subject sequence.

Type:list
ipg

NCBI Identical Protein Group (IPG) id.

Type:int
start

Start of sequence on parent scaffold.

Type:int
end

End of sequence on parent scaffold.

Type:int
strand

Strandedness of the sequence (‘+’ or ‘-‘).

Type:str
classmethod from_dict(d)

Loads class from dict.

to_dict()

Serialises class to dict.