Classes¶
This module stores the classes (Organism, Scaffold, Hit) used in cblaster.
-
class
cblaster.classes.
Hit
(query, subject, identity, coverage, evalue, bitscore)¶ A BLAST hit identified during a cblaster search.
This class is first instantiated when parsing BLAST results, and is then updated with genomic coordinates after querying either the Identical Protein Groups (IPG) resource on NCBI, or a local JSON database.
-
query
¶ Name of query sequence.
Type: str
-
subject
¶ Name of subject sequence.
Type: str
-
identity
¶ Percentage identity (%) of hit.
Type: float
-
coverage
¶ Query coverage (%) of hit.
Type: float
-
evalue
¶ E-value of hit.
Type: float
-
bitscore
¶ Bitscore of hit.
Type: float
-
start
¶ Start of subject sequence on corresponding scaffold.
Type: int
-
end
¶ End of subject sequence on corresponding scaffold
Type: int
-
strand
¶ Orientation of subject sequence (‘+’ or ‘-‘).
Type: str
-
copy
(**kwargs)¶ Creates a copy of this Hit with any additional args.
-
classmethod
from_dict
(d)¶ Loads class from dict.
-
to_dict
()¶ Serialises class to dict.
-
values
(decimals=4)¶ Formats hit attributes for printing.
Parameters: decimals (int) – Total decimal places to show in score values. Returns: List of formatted attribute strings.
-
-
class
cblaster.classes.
Organism
(name, strain, scaffolds=None)¶ A unique organism containing hits found in a cblaster search.
Every strain (or lack thereof) is a unique Organism, and will be reported separately in cblaster results.
-
name
¶ Organism name, typically the genus and species epithet.
Type: str
-
strain
¶ Strain name of this organism, e.g. CBS 536.65.
Type: str
-
scaffolds
¶ Scaffold objects belonging to this organism.
Type: dict
-
classmethod
from_dict
(d)¶ Loads class from dict.
-
full_name
¶ The full name (including strain) of the organism. Note: if strain found in name, returns just name.
-
to_dict
()¶ Serialises class to dict.
-
total_hit_clusters
¶ Counts total amount of hit clusters in this Organism.
-
-
class
cblaster.classes.
Scaffold
(accession, clusters=None, subjects=None)¶ A genomic scaffold containing hits found in a cblaster search.
-
accession
¶ Name of this scaffold, typically NCBI accession.
Type: str
-
hits
¶ Hit objects located on this scaffold.
Type: list
-
clusters
¶ Clusters of hits identified on this scaffold.
Type: list
-
classmethod
from_dict
(d)¶ Loads class from dict.
-
to_dict
()¶ Serialises class to dict.
-
-
class
cblaster.classes.
Serializer
¶ JSON serialisation mixin class.
Classes that inherit from this class should implement to_dict and from_dict methods.
-
classmethod
from_dict
(d)¶ Loads class from dict.
-
classmethod
from_json
(js)¶ Instantiates class from JSON handle.
-
to_dict
()¶ Serialises class to dict.
-
to_json
(fp=None, **kwargs)¶ Serialises class to JSON.
-
classmethod
-
class
cblaster.classes.
Session
(queries, params, organisms=None)¶ Stores the state of a cblaster search.
This class stores query proteins, search parameters, Organism objects created during searches, as well as methods for generating summary tables. It can also be dumped to/loaded from JSON for re-filtering, plotting, etc.
>>> s = Session() >>> with open("session.json", "w") as fp: ... s.to_json(fp) >>> with open("session.json") as fp: ... s2 = Session.from_json(fp) >>> s == s2 True
-
queries
¶ Names of query sequences.
Type: list
-
params
¶ Search parameters.
Type: dict
-
organisms
¶ Organism objects created in a search.
Type: list
-
format
(form, fp=None, **kwargs)¶ Generates a summary table.
Parameters: - form (str) – Type of table to generate (‘summary’ or ‘binary’).
- fp (file handle) – File handle to write to.
- human (bool) – Use human-readable format.
- headers (bool) – Show table headers.
Raises: ValueError
– form not ‘binary’ or ‘summary’Returns: Summary table.
-
classmethod
from_dict
(d)¶ Loads class from dict.
-
to_dict
()¶ Serialises class to dict.
-
-
class
cblaster.classes.
Subject
(hits=None, ipg=None, start=None, end=None, strand=None)¶ A sequence representing one or more BLAST hits.
This class is instantiated during the contextual lookup stage. It is important since it allows for subject sequences which hit >1 of the query sequences, while still staying non-redundant.
-
hits
¶ Hit objects referencing this subject sequence.
Type: list
-
ipg
¶ NCBI Identical Protein Group (IPG) id.
Type: int
-
start
¶ Start of sequence on parent scaffold.
Type: int
-
end
¶ End of sequence on parent scaffold.
Type: int
-
strand
¶ Strandedness of the sequence (‘+’ or ‘-‘).
Type: str
-
classmethod
from_dict
(d)¶ Loads class from dict.
-
to_dict
()¶ Serialises class to dict.
-