Authors: Hannah Kullik & Martin Urban
- Description
- Dev-Environment
- Pre-commit hook
- Linting
- Type annotation
- Imports
- Exception handling
- Communication
- Threading
- Database
- Terminology
- Code formatting
- Code documentation
Python is the main programming language of the PySSA project. This document describes the rules which must be followed if the source code gets extended.
The conda environment used for the development must be created through an environment.yaml file from one of the authors. This ensures that the development environment is reproducible.
Before committing any changes, a custom commit-hook will be run automatically. You must solve any issues before committing any changes!
You can choose to run the pre-commit hook configuration by yourself beforehand by running the command:
pre-commit run --all-files
All available pre-commit hooks are listed here: https://pre-commit.com/hooks.html. The addition of a pre-commit hook needs an approval of an author.
To be able to run the pre-commit hooks on Windows, it could be possible to update the SSL lib. To do this download this OpenSSL Installer.
You have to run ruff
over your code to check any static errors.
The configuration to use is defined in the pyproject.toml
.
Python is a dynamically typed language, but in this project Python is used as a statically typed language. The decision emphasizes robust and less error-prone code. Therefore, you have to use Python's type annotation feature.
Annotating variables using python builtins where it is possible.
i: int = 0
Annotating variables using pyssa builtins where data structures of pyssa are used.
protein_pairs_for_analysis: list['protein_pair.ProteinPair'] = []
Annotating variables using library builtins where data types of libraries are used.
import numpy as np
distances_of_amino_acid_pairs: np.ndarray = np.ndarray([])
If a function/ method has a return value that will not be used, that
function call needs to be wrapped inside the rvoid
function.
The rvoid
function is the only function which gets imported as function
and not as module:
from pyssa.util.void import rvoid
from pyssa.util import main_window_util
rvoid(main_window_util.setup_app_settings(self.app_settings)) # void indicates that there is a
# return value but it is not used
- Package: snake_case
- Module: snake_case
- Class: PascalCase
- Method: snake_case
- private: _ prefix (single underscore)
def _create_directory_structure(self) -> None:
- Function: snake_case
- Variable: snake_case
- argument: a/an_var_name, if no specific variable is meant.
def export_protein_as_pdb_file(a_filepath: str) -> None:
- argument: the_var_name, if a specific variable is meant.
def load_settings(the_app_settings: 'settings.Settings') -> None:
- method/function scope: tmp_ prefix
... tmp_destination_filepath: str = "/home/rhel_user/scratch/log.txt" ...
- Global variable: g_ prefix + snake_case
Never use wildcard imports. Always import the module not the class itself.
from pymol import cmd # Correct: Module is imported
from pymol import * # Wrong! Wildcard import
from os.path import exists # Wrong! Function/Class import
Use official abbreviations for common python libraries.
import numpy as np
import pandas as pd
Always check for None:
def copy_fasta_file(a_source_filepath, a_destination_filepath):
if a_source_filepath is None:
logger.error(f"The argument 'a_source_filepath' is illegal: {a_source_filepath}!")
raise exception.IllegalArgumentError("An argument is illegal.")
if a_destination_filepath is None:
logger.error(f"The argument 'a_destination_filepath' is illegal: {a_destination_filepath}!")
raise exception.IllegalArgumentError("An argument is illegal.")
Raise IllegalArgumentError if unmodified argument is not usable for the function/method:
import os
def copy_fasta_file(a_source_filepath: pathlib.Path, a_destination_filepath: pathlib.Path):
...
if not os.path.exists(a_source_filepath): # argument is unmodified
raise FileNotFoundError()
Raise custom exception if argument is modified and is not usable for the function/method
import os
def copy_fasta_file(a_source_filepath: pathlib.Path, a_destination_filepath: pathlib.Path):
...
if not os.path.exists(a_source_filepath.parent): # .parent is a modified version of the argument
raise exceptions.DirectoryNotFoundError("")
Always wrap cmd
commands of the PyMOL API into a try-except block.
import pymol
try:
cmd.scene(f"{tmp_protein_pair.protein_1.get_molecule_object()}"
f"{tmp_protein_pair.protein_2.get_molecule_object()}",
action="recall")
except pymol.CmdException:
logger.error("...")
raise ...
The communication between any QMainWindow and QDialog is done with signals and slots. This ensures that no unauthorized memory access violations occur.
- Define a custom pyqtsignal in the QDialog class:
...
class DialogAddModel(Qt.QtWidgets.QDialog):
"""Class for a dialog to add proteins to a project."""
"""
A pyqtsignal that is used to hand-over the protein structure information.
"""
return_value = pyqtSignal(tuple) # this is a custom PyQt signal
...
- Emit the signal where communication should occur.
...
def add_model(self) -> None:
"""Emits a custom pyqtsignal and closes the dialog."""
self.return_value.emit((self.ui.txt_add_protein.text(), True))
self.close()
...
- Connect the signal in the QMainWindow with the QDialog object and the slot function
...
def add_existing_protein(self) -> None:
"""Opens a dialog to add an existing protein structure to the project."""
self.tmp_dialog = dialog_add_model.AddProteinView()
self.tmp_dialog.return_value.connect(self.post_add_existing_protein) # here is the connection
self.tmp_dialog.show()
...
- Be sure that the slot function has the value of the signal as an function argument
...
def post_add_existing_protein(self, return_value: tuple): # in this case the value is a tuple
...
Within PySSA the custom Task
class will be used if multithreading is necessary
for the presenter. The Task
class is in the pyssa.internal.thread.tasks
module. Do NOT use the _Action
class directly only use the Task
class!
...
def opens_project(self):
"""Initiates the task to open an existing project."""
self._active_task = tasks.LegacyTask(self.__async_open_project, post_func=self.__await_open_project)
self._active_task.start()
def __async_open_project(self) -> tuple:
"""Runs in the separate QThread and does CPU-bound work."""
tmp_project_path = pathlib.Path(f"{self._workspace_path}/{self._view.ui.txt_open_selected_project.text()}")
return ("result", project.Project.deserialize_project(tmp_project_path, self._application_settings))
def __await_post_project(self, a_result: tuple):
"""Runs after the QThread finished."""
...
The Task
class gets an "async" function and optionally an "await" function.
The function that runs in the QThread must have the signature __async
(double underscore). The function that runs after the QThread finished must
have the signature __await
.
This design decision is based on intuition because the __async
function
runs asynchronous in the QThread and the __await
function waits
for the QThread (__async
function) to finish,
PySSA uses a SQLite database for every single project.
The interaction is managed through the DatabaseManager
class.
The interaction with the manager from a controller is done through the
DatabaseThread
class. The DatabaseThread
has a queue which
accepts objects of the type DatabaseOperation
.
To run an INSERT statement from a controller, you have to create
a DatabaseOperation
object with the SQLQueryType
(in this case INSERT_...)
and put it into the queue of the DatabaseThread
.
def _delete_protein(self):
"""Deletes an existing protein from the project."""
tmp_protein: "protein.Protein" = self._view.ui.proteins_tree_view.currentIndex().data(enums.ModelEnum.OBJECT_ROLE)
# Below is the creation of the DatabaseOperation object
tmp_database_operation = database_operation.DatabaseOperation(enums.SQLQueryType.DELETE_EXISTING_PROTEIN,
(0, tmp_protein.get_id()))
# Here the DatabaseOperation object will be put into the queue of the DatabaseThread
self._database_thread.put_database_operation_into_queue(tmp_database_operation)
# -- The rest of the function
self._interface_manager.get_current_project().delete_specific_protein(tmp_protein.get_molecule_object())
self._interface_manager.refresh_protein_model()
self._interface_manager.refresh_main_view()
Every SQL statement has to be implemented in the DatabaseManager
class!
For proper functionality of the DatabaseThread
class
it is necessary to add the SQL statements from the
database manager into a wrapper function and map this function against
an appropriate SQLQueryType enum.
An example for a wrapper function.
@staticmethod
def __wrapper_delete_existing_protein(the_db_manager, the_buffered_data: tuple):
# It is import to unpack the first element of the tuple with an _ !
_, tmp_protein_id = the_buffered_data
the_db_manager.delete_existing_protein(tmp_protein_id)
An example for the mapping process
def _setup_operations_mapping(self):
self._operations_mapping = {
enums.SQLQueryType.INSERT_NEW_PROTEIN: self.__wrapper_insert_new_protein,
enums.SQLQueryType.DELETE_EXISTING_PROTEIN: self.__wrapper_delete_existing_protein
}
and the SQLQueryType enum class
class SQLQueryType(enum.Enum):
"""An enum for all possible sql queries for the database thread."""
INSERT_NEW_PROTEIN = 'insert_new_protein'
DELETE_EXISTING_PROTEIN = 'delete_existing_protein'
- Always use
path
if a directory path is meant. - Always use
dir
if a directory name is meant. - Always use
filepath
if an absolute path to a file is meant. - Always use
file
if a name of a file is meant.
- Add a
# TODO
if there is a task which needs to be done. - Add a
# fixme
if there is an important note which needs to be quickly found.
The overall code formatting is done with the auto-formatter black. This will be done if the pre-commit hooks are ran.
Always wrap argument checks into an editor-fold (Ctrl+Alt+T) and insert a line break before and after the ending of the editor-fold. Example:
# <editor-fold desc="Checks">
if the_fasta_path is None:
logger.error("The argument filename is illegal.")
raise exception.IllegalArgumentError("")
# </editor-fold>
The documentation for the pyssa codebase is done with sphinx. To generate the new documentation run if you are in the codebase dir (PySSA/docs/codebase):
sphinx-apidoc -f -o .\source\ ..\..\pyssa\
sphinx-build -M html source/ build/