Bone morphogenetic protein 2 (BMP21) is a highly interesting therapeutic growth factor due to its strong osteogenic/osteoinductive potential. However, its pronounced aggregation tendency renders recombinant and soluble production troublesome and complex. While prokaryotic expression systems can provide BMP2 in large amounts, the typically insoluble protein requires complex denaturation-renaturation procedures with medically hazardous reagents to obtain natively folded homodimeric BMP2. Based on a detailed aggregation analysis of wildtype BMP2, we designed a hydrophilic variant of BMP2 additionally containing an improved heparin binding site (BMP2-2Hep-7M). Consecutive optimization of BMP2-2Hep-7M expression and purification enabled production of soluble dimeric BMP2-2Hep-7M in high yield in E. coli. This was achieved by a) increasing protein hydrophilicity via introducing seven point mutations within aggregation hot spots of wildtype BMP2 and a longer N-terminus resulting in higher affinity for heparin, b) by employing E. coli strain SHuffle® T7, which enables the structurally essential disulfide-bond formation in BMP2 in the cytoplasm, c) by using BMP2 variant characteristic soluble expression conditions and application of L-arginine as solubility enhancer. The BMP2 variant BMP2-2Hep-7M shows strongly attenuated although not completely eliminated aggregation tendency.
Computational methods for the accurate prediction of protein folding based on amino acid sequences have been researched for decades. The field has been significantly advanced in recent years by deep learning-based approaches, like AlphaFold, RoseTTAFold, or ColabFold. Although these can be used by the scientific community in various, mostly free and open ways, they are not yet widely used by bench scientists in relevant fields such as protein biochemistry or molecular biology, who are often not familiar with software tools such as scripting notebooks, command-line interfaces or cloud computing. In addition, visual inspection functionalities like protein structure displays, structure alignments, and specific protein hotspot analyses are required as a second step to interpret and apply the predicted structures in ongoing research studies.
PySSA (Python rich client for visual protein Sequence to Structure Analysis) is an open Graphical User Interface (GUI) application combining the protein sequence to structure prediction capabilities of ColabFold with the open-source variant of the molecular structure visualisation and analysis system PyMOL to make both available to the scientific end-user. PySSA enables the creation of managed and shareable projects with defined protein structure prediction and corresponding alignment workflows that can be conveniently performed by scientists without specialised computer skills or programming knowledge on their local computers. Thus, PySSA can help make protein structure prediction more accessible for end-users in protein chemistry and molecular biology as well as be used for educational purposes. It is openly available on GitHub, alongside a custom graphical installer executable for the Windows operating system: https://github.com/urban233/PySSA/wiki/Installation-for-Windows-Operating-System.
To demonstrate the capabilities of PySSA, its usage in a protein mutation study on the protein drug Bone Morphogenetic Protein 2 (BMP2) is described: the structure prediction results indicate that the previously reported BMP2-2Hep-7M mutant, which is intended to be less prone to aggregation, does not exhibit significant spatial rearrangements of amino acid residues interacting with the receptor.