Validation information in the Protein Data Bank: What is it and why should you care?

Smart, Oliver S.; Horský,  Vladimír; Gore, Swanand; Svobodová,  Radka; Horská,  Veronika; Kleywegt, Gerard J.; Velankar, Sameer

Validation information in the Protein Data Bank: What is it and why should you care?

Warning

This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Science. Official publication website can be found on muni.cz.

Authors	SMART Oliver S. HORSKÝ Vladimír GORE Swanand SVOBODOVÁ VAŘEKOVÁ Radka BENDOVÁ Veronika KLEYWEGT Gerard J. VELANKAR Sameer
Year of publication	2018
Type	Appeared in Conference without Proceedings
MU Faculty or unit	Faculty of Science
Citation
Description	Widespread availability of biomacromolecular structural data has accelerated the progress of research in various life sciences. As an example of this paradigm shift, computer-assisted studies of ligands bound to active sites of proteins and nucleic acids became possible, which in turn aided structure-guided drug discovery and design. Published structures are stored in many databases that have emerged over time, the largest one being the Protein Data Bank (PDB). Concerns regarding quality of available structures have gone hand-in-hand with broad structure production and usage. Curators of the PDB database have reacted by developing the PDB validation pipeline. Here, we present the available validation metrics and show how their values can be combined into a single score that can be used to rank macromolecular structures and their domains in search results. A major challenge that accompanies crystallographic experiments is how to correctly interpret electron density at binding sites. Incorrect solution of this ambiguity is one of the reasons why quality of ligands in complexes in the PDB is a concerning matter. Therefore, it comes as no surprise that several ligand validation methods are part of the PDB validation pipeline. Here, we describe these methods. Furthermore, we discuss that the currently used LLDF metric can give misleading results.
Related projects:	Matematické statistické modelování 2