Phenix/DivCon: Frequently Asked Questions

Phenix/DivCon: Frequently Asked Questions

Question: Does Phenix/DivCon support multiple QM regions?

Answer: Yes. Simply use the qblib_region_selection= command line argument and request multiple regions. DivCon will automatically determine whether the regions overlap (according to the core/buffer size settings), and how to divide the system among the various processors requested. See the tutorial for more information.

Q: Can I treat alternative atom conformations with Phenix/DivCon.

A: Yes and No. Currently, alternative atom conformations are supported outside of the QM region, but within the QM region, only the single highest occupied atom will be considered. This is a fundamental limitation of quantum mechanics, and solutions are being explored.

Q: Is the method able to address metal containing systems and other exotic elements?

A: YES, absolutely. The default semiempirical quantum mechanics Hamiltonian – PM6 – has been trained for 70 elements on the periodic table. This coverage will probably address the vast majority of structures in biological problems. The only limitation is that the structure is closed shell in that it has an even number of electrons. In order to enforce this requirement, the automatic perception algorithm employed in DivCon will assume oxidation states that lead to closed shell structures.

Q: What about covalently-bound ligands? Can Phenix/DivCon really handle them without special parameters?

A: Yes, certainly. This too is a strength of Phenix/DivCon. When both the ligand and the active site (with the bound residue) are included in the main QM region, the ligand, the residue, and the corresponding bond will adopt a proper, lower-energy conformation. As is often observed, the influence of this bond cascades through the ligand and this accuracy will impact the entire structure. Further, since QM does not depend on static restraints, if there is question whether or not the ligand is actually covalently-bound to the protein, you may observe instabilities in the bond – such as the ligand moving away from the protein – which may inform this important decision. Likewise, exploring protonation in such a case with the significantly more accurate functional available in Phenix/DivCon will be crucial.

Q: Do I need to protonate the structure? How can I use DivCon to help in the protonation process?

A: Yes, you will need to protonate the structure prior to refinement. In cases where there are truncated residues or other “open valence” situations, DivCon will attempt to transparently protonate the structure. However, protons – especially in biological structures – are extremely important, and therefore the successful practitioner should make sure his or her work is “complete” by double checking key protonation states. Either Phenix tools or MOE can be used for this process. Finally, if there are questionable protonation states, Phenix/DivCon can be used to explore the various states and provide experimental evidence for their states. This has been illustrated in this case study.

Q: I have run protonation, but the protonation states are wrong. What happened?

A: Protonation with advanced, perception-based methods such as those in MOE presuppose that the heavy atom positions are fairly correct. If the input structure is so poor that the perception algorithm in MOE (or some other similar package) fails to correctly determine the structure, then it will sometimes add or delete protons incorrectly. The suggested way to address this problem is to double check your work. You are free to run the entire preparation, look at the structure, delete extraneous protons, and then run the refinement. Tutorial #4 is provided specifically to illustrate this problem. If you are unsure of the correct protonation state, use this case study as a guide and run multiple refinements in parallel and use the data given – such as ligand strain and crystallographic metrics – to decide which is correct.

Q: How complete is the functional? Can it really account for electrostatics, hydrogen bonding, polarization and other quantum effects?

A: Yes! Classical or molecular mechanics (MM) has its strengths, but in many cases, these force fields are playing “catch up” to quantum methods. Semiempirical quantum mechanics is able to capture these influences very well even for the relatively fast calculations that are employed during the Phenix/DivCon refinement. Granted, to get a more complete picture of dispersion and hydrogen bonding, higher level ab initio and DFT methods are required; however, these calculations are also much, much more expensive than semiempirical QM methods. For this reason, we believe that we have struck the proper balance between speed and accuracy. Ultimately, compared with stereochemical restraint methods utilized in conventional refinement, the higher level methods found in Phenix/DivCon are far superior.

Q: What about symmetry – has this been added yet?

A: Not quite – but it’s on the way. This functionality is a “critical path item” and it will be added in the next few months. For the time being, you should focus on structures which do not require symmetry within the active site or the QM region. Outside of the QM region, Phenix will perform as it always does with the use of standard, stereochemical restraints.

Q: When should I employ quantum mechanics during the refinement? Should I use QM from the beginning or should I use it after I’m pretty sure of the structure?

A: The answer to this question will vary from problem to problem, but as a general rule, we recommend as early as possible. One of the biggest benefits of the method is that it requires fewer a priori assumptions vs conventional methods. Conventional assumptions (e.g. stereochemical restraints) can quickly become biases. Granted, you need to know enough to make fairly accurate protonation predictions, but even with this requirement, Phenix/DivCon can help. The only real requirement for a starting geometry is that it isn’t so “broken” that it causes convergence trouble with the QM calculation. Problems such as these can be handled by employing the macro_cycle_to_skip= command line option that will allow the standard stereochemical restraints to minimize the structure prior to QM refinement.

Q: Does Phenix/DivCon do anything with the starting geometry, such as re-docking the ligand and other model-building exercises, prior to QM refinement?

A: No, this functionality is beyond the scope of the method. There are several tools that can be used to (re)place the ligand such as AFIT and other tools. A suggested method is to dock the ligand repeatedly using some sort of density-aware docker, and then refine each binding mode in parallel using Phenix/DivCon. Along with each of the new models, Phenix/DivCon also provides information such as final strain of the ligand and optionally the protein:ligand pairwise interaction energy decomposition (PWD). This information can be used to help inform your decision about which of the final poses is correct.

Q: What does Phenix/DivCon do about bond making/breaking during the run?

A: Quantum mechanics does not include any sort of topological constraints, so it is entirely possible (though unlikely) that bonds will be broken. More often that not, this is actually indicative of some sort of problem in the starting structure and therefore something you want to address prior to publication or dissemination. If you do see bond making/breaking, go through and double check your structure.

Q: So QM does not include any sort of topological restraints. But if I break a bond “by hand” prior to refinement, shouldn’t quantum mechanics be “smart” enough to reform the bond?

A: Probably not – though this is not a problem with quantum mechanics per se as much as a result of the realities of simulations: we need to limit the number of unknowns or degrees of freedom in order to better guarantee success. When a structure is first read into DivCon, it goes through a perception process to determine formal charge, hybridization, and so on. You do not need to provide this information thanks to this perception algorithm, so this perception increases the usability of the software. However, the one drawback is we assume that the input chemistry is correct and the unknown is the conformation. It is therefore an unfair expectation that a broken bond will reform. The benefits of this perception far out way this drawback since most of the time, you begin with a ligand structure that is reasonable. So, if you are unsure about the topological connectivity of a structure, then you should run multiple, parallel refinements beginning with the various topological permutations and observe which final, refined structure best matches the data with a reasonable strain. Incidentally, if you use macro_cycle_to_skip= command line option then the CIF-provided topological restraints will be used in the first macrocycle. This will cause reformation of any bonds that you may have broken according to the CIF. This means that if you set out to determine which topological configuration is correct, you should make sure that the you aren’t providing a CIF that is artificially reforming any bonds!

Q: Given all of this information, what structures will the method particularly excel at?

A: The short answer is Phenix/DivCon will run on just about any structure you may want to treat, and it will generally perform significantly better than conventional refinement methods. The following list of structural characteristics are most applicable to QM-based methods, and running Phenix/DivCon on these structural types will yield the “biggest bang for your buck:”

  • Structures where the active site has a significant influence on the structure (because this influence is not captured in the CIF and the rudimentary conventional functional).
  • Cases where there are controversial binding modes and you need data to decide which is correct. In this case, use AFIT or manual docking/placement approaches and refine each example in parallel, and then use QM indicators such as strain and PWD to help determine which binding mode is correct.
  • Controversial protonation cases where there is some question about the protonation state of the ligand, active site, or both. This is analogous to the HIS example on our website. Again, start with several potential protonation states, and refine each example in parallel and use the indicators to decide which protonation pattern is correct.
  • Covalently bound ligand cases where the ligand-residue bond is unclear or has incorrect/missing restraints. The bond requires restraints in conventional refinement that are not required in QM refinement.
  • Closed shell, Metal-containing structures. The PM6 Hamiltonian has support for 70 elements, and it should be able to cover all of the major metals of interest.

Q: How much memory should I have available for the QM calculation?

A: DivCon can run in very tight memory situations, but then some things are recalculated each step that could be stored/reused instead. So more memory is faster. Generally, it is best to figure at least 1/4 mb per QM-region-atom when run under 4 processor cores. DivCon automatically determines available memory based either on the amount of free memory available on the machine or on the queuing system limits (if the queuing system properly sets the per-process memory limit in the shell). So if the QM region (which includes both the main and buffer regions) have a couple thousand atoms or so, the calculation should fit within a GB of memory. If less memory is requested, the calculation may still proceed without difficulty; however, the time for the calculation will increase. Problems with memory will be reported to the phenix.log as a warning. In the QuantumBio laboratory, the default job configuration will request 1.25 GB of memory.

Q: How many processors can be allocated to the QM calculation?

A: The answer to this question ultimately depends upon the number of residues within the QM region. The more residues, the greater the available parallelism. A detailed analysis of the general “rules of thumb” is available here. In the QuantumBio laboratory, the default job configuration will use 4 CPU-cores.

Manual Contents

Other Resources