All examples can be downloaded from this link and are provided as supplementary information for the paper: ‘Direct space approach in action: Challenging structure solution of microcrystalline materials using the EXPO software’, Computational Materials Science 210 (2022) 111465.
Example 1: structure solution of diltiazem hydrochloride (CSD reference code: CEYHUJ01)
- Folder examples/Example 1 – diltiazem hydrochloride/test1 contains the following files:
diltia.exp: EXPO input file for structure solution by direct space method starting from a model with an open ring at bond C4-C5.
diltiazem_nw_break.mol: starting model.
pd_0029.xy: powder diffraction data file (available to the public at http://www.powderdata.net).
The input file called diltia.exp is here reported:
%Structure diltia %Job CSD refcode: CEYHUJ01 %Data Cell 42.190 9.075 6.037 90 90 90 SpaceGroup p 21 21 21 Pattern pd_0029.xy Wavelength 1.54056 %fragment diltiazem_nw_break.mol %fragment atoms Cl deletehydro %sannel nrun 100 niter 5000
- Folder examples/Example 1 – diltiazem hydrochloride/test2 contains the following files:
diltia.exp: EXPO input file for structure solution by direct space method starting from a model with an open ring at bond C4-C5. Restraint is applied to the bond C4-C5.
diltiazem_nw_break.mol: starting model.
pd_0029.xy: powder diffraction data file.
The input file called diltia.exp is here reported:
%Structure diltia %Job CSD refcode: CEYHUJ01 %Data Cell 42.190 9.075 6.037 90 90 90 SpaceGroup p 21 21 21 Pattern pd_0029.xy Wavelength 1.54056 %fragment diltiazem_nw_break.mol %frag atoms Cl deletehydro %sannel nrun 100 niter 1000 rest C4 C5
- Folder examples/Example 1 – diltiazem hydrochloride/test3/conformer_test contains files for the structure solution by direct space method applied to 100 conformers generated by conformer generator tool RDKit:
scanconformers: bash shell script used to generate automatically .exp file for each conformer, run the structure solution and create result files.
diltia_templ: .exp input file used as a template by the script scanconformers.
molecules.txt: file used by the script scanconformers and containing the list of conformers to test for structure solution.
diltia1.mol, diltia2.mol, …: all conformer files in MOL format. They were generated using the python script rdk_confgen.py available in the folder examples/Example 1 – diltiazem hydrochloride/test3/conformer_generation. If RDKit and Open Babel are installed, you can generate conformers running the following commands:python3 rdk_confgen.py --input diltia_nw.mol --numconf 100 obabel gen_confs.sdf -O diltia1.mol -m
Openbabel is used to split the SDF multi-file generated by rdk_confgen.py in single files: diltia1.mol, diltia2.mol, diltia3.mol, … The rdk_confgen.py is a modified version of the python script available at: https://github.com/iwatobipen/rdk_confgen.
The file scanconformers is a bash script and is reported here:
#!/bin/bash filein=molecules.txt template=diltia_templ rm results.dat summary.dat *.expo *.bin *_best*.cif c=`wc -l $filein | awk '{print $1}'` for i in `seq 1 $c`; do a=`cat $filein | head -$i | tail -1` file=${a%.*} echo $file $template sed "s%STRNAME%$file%" $template > "$file.exp" #expo "$file.exp" "$file.out" --nogui mpirun -np 20 -oversubscribe $HOME/expo/expo_mpi "$file.exp" "$file.out" echo "============================================================" >> results.dat echo "============================================================" >> results.dat echo "$file.out" >> results.dat grep @@ "$file.out" | cut -c3- >> results.dat grep @@best "$file.out" | cut -c3- >> summary.dat done
To use this file you have to make it executable (chmod +x scanconformers) and run it (./scanconformers). Modify the line 16 of the script to change the expo installation folder and the number of used CPU-cores. We run expo in parallel on 20 CPU-cores but if the serial version of the program is used, replace the line 16 with the commented line 15.
In the lines 18-20, records in the output file tagged with the ‘@@’ are filtered and redirect on the files summary.dat and results.dat. When the script is ended results.dat will contain a report that can be consulted to quickly identify the most reasonable structure. In addition, for each conformer other files are generated:
- output file (.out) with general information about the procedure;
- CIF files with the solutions found during SA;
- project file (.expo) that can be used for a visual inspection of the models. Load the files .expo in EXPO by File > Old Project and access to the list of solutions selecting Solve > Simulated Annealing and clicking on the
button.
Example 2: structure solution of tetracycline hydrochloride (CSD reference code: XAYCAB)
Folder examples/Example 2 – tetracycline hydrochloride contains files for the structure solution by direct space method applied to 300 conformers generated by conformer generator tool RDKit:
scanconformers: bash shell script used to generate automatically .exp file for each conformer, run the structure solution, create a result file.
aldx_templ: .exp input file used as a template by the script scanconformers.
molecules.txt: file used by the script scanconformers and containing the list of conformers to test for structure solution.
aldx1.mol, aldx2.mol, …: all conformer files in MOL format. They were generated using the python script rdk_confgen.py available in the folder /examples/Example 2 – tetracycline hydrochloride/conformer_test.
aldx.dat: powder diffraction data file.
For more information about the meaning and the use of these files see the Example 1 (examples/Example 1 – diltiazem hydrochloride/conformer_test).
Example 3: structure solution of selexipag form I (CSD reference code: VOHVIA)
Folder examples/Example 3 – selexipag contains the following files:
selexipag.exp: EXPO input file for structure solution by direct space method.
selexipag_mopac.mol: starting model.
VOHVIA.cif: CIF file containing powder diffraction data.
The input file called selexipag.exp is here reported:
%structure selexipag %job CSD refcode: VOHVIA %data pattern VOHVIA.cif synchrotron cell 37.96347 6.110426 22.47454 90 98.3273 90 space P21/c %fragment selexipag_mopac.mol %fragment selexipag_mopac.mol deletehydro %sannel nrun 100 niter 2000 resm 2.5
Example 4: structure solution of ALPO-M (CSD reference code: RAMXUZ)
Folder examples/Example 4 – ALPO-M contains the following files:
alpo.exp: EXPO input file for structure solution by direct space method.
alpo.xy: synchrotron powder diffraction data file.
The input file called alpo.exp is here reported:
%structure alpo %job AlPO-M L.B. McCusker - (dati SLS) - (CSD refcode: RAMXUZ) %data pattern alpo.xy cell 9.7493 29.1668 9.3528 90. 90. 90. space p b c a wave 0.9187 synchrotron %frag tetra P O %frag tetra P O %frag atoms Al2O %frag smiles C(CO)N %frag smiles C(CO)N deletehydro %sannel nrun 100 niter 3000 doc nodoc C* N* O10 O11 po 0 1 0
Example 5: structure solution of calcium glycinate trihydrate (CSD reference code: ZINXAX)
Folder examples/Example 5 – calcium glycinate trihydrate contains the following files:
cagly.exp: EXPO input file for structure solution by direct space method.
sk3493Isup2.rtv: file containing powder diffraction data.
The input file called cagly.exp is here reported:
%job calcium glycinate trihydrate (CSD refcode: ZINXAX) %structure cagly %data pattern sk3493Isup2.rtv cell 9.6572 9.6878 5.7627 90.588 76.997 97.467 space P-1 wave 1.540560 %frag smiles C(C(=O)[O-])N %frag smiles C(C(=O)[O-])N %frag atoms CaO3 deletehydro %sannel nrun 100 niter 20000 refinetf molecules
Example 6: structure solution of LaTi2Al9O19 (ICSD 262089)
Folder examples/Example 6 – LaTi2Al9O19 contains the following files:
LaTi.exp: EXPO input file for structure solution by direct space method.
hw5019Isup2.rtv: file containing powder diffraction data.
The input file called LaTi.exp is here reported:
%job La Ti2 Al9 O19 (ICSD 262089) %structure LaTi %data pattern hw5019Isup2.rtv cell 22.59355 10.99919 9.72968 90 98.5634 90 space C2/c wave 1.54059 %fragment atoms La %fragment octa Ti O %fragment octa Ti O %fragment tetra Al O %fragment tetra Al O %fragment octa Al O %fragment octa Al O %fragment octa Al O %fragment octa Al O %fragment octa Al O %fragment octa Al O %fragment octa Al O %fragment octa Al O %sannel nrun 100 doc