Examples of structure solution by direct space method

All examples can be downloaded from this link and are provided as supplementary information for the paper: ‘Direct space approach in action: Challenging structure solution of microcrystalline materials using the EXPO software’, Computational Materials Science 210 (2022) 111465.

Example 1: structure solution of diltiazem hydrochloride (CSD reference code: CEYHUJ01)

  • Folder examples/Example 1 – diltiazem hydrochloride/test1 contains the following files:
    diltia.exp: EXPO input file for structure solution by direct space method starting from a model with an open ring at bond C4-C5.
    diltiazem_nw_break.mol: starting model.
    pd_0029.xy: powder diffraction data file (available to the public at http://www.powderdata.net).

The input file called diltia.exp is here reported:

%Structure diltia
%Job CSD refcode: CEYHUJ01

%Data
Cell   42.190  9.075  6.037  90  90  90
SpaceGroup p 21 21 21
Pattern    pd_0029.xy
Wavelength 1.54056

%fragment diltiazem_nw_break.mol
%fragment atoms Cl
deletehydro

%sannel
nrun 100
niter 5000
  • Folder examples/Example 1 – diltiazem hydrochloride/test2 contains the following files:
    diltia.exp: EXPO input file for structure solution by direct space method starting from a model with an open ring at bond C4-C5. Restraint is applied to the bond C4-C5.
    diltiazem_nw_break.mol: starting model.
    pd_0029.xy: powder diffraction data file.

The input file called diltia.exp is here reported:

%Structure diltia
%Job CSD refcode: CEYHUJ01

%Data
Cell 42.190 9.075 6.037 90 90 90
SpaceGroup p 21 21 21
Pattern pd_0029.xy
Wavelength 1.54056

%fragment diltiazem_nw_break.mol
%frag atoms Cl
deletehydro

%sannel
nrun 100
niter 1000
rest C4 C5
  • Folder examples/Example 1 – diltiazem hydrochloride/test3/conformer_test contains files for the structure solution by direct space method applied to 100 conformers generated by conformer generator tool RDKit:
    scanconformers: bash shell script used to generate automatically .exp file for each conformer, run the structure solution and create result files.
    diltia_templ: .exp input file used as a template by the script scanconformers.
    molecules.txt: file used by the script scanconformers and containing the list of conformers to test for structure solution.
    diltia1.mol, diltia2.mol, …: all conformer files in MOL format. They were generated using the python script rdk_confgen.py available in the folder examples/Example 1 – diltiazem hydrochloride/test3/conformer_generation. If RDKit and Open Babel are installed, you can generate conformers running the following commands:

    python3 rdk_confgen.py --input diltia_nw.mol --numconf 100 
    
    obabel gen_confs.sdf -O diltia1.mol -m

Openbabel is used to split the SDF multi-file generated by rdk_confgen.py in single files: diltia1.mol, diltia2.mol, diltia3.mol, … The rdk_confgen.py is a modified version of the python script available at: https://github.com/iwatobipen/rdk_confgen.

The file scanconformers is a bash script and is reported here:

#!/bin/bash
 
filein=molecules.txt
template=diltia_templ
 
rm results.dat summary.dat *.expo *.bin *_best*.cif
c=`wc -l $filein | awk '{print $1}'`
 
for i in `seq 1 $c`;
do
a=`cat $filein | head -$i | tail -1`
file=${a%.*}
echo $file $template
sed "s%STRNAME%$file%" $template > "$file.exp"
#expo "$file.exp" "$file.out" --nogui
mpirun -np 20 -oversubscribe $HOME/expo/expo_mpi "$file.exp" "$file.out"
echo "============================================================" >> results.dat
echo "============================================================" >> results.dat
echo "$file.out" >> results.dat
grep @@  "$file.out" | cut -c3-  >> results.dat
grep @@best  "$file.out" | cut -c3-  >> summary.dat
done

To use this file you have to make it executable (chmod +x scanconformers) and run it (./scanconformers). Modify the line 16 of the script to change the expo installation folder and the number of used CPU-cores. We run expo in parallel on 20 CPU-cores but if the serial version of the program is used, replace the line 16 with the commented line 15.
In the lines 18-20, records in the output file tagged with the ‘@@’ are filtered and redirect on the files summary.dat and results.dat. When the script is ended results.dat will contain a report that can be consulted to quickly identify the most reasonable structure. In addition, for each conformer other files are generated:

  • output file (.out) with general information about the procedure;
  • CIF files with the solutions found during SA;
  • project file (.expo) that can be used for a visual inspection of the models. Load the files .expo in EXPO by File > Old Project and access to the list of solutions selecting Solve > Simulated Annealing and clicking on the button.

Example 2: structure solution of tetracycline hydrochloride (CSD reference code: XAYCAB)

Folder examples/Example 2 – tetracycline hydrochloride contains files for the structure solution by direct space method applied to 300 conformers generated by conformer generator tool RDKit:
scanconformers: bash shell script used to generate automatically .exp file for each conformer, run the structure solution, create a result file.
aldx_templ: .exp input file used as a template by the script scanconformers.
molecules.txt: file used by the script scanconformers and containing the list of conformers to test for structure solution.
aldx1.mol, aldx2.mol, …: all conformer files in MOL format. They were generated using the python script rdk_confgen.py available in the folder /examples/Example 2 – tetracycline hydrochloride/conformer_test.
aldx.dat: powder diffraction data file.

For more information about the meaning and the use of these files see the Example 1 (examples/Example 1 – diltiazem hydrochloride/conformer_test).

Example 3: structure solution of selexipag form I (CSD reference code: VOHVIA)

Folder examples/Example 3 – selexipag contains the following files:
selexipag.exp: EXPO input file for structure solution by direct space method.
selexipag_mopac.mol: starting model.
VOHVIA.cif: CIF file containing powder diffraction data.

The input file called selexipag.exp is here reported:

%structure selexipag
%job CSD refcode: VOHVIA

%data
pattern VOHVIA.cif
synchrotron
cell 37.96347 6.110426 22.47454 90 98.3273 90
space P21/c

%fragment selexipag_mopac.mol
%fragment selexipag_mopac.mol
deletehydro

%sannel
nrun 100
niter 2000
resm 2.5

Example 4: structure solution of ALPO-M (CSD reference code: RAMXUZ)

Folder examples/Example 4 – ALPO-M contains the following files:
alpo.exp: EXPO input file for structure solution by direct space method.
alpo.xy: synchrotron powder diffraction data file.

The input file called alpo.exp is here reported:

%structure alpo
%job AlPO-M L.B. McCusker - (dati SLS) - (CSD refcode: RAMXUZ)

%data
pattern alpo.xy
cell 9.7493 29.1668 9.3528 90. 90. 90.
space p b c a
wave 0.9187
synchrotron

%frag tetra P O
%frag tetra P O
%frag atoms Al2O
%frag smiles C(CO)N
%frag smiles C(CO)N
deletehydro

%sannel 
nrun 100
niter 3000
doc 
nodoc C* N* O10 O11
po 0 1 0

Example 5: structure solution of calcium glycinate trihydrate (CSD reference code: ZINXAX)

Folder examples/Example 5 – calcium glycinate trihydrate contains the following files:
cagly.exp: EXPO input file for structure solution by direct space method.
sk3493Isup2.rtv: file containing powder diffraction data.

The input file called cagly.exp is here reported:

%job calcium glycinate trihydrate (CSD refcode: ZINXAX)
%structure cagly

%data
pattern sk3493Isup2.rtv
cell 9.6572 9.6878 5.7627 90.588 76.997 97.467
space P-1
wave 1.540560

%frag smiles C(C(=O)[O-])N
%frag smiles C(C(=O)[O-])N
%frag atoms CaO3
deletehydro

%sannel
nrun 100
niter 20000
refinetf molecules

Example 6: structure solution of LaTi2Al9O19 (ICSD 262089)

Folder examples/Example 6 – LaTi2Al9O19 contains the following files:
LaTi.exp: EXPO input file for structure solution by direct space method.
hw5019Isup2.rtv: file containing powder diffraction data.

The input file called LaTi.exp is here reported:

%job La Ti2 Al9 O19 (ICSD 262089)
%structure LaTi

%data
pattern hw5019Isup2.rtv
cell 22.59355 10.99919 9.72968 90 98.5634 90
space C2/c
wave 1.54059

%fragment atoms La
%fragment octa Ti O
%fragment octa Ti O
%fragment tetra Al O
%fragment tetra Al O
%fragment octa Al O
%fragment octa Al O
%fragment octa Al O
%fragment octa Al O
%fragment octa Al O
%fragment octa Al O
%fragment octa Al O
%fragment octa Al O

%sannel
nrun 100
doc