doped.complexes module

Code for generating and analysing defect complexes.

doped.complexes.are_equivalent_molecules(molecule_1: Molecule, molecule_2: Molecule, tol: float = 0.01) bool[source]

Determine if two Molecule objects are equivalent, using the Kabsch algorithm (which minimizes the root-mean-square-deviation (RMSD) of two molecules which are topologically (atom types, geometry) similar) as implemented in the BruteForceOrderMatcher class, which allows permutation invariance in the molecule definitions.

Uses caching to speed up the comparison.

Parameters:
  • molecule_1 (Molecule) – The first pymatgen Molecule object to compare.

  • molecule_2 (Molecule) – The second pymatgen Molecule object to compare.

  • tol (float) – The tolerance for the Kabsch algorithm. Default is 0.01 Å.

Returns:

True if the two Molecule objects are equivalent, False otherwise.

Return type:

bool

doped.complexes.classify_vacancy_geometry(bulk_supercell: Structure, vacancy_supercell: Structure, site_tol: float = 0.5, abs_tol: bool = False, verbose: bool = False, use_oxi_states: bool = False) str[source]

Classify the geometry of a given vacancy in a supercell, as either a ‘simple’ point vacancy, a ‘split’ vacancy, or a ‘non-trivial’ vacancy.

Split vacancy geometries are those where 2 vacancies and 1 interstitial are found to be present in the defect structure, as determined using site-matching between the defect and bulk structures with a distance tolerance (site_tol), such that the absence of any site of matching species within the distance tolerance to the original bulk site is considered a vacancy, and vice versa in comparing the bulk to the defect structure is an interstitial. This corresponds to the 2 V_X + X_i definition of split vacancies geometries as discussed in https://doi.org/10.1088/2515-7655/ade916

On the other hand, a simple (point) vacancy corresponds to cases where 1 site from the bulk structure cannot be matched to the defect structure while all defect structure sites can be matched to bulk sites, and ‘non-trivial’ vacancies refer to all other cases (multiple off-site atoms, which don’t match the split vacancy classification).

Inspired by the vacancy geometry classification used in Kumagai et al. Phys Rev Mater 2021. See https://doi.org/10.1088/2515-7655/ade916 for further details.

Parameters:
  • bulk_supercell (Structure) – The bulk supercell structure to compare against for site-matching.

  • vacancy_supercell (Structure) – The defect supercell containing the vacancy to be classified.

  • site_tol (float) – The (fractional) tolerance for matching sites between the defect and bulk structures. If abs_tol is False (default), then this value multiplied by the shortest bond length in the bulk structure will be used as the distance threshold for matching, otherwise the value is used directly (as a length in Å). Default is 0.5 (i.e. half the shortest bond length in the bulk structure).

  • abs_tol (bool) – Whether to use site_tol as an absolute distance tolerance (in Å) instead of a fractional tolerance (in terms of the shortest bond length in the structure). Default is False.

  • verbose (bool) – Whether to print additional information about the classification for non-trivial vacancies. Default is False.

  • use_oxi_states (bool) – Whether to use the oxidation states of the sites in the bulk and defect structures when considering matching sites (such that e.g. Fe3+ and Fe2+ would be considered different species). Default is False.

Returns:

The classification of the vacancy geometry, which can be one of “Simple Vacancy”, “Split Vacancy”, or “Non-Trivial”.

Return type:

str

doped.complexes.generate_complex_from_defect_sites(bulk_supercell: Structure, vacancy_sites: Iterable[PeriodicSite] | PeriodicSite | None = None, interstitial_sites: Iterable[PeriodicSite] | PeriodicSite | None = None, substitution_sites: Iterable[PeriodicSite] | PeriodicSite | None = None) Structure[source]

Generate the supercell containing a defect complex, given the bulk supercell and the sites of the defects to be included in the complex.

The coordinates of the input defect sites should correspond to the input bulk supercell. For substitutions, the closest site in the bulk supercell to the supplied site(s) will be removed, and replaced with the input substitution_sites.

Parameters:
  • bulk_supercell (Structure) – The bulk supercell structure in which to generate the defect complex.

  • vacancy_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of vacancies to include in the defect complex supercell. Default is None.

  • interstitial_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of interstitials to include in the defect complex supercell. Default is None.

  • substitution_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of substitutions to include in the defect complex supercell. Default is None.

Returns:

The defect complex supercell structure.

Return type:

Structure

doped.complexes.get_complex_defect_multiplicity(bulk_supercell: Structure, vacancy_sites: Iterable | PeriodicSite | None = None, interstitial_sites: Iterable | PeriodicSite | None = None, substitution_sites: Iterable | PeriodicSite | None = None, primitive_structure: Structure | None = None, symprec: float = 0.01, dist_tol_factor: float = 1.0, primitive_cell_multiplicity: bool = True, **kwargs) int[source]

Get the multiplicity of a given complex defect configuration (as given by the combination of input constituent point defect sites).

The complex defect multiplicity is given by the number of distinct symmetry-equivalent complex defect site configurations in the unit cell. The returned multiplicity value corresponds to the primitive cell site multiplicity by default (primitive_cell_multiplicity = True).

The input sites should correspond to the input bulk supercell.

The multiplicity is calculated by generating all possible symmetry-equivalent complex defect site configurations in the primitive unit cell; see get_equivalent_complex_defect_sites_in_primitive() for details.

Parameters:
  • bulk_supercell (Structure) – The bulk supercell structure to which the input sites correspond.

  • vacancy_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of vacancies in the defect complex. Default is None.

  • interstitial_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of interstitials in the defect complex. Default is None.

  • substitution_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of substitutions in the defect complex. Default is None.

  • primitive_structure (Structure | None) – The primitive unit cell structure, in which to get equivalent complex defect sites. If None (default), the primitive structure will be determined from the bulk supercell.

  • symprec (float) – Symmetry precision for determining the primitive structure (if not provided), supercell symmetry operations and equivalent defect sites in the primitive unit cell. Defaults to 0.01. Note that this should match the value used for determining the point defect multiplicities (e.g. with the Defect.get_multiplicity() methods) for appropriate comparisons – the same default of 0.01 is used in all relevant doped functions. If fixed_symprec_and_dist_tol_factor is False (default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites from spglib have consistent point group symmetries. Setting verbose to True will print information on the trialled symprec (and dist_tol_factor values).

  • dist_tol_factor (float) – Distance tolerance for clustering generated sites (to ensure they are truly distinct), searching for equivalent sites (by inter-defect distances) and matching equivalent complex defect geometries (as Molecules), as a multiplicative factor of symprec. Default is 1.0 (i.e. dist_tol = symprec, in Å). Note that this should match the value used for determining the point defect multiplicities (e.g. with the Defect.get_multiplicity() methods) for appropriate comparisons – the same default of 0.01 Å is used in all relevant doped functions. If fixed_symprec_and_dist_tol_factor is False (default), this value will also be automatically adjusted if necessary (up to 10x, down to 0.1x)(after symprec adjustments) until the identified equivalent sites from spglib have consistent point group symmetries. Setting verbose to True will print information on the trialled dist_tol_factor (and symprec) values.

  • primitive_cell_multiplicity (bool) – Whether to return the site multiplicity in the primitive unit cell (True) or the bulk supercell (False). Default is True.

  • **kwargs – Additional keyword arguments to pass to get_all_equiv_sites (via get_equiv_frac_coords_in_primitive()), such as fixed_symprec_and_dist_tol_factor and verbose.

Returns:

The multiplicity of the input complex defect configuration, in the primitive unit cell if primitive_cell_multiplicity = True (default), or in the bulk supercell if primitive_cell_multiplicity = False.

Return type:

int

doped.complexes.get_equivalent_complex_defect_sites_in_primitive(bulk_supercell: Structure, vacancy_sites: Iterable[PeriodicSite] | PeriodicSite | None = None, interstitial_sites: Iterable[PeriodicSite] | PeriodicSite | None = None, substitution_sites: Iterable[PeriodicSite] | PeriodicSite | None = None, primitive_structure: Structure | None = None, symprec: float = 0.01, dist_tol_factor: float = 1.0, return_molecules: bool = False, **kwargs) list[list[PeriodicSite]] | list[Molecule][source]

Generate all equivalent complex defect site configurations in the primitive unit cell, for the input constituent point defect sites of the complex.

The input sites should correspond to the input bulk supercell.

The approach followed in this function is:

1. Generate all symmetry-equivalent sites in the primitive unit cell, for the input constituent point defect sites of the complex.

2. Choose one constituent point defect as the ‘anchor’ site, based on estimated computational efficiency.

3. Generate a ‘template’ complex defect molecule, using the pymatgen Molecule class.

4. From the sets of symmetry-equivalent defect sites in the primitive unit cell, generate all possible combinations of candidate point defect sites that have inter-defect distances matching the input complex defect sites (+/- 2 \(\times\) symprec \(\times\) dist_tol_factor).

5. From these candidate site combinations, generate complex defect molecules (as pymatgen Molecules), and reduce to only those which are symmetry-equivalent to the input complex defect, and are distinct (i.e. not identical or periodic images; to avoid potential double counting).

Parameters:
  • bulk_supercell (Structure) – The bulk supercell structure to which the input sites correspond.

  • vacancy_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of vacancies in the defect complex. Default is None.

  • interstitial_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of interstitials in the defect complex. Default is None.

  • substitution_sites (Iterable[PeriodicSite] | PeriodicSite | None) – The site(s) of substitutions in the defect complex. Default is None.

  • primitive_structure (Structure | None) – The primitive unit cell structure, in which to get equivalent complex defect sites. If None (default), the primitive structure will be determined from the bulk supercell.

  • symprec (float) – Symmetry precision for determining the primitive structure (if not provided), supercell symmetry operations and equivalent defect sites in the primitive unit cell. Defaults to 0.01. Note that this should match the value used for determining the point defect multiplicities (e.g. with the Defect.get_multiplicity() methods) for appropriate comparisons – the same default of 0.01 is used in all relevant doped functions. If fixed_symprec_and_dist_tol_factor is False (default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites from spglib have consistent point group symmetries. Setting verbose to True will print information on the trialled symprec (and dist_tol_factor values).

  • dist_tol_factor (float) – Distance tolerance for clustering generated sites (to ensure they are truly distinct), searching for equivalent sites (by inter-defect distances) and matching equivalent complex defect geometries (as Molecules), as a multiplicative factor of symprec. Default is 1.0 (i.e. dist_tol = symprec, in Å). Note that this should match the value used for determining the point defect multiplicities (e.g. with the Defect.get_multiplicity() methods) for appropriate comparisons – the same default of 0.01 Å is used in all relevant doped functions. If fixed_symprec_and_dist_tol_factor is False (default), this value will also be automatically adjusted if necessary (up to 10x, down to 0.1x)(after symprec adjustments) until the identified equivalent sites from spglib have consistent point group symmetries. Setting verbose to True will print information on the trialled dist_tol_factor (and symprec) values.

  • return_molecules (bool) – Whether to return the equivalent complex defect molecules as Molecule objects, or as lists of PeriodicSite objects. Default is False (return lists of PeriodicSites).

  • **kwargs – Additional keyword arguments to pass to get_all_equiv_sites (via get_equiv_frac_coords_in_primitive()), such as fixed_symprec_and_dist_tol_factor and verbose.

Returns:

List of equivalent complex defect sites as Molecules or lists of PeriodicSites, depending on the value of return_molecules.

Return type:

list[list[PeriodicSite]] | list[Molecule]

doped.complexes.get_es_energy(structure: Structure, oxi_states: dict | None = None) float[source]

Calculate the electrostatic (Madelung) energy of a structure using Ewald summation.

The oxidation states of the structure should be set already (via site species), or they should be provided with the oxi_states argument.

Parameters:
  • structure (Structure) – Structure object for which to calculate the energy.

  • oxi_states (dict, optional) – Dictionary of oxidation states for the input structure. If None (default), the oxidation states of the structure are used. An error will be raised if the oxidation states are not set and are not provided.

Returns:

The electrostatic energy of the structure.

Return type:

float

doped.complexes.get_split_vacancies(defect_gen: DefectsGenerator, elements: Iterable[str] | str | None = None, bulk_oxi_states: Structure | Composition | dict[str, int] | None = None, relative_electrostatic_energy_tol: float = 1.1, split_vac_dist_tol: float = 5, verbose: bool = True, **kwargs)[source]

TODO. Elements can elt strings, “all” or None (cations only).

Todo: Note that this function requires the bulk oxidation states to be set (in the DefectsGenerator._bulk_oxi_states attribute), which is done automatically in DefectsGenerator initialisation when oxidation states can be successfully guessed. Additionally, this function assumes single oxidation states for each element in the bulk structure (i.e. does not account for any mixed valence).

doped.complexes.get_split_vacancies_by_geometry(bulk_supercell: Structure, interstitial_sites: Iterable[PeriodicSite] | PeriodicSite, split_vac_dist_tol: float = 5.0, all_species: bool = False, prune_symmetry_equivalent: bool = True, show_pbar: bool = False, **kwargs) dict[frozenset[float] | int, dict][source]

Generate inequivalent split vacancy configurations (i.e. vacancy - interstitial - vacancy complexes) for the given interstitial sites with a maximum vacancy-interstitial distance of split_vac_dist_tol in Å.

Split vacancies are generated by finding all possible vacancy-interstitial-vacancy complexes with vacancy-interstitial distances less than split_vac_dist_tol Å using the set of input interstitial sites, and removing any symmetry-equivalent duplicates (based on the VIV distances, rounded to 0.01 Å, and V-I-V bond angle, rounded to 0.1°).

See https://doi.org/10.1088/2515-7655/ade916 for further details.

Parameters:
  • bulk_supercell (Structure) – The bulk supercell structure in which to generate split vacancies.

  • interstitial_sites (Iterable[PeriodicSite] | PeriodicSite) – The set of interstitial sites to consider for split vacancy generation.

  • split_vac_dist_tol (float) – The maximum distance between vacancy and interstitial sites to allow for candidate split vacancies (which are vacancy-interstitial-vacancy complexes). Default is 5.0 Å.

  • all_species (bool) – Whether to consider all species for vacancy generation, rather than just vacancies for species matching the interstitial species. Default is ``False`.

  • prune_symmetry_equivalent (bool) – Whether to prune symmetry-equivalent split vacancies based on the VIV distances and bond angle as mentioned above. In rare / low-symmetry cases, this could lead to unintentional reduction of similar but not fully symmetry-equivalent candidates. Default is True.

  • show_pbar (bool) – Whether to show a progress bar during generation. Default is True.

  • **kwargs – Additional keyword arguments to use for vacancy generation, such as symprec.

Returns:

A dictionary of candidate split vacancies, with the vacancy-interstitial distances as keys and a dictionary of the defect information as values.

Return type:

dict

doped.complexes.get_split_vacancies_from_database(*args, verbose: bool | str = False)[source]

TODO.

doped.complexes.get_split_vacancies_from_electrostatics(bulk_supercell: Structure, interstitial_sites: Iterable[PeriodicSite] | PeriodicSite, bulk_oxi_states: Structure | Composition | dict[str, int] | None = None, relative_electrostatic_energy_tol: float = 1.1, split_vac_dist_tol: float = 5, verbose: bool = True, ndigits: int = 2, prune_symmetry_equivalent: bool = True, **kwargs)[source]

Generate inequivalent split vacancy configurations (i.e. vacancy - interstitial - vacancy complexes) for the given interstitial sites, using geometric and electrostatic analyses.

Candidate split vacancies are generated by finding all possible vacancy-interstitial-vacancy complexes with vacancy-interstitial distances less than split_vac_dist_tol Å using the set of input interstitial sites, and removing any symmetry-equivalent duplicates (based on the VIV distances, rounded to 0.01 Å, and V-I-V bond angle, rounded to 0.1°). The electrostatic formation energies (i.e. electrostatic energy of the split vacancy supercell minus that of the bulk supercell) are then calculated, and only those within relative_electrostatic_energy_tol (default = 1.1) times the lowest point vacancy electrostatic formation energy are returned.

See https://doi.org/10.1088/2515-7655/ade916 for further details.

Parameters:
  • bulk_supercell (Structure) – The bulk supercell structure in which to generate split vacancies.

  • interstitial_sites (Iterable[PeriodicSite] | PeriodicSite) – The set of interstitial sites to consider for split vacancy generation.

  • bulk_oxi_states (Structure | Composition | dict[str, int] | None) – The oxidation states of elements to use for electrostatic energy evaluations. If not provided, oxidation states will be taken from the bulk supercell, or otherwise guessed. Default is None.

  • relative_electrostatic_energy_tol (float) – Relative tolerance for selecting candidate split vacancy configurations. Split vacancies with electrostatic formation energies (i.e. electrostatic energy of the split vacancy supercell minus that of the bulk supercell) less than relative_electrostatic_energy_tol times the lowest point vacancy electrostatic formation energy are returned. Default 1.1.

  • split_vac_dist_tol (float) – The maximum distance between vacancy and interstitial sites to allow for candidate split vacancies (which are vacancy-interstitial-vacancy complexes). Default is 5.0 Å.

  • verbose (bool) – Whether to print verbose information about the generation process. Default is True.

  • ndigits (int) – The number of decimal places to round the electrostatic formation energies to, which is used to avoid degenerate configurations. Default is 2.

  • prune_symmetry_equivalent (bool) – Whether to prune symmetry-equivalent split vacancies based on the VIV distances and bond angle (see get_split_vacancies_by_geometry). In rare / low-symmetry cases, this could lead to unintentional reduction of similar but not fully symmetry-equivalent candidates. Default is True.

  • **kwargs – Additional keyword arguments to use for vacancy generation, such as symprec.

Returns:

A dictionary with information about candidate split vacancies, including:

  • dictionary of split vacancy electrostatic formation energies as keys and the constituent vacancy and interstitial sites as a dictionary for values,

  • the minimum electrostatic formation energies for split vacancies, point vacancies, interstitials and isolated vacancy-interstitial- vacancy combinations,

  • the number of split vacancies within the relative_electrostatic_energy_tol tolerance, or lower electrostatic formation energy that the lowest point vacancy electrostatic formation energy,

  • the input structure (bulk supercell) with oxidation states added,

  • the absolute electrostatic energy of the input structure.

Return type:

dict

doped.complexes.molecule_from_sites(sites: list[PeriodicSite], anchor_idx: int = 0) Molecule[source]

Generate a Molecule from a list of PeriodicSite objects, accounting for periodic boundary conditions.

sites[anchor_idx] is taken as the ‘anchor’ site, from which the closest periodic image of each other site is used in constructing the molecule.

Parameters:
  • sites (list[PeriodicSite]) – The list of PeriodicSite objects to generate a Molecule from.

  • anchor_idx (int) – The index of the anchor site in the list of PeriodicSite objects. Default is 0.

Returns:

The Molecule object generated from the list of PeriodicSite objects.

Return type:

Molecule