doped.utils package

Submodules

doped.utils.displacements module

Code to analyse site displacements around a defect.

doped.utils.displacements.calc_site_displacements(defect_entry, vector_to_project_on: list | None = None, relative_to_defect: bool | None = False) → dict[source]

Calculates the site displacements in the defect supercell, relative to the bulk supercell. The signed displacements are stored in the calculation_metadata of the DefectEntry object under the “site_displacements” key.

Parameters:

defect_entry (DefectEntry) – DefectEntry object
vector_to_project_on (list) – Direction to project the site displacements along (e.g. [0, 0, 1]). Defaults to None.
relative_to_defect (bool) – Whether to calculate the signed displacements along the line from the defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect. Defaults to False. If True, the relative displacements are stored in the Displacement wrt defect key of the returned dictionary.

Returns:

Dictionary with site displacements (compared to pristine supercell).

doped.utils.displacements.plot_site_displacements(defect_entry, separated_by_direction: bool | None = False, relative_to_defect: bool | None = False, vector_to_project_on: list | None = None, use_plotly: bool | None = True, style_file: str | None = '')[source]

Plots site displacements around a defect.

Parameters:

defect_entry – DefectEntry object
separated_by_direction – Whether to plot site displacements separated by direction (x, y, z). Default is False.
relative_to_defect (bool) – Whether to plot the signed displacements along the line from the defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect (tensile strain). Uses the relaxed defect position as reference.
vector_to_project_on – Direction to project the site displacements along (e.g. [0, 0, 1]). Defaults to None (e.g. the displacements are calculated in the cartesian basis x, y, z).
use_plotly – Whether to use Plotly for plotting. Default is True.
style_file – Path to matplotlib style file. Default is “”, which will use the doped default style.

Returns:

Plotly or matplotlib figure.

doped.utils.legacy_corrections module

Functions for computing legacy finite-size charge corrections (Makov-Payne, Murphy-Hine, Lany-Zunger) for defect formation energies.

Mostly adapted from the deprecated AIDE package developed by the dynamic duo Adam Jackson and Alex Ganose ( https://github.com/SMTG-Bham/aide)

doped.utils.legacy_corrections.get_murphy_image_charge_correction(lattice, dielectric_matrix, conv=0.3, factor=30, verbose=False)[source]

Calculates the anisotropic image charge correction by Sam Murphy in eV.

This a rewrite of the code ‘madelung.pl’ written by Sam Murphy (see [1]). The default convergence parameter of conv = 0.3 seems to work perfectly well. However, it may be worth testing convergence of defect energies with respect to the factor (i.e. cut-off radius).

References

[1] S. T. Murphy and N. D. H. Hine, Phys. Rev. B 87, 094111 (2013).

Parameters:

lattice (list) – The defect cell lattice as a 3x3 matrix.
dielectric_matrix (list) – The dielectric tensor as 3x3 matrix.
conv (float) – A value between 0.1 and 0.9 which adjusts how much real space vs reciprocal space contribution there is.
factor – The cut-off radius, defined as a multiple of the longest cell parameter.
verbose (bool) – If True details of the correction will be printed.

Returns:

The image charge correction as a {charge: correction} dictionary.

doped.utils.legacy_corrections.lany_zunger_corrected_defect_dict(defect_dict: dict)[source]

Convert input parsed defect dictionary (presumably created using DefectParser) with Freysoldt/Kumagai charge corrections to the same.

parsed defect dictionary but with the Lany-Zunger charge correction (same potential alignment plus 0.65 * Makov-Payne image charge correction).

Parameters:: defect_dict (dict) – Dictionary of parsed defect calculations (presumably created using DefectParser (see tutorials) Must have ‘freysoldt_meta’ in defect.calculation_metadata for each charged defect (from DefectParser.load_FNV_data())
Returns:: Parsed defect dictionary with Lany-Zunger charge corrections.

doped.utils.parsing module

Helper functions for parsing VASP supercell defect calculations.

doped.utils.parsing.check_atom_mapping_far_from_defect(bulk, defect, defect_coords)[source]: Check the displacement of atoms far from the determined defect site, and warn the user if they are large (often indicates a mismatch between the bulk and defect supercell definitions).

doped.utils.parsing.defect_charge_from_vasprun(defect_vr: Vasprun, charge_state: int | None) → int[source]

Determine the defect charge state from the defect vasprun, and compare to the manually-set charge state if provided.

Parameters:

defect_vr (Vasprun) – Defect pymatgen Vasprun object.
charge_state (int) – Manually-set charge state for the defect, to check if it matches the auto-determined charge state.

Returns:

The auto-determined defect charge state.

Return type:

int

doped.utils.parsing.find_archived_fname(fname, raise_error=True)[source]: Find a suitable filename, taking account of possible use of compression software.

doped.utils.parsing.find_nearest_coords(bulk_coords, target_coords, bulk_lattice_matrix, defect_type='substitution', searched_structure='bulk', unique_tolerance=1)[source]: Find the nearest coords in bulk_coords to target_coords.

doped.utils.parsing.get_coords_and_idx_of_species(structure, species_name)[source]: Get arrays of the coordinates and indices of the given species in the structure.

doped.utils.parsing.get_defect_site_idxs_and_unrelaxed_structure(bulk, defect, defect_type, composition_diff, unique_tolerance=1)[source]

Get the defect site and unrelaxed structure, where ‘unrelaxed structure’ corresponds to the pristine defect supercell structure for vacancies / substitutions, and the pristine bulk structure with the final relaxed interstitial site for interstitials.

Initially contributed by Dr. Alex Ganose (@ Imperial Chemistry) and refactored for extrinsic species and code efficiency/robustness improvements.

Returns:

Index of the site in the bulk structure that corresponds: to the defect site in the defect structure
defect_site_idx:: Index of the defect site in the defect structure
unrelaxed_defect_structure:: Pristine defect supercell structure for vacancies/substitutions (i.e. pristine bulk with unrelaxed vacancy/substitution), or the pristine bulk structure with the final relaxed interstitial site for interstitials.

Return type:

bulk_site_idx

doped.utils.parsing.get_defect_type_and_composition_diff(bulk, defect)[source]

Get the difference in composition between a bulk structure and a defect structure.

Contributed by Dr. Alex Ganose (@ Imperial Chemistry) and refactored for extrinsic species and code efficiency/robustness improvements.

doped.utils.parsing.get_interstitial_site_and_orientational_degeneracy(interstitial_defect_entry: DefectEntry, dist_tol: float = 0.15) → int[source]

Get the combined site and orientational degeneracy of an interstitial defect entry.

The standard approach of using _get_equiv_sites() for interstitial site multiplicity and then point_symmetry_from_defect_entry() & get_orientational_degeneracy for symmetry/orientational degeneracy is preferred (as used in the DefectParser code), but alternatively this function can be used to compute the product of the site and orientational degeneracies.

This is done by determining the number of equivalent sites in the bulk supercell for the given interstitial site (from defect_supercell_site), which gives the combined site and orientational degeneracy if there was no relaxation of the bulk lattice atoms. This matches the true combined degeneracy in most cases, except for split-interstitial type defects etc, where this would give an artificially high degeneracy (as, for example, the interstitial site is automatically assigned to one of the split-interstitial atoms and not the midpoint, giving a doubled degeneracy as it considers the two split-interstitial sites as two separate (degenerate) interstitial sites, instead of one). This is counteracted by dividing by the number of sites which are present in the defect supercell (within a distance tolerance of dist_tol in Å) with the same species, ensuring none of the predicted different equivalent sites are actually included in the defect structure.

Parameters:

interstitial_defect_entry – DefectEntry object for the interstitial defect.
dist_tol – distance tolerance in Å for determining equivalent sites.

Returns:

combined site and orientational degeneracy of the interstitial defect entry (int).

doped.utils.parsing.get_locpot(locpot_path: str | Path)[source]: Read the LOCPOT(.gz) file as a pymatgen Locpot object.

doped.utils.parsing.get_magnetization_from_vasprun(vasprun: Vasprun) → int | float[source]

Determine the magnetization (number of spin-up vs spin-down electrons) from a Vasprun object.

Parameters:: vasprun (Vasprun) – The Vasprun object from which to extract the total magnetization.
Returns:: The total magnetization of the system.
Return type:: int or float

doped.utils.parsing.get_nelect_from_vasprun(vasprun: Vasprun) → int | float[source]

Determine the number of electrons (NELECT) from a Vasprun object.

Parameters:: vasprun (Vasprun) – The Vasprun object from which to extract NELECT.
Returns:: The number of electrons in the system.
Return type:: int or float

doped.utils.parsing.get_neutral_nelect_from_vasprun(vasprun: Vasprun, skip_potcar_init: bool = False) → int | float[source]

Determine the number of electrons (NELECT) from a Vasprun object, corresponding to a neutral charge state for the structure.

Parameters:

vasprun (Vasprun) – The Vasprun object from which to extract NELECT.
skip_potcar_init (bool) – Whether to skip the initialisation of the POTCAR statistics (i.e. the auto-charge determination) and instead try to reverse engineer NELECT using the DefectDictSet.

Returns:

The number of electrons in the system for a neutral charge state.

Return type:

int or float

doped.utils.parsing.get_orientational_degeneracy(defect_entry: DefectEntry | None = None, relaxed_point_group: str | None = None, bulk_site_point_group: str | None = None, bulk_symm_ops: list | None = None, defect_symm_ops: list | None = None, symprec: float = 0.1) → float[source]

Get the orientational degeneracy factor for a given relaxed DefectEntry, by supplying either the DefectEntry object or the bulk-site & relaxed defect point group symbols (e.g. “Td”, “C3v” etc).

If a DefectEntry is supplied (and the point group symbols are not), this is computed by determining the relaxed defect point symmetry and the (unrelaxed) bulk site symmetry, and then getting the ratio of their point group orders (equivalent to the ratio of partition functions or number of symmetry operations (i.e. degeneracy)).

For interstitials, the bulk site symmetry corresponds to the point symmetry of the interstitial site with no relaxation of the host structure, while for vacancies/substitutions it is simply the symmetry of their corresponding bulk site. This corresponds to the point symmetry of DefectEntry.defect, or calculation_metadata["bulk_site"]/["unrelaxed_defect_structure"].

Note: This tries to use the defect_entry.defect_supercell to determine the relaxed site symmetry. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell. doped tries to automatically check if this is the case, and will warn you if so.

This can also be checked by using this function on your doped generated defects:

from doped.generation import get_defect_name_from_entry
for defect_name, defect_entry in defect_gen.items():
    print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False),
          get_defect_name_from_entry(defect_entry), "\n")

And if the point symmetries match in each case, then using this function on your parsed relaxed DefectEntry objects should correctly determine the final relaxed defect symmetry (and orientational degeneracy) - otherwise periodicity-breaking prevents this.

If periodicity-breaking prevents auto-symmetry determination, you can manually determine the relaxed defect and bulk-site point symmetries, and/or orientational degeneracy, from visualising the structures (e.g. using VESTA)(can use get_orientational_degeneracy to obtain the corresponding orientational degeneracy factor for given defect/bulk-site point symmetries) and setting the corresponding values in the calculation_metadata['relaxed point symmetry']/['bulk site symmetry'] and/or degeneracy_factors['orientational degeneracy'] attributes. Note that the bulk-site point symmetry corresponds to that of DefectEntry.defect, or equivalently calculation_metadata["bulk_site"]/["unrelaxed_defect_structure"], which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure. The degeneracy factor is used in the calculation of defect/carrier concentrations and Fermi level behaviour (see e.g. https://doi.org/10.1039/D2FD00043A & https://doi.org/10.1039/D3CS00432E).

Parameters:

defect_entry (DefectEntry) – DefectEntry object. (Default = None)
relaxed_point_group (str) – Point group symmetry (e.g. “Td”, “C3v” etc) of the relaxed defect structure, if already calculated / manually determined. Default is None (automatically calculated by doped).
bulk_site_point_group (str) – Point group symmetry (e.g. “Td”, “C3v” etc) of the defect site in the bulk, if already calculated / manually determined. For vacancies/substitutions, this should match the site symmetry label from doped when generating the defect, while for interstitials it should be the point symmetry of the final relaxed interstitial site, when placed in the bulk structure. Default is None (automatically calculated by doped).
bulk_symm_ops (list) – List of symmetry operations of the defect_entry.bulk_supercell structure (used in determining the unrelaxed bulk site symmetry), to avoid re-calculating. Default is None (recalculates).
defect_symm_ops (list) – List of symmetry operations of the defect_entry.defect_supercell structure (used in determining the relaxed point symmetry), to avoid re-calculating. Default is None (recalculates).
symprec (float) – Symmetry tolerance for spglib to use when determining point symmetries and thus orientational degeneracies. Default is 0.1 which matches that used by the Materials Project and is larger than the pymatgen default of 0.01 to account for residual structural noise in relaxed defect supercells. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).

Returns:

orientational degeneracy factor for the defect.

Return type:

float

doped.utils.parsing.get_outcar(outcar_path: str | Path)[source]: Read the OUTCAR(.gz) file as a pymatgen Outcar object.

doped.utils.parsing.get_procar(procar_path: str | Path)[source]

Read the PROCAR(.gz) file as an easyunfold Procar object (if easyunfold installed), else a pymatgen Procar object (doesn’t support SOC).

If easyunfold installed, the Procar will be parsed with easyunfold and then the proj_data attribute will be converted to a data attribute (to be compatible with pydefect, which uses the pymatgen format).

doped.utils.parsing.get_site_mapping_indices(structure_a: Structure, structure_b: Structure, threshold=2.0)[source]

Reset the position of a partially relaxed structure to its unrelaxed positions.

The template structure may have a different species ordering to the input_structure.

NOTE: This assumes that both structures have the same lattice definitions (i.e. that they match, and aren’t rigidly translated/rotated with respect to each other), which is mostly the case unless we have a mismatching defect/bulk supercell (in which case the check_atom_mapping_far_from_defect warning should be thrown anyway during parsing). Currently this function is only used for analysing site displacements in the displacements module so this is fine (user will already have been warned at this point if there is a possible mismatch).

doped.utils.parsing.get_vasprun(vasprun_path: str | Path, **kwargs)[source]: Read the vasprun.xml(.gz) file as a pymatgen Vasprun object.

doped.utils.parsing.parse_projected_eigen_no_mag(elem)[source]

Parse the projected eigenvalues from a Vasprun object (used during initialisation), but excluding the projected magnetisation for efficiency.

This is a modified version of _parse_projected_eigen from pymatgen.io.vasp.outputs.Vasprun, which skips parsing of the projected magnetisation in order to expedite parsing in doped, as well as some small adjustments to maximise efficiency.

doped.utils.parsing.reorder_s1_like_s2(s1_structure: Structure, s2_structure: Structure, threshold=5.0)[source]

Reorder the atoms of a (relaxed) structure, s1, to match the ordering of the atoms in s2_structure.

s1/s2 structures may have a different species orderings.

Previously used to ensure correct site matching when pulling site potentials for the eFNV Kumagai correction, though no longer used for this purpose. If threshold is set to a low value, it will raise a warning if there is a large site displacement detected.

NOTE: This assumes that both structures have the same lattice definitions (i.e. that they match, and aren’t rigidly translated/rotated with respect to each other), which is mostly the case unless we have a mismatching defect/bulk supercell (in which case the check_atom_mapping_far_from_defect warning should be thrown anyway during parsing). Currently this function is no longer used, but if it is reintroduced at any point, this point should be noted!

doped.utils.parsing.simple_spin_degeneracy_from_charge(structure, charge_state: int = 0) → int[source]

Get the defect spin degeneracy from the supercell and charge state, assuming either simple singlet (S=0) or doublet (S=1/2) behaviour.

Even-electron defects are assumed to have a singlet ground state, and odd- electron defects are assumed to have a doublet ground state.

doped.utils.parsing.suppress_logging(level=50)[source]: Context manager to catch and suppress logging messages.

doped.utils.plotting module

Code to analyse VASP defect calculations.

These functions are built from a combination of useful modules from pymatgen and AIDE (by Adam Jackson and Alex Ganose), alongside substantial modification, in the efforts of making an efficient, user-friendly package for managing and analysing defect calculations, with publication-quality outputs.

doped.utils.plotting.format_defect_name(defect_species: str, include_site_info_in_name: bool, wout_charge: bool = False) → str | None[source]

Format defect name for plot titles.

(i.e. from Cd_i_C3v_0 to $Cd_{i}^{0}$ or $Cd_{i_{C3v}}^{0}$). Note this assumes “V_” means vacancy not Vanadium.

Parameters:

defect_species (str) – Name of defect including charge state (e.g. Cd_i_C3v_0)
include_site_info_in_name (bool) – Whether to include site info in name (e.g. $Cd_{i}^{0}$ or $Cd_{i_{C3v}}^{0}$).
wout_charge (bool, optional) – Whether the charge state is included in the defect_species name. Defaults to False.

Returns:

formatted defect name

Return type:

str

doped.utils.supercells module

Utility code and functions for generating defect supercells.

doped.utils.supercells.cell_metric(cell_matrix: ndarray, target: str = 'SC') → float[source]

Calculates the deviation of the given cell matrix from an ideal simple cubic (if target = “SC”) or face-centred cubic (if target = “FCC”) matrix, by evaluating the root mean square (RMS) difference of the vector lengths from that of the idealised values (i.e. the corresponding SC/FCC lattice vector lengths for the given cell volume).

For target = “SC”, the idealised lattice vector length is the effective cubic length (i.e. the cube root of the volume), while for “FCC” it is 2^(1/6) (~1.12) times the effective cubic length. This is a fixed version of the cell metric function in ASE (get_deviation_from_optimal_cell_shape), described in https://wiki.fysik.dtu.dk/ase/tutorials/defects/defects.html which currently does not account for rotated matrices (e.g. a cubic cell with target = “SC”, which should have a perfect score of 0, will have a bad score if its lattice vectors are rotated away from x, y and z, or even if they are just swapped as z, x, y). e.g. with ASE, [[1, 0, 0], [0, 1, 0], [0, 0, 1]] and [[0, 0, 1], [1, 0, 0], [0, 1, 0]] give scores of 0 and 1, but with this function they both give perfect scores of 0 as desired.

Parameters:

cell_matrix (np.ndarray) – Cell matrix for which to calculate the cell metric.
target (str) – Target cell shape, for which to calculate the normalised deviation score from. Either “SC” for simple cubic or “FCC” for face-centred cubic. Default = “SC”

Returns:

Cell metric (0 is perfect score)

Return type:

float

doped.utils.supercells.find_ideal_supercell(cell: ndarray, target_size: int, limit: int = 2, clean: bool = True, return_min_dist: bool = False, verbose: bool = False) → ndarray | tuple[source]

Given an input cell matrix (e.g. Structure.lattice.matrix or Atoms.cell) and chosen target_size (size of supercell in number of cells), finds an ideal supercell matrix (P) that yields the largest minimum image distance (i.e. minimum distance between periodic images of sites in a lattice), while also being as close to cubic as possible.

Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly cubic supercell with volume equal to target_size, and then scanning over all matrices where the elements are within +/-limit of the ideal P matrix elements (rounded to the nearest integer). For relatively small target_sizes (<100) and/or cells with mostly similar lattice vector lengths, the default limit of +/-2 performs very well. For larger target_sizes, cells with very different lattice vector lengths, and/or cases where small differences in minimum image distance are very important, a larger limit may be required (though typically only improves the minimum image distance by 1-6%).

This is also known as the Shortest Vector Problem (SVP), and has no known analytical solution, requiring enumeration type approaches. (https://wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_(SVP)), so can be slow for certain cases.

Parameters:

cell (np.ndarray) – Unit cell matrix for which to find a supercell.
target_size (int) – Target supercell size (in number of cells).
limit (int) – Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly SC/FCC supercell with volume equal to target_size, and then scanning over all matrices where the elements are within +/-limit of the ideal P matrix elements (rounded to the nearest integer). (Default = 2)
clean (bool) – Whether to return the supercell matrix which gives the ‘cleanest’ supercell (according to _lattice_matrix_sorting_func; most symmetric, with mostly positive diagonals and c >= b >= a). (Default = True)
return_min_dist (bool) – Whether to return the minimum image distance (in Å) as a second return value. (Default = False)
verbose (bool) – Whether to print out extra information. (Default = False)

Returns:

Supercell matrix (P). float: Minimum image distance (in Å) if return_min_dist is True.

Return type:

np.ndarray

doped.utils.supercells.find_optimal_cell_shape(cell: ndarray, target_size: int, target_shape: str = 'SC', limit: int = 2, return_score: bool = False, verbose: bool = False) → ndarray | tuple[source]

Find the transformation matrix that produces a supercell corresponding to target_size unit cells that most closely approximates the shape defined by target_shape.

This is an updated version of ASE’s find_optimal_cell_shape() function, but fixed to be rotationally-invariant (explained below), with significant efficiency improvements, and then secondarily sorted by the (fixed) cell metric (in doped), and then by some other criteria to give the cleanest output.

Finds the optimal supercell transformation matrix by calculating the deviation of the possible supercell matrices from an ideal simple cubic (if target = “SC”) or face-centred cubic (if target = “FCC”) matrix - and then taking that with the best (lowest) score, by evaluating the root mean square (RMS) difference of the vector lengths from that of the idealised values (i.e. the corresponding SC/FCC lattice vector lengths for the given cell volume).

For target = “SC”, the idealised lattice vector length is the effective cubic length (i.e. the cube root of the volume), while for “FCC” it is 2^(1/6) (~1.12) times the effective cubic length. The get_deviation_from_optimal_cell_shape function in ASE - described in https://wiki.fysik.dtu.dk/ase/tutorials/defects/defects.html - currently does not account for rotated matrices (e.g. a cubic cell with target = “SC”, which should have a perfect score of 0, will have a bad score if its lattice vectors are rotated away from x, y and z, or even if they are just swapped as z, x, y). e.g. with ASE, [[1, 0, 0], [0, 1, 0], [0, 0, 1]] and [[0, 0, 1], [1, 0, 0], [0, 1, 0]] give scores of 0 and 1, but with this function they both give perfect scores of 0 as desired.

Parameters:

cell (np.ndarray) – Unit cell matrix for which to find a supercell transformation.
target_size (int) – Target supercell size (in number of cells).
target_shape (str) – Target cell shape, for which to calculate the normalised deviation score from. Either “SC” for simple cubic or “FCC” for face-centred cubic. Default = “SC”
limit (int) – Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly SC/FCC supercell with volume equal to target_size, and then scanning over all matrices where the elements are within +/-limit of the ideal P matrix elements (rounded to the nearest integer). (Default = 2)
return_score (bool) – Whether to return the cell metric score as a second return value. (Default = False)
verbose (bool) – Whether to print out extra information. (Default = False)

Returns:

Supercell matrix (P). float: Cell metric (0 is perfect score) if return_score is True.

Return type:

np.ndarray

doped.utils.supercells.get_min_image_distance(structure: Structure) → float[source]

Get the minimum image distance (i.e. minimum distance between periodic images of sites in a lattice) for the input structure.

This is also known as the Shortest Vector Problem (SVP), and has no known analytical solution, requiring enumeration type approaches. (https://wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_(SVP))

Parameters:: structure (Structure) – Structure object.
Returns:: Minimum image distance.
Return type:: float

doped.utils.supercells.get_pmg_cubic_supercell_dict(struct: Structure, uc_range: tuple = (1, 200)) → dict[source]

Get a dictionary of (near-)cubic supercell matrices for the given structure and range of numbers of unit cells (in the supercell).

Returns a dictionary of format:

{Number of Unit Cells:
    {"P": transformation matrix,
     "min_dist": minimum image distance}
}

for (near-)cubic supercells generated by the pymatgen CubicSupercellTransformation class. If a (near-)cubic supercell cannot be found for a given number of unit cells, then the corresponding dict value will be set to an empty dict.

Parameters:

struct (Structure) – pymatgen Structure object to generate supercells for
uc_range (tuple) – Range of numbers of unit cells to search over

Returns:

{Number of Unit Cells: {"P": transformation matrix, "min_dist": minimum image distance}}

Return type:

dict of

doped.utils.symmetry module

Utility code and functions for symmetry analysis of structures and defects.

doped.utils.symmetry.apply_symm_op_to_site(symm_op: SymmOp, site: PeriodicSite, lattice: Lattice) → PeriodicSite[source]: Apply the given symmetry operation to the input site (not in place) and return the new site.

doped.utils.symmetry.apply_symm_op_to_struct(struct: Structure, symm_op: SymmOp, fractional: bool = False) → Structure[source]

Apply a symmetry operation to a structure and return the new structure.

This differs from pymatgen’s apply_operation method in that it does not apply the operation in place as well (i.e. does not modify the input structure), which avoids the use of unnecessary and slow Structure.copy() calls, making the structure manipulation / symmetry analysis functions more efficient.

doped.utils.symmetry.get_BCS_conventional_structure(structure, pbar=None, return_wyckoff_dict=False)[source]

Get the conventional crystal structure of the input structure, according to the Bilbao Crystallographic Server (BCS) definition. Also returns the transformation matrix from the spglib (SpaceGroupAnalyzer) conventional structure definition to the BCS definition.

Parameters:

structure (Structure) – pymatgen Structure object for this to get the corresponding BCS conventional crystal structure
pbar (ProgressBar) – tqdm progress bar object, to update progress.
return_wyckoff_dict (bool) – whether to return the Wyckoff label dict ({Wyckoff label: coordinates})

number.

Returns:: pymatgen Structure object and spglib -> BCS conv cell transformation matrix.

doped.utils.symmetry.get_clean_structure(structure: Structure, return_T: bool = False)[source]

Get a ‘clean’ version of the input structure by searching over equivalent Niggli reduced cells, and finding the most optimal according to _lattice_matrix_sorting_func (most symmetric, with mostly positive diagonals and c >= b >= a), with a positive determinant (required to avoid VASP bug for negative determinant cells).

Parameters:

structure (Structure) – Structure object.
return_T (bool) – Whether to return the transformation matrix. (Default = False)

doped.utils.symmetry.get_conv_cell_site(defect_entry)[source]

Gets an equivalent site of the defect entry in the conventional structure of the host material. If the conventional_structure attribute is not defined for defect_entry, then it is generated using SpaceGroupAnalyzer and then reoriented to match the Bilbao Crystallographic Server’s conventional structure definition.

Parameters:: defect_entry – DefectEntry object.

doped.utils.symmetry.get_primitive_structure(sga, ignored_species: list | None = None, clean: bool = True)[source]

Get a consistent/deterministic primitive structure from a SpacegroupAnalyzer object.

For some materials (e.g. zinc blende), there are multiple equivalent primitive cells, so for reproducibility and in line with most structure conventions/definitions, take the one with the lowest summed norm of the fractional coordinates of the sites (i.e. favour Cd (0,0,0) and Te (0.25,0.25,0.25) over Cd (0,0,0) and Te (0.75,0.75,0.75) for F-43m CdTe).

If ignored_species is set, then the sorting function used to determine the ideal primitive structure will ignore sites with species in ignored_species.

doped.utils.symmetry.get_spglib_conv_structure(sga)[source]

Get a consistent/deterministic conventional structure from a SpacegroupAnalyzer object. Also returns the corresponding SpacegroupAnalyzer (for getting Wyckoff symbols corresponding to this conventional structure definition).

For some materials (e.g. zinc blende), there are multiple equivalent primitive/conventional cells, so for reproducibility and in line with most structure conventions/definitions, take the one with the lowest summed norm of the fractional coordinates of the sites (i.e. favour Cd (0,0,0) and Te (0.25,0.25,0.25) over Cd (0,0,0) and Te (0.75,0.75,0.75) for F-43m CdTe; SGN 216).

doped.utils.symmetry.get_wyckoff(frac_coords, struct, symm_ops: list | None = None, equiv_sites=False, symprec=0.01)[source]

Get the Wyckoff label of the input fractional coordinates in the input structure. If the symmetry operations of the structure have already been computed, these can be input as a list to speed up the calculation.

Parameters:

frac_coords – Fractional coordinates of the site to get the Wyckoff label of.
struct – pymatgen Structure object for which frac_coords corresponds to.
symm_ops – List of pymatgen SymmOps of the structure. If None (default), will recompute these from the input struct.
equiv_sites – If True, also returns a list of equivalent sites in struct.
symprec – Symmetry precision for SpacegroupAnalyzer.

doped.utils.symmetry.get_wyckoff_dict_from_sgn(sgn)[source]

Get dictionary of {Wyckoff label: coordinates} for a given space group number.

The database used here for Wyckoff analysis (wyckpos.dat) was obtained from code written by JaeHwan Shim @schinavro (ORCID: 0000-0001-7575-4788) (https://gitlab.com/ase/ase/-/merge_requests/1035) based on the tabulated datasets in https://github.com/xtalopt/randSpg (also found at https://github.com/spglib/spglib/blob/develop/database/Wyckoff.csv). By default, doped uses the Wyckoff functionality of spglib (along with symmetry operations in pymatgen) when possible however.

doped.utils.symmetry.get_wyckoff_label_and_equiv_coord_list(defect_entry=None, conv_cell_site=None, sgn=None, wyckoff_dict=None)[source]

Return the Wyckoff label and list of equivalent fractional coordinates within the conventional cell for the input defect_entry or conv_cell_site (whichever is provided, defaults to defect_entry if both), given a dictionary of Wyckoff labels and coordinates (wyckoff_dict).

If wyckoff_dict is not provided, it is generated from the spacegroup number (sgn) using get_wyckoff_dict_from_sgn(sgn). If sgn is not provided, it is obtained from the bulk structure of the defect_entry if provided.

doped.utils.symmetry.group_order_from_schoenflies(sch_symbol)[source]

Return the order of the point group from the Schoenflies symbol.

Useful for symmetry and orientational degeneracy analysis.

doped.utils.symmetry.point_symmetry(structure: Structure, bulk_structure: Structure | None = None, symm_ops: list | None = None, symprec: float | None = None, relaxed: bool = True, verbose: bool = True, return_periodicity_breaking: bool = False)[source]

Get the point symmetry of a given structure.

Note: For certain non-trivial supercell expansions, the broken cell periodicity can break the site symmetry and lead to incorrect point symmetry determination (particularly if using non-scalar supercell matrices with high symmetry materials). If the unrelaxed bulk structure (bulk_structure) is also supplied, then doped will determine the defect site and then automatically check if this is the case, and warn you if so.

This can also be checked by using this function on your doped generated defects:

from doped.generation import get_defect_name_from_entry
for defect_name, defect_entry in defect_gen.items():
    print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False),
          get_defect_name_from_entry(defect_entry), "\n")

And if the point symmetries match in each case, then using this function on your parsed relaxed DefectEntry objects should correctly determine the final relaxed defect symmetry - otherwise periodicity-breaking prevents this.

If bulk_structure is supplied and relaxed is set to False, then returns the bulk site symmetry of the defect, which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure.

Parameters:

structure (Structure) – Structure object for which to determine the point symmetry.
bulk_structure (Structure) – Structure object of the bulk structure, if known. Default is None. If provided and relaxed = True, will be used to check if the supercell is breaking the crystal periodicity (and thus preventing accurate determination of the relaxed defect point symmetry) and warn you if so.
symm_ops (list) – List of symmetry operations of either the defect_entry.bulk_supercell structure (if relaxed=False) or defect_entry.defect_supercell (if relaxed=True), to avoid re-calculating. Default is None (recalculates).
symprec (float) – Symmetry tolerance for spglib. Default is 0.01 for unrelaxed structures, 0.1 for relaxed (to account for residual structural noise, matching that used by the Materials Project). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).
relaxed (bool) – If False, determines the site symmetry using the defect site in the unrelaxed bulk supercell (i.e. the bulk site symmetry), otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell. Default is True.
verbose (bool) – If True, prints a warning if the supercell is detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry). Default is True.
return_periodicity_breaking (bool) – If True, also returns a boolean specifying if the supercell has been detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry) or not. Default is False.

Returns:

Structure point symmetry (and if return_periodicity_breaking = True, a boolean specifying if the supercell has been detected to break the crystal periodicity).

Return type:

str

doped.utils.symmetry.point_symmetry_from_defect(defect, symm_ops=None, symprec=0.01)[source]

Get the defect site point symmetry from a Defect object.

Note that this is intended only to be used for unrelaxed, as-generated Defect objects (rather than parsed defects).

Parameters:

defect (Defect) – Defect object.
symm_ops (list) – List of symmetry operations of defect.structure, to avoid re-calculating. Default is None (recalculates).
symprec (float) – Symmetry tolerance for spglib. Default is 0.01.

Returns:

Defect point symmetry.

Return type:

str

doped.utils.symmetry.point_symmetry_from_defect_entry(defect_entry: DefectEntry, symm_ops: list | None = None, symprec: float | None = None, relaxed: bool = True, verbose: bool = True, return_periodicity_breaking: bool = False)[source]

Get the defect site point symmetry from a DefectEntry object.

Note: If relaxed = True (default), then this tries to use the defect_entry.defect_supercell to determine the site symmetry. This will thus give the relaxed defect point symmetry if this is a DefectEntry created from parsed defect calculations. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell. doped tries to automatically check if this is the case, and will warn you if so.

This can also be checked by using this function on your doped generated defects:

from doped.generation import get_defect_name_from_entry
for defect_name, defect_entry in defect_gen.items():
    print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False),
          get_defect_name_from_entry(defect_entry), "\n")

And if the point symmetries match in each case, then using this function on your parsed relaxed DefectEntry objects should correctly determine the final relaxed defect symmetry - otherwise periodicity-breaking prevents this.

If periodicity-breaking prevents auto-symmetry determination, you can manually determine the relaxed defect and bulk-site point symmetries, and/or orientational degeneracy, from visualising the structures (e.g. using VESTA)(can use get_orientational_degeneracy to obtain the corresponding orientational degeneracy factor for given defect/bulk-site point symmetries) and setting the corresponding values in the calculation_metadata['relaxed point symmetry']/['bulk site symmetry'] and/or degeneracy_factors['orientational degeneracy'] attributes. Note that the bulk-site point symmetry corresponds to that of DefectEntry.defect, or equivalently calculation_metadata["bulk_site"]/["unrelaxed_defect_structure"], which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure. The degeneracy factor is used in the calculation of defect/carrier concentrations and Fermi level behaviour (see e.g. https://doi.org/10.1039/D2FD00043A & https://doi.org/10.1039/D3CS00432E).

Parameters:

defect_entry (DefectEntry) – DefectEntry object.
symm_ops (list) – List of symmetry operations of either the defect_entry.bulk_supercell structure (if relaxed=False) or defect_entry.defect_supercell (if relaxed=True), to avoid re-calculating. Default is None (recalculates).
symprec (float) – Symmetry tolerance for spglib. Default is 0.01 for unrelaxed structures, 0.1 for relaxed (to account for residual structural noise, matching that used by the Materials Project). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).
relaxed (bool) – If False, determines the site symmetry using the defect site in the unrelaxed bulk supercell (i.e. the bulk site symmetry), otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell. Default is True.
verbose (bool) – If True, prints a warning if the supercell is detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry). Default is True.
return_periodicity_breaking (bool) – If True, also returns a boolean specifying if the supercell has been detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry) or not. Mainly for internal doped usage. Default is False.

Returns:

Defect point symmetry (and if return_periodicity_breaking = True, a boolean specifying if the supercell has been detected to break the crystal periodicity).

Return type:

str

doped.utils.symmetry.schoenflies_from_hermann(herm_symbol)[source]: Convert from Hermann-Mauguin to Schoenflies.

doped.utils.symmetry.swap_axes(structure, axes)[source]

Swap axes of the given structure.

The new order of the axes is given by the axes parameter. For example, axes=(2, 1, 0) will swap the first and third axes.

Module contents

Submodule for utility functions in doped.