doped.utils package
Submodules
doped.utils.configurations module
Utility functions for generating and parsing configurational coordinate (CC) diagrams, for potential energy surfaces (PESs), Nudged Elastic Band (NEB), non- radiative recombination calculations etc.
- doped.utils.configurations.get_dQ(struct_a: Structure, struct_b: Structure) float[source]
Get the mass-weighted displacement (ΔQ in amu^(1/2)Å) between two structures, assuming matched atomic indices.
- Parameters:
struct_a (Structure) – Initial structure.
struct_b (Structure) – Final structure.
- Returns:
The mass-weighted displacement (ΔQ in amu^(1/2)Å) between the two structures, assuming matched atomic indices. Returns
np.infif the structures are not matching.- Return type:
float
- doped.utils.configurations.get_path_structures(struct1: Structure, struct2: Structure, n_images: int | ndarray | list[float] = 7, displacements: ndarray | list[float] | None = None, displacements2: ndarray | list[float] | None = None) dict[str, Structure] | tuple[dict[str, Structure], dict[str, Structure]][source]
Generate a series of interpolated structures along the linear path between
struct1andstruct2, typically for use in NEB calculations or configuration coordinate (CC) diagrams.Structures are output as a dictionary with keys corresponding to either the index of the interpolated structure (0-indexed;
00,01etc as for VASP NEB calculations) or the fractional displacement along the interpolation path between structures, and values corresponding to the interpolated structure. Ifdisplacementsis set (and thus two sets of structures are generated), a tuple of such dictionaries is returned.Note that for NEB calculations, the the lattice vectors and order of sites (atomic indices) must be consistent in both
struct1andstruct2. This can be ensured by using theorient_s2_like_s1()function indoped.utils.configurations, as shown in thedopedtutorials. This is also desirable for CC diagrams, as the atomic indices are assumed to match for many parsing and plotting functions (e.g. innonradandCarrierCapture.jl), but is not strictly necessary. If the input structures are detected to be different (symmetry-inequivalent) geometries (e.g. not a simple defect migration between two symmetry-equivalent sites), but have mis-matching orientations/ positions (such that they do not correspond to the shortest linear path between them), a warning will be raised. See thedopedconfiguration coordinate / NEB path generation tutorial for a deeper explanation.If only
n_imagesis set (anddisplacementsisNone)(default), then only one set of interpolated structures is generated (in other words, assuming a standard NEB/PES calculation is being performed). Ifdisplacements(and possiblydisplacements2) is set, then two sets of interpolated structures are generated (in other words, assuming a CC / non-radiative recombination calculation is being performed, where the two sets of structures are to be calculated in separate charge/spin etc states).- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
n_images (int) – Number of images to interpolate between
struct1andstruct2, or a list of fractional interpolation values (displacements) to use. Note thatn_imagesis ignored ifdisplacementsis set (in which case CC / non-radiative recombination calculations are assumed – generating two sets of interpolated structures – otherwise a standard NEB / PES calculation is assumed – generating one set of structures). Default: 7displacements (np.ndarray or list) – Displacements to use for
struct1along the linear transformation path tostruct2. If set, then CC / non-radiative recombination calculations are assumed, and two sets of interpolated structures will be generated. If set anddisplacements2is not set, then the same set of displacements is used for both sets of interpolated structures. Default:Nonedisplacements2 (np.ndarray or list) – Displacements to use for
struct2along the linear transformation path tostruct1. If not set anddisplacementsis notNone, then the same set of displacements is used for both sets of interpolated structures. Default:None
- Returns:
Dictionary of structures (for NEB/PES calculations), or tuple of two dictionaries of structures (for CC / non-radiative calculations, when
displacementsis notNone).- Return type:
dict[str, Structure] | tuple[dict[str, Structure], dict[str, Structure]]
- doped.utils.configurations.get_s2_like_s1(struct1: Structure, struct2: Structure, new_lattice: str | None = None, verbose: bool = False, **sm_kwargs)
Re-orient
struct2to match the orientation ofstruct1as closely as possible , with matching atomic indices as needed for VASP NEB calculations and other structural transformation analyses (e.g. configuration coordinate (CC) diagrams vianonrad,CarrierCapture.jletc.).This will give a fully symmetry-equivalent orientation (i.e. will not change the actual geometry) of
struct2, except ifstruct1andstruct2have different inequivalent lattices (e.g. different space groups) andstruct1_latticeisTrue.This corresponds to minimising the root-mean-square displacement for the shortest linear path from
struct1to a symmetry-equivalent definition ofstruct2, with matched atomic indices and lattices as required by VASP NEB andnonradfunctions. This function uses an accelerated version of theget_s2_like_s1()method, extended to ensure the correct atomic indices matching and lattice vector definitions.If
verbose=True, information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures is printed. This is the typical x-axis unit in configurational coordinate diagrams (see e.g. 10.1103/PhysRevB.90.075202).- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
new_lattice (str | None) – If
"struct1", then the lattice ofstruct1is used for the re-oriented structure, if"struct2", then the lattice ofstruct2is used, or if"s2_like_s1", then the output lattice ofStructureMatcher.get_s2_like_s1(a symmetry-equivalent version ofstruct2.lattice) is used. Default isNone, wherenew_latticeis set to"struct1"ifstruct1andstruct2have equivalent lattices (expected to be the case for defect NEBs/CC diagrams), and"s2_like_s1"otherwise.verbose (bool) – Print information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures. Default:
False**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher()(e.g.ignored_species,comparatoretc).
- Returns:
struct2re-oriented to matchstruct1as closely as possible.# TODO: Option to return RMSD, just displacement, anything else?
- Return type:
Structure
- doped.utils.configurations.orient_s2_like_s1(struct1: Structure, struct2: Structure, new_lattice: str | None = None, verbose: bool = False, **sm_kwargs)[source]
Re-orient
struct2to match the orientation ofstruct1as closely as possible , with matching atomic indices as needed for VASP NEB calculations and other structural transformation analyses (e.g. configuration coordinate (CC) diagrams vianonrad,CarrierCapture.jletc.).This will give a fully symmetry-equivalent orientation (i.e. will not change the actual geometry) of
struct2, except ifstruct1andstruct2have different inequivalent lattices (e.g. different space groups) andstruct1_latticeisTrue.This corresponds to minimising the root-mean-square displacement for the shortest linear path from
struct1to a symmetry-equivalent definition ofstruct2, with matched atomic indices and lattices as required by VASP NEB andnonradfunctions. This function uses an accelerated version of theget_s2_like_s1()method, extended to ensure the correct atomic indices matching and lattice vector definitions.If
verbose=True, information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures is printed. This is the typical x-axis unit in configurational coordinate diagrams (see e.g. 10.1103/PhysRevB.90.075202).- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
new_lattice (str | None) – If
"struct1", then the lattice ofstruct1is used for the re-oriented structure, if"struct2", then the lattice ofstruct2is used, or if"s2_like_s1", then the output lattice ofStructureMatcher.get_s2_like_s1(a symmetry-equivalent version ofstruct2.lattice) is used. Default isNone, wherenew_latticeis set to"struct1"ifstruct1andstruct2have equivalent lattices (expected to be the case for defect NEBs/CC diagrams), and"s2_like_s1"otherwise.verbose (bool) – Print information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures. Default:
False**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher()(e.g.ignored_species,comparatoretc).
- Returns:
struct2re-oriented to matchstruct1as closely as possible.# TODO: Option to return RMSD, just displacement, anything else?
- Return type:
Structure
- doped.utils.configurations.write_path_structures(struct1: Structure, struct2: Structure, output_dir: str | Path | None = None, n_images: int | list = 7, displacements: ndarray | list[float] | None = None, displacements2: ndarray | list[float] | None = None)[source]
Generate a series of interpolated structures along the linear path between
struct1andstruct2, typically for use in NEB calculations or configuration coordinate (CC) diagrams, and write to folders.Folder names are labelled by the index of the interpolated structure (0-indexed;
00,01etc as for VASP NEB calculations) or the fractional displacement along the interpolation path between structures (e.g.delQ_0.0,delQ_0.1,delQ_-0.1etc), depending on the inputn_images/displacementssettings.Note that for NEB calculations, the the lattice vectors and order of sites (atomic indices) must be consistent in both
struct1andstruct2. This can be ensured by using theorient_s2_like_s1()function indoped.utils.configurations, as shown in thedopedtutorials. This is also desirable for CC diagrams, as the atomic indices are assumed to match for many parsing and plotting functions (e.g. innonradandCarrierCapture.jl), but is not strictly necessary. If the input structures are detected to be different (symmetry-inequivalent) geometries (e.g. not a simple defect migration between two symmetry-equivalent sites), but have mis-matching orientations/ positions (such that they do not correspond to the shortest linear path between them), a warning will be raised. See thedopedconfiguration coordinate / NEB path generation tutorial for a deeper explanation. (TODO)If only
n_imagesis set (anddisplacementsisNone)(default), then only one set of interpolated structures is written (in other words, assuming a standard NEB/PES calculation is being performed). Ifdisplacements(and possiblydisplacements2) is set, then two sets of interpolated structures are written (in other words, assuming a CC / non-radiative recombination calculation is being performed, where the two sets of structures are to be calculated in separate charge/spin etc states).- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
output_dir (PathLike) – Directory to write the interpolated structures to. Defaults to “Configuration_Coordinate” if
displacementsis set, otherwise “NEB”.n_images (int) – Number of images to interpolate between
struct1andstruct2, or a list of fractional interpolation values (displacements) to use. Note thatn_imagesis ignored ifdisplacementsis set (in which case CC / non-radiative recombination calculations are assumed – generating two sets of interpolated structures – otherwise a standard NEB / PES calculation is assumed – generating one set of structures). Default: 7displacements (np.ndarray or list) – Displacements to use for
struct1along the linear transformation path tostruct2. If set, then CC / non-radiative recombination calculations are assumed, and two sets of interpolated structures will be written to file. If set anddisplacements2is not set, then the same set of displacements is used for both sets of interpolated structures. Default:Nonedisplacements2 (np.ndarray or list) – Displacements to use for
struct2along the linear transformation path tostruct1. If not set anddisplacementsis notNone, then the same set of displacements is used for both sets of interpolated structures. Default:None
doped.utils.displacements module
Code to analyse site displacements around defects.
- doped.utils.displacements.calc_displacements_ellipsoid(defect_entry: DefectEntry, quantile: float = 0.8, relaxed_distances: bool = False, return_extras: bool = False, tolerance: float = 0.0005) tuple[source]
Calculate displacements around a defect site and fit an ellipsoid to these displacements, returning a tuple of the ellipsoid’s center, radii, rotation matrix and dataframe of anisotropy information.
- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject.quantile (float) – The quantile threshold for selecting significant displacements (between 0 and 1). Default is 0.8.
relaxed_distances (bool) – Whether to use the atomic positions in the relaxed defect supercell for
'Distance to defect','Vector to site from defect'and'Displacement wrt defect'values (True), or unrelaxed positions (i.e. the bulk structure positions)(False). Defaults toFalse.return_extras (bool) – Whether to also return the
disp_df(output fromcalc_site_displacements(defect_entry, relative_to_defect=True)) and the points used to fit the ellipsoid, corresponding to the Cartesian coordinates of the sites with displacements above the threshold, where the structure has been shifted to place the defect at the cell midpoint ([0.5, 0.5, 0.5]) in fractional coordinates. Default isFalse.tolerance (float) – Tolerance for the minimum volume ellipsoid fitting algorithm. Default is 5e-4. Smaller is more precise, but slower.
- Returns:
(
ellipsoid_center,ellipsoid_radii,ellipsoid_rotation,anisotropy_df): A tuple containing the ellipsoid’s center, radii, rotation matrix, and a dataframe of anisotropy information, or(None, None, None, None)if fitting was unsuccessful.(
disp_dfandpoints): Ifreturn_extras=True, also returnsdisp_dfand the points used to fit the ellipsoid, appended to the return tuple.
- Return type:
tuple
- doped.utils.displacements.calc_site_displacements(defect_entry: DefectEntry, relaxed_distances: bool = False, relative_to_defect: bool = False, vector_to_project_on: list | None = None, threshold: float = 2.0) DataFrame[source]
Calculates the site displacements in the defect supercell, relative to the bulk supercell, and returns a
DataFrameof site displacement info.The signed displacements are stored in the calculation_metadata of the
DefectEntryobject under the"site_displacements"key.- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject.relaxed_distances (bool) – Whether to use the atomic positions in the relaxed defect supercell for
'Distance to defect','Vector to site from defect'and'Displacement wrt defect'values (True), or unrelaxed positions (i.e. the bulk structure positions)(False). Defaults toFalse.relative_to_defect (bool) – Whether to calculate the signed displacements along the line from the (relaxed) defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect. Defaults to
False. IfTrue, the relative displacements are stored in theDisplacement wrt defectkey of the returned dictionary.vector_to_project_on (list) – Direction to project the site displacements along (e.g. [0, 0, 1]). Defaults to
None(displacements are given as vectors in Cartesian space).threshold (float) – If the distance between a pair of matched sites is larger than this, then a warning will be thrown. Default is 2.0 Å.
- Returns:
pandasDataFramewith site displacements (compared to pristine supercell), and other displacement-related information.
- doped.utils.displacements.plot_displacements_ellipsoid(defect_entry: DefectEntry, plot_ellipsoid: bool = True, plot_anisotropy: bool = False, quantile: float = 0.8, use_plotly: bool = False, show_supercell: bool = True, style_file: str | Path | None = None) tuple[source]
Plot the displacement ellipsoid and/or anisotropy around a relaxed defect.
Set
use_plotly = Trueto get an interactiveplotlyplot, useful for analysis!The supercell edges are also plotted if
show_supercell = True(default).- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject.plot_ellipsoid (bool) – If
True, plot the fitted ellipsoid in the crystal lattice.plot_anisotropy (bool) – If
True, plot the anisotropy of the ellipsoid radii.quantile (float) – The quantile threshold for selecting significant displacements (between 0 and 1). Default is 0.8.
use_plotly (bool) – Whether to use
plotlyfor plotting. Default isFalse. Set toTrueto get an interactive plot.show_supercell (bool) – Whether to show the supercell edges in the plot. Default is
True.style_file (PathLike) – Path to
matplotlibstyle file. if not set, will use thedopeddefault displacements style.
- Returns:
Either a single
plotlyormatplotlibFigure, if only one ofplot_ellipsoidorplot_anisotropyareTrue, or a tuple of plots if both areTrue.
- doped.utils.displacements.plot_site_displacements(defect_entry: DefectEntry, separated_by_direction: bool = False, relaxed_distances: bool = False, relative_to_defect: bool = False, vector_to_project_on: list | None = None, use_plotly: bool = False, style_file: str | Path | None = None)[source]
Plots site displacements around a defect.
Set
use_plotly = Trueto get an interactiveplotlyplot, useful for analysis!- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject.separated_by_direction (bool) – Whether to plot site displacements separated by direction (x, y, z). Default is
False.relaxed_distances (bool) – Whether to use the atomic positions in the relaxed defect supercell for
'Distance to defect','Vector to site from defect'and'Displacement wrt defect'values (True), or unrelaxed positions (i.e. the bulk structure positions)(False). Defaults toFalse.relative_to_defect (bool) – Whether to plot the signed displacements along the line from the (relaxed) defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect (tensile strain). Default is
False.vector_to_project_on (bool) – Direction to project the site displacements along (e.g. [0, 0, 1]). Defaults to
None(e.g. the displacements are calculated in the cartesian basis x, y, z).use_plotly (bool) – Whether to use
plotlyfor plotting. Default isFalse. Set toTrueto get an interactive plot.style_file (PathLike) – Path to
matplotlibstyle file. if not set, will use thedopeddefault displacements style.
- Returns:
plotlyormatplotlibFigure.
doped.utils.efficiency module
Utility functions to improve the efficiency of common
functions/workflows/calculations in doped.
- class doped.utils.efficiency.DopedTopographyAnalyzer(structure: Structure, image_tol: float = 0.0001, max_cell_range: int = 1, constrained_c_frac: float = 0.5, thickness: float = 0.5)[source]
Bases:
objectThis is a modified version of
pymatgen.analysis.defects.utils.TopographyAnalyzerto lean down the input options and make initialisation far more efficient (~2 orders of magnitude faster).The original code was written by Danny Broberg and colleagues (10.1016/j.cpc.2018.01.004), which was then added to
pymatgenbefore being cut.- Parameters:
structure (Structure) – Structure to analyse.
image_tol (float) – A tolerance distance for the analysis, used to determine if sites are periodic images of each other. Default (of 1e-4) is usually fine.
max_cell_range (int) – This is the range of periodic images to construct the Voronoi tessellation. A value of 1 means that we include all points from
(x +- 1, y +- 1, z+- 1)in the Voronoi construction. This is because the Voronoi polyhedra extend beyond the standard unit cell because of PBC. Typically, the default value of 1 works fine for most structures and is fast. But for very small unit cells with high symmetry, this may need to be increased to 2 or higher. If there are < 5 atoms in the input structure andmax_cell_rangeis 1, this will automatically be increased to 2.constrained_c_frac (float) – Constrain the region where topology analysis is performed. Only sites with
zfractional coordinates betweenconstrained_c_frac +/- thicknessare considered. Default of 0.5 (withthicknessof 0.5) includes all sites in the unit cell.thickness (float) – Constrain the region where topology analysis is performed. Only sites with
zfractional coordinates betweenconstrained_c_frac +/- thicknessare considered. Default of 0.5 (withthicknessof 0.5) includes all sites in the unit cell.
- class doped.utils.efficiency.DopedVacancyGenerator(symprec: float = 0.01, angle_tolerance: float = 5)[source]
Bases:
VacancyGeneratorVacancy defects generator, subclassed from
pymatgen-analysis-defectsto improve efficiency (particularly when handling defect complexes).Initialize the vacancy generator.
- generate(structure: Structure, rm_species: set[str | Species] | list[str | Species] | None = None, **kwargs) Generator[Vacancy, None, None][source]
Generate vacancy defects.
- Parameters:
structure (Structure) – The structure to generate vacancy defects in.
rm_species (set[str | Species] | list[str | Species] | None) – List/set of species to be removed (i.e. to consider for vacancy generation). If
None, considers all species.**kwargs – Additional keyword arguments for the
Vacancyconstructor.
- Returns:
Generator that yields a list of
Vacancyobjects.- Return type:
Generator[Vacancy, None, None]
- doped.utils.efficiency.StructureMatcher_scan_stol(struct1: Structure, struct2: Structure, func_name: str = 'get_s2_like_s1', min_stol: float | None = None, max_stol: float = 5.0, stol_factor: float = 0.5, **sm_kwargs)[source]
Utility function to scan through a range of
stolvalues forStructureMatcheruntil a match is found betweenstruct1andstruct2(i.e.StructureMatcher.{func_name}returns a result).The
StructureMatcher.match()function (used in mostStructureMatchermethods) speed is heavily dependent onstol, with smaller values being faster, so we can speed up evaluation by starting with small values and increasing until a match is found (especially with thedopedefficiency tools which implement caching (and other improvements) to ensure no redundant work here).Note that
ElementComparator()is used by default here! (So sites with different species but the same element (e.g. “S2-” & “S0+”) will be considered match-able). This can be controlled withsm_kwargs['comparator'].- Parameters:
struct1 (Structure) –
struct1forStructureMatcher.match().struct2 (Structure) –
struct2forStructureMatcher.match().func_name (str) –
The name of the
StructureMatchermethod to return the result ofStructureMatcher.{func_name}(struct1, struct2)for, such as:”get_s2_like_s1” (default)
”get_rms_dist”
”fit”
”fit_anonymous”
”get_rms_anonymous”
min_stol (float) – Minimum
stolvalue to try. Default is to usedopedsget_min_stol_for_s1_s2()function to estimate the minimumstolnecessary, and start with 2x this value to achieve fast structure-matching in most cases.max_stol (float) – Maximum
stolvalue to try. Default: 5.0.stol_factor (float) – Fractional increment to increase
stolby each time (when a match is not found). Default value of 0.5 increasesstolby 50% each time.**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher().
- Returns:
Result of
StructureMatcher.{func_name}(struct1, struct2)orNoneif no match is found.
- doped.utils.efficiency.cache_ready_PeriodicSite__eq__(self, other)[source]
Custom
__eq__method forPeriodicSiteinstances, using a cached equality function to speed up comparisons.
- doped.utils.efficiency.cache_species(structure_cls)[source]
Context manager that makes
Structure.speciesa cached property, which significantly speeds uppydefecteigenvalue parsing in large structures (due to repeated use ofStructure.indices_from_symbol.
- doped.utils.efficiency.cached_Structure_eq_func(self_hash, other_hash)[source]
Cached equality function for
Structureinstances.
- doped.utils.efficiency.cached_allclose(a: tuple, b: tuple, rtol: float = 1e-05, atol: float = 1e-08)[source]
Cached version of
np.allclose, taking tuples as inputs (so that they are hashable and thus cacheable).
- doped.utils.efficiency.doped_Composition_eq_func(self_hash, other_hash)[source]
Update equality function for
Compositioninstances, which breaks early for mismatches and also uses caching, making it orders of magnitude faster thanpymatgens equality function.
- doped.utils.efficiency.doped_Structure__eq__(self, other: IStructure) bool[source]
Copied from
pymatgen, but updated to break early once a mis-matching site is found, to speed up structure matching by ~2x.
- doped.utils.efficiency.fast_Composition_eq(self, other)[source]
Fast equality function for
Compositioninstances, breaking early for mismatches.
- doped.utils.efficiency.get_dist_equiv_stol(dist: float, structure: Structure) float[source]
Get the equivalent
stolvalue for a given Cartesian distance (dist) in a givenStructure.stolis a site tolerance parameter used inpymatgenStructureMatcherfunctions, defined as the fraction of the average free length per atom := ( V / Nsites ) ** (1/3).- Parameters:
dist (float) – Cartesian distance in Å.
structure (Structure) – Structure to calculate
stolfor.
- Returns:
Equivalent
stolvalue for the given distance.- Return type:
float
- doped.utils.efficiency.get_element_indices(structure: Structure, elements: list[Element | Species | str] | None = None, comparator: AbstractComparator | None = None) dict[str, list[int]][source]
Convenience function to generate a dictionary of
{element: [indices]}for a givenStructure, whereindicesare the indices of the sites in the structure corresponding to the givenelements(default is all elements in the structure).- Parameters:
structure (Structure) –
Structureto get the indices from.elements (list[Element | Species | str] | None) – List of elements to get the indices of. If
None(default), all elements in the structure are used.comparator (AbstractComparator | None) – Comparator to check if we should return the
str(element)representation (which includes charge information ifelementis aSpecies), or just the element symbol (i.e.element.element.symbol) – which is the case whencomparatorisNone(default) orElementComparator/FrameworkComparator.
- Returns:
Dictionary of
{element: [indices]}for the givenelementsin the structure.- Return type:
dict[str, list[int]]
- doped.utils.efficiency.get_element_min_max_bond_length_dict(structure: Structure, **sm_kwargs) dict[source]
Get a dictionary of
{element: (min_bond_length, max_bond_length)}for a givenStructure, wheremin_bond_lengthandmax_bond_lengthare the minimum and maximum bond lengths for each element in the structure.- Parameters:
structure (Structure) – Structure to calculate bond lengths for.
**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher(). Just used to check ifcomparatorhas been set here (ifElementComparator/FrameworkComparatorused, then we useElements rather thanSpeciesas the keys), or ifignored_speciesis set (in which case these species are ignored when calculating bond lengths).
- Returns:
Dictionary of
{element: (min_bond_length, max_bond_length)}.- Return type:
dict
- doped.utils.efficiency.get_min_stol_for_s1_s2(struct1: Structure, struct2: Structure, **sm_kwargs) float[source]
Get the minimum possible
stolvalue which will give a match betweenstruct1andstruct2usingStructureMatcher, based on the ranges of per-element minimum interatomic distances in the two structures.- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher(). Just used to check ifignored_speciesorcomparatorhas been set here.
- Returns:
Minimum
stolvalue for a match betweenstruct1andstruct2. If a direct match is detected (corresponding to minstol= 0, then1e-4is returned).- Return type:
float
- doped.utils.efficiency.get_voronoi_nodes(structure: Structure) list[PeriodicSite][source]
Get the Voronoi nodes of a
pymatgenStructure.Maximises efficiency by mapping down to the primitive cell, doing Voronoi analysis (with the efficient
DopedTopographyAnalyzerclass), and then mapping back to the original structure (typically a supercell).- Parameters:
structure (Structure) –
pymatgenStructureobject.- Returns:
List of
PeriodicSiteobjects representing the Voronoi nodes.- Return type:
list[PeriodicSite]
doped.utils.eigenvalues module
Helper functions for setting up PHS analysis.
Contains modified versions of functions from pydefect and vise
(https://github.com/kumagai-group/pydefect / vise).
Note that this module attempts to import modules from pydefect & vise,
which are highly-recommended but not strictly required dependencies of
doped (currently not available on conda-forge), and so any imports of
code from this module will attempt their import, raising an ImportError if
not available.
- doped.utils.eigenvalues.band_edge_properties_from_vasprun(vasprun: Vasprun, integer_criterion: float = 0.1) BandEdgeProperties[source]
Create a
pydefectBandEdgePropertiesobject from aVasprunobject.- Parameters:
vasprun (Vasprun) –
Vasprunobject.integer_criterion (float) – Threshold criterion for determining if a band is unoccupied (<
integer_criterion), partially occupied (betweeninteger_criterionand 1 -integer_criterion), or fully occupied (> 1 -integer_criterion). Default is 0.1.
- Returns:
BandEdgePropertiesobject.
- doped.utils.eigenvalues.get_band_edge_info(bulk_vr: Vasprun, defect_vr: Vasprun, bulk_procar: str | Path | Procar | None = None, defect_procar: str | Path | Procar | None = None, defect_supercell_site: PeriodicSite | None = None, neighbor_cutoff_factor: float = 1.3) tuple[BandEdgeOrbitalInfos, EdgeInfo, EdgeInfo][source]
Generate metadata required for performing eigenvalue & orbital analysis, specifically
pydefectBandEdgeOrbitalInfos, andEdgeInfoobjects for the bulk VBM and CBM.See: https://doped.readthedocs.io/en/latest/Tips.html#perturbed-host-states-shallow-defects
- Parameters:
bulk_vr (Vasprun) –
Vasprunobject of the bulk supercell calculation. Ifbulk_procaris not provided, then this must have theprojected_eigenvaluesattribute (i.e. from a calculation withLORBIT > 10in theINCARand parsed withparse_projected_eigen = True(default)).defect_vr (Vasprun) –
Vasprunobject of the defect supercell calculation. Ifdefect_procaris not provided, then this must have theprojected_eigenvaluesattribute (i.e. from a calculation withLORBIT > 10in theINCARand parsed withparse_projected_eigen = True(default)).bulk_procar (PathLike, Procar) – Either a path to the
VASPPROCAR(.gz)output file (withLORBIT > 10in theINCAR) or apymatgenProcarobject, for the reference bulk supercell calculation. Not required if the suppliedbulk_vrwas parsed withparse_projected_eigen = True(default). Default isNone.defect_procar (PathLike, Procar) – Either a path to the
VASPPROCAR(.gz)output file (withLORBIT > 10in theINCAR) or apymatgenProcarobject, for the defect supercell calculation. Not required if the supplieddefect_vrwas parsed withparse_projected_eigen = True(default). Default isNone.defect_supercell_site (PeriodicSite) –
PeriodicSiteobject of the defect site in the defect supercell, from which the defect neighbours are determined for localisation analysis. IfNone(default), then the defect site is determined automatically from the defect and bulk supercell structures.neighbor_cutoff_factor (float) – Sites within
min_distance * neighbor_cutoff_factorof the defect site in the relaxed defect supercell are considered neighbours for localisation analysis, wheremin_distanceis the minimum distance between sites in the defect supercell. Default is 1.3 (matching thepydefectdefault).
- Returns:
pydefectBandEdgeOrbitalInfos, andEdgeInfoobjects for the bulk VBM and CBM.
- doped.utils.eigenvalues.get_eigenvalue_analysis(defect_entry: DefectEntry | None = None, plot: bool = True, filename: str | None = None, ks_labels: bool = False, style_file: str | None = None, bulk_vr: str | Path | Vasprun | None = None, bulk_procar: str | Path | Procar | None = None, defect_vr: str | Path | Vasprun | None = None, defect_procar: str | Path | Procar | None = None, force_reparse: bool = False, ylims: tuple[float, float] | None = None, legend_kwargs: dict | None = None, similar_orb_criterion: float | None = None, similar_energy_criterion: float | None = None) BandEdgeStates | tuple[BandEdgeStates, Figure][source]
Get eigenvalue & orbital info (with automated classification of PHS states) for the band edge and in-gap electronic states for the input defect entry / calculation outputs, as well as a plot of the single-particle electronic eigenvalues and their occupation (if
plot=True).Can be used to determine if a defect is adopting a perturbed host state (PHS / shallow state), see: https://doped.readthedocs.io/en/latest/Tips.html#perturbed-host-states-shallow-defects
Note that the classification of electronic states as band edges or localised orbitals is based on the similarity of orbital projections and eigenvalues between the defect and bulk cell calculations (see
similar_orb/energy_criterionargument descriptions below for more details). You may want to adjust the default values of these keyword arguments, as the defaults may not be appropriate in all cases. In particular, the P-ratio values can give useful insight, revealing the level of (de)localisation of the states.Either a
dopedDefectEntryobject can be provided, or the required VASP output files/objects for the bulk and defect supercell calculations (Vaspruns, orVaspruns andProcars). If aDefectEntryis provided but eigenvalue data has not already been parsed (default indopedis to parse this data withDefectsParser/DefectParser, as controlled by theparse_projected_eigenflag), then this function will attempt to load the eigenvalue data from either the inputVasprun/PROCARobjects or files, or from thebulk/defect_paths indefect_entry.calculation_metadata. If so, will initially try to load orbital projections fromvasprun.xml(.gz)files (more accurate due to less rounding errors), or failing that fromPROCAR(.gz)files if present.This function uses code from
pydefect, so please cite thepydefectpaper: https://doi.org/10.1103/PhysRevMaterials.5.123803- Parameters:
defect_entry (DefectEntry) –
dopedDefectEntryobject. Default isNone.plot (bool) – Whether to plot the single-particle eigenvalues. (Default: True)
filename (str) – Filename to save the eigenvalue plot to (if
plot = True). IfNone(default), plots are not saved.ks_labels (bool) – Whether to add band index labels to the KS levels. (Default: False)
style_file (str) – Path to a
mplstylefile to use for the plot. IfNone(default), uses thedopeddisplacement plot style (doped/utils/displacement.mplstyle).bulk_vr (PathLike, Vasprun) – Not required if
defect_entryprovided and eigenvalue data already parsed (default behaviour when parsing withdoped, data indefect_entry.calculation_metadata["eigenvalue_data"]). Either a path to theVASPvasprun.xml(.gz)output file or apymatgenVasprunobject, for the reference bulk supercell calculation. IfNone(default), tries to load theVasprunobject fromdefect_entry.calculation_metadata["run_metadata"]["bulk_vasprun_dict"]or, failing that, from avasprun.xml(.gz)file atdefect_entry.calculation_metadata["bulk_path"].bulk_procar (PathLike, Procar) – Not required if
defect_entryprovided and eigenvalue data already parsed (default behaviour when parsing withdoped, data indefect_entry.calculation_metadata["eigenvalue_data"]), or ifbulk_vrwas parsed withparse_projected_eigen = True(default). Either a path to theVASPPROCARoutput file (withLORBIT > 10in theINCAR) or apymatgenProcarobject, for the reference bulk supercell calculation. IfNone(default), tries to load from aPROCAR(.gz)file atdefect_entry.calculation_metadata["bulk_path"].defect_vr (PathLike, Vasprun) – Not required if
defect_entryprovided and eigenvalue data already parsed (default behaviour when parsing withdoped, data indefect_entry.calculation_metadata["eigenvalue_data"]). Either a path to theVASPvasprun.xml(.gz)output file or apymatgenVasprunobject, for the defect supercell calculation. IfNone(default), tries to load theVasprunobject fromdefect_entry.calculation_metadata["run_metadata"]["defect_vasprun_dict"]or, failing that, from avasprun.xml(.gz)file atdefect_entry.calculation_metadata["defect_path"].defect_procar (PathLike, Procar) – Not required if
defect_entryprovided and eigenvalue data already parsed (default behaviour when parsing withdoped, data indefect_entry.calculation_metadata["eigenvalue_data"]), or ifdefect_vrwas parsed withparse_projected_eigen = True(default). Either a path to theVASPPROCARoutput file (withLORBIT > 10in theINCAR) or apymatgenProcarobject, for the defect supercell calculation. IfNone(default), tries to load from aPROCAR(.gz)file atdefect_entry.calculation_metadata["defect_path"].force_reparse (bool) – Whether to force re-parsing of the eigenvalue data, even if already present in the
calculation_metadatadict.ylims (tuple[float, float]) – Custom y-axis limits for the eigenvalue plot. If
None(default), the y-axis limits are automatically set to +/-5% of the eigenvalue range.legend_kwargs (dict) – Custom keyword arguments to pass to the
ax.legendcall in the eigenvalue plot (e.g. “loc”, “fontsize”, “framealpha” etc.). If set toFalse, then no legend is shown. Default isNone.similar_orb_criterion (float) – Threshold criterion for determining if the orbitals of two eigenstates are similar (for identifying band-edge and defect states). If the summed orbital projection differences, normalised by the total orbital projection coefficients, are less than this value, then the orbitals are considered similar. Default is to try with 0.2 (
pydefectdefault), then if this fails increase to 0.35, and lastly 0.5.similar_energy_criterion (float) – Threshold criterion for considering two eigenstates similar in energy, used for identifying band-edge (and defect states). Bands within this energy difference from the VBM/CBM of the bulk are considered potential band-edge states. Default is to try with the larger of either 0.25 eV or 0.1 eV + the potential alignment from defect to bulk cells as determined by the charge correction in
defect_entry.corrections_metadataif present. If this fails, then it is increased to thepydefectdefault of 0.5 eV.
- Returns:
pydefectBandEdgeStatesobject, containing the band-edge and defect eigenvalue information, and the eigenvalue plot (ifplot=True).
- doped.utils.eigenvalues.make_band_edge_orbital_infos(defect_vr: Vasprun, vbm: float, cbm: float, eigval_shift: float = 0.0, neighbor_indices: list[int] | None = None, defect_procar: Procar | None = None)[source]
Make
BandEdgeOrbitalInfosfrom aVasprunobject.Modified from
pydefectto use projected orbitals stored in theVasprunobject.- Parameters:
defect_vr (Vasprun) – Defect
Vasprunobject.vbm (float) – VBM eigenvalue in eV.
cbm (float) – CBM eigenvalue in eV.
eigval_shift (float) – Shift eigenvalues by this value in eV. Default is 0.0.
neighbor_indices (list[int]) – Indices of neighboring atoms to the defect site, for localisation analysis. Default is
None.defect_procar (Procar) –
pymatgenProcarobject, for the defect supercell, if projected eigenvalue/orbitals data is not provided indefect_vr.
- Returns:
BandEdgeOrbitalInfosobject.
- doped.utils.eigenvalues.make_perfect_band_edge_state_from_vasp(vasprun: Vasprun, procar: Procar, integer_criterion: float = 0.1) PerfectBandEdgeState[source]
Create a
pydefectPerfectBandEdgeStateobject from just aVasprunandProcarobject, without the need for theOutcarinput (as inpydefect).- Parameters:
vasprun (Vasprun) –
Vasprunobject.procar (Procar) –
Procarobject.integer_criterion (float) – Threshold criterion for determining if a band is unoccupied (<
integer_criterion), partially occupied (betweeninteger_criterionand 1 -integer_criterion), or fully occupied (> 1 -integer_criterion). Default is 0.1.
- Returns:
PerfectBandEdgeStateobject.
doped.utils.legacy_corrections module
Functions for computing legacy finite-size charge corrections (Makov-Payne, Murphy-Hine, Lany-Zunger) for defect formation energies.
Mostly adapted from the deprecated AIDE package developed by the dynamic duo Adam Jackson and Alex Ganose.
Note that bandfilling corrections are no longer supported, as in most cases they shouldn’t be used (see https://doi.org/10.1038/s41578-025-00879-y). If for some reason bandfilling corrections are desired, they can be manually added to corrections attributes of DefectEntry objects. See https://github.com/materialsproject/pymatgen/pull/2193
- doped.utils.legacy_corrections.get_murphy_image_charge_correction(lattice, dielectric_matrix, conv=0.3, factor=30, verbose=False)[source]
Calculates the anisotropic image charge correction by Sam Murphy in eV.
This a rewrite of the code ‘madelung.pl’ written by Sam Murphy (see [1]). The default convergence parameter of conv = 0.3 seems to work perfectly well. However, it may be worth testing convergence of defect energies with respect to the factor (i.e. cut-off radius).
Reference: S. T. Murphy and N. D. H. Hine, Phys. Rev. B 87, 094111 (2013).
- Parameters:
lattice (list) – The defect cell lattice as a 3x3 matrix.
dielectric_matrix (list) – The dielectric tensor as 3x3 matrix.
conv (float) – A value between 0.1 and 0.9 which adjusts how much real space vs reciprocal space contribution there is.
factor – The cut-off radius, defined as a multiple of the longest cell parameter.
verbose (bool) – If True details of the correction will be printed.
- Returns:
The image charge correction as a
{charge: correction}dictionary.
- doped.utils.legacy_corrections.lany_zunger_corrected_defect_dict(defect_dict: dict)[source]
Convert charge corrections from (e)FNV to Lany-Zunger in the input parsed defect dictionary.
This function is used to convert the finite-size charge corrections for parsed defect entries in a dictionary to the same dictionary but with the Lany-Zunger charge correction (0.65 * Makov-Payne image charge correction, with the same potential alignment).
- Parameters:
defect_dict (dict) – Dictionary of parsed defect calculations. Must have
'freysoldt_meta'inDefectEntry.calculation_metadatafor each charged defect (fromDefectParser.load_FNV_data()).- Returns:
Parsed defect dictionary with Lany-Zunger charge corrections.
doped.utils.parsing module
Helper functions for parsing defect supercell calculations.
- doped.utils.parsing.check_atom_mapping_far_from_defect(bulk_supercell: Structure, defect_supercell: Structure, defect_coords: ndarray[float], coords_are_cartesian: bool = False, displacement_tol: float = 0.5, warning: bool | str = 'verbose') bool[source]
Check the displacement of atoms far from the determined defect site, and warn the user if they are large (often indicates a mismatch between the bulk and defect supercell definitions).
The threshold for identifying ‘large’ displacements is if the mean displacement of any species is greater than
displacement_tolÅngströms for sites of that species outside the Wigner-Seitz radius of the defect in the defect supercell. The Wigner-Seitz radius corresponds to the radius of the largest sphere which can fit in the cell.- Parameters:
bulk_supercell (Structure) – The bulk structure.
defect_supercell (Structure) – The defect structure.
defect_coords (np.ndarray[float]) – The coordinates of the defect site.
coords_are_cartesian (bool) – Whether the defect coordinates are in Cartesian or fractional coordinates. Default is
False(fractional).displacement_tol (float) – The tolerance for the displacement of atoms far from the defect site, in Ångströms. Default is 0.5 Å.
warning (bool, str) – Whether to throw a warning if a mismatch is detected. If
warning = "verbose"(default), the individual atomic displacements are included in the warning message.
- Returns:
Returns
Falseif a mismatch is detected, elseTrue.- Return type:
bool
- doped.utils.parsing.find_archived_fname(fname, raise_error=True)[source]
Find a suitable filename, taking account of possible use of compression software.
- doped.utils.parsing.get_coords_and_idx_of_species(structure_or_sites: SiteCollection, species_name: str, frac_coords: bool = True, use_oxi_states: bool = False) tuple[ndarray, ndarray][source]
Get arrays of the coordinates and indices of the given species in the structure/list of sites.
- doped.utils.parsing.get_core_potentials_from_outcar(outcar_path: str | Path, dir_type: str = '', total_energy: list | float | None = None)[source]
Get the core potentials from the OUTCAR file, which are needed for the Kumagai-Oba (eFNV) finite-size correction.
This parser skips the full
pymatgenOutcarinitialisation/parsing, to expedite parsing and make it more robust (doesn’t fail ifOUTCARis incomplete, as long as it has the core potentials information).- Parameters:
outcar_path (PathLike) – The path to the OUTCAR file.
dir_type (str) – The type of directory the OUTCAR is in (e.g.
bulkordefect) for informative error messages.total_energy (Optional[Union[list, float]]) – The already-parsed total energy for the structure. If provided, will check that the total energy of the
OUTCARmatches this value / one of these values, and throw a warning if not.
- Returns:
The core potentials from the last ionic step in the
OUTCAR.- Return type:
np.ndarray
- doped.utils.parsing.get_defect_type_and_composition_diff(bulk: Structure | Composition, defect: Structure | Composition) tuple[str, dict][source]
Get the difference in composition between a bulk structure and a defect structure.
- Parameters:
bulk (Union[Structure, Composition]) – The bulk structure or composition.
defect (Union[Structure, Composition]) – The defect structure or composition.
- Returns:
The defect type (
interstitial,vacancy,substitutionorcomplex) and the composition difference between the bulk and defect structures as a dictionary.- Return type:
tuple[str, Dict[str, int]]
- doped.utils.parsing.get_defect_type_and_site_indices(bulk_supercell: Structure, defect_supercell: Structure, site_tol: float | None = None, abs_tol: bool = False, use_oxi_states: bool = False, use_rms: bool = False) tuple[str, list[int], list[int]][source]
Get the defect type, and indices of defect sites in the bulk (vacancies / substitutions) and defect (interstitials / substitutions) supercells.
Defect sites are determined by matching sites in the bulk and defect structures (by element and distances), according to
site_tol.- Parameters:
bulk_supercell (Structure) – The bulk supercell structure.
defect_supercell (Structure) – The defect supercell structure.
site_tol (float | None) – The (fractional) tolerance for matching sites between the defect and bulk structures. If
abs_tolisFalse(default), then the distance threshold for matching is set to the product ofsite_toland the shortest bond length in the bulk structure for the given species, otherwise the value is used directly (as a length in Å). If set toNone, the defect is assumed to be a point defect, and the largest site mismatch is assigned as the defect site. Default is 0.5 (i.e. half the shortest bond length in the bulk structure for the given species).abs_tol (bool) – Whether to use
site_tolas an absolute distance tolerance (in Å) instead of a fractional tolerance (in terms of the shortest bond length in the structure). Default isFalse.use_oxi_states (bool) – Whether to use the oxidation states of the sites in the bulk and defect structures when considering matching sites (such that e.g.
Fe3+andFe2+would be considered different species). Default isFalse.use_rms (bool) – Site mapping (using linear assignment) – used to determine defect sites – will be that which minimises either the summed RMS distances (if
use_rmsisTrue) or just simple linear sum of distances (ifFalse, default) between all paired sites.
- Returns:
- The type of defect as a string (
interstitial,vacancyor substitution).- missing_bulk_site_indices (list[int]):
Indices of sites in the bulk structure that do not match any site in the defect structure (according to
site_tolchoice).- additional_defect_site_indices (list[int]):
Indices of sites in the defect structure that do not match any site in the bulk structure (according to
site_tolchoice).
- The type of defect as a string (
- Return type:
defect_type (str)
- doped.utils.parsing.get_dimer_bonds(structure: Structure, rtol: float = 1.05) dict[str, list[float]][source]
Get a dictionary of all homoionic (dimer) bonds in the structure.
This function uses the
get_homoionic_bondsandget_dimer_bond_lengthfunctions fromshakenbreakto identify dimer bonds in the structure (where any pair of atoms of the same element with distance <rtol * get_dimer_bond_length(elt, elt)are considered a dimer bond), returning a dictionary of the site names and the dimer bond length.- Parameters:
structure (Structure) – The structure to get the dimer bond lengths for.
rtol (float) – The relative tolerance to use for classifying bonds as dimer bonds, where distances <
rtol * get_dimer_bond_length(elt, elt)are considered dimer bonds. Default is 1.05.
- Returns:
A dictionary of element names with values being sub-dictionaries of site names and their homoionic neighbours and distances (in Å) which are classified as dimer bonds. (e.g. {‘O’: {‘O(1)’: {‘O(3)’: ‘1.44 Å’}}})
- Return type:
dict[str, list[float]]
- doped.utils.parsing.get_locpot(locpot_path: str | Path)[source]
Read the
LOCPOT(.gz)file as apymatgenLocpotobject.
- doped.utils.parsing.get_magnetization_from_vasprun(vasprun: Vasprun) int | float | ndarray[float][source]
Determine the total magnetization from a
Vasprunobject.For spin-polarised calculations, this is the difference between the number of spin-up vs spin-down electrons. For non-spin-polarised calculations, there is no magnetization. For non-collinear (NCL) magnetization (e.g. spin-orbit coupling (SOC) calculations), the magnetization becomes a vector (spinor), in which case we take the vector norm as the total magnetization.
VASP does not write the total magnetization to
vasprun.xmlfile (but does to theOUTCARfile), and so here we have to reverse-engineer it from the eigenvalues (for normal spin-polarised calculations) or the projected magnetization & eigenvalues (for NCL calculations). For NCL calculations, we sum the projected orbital magnetizations for all occupied states, weighted by the k-point weights and normalised by the total orbital projections for each band and k-point. This gives the best estimate of the total magnetization from the projected magnetization array, but due to incomplete orbital projections and orbital-dependent non-uniform scaling factors (i.e. completeness of orbital projects for s vs p vs d orbitals etc.), there can be inaccuracies up to ~30% in the estimated total magnetization for tricky cases.- Parameters:
vasprun (Vasprun) – The
Vasprunobject from which to extract the total magnetization.- Returns:
The total magnetization of the system.
- Return type:
int or float or np.ndarray[float]
- doped.utils.parsing.get_matching_site(site: PeriodicSite | ndarray[float], structure: Structure, anonymous: bool = False, tol: float = 0.5) PeriodicSite[source]
Get the (closest) matching
PeriodicSiteinstructurefor the inputsite, which can be aPeriodicSiteor fractional coordinates.If the closest matching site in
structureis >tolÅ (0.5 Å by default) away from the inputsitecoordinates, an error is raised.Automatically accounts for possible differences in assigned oxidation states, site property dicts etc.
- Parameters:
site (PeriodicSite | np.ndarray[float]) – The site for which to find the closest matching site in
structure, either as aPeriodicSiteor fractional coordinates array. If fractional coordinates, thenanonymousis set toTrue.structure (Structure) – The structure in which to search for matching sites to
site.anonymous (bool) – Whether to use anonymous matching, allowing different species/elements to match each other (i.e. just matching based on coordinates). Default is
Falseifsiteis aPeriodicSite, andTrueifsiteis fractional coordinates.tol (float) – A distance tolerance (in Å), where an error will be thrown if the closest matching site is >
tolÅ away from the inputsite. Default is 0.5 Å.
- Returns:
The closest matching site in
structureto the inputsite.- Return type:
PeriodicSite
- doped.utils.parsing.get_nelect_from_vasprun(vasprun: Vasprun) int | float[source]
Determine the number of electrons (
NELECT) from aVasprunobject.- Parameters:
vasprun (Vasprun) – The
Vasprunobject from which to extractNELECT.- Returns:
The number of electrons in the system.
- Return type:
int or float
- doped.utils.parsing.get_neutral_nelect_from_vasprun(vasprun: Vasprun, skip_potcar_init: bool = False) int[source]
Determine the number of electrons (
NELECT) from aVasprunobject, corresponding to a neutral charge state for the structure.- Parameters:
vasprun (Vasprun) – The
Vasprunobject from which to extractNELECT.skip_potcar_init (bool) – Whether to skip the initialisation of the
POTCARstatistics (i.e. the auto-charge determination) and instead try to reverse engineerNELECTusing theDefectDictSet.
- Returns:
The number of electrons in the system for a neutral charge state.
- Return type:
int
- doped.utils.parsing.get_outcar(outcar_path: str | Path)[source]
Read the
OUTCAR(.gz)file as apymatgenOutcarobject.
- doped.utils.parsing.get_procar(procar_path: str | Path) Procar[source]
Read the
PROCAR(.gz)file as apymatgenProcarobject.Previously,
pymatgenProcarparsing did not support SOC calculations, however this was updated in https://github.com/materialsproject/pymatgen/pull/3890 to use code fromeasyunfold(https://smtg-bham.github.io/easyunfold – a package for unfolding electronic band structures for symmetry-broken / defect / dopant systems, with many plotting & analysis tools).
- doped.utils.parsing.get_site_mappings(struct1: Structure, struct2: Structure, species: str | Element | Species | DummySpecies | None = None, allow_duplicates: bool = False, threshold: float = 2.0, anonymous: bool = False, ignored_species: list[str] | None = None, use_rms: bool = False) list[tuple[float | None, int | None, int | None]][source]
Get the site mappings between two structures (from
struct1tostruct2), based on the shortest distances between sites.The two structures may have different species orderings.
NOTE: This assumes that both structures have the same lattice definitions (i.e. that they match, and aren’t rigidly translated/rotated with respect to each other), which is mostly the case unless we have a mismatching defect/bulk supercell (in which case the
check_atom_mapping_far_from_defectwarning should be thrown anyway during parsing).- Parameters:
struct1 (Structure) – The input structure.
struct2 (Structure) – The template structure.
species (str) – If provided, only sites of this species will be considered when matching sites. Default is
None(all species).allow_duplicates (bool) – If
True, allow multiple sites instruct1to be matched to the same site instruct2. Default isFalse.threshold (float) – If the distance between a pair of matched sites is larger than this, then a warning will be thrown. Default is 2.0 Å.
anonymous (bool) – If
True, the species of the sites will not be considered when matching sites. Default isFalse(only matching species can be matched together).ignored_species (list[str]) – A list of species to ignore when matching sites. Default is no species ignored.
use_rms (bool) – The returned site mapping (using linear assignment – only applicable when
allow_duplicatesisFalse) will be that which minimises either the summed RMS distances (ifuse_rmsisTrue) or just simple linear sum of distances (ifFalse, default) between all paired sites.
- Returns:
A list of lists containing the distance, index in
struct1and index instruct2for each matched site.- Return type:
list
- doped.utils.parsing.get_vasprun(vasprun_path: str | Path, parse_mag: bool = True, **kwargs)[source]
Read the
vasprun.xml(.gz)file as apymatgenVasprunobject.
- doped.utils.parsing.get_wigner_seitz_radius(lattice: Structure | Lattice) float[source]
Calculates the Wigner-Seitz radius of the structure, which corresponds to the maximum radius of a sphere fitting inside the cell.
Templated on the
calc_max_sphere_radiusfunction frompydefect, but rewritten to avoid callingvisewhich causes hanging on Windows. (https://github.com/SMTG-Bham/doped/issues/147).- Parameters:
lattice (Union[Structure,Lattice]) – The lattice of the structure (either a
pymatgenStructureorLatticeobject).- Returns:
The Wigner-Seitz radius of the structure.
- Return type:
float
- doped.utils.parsing.parse_projected_eigen(elem: Element, parse_mag: bool = True) tuple[dict[Spin, ndarray], ndarray | None][source]
Parse the projected eigenvalues from a
Vasprunobject (used during initialisation), but excluding the projected magnetization for efficiency.Note that following SK’s PRs to
pymatgen(#4359, #4360), parsing of projected eigenvalues adds minimal additional cost to Vasprun parsing (~1-5%), while parsing of projected magnetization can add ~30% cost.This is a modified version of
_parse_projected_eigenfrompymatgen.io.vasp.outputs.Vasprun, which allows skipping of projected magnetization parsing in order to expedite parsing indoped, as well as some small adjustments to maximise efficiency.- Parameters:
elem (Element) – The XML element to parse, with projected eigenvalues/magnetization.
parse_mag (bool) – Whether to parse the projected magnetization. Default is
True.
- Returns:
A dictionary of projected eigenvalues for each spin channel (up/down), and the projected magnetization (if parsed).
- Return type:
Tuple[Dict[Spin, np.ndarray], Optional[np.ndarray]]
- doped.utils.parsing.reorder_s2_like_s1(s1_structure: Structure, s2_structure: Structure, threshold=5.0) Structure[source]
Reorder the atoms of a (relaxed) structure,
s2_structure, to match the ordering of the atoms ins1_structure.s1/s2 structures may have a different species orderings.
NOTE: This assumes that both structures have the same lattice definitions (i.e. that they match, and aren’t rigidly translated/rotated with respect to each other), which is mostly the case unless we have a mismatching defect/bulk supercell (in which case the
check_atom_mapping_far_from_defectwarning should be thrown anyway during parsing).- Parameters:
s1_structure (Structure) – The template structure.
s2_structure (Structure) – The structure to reorder, to match
s1_structure.threshold (float) – If the distance between a pair of matched sites is larger than this value in Å, then a warning will be thrown. Default is 5.0 Å.
- Returns:
s2_structurereordered to matchs1_structure.- Return type:
Structure
- doped.utils.parsing.spin_degeneracy_from_vasprun(vasprun: Vasprun, charge_state: int | None = None) int[source]
Get the spin degeneracy (multiplicity) of a system from a
VASPvasprun output.Spin degeneracy is determined by first getting the total magnetization and thus electron spin (S = N_μB/2 – where N_μB is the magnetization in Bohr magnetons (i.e. electronic units, as used in VASP), and using the spin multiplicity equation:
g_spin = 2S + 1. The total magnetizationN_μBis determined usingget_magnetization_from_vasprun(see docstring for details), and if this fails, then simple spin behaviour is assumed with singlet (S = 0) behaviour for even-electron systems and doublet behaviour (S = 1/2) for odd-electron systems.For non-collinear (NCL) magnetization (e.g. spin-orbit coupling (SOC) calculations), the magnetization
N_μBbecomes a vector (spinor), in which case we take the vector norm as the total magnetization. This can be non-integer in these cases (e.g. due to SOC mixing of spin states, as _S_ is no longer a good quantum number). As an approximation for these cases, we roundN_μBto the nearest integer which would be allowed under collinear magnetism (i.e. even numbers for even-electron systems, odd numbers for odd-electron systems).- Parameters:
vasprun (Vasprun) –
pymatgenVasprunfor which to determine spin degeneracy.charge_state (int) – The charge state of the system, which can be used to determine the number of electrons. If
None(default), automatically determines the number of electrons usingget_nelect_from_vasprun(vasprun).
- Returns:
Spin degeneracy of the system.
- Return type:
int
- doped.utils.parsing.total_charge_from_vasprun(vasprun: Vasprun) int | None[source]
Determine the total charge state of a system from the vasprun, and compare to the expected charge state if provided.
Note that if the system is charged, then this function relies on access to
POTCARdata, which can be setup withpymatgenas detailed on the installation page here: https://doped.readthedocs.io/en/latest/Installation.html#setup-potcars-and-materials-project-api- Parameters:
vasprun (Vasprun) –
pymatgenVasprunobject for which to determine the total charge.- Returns:
The total charge state, or
Noneif it cannot be determined.- Return type:
int or None
doped.utils.plotting module
Code for plotting defect formation energies.
These functions were built from a combination of useful modules from
pymatgen, AIDE (by Adam Jackson and Alex Ganose), alongside substantial
modification, the efforts of making an efficient, user-friendly package for
managing and analysing defect calculations with publication-quality outputs.
- doped.utils.plotting.format_defect_name(defect_species: str, include_site_info_in_name: bool = False, wout_charge: bool = False) str | None[source]
Format defect name for plot titles.
(i.e. from
"Cd_i_C3v_0"to"$Cd_{i}^{0}$"or"$Cd_{i_{C3v}}^{0}$"). Note this assumes “V_…” means vacancy not Vanadium.- Parameters:
defect_species (str) – Name of defect including charge state (e.g.
"Cd_i_C3v_0").include_site_info_in_name (bool) – Whether to include site info in name (e.g.
"$Cd_{i}^{0}$"or"$Cd_{i_{C3v}}^{0}$"). Defaults toFalse.wout_charge (bool) – Whether to exclude the charge state from the formatted
defect_speciesname. Defaults toFalse.
- Returns:
Formatted defect name.
- Return type:
str
- doped.utils.plotting.formation_energy_plot(defect_thermodynamics: DefectThermodynamics, dft_chempots: dict | None = None, el_refs: dict | None = None, chempot_table: bool = True, all_entries: bool | str = False, xlim: tuple[float, float] | None = None, ylim: tuple[float, float] | None = None, fermi_level: float | None = None, include_site_info: bool = False, title: str | None = None, colormap: str | Colormap | None = None, linestyles: str | list[str] = '-', auto_labels: bool = False, filename: str | Path | None = None)[source]
Produce defect formation energy vs Fermi level plot.
- Parameters:
defect_thermodynamics (DefectThermodynamics) –
DefectThermodynamicsobject containing defect entries to plot.dft_chempots (dict) – Dictionary of
{Element: value}giving the chemical potential of each element.el_refs (dict) – Dictionary of
{Element: value}giving the reference energy of each element.chempot_table (bool) – Whether to print the chemical potential table above the plot. (Default: True)
all_entries (bool, str) – Whether to plot the formation energy lines of all defect entries, rather than the default of showing only the equilibrium states at each Fermi level position (traditional). If instead set to “faded”, will plot the equilibrium states in bold, and all unstable states in faded grey. (Default: False)
xlim – Tuple (min,max) giving the range of the x-axis (Fermi level). May want to set manually when including transition level labels, to avoid crossing the axes. Default is to plot from -0.3 to +0.3 eV above the band gap.
ylim – Tuple (min,max) giving the range for the y-axis (formation energy). May want to set manually when including transition level labels, to avoid crossing the axes. Default is from 0 to just above the maximum formation energy value in the band gap.
fermi_level (float) – If set, plots a dashed vertical line at this Fermi level value, typically used to indicate the equilibrium Fermi level position. (Default: None)
include_site_info (bool) – Whether to include site info in defect names in the plot legend (e.g.
$Cd_{i_{C3v}}^{0}$rather than$Cd_{i}^{0}$). Default isFalse, where site info is not included unless we have inequivalent sites for the same defect type. If, even with site info added, there are duplicate defect names, then “-a”, “-b”, “-c” etc. are appended to the names to differentiate.title (str) – Title for the plot. (Default: None)
colormap (str, matplotlib.colors.Colormap) – Colormap to use for the formation energy lines, either as a string (which can be a colormap name from https://matplotlib.org/stable/users/explain/colors/colormaps or from https://www.fabiocrameri.ch/colourmaps – append ‘S’ if using a sequential colormap from the latter) or a
Colormap/ListedColormapobject. IfNone(default), usestab10withalpha=0.75(if 10 or fewer lines to plot),tab20(if 20 or fewer lines) orbatlow(if more than 20 lines).linestyles (str, list[str]) – Linestyles to use for the formation energy lines, either as a single linestyle (
str) or list of linestyles (list[str]) in the order of appearance of lines in the plot legend. Default is"-"; i.e. solid linestyle for all entries.auto_labels (bool) – Whether to automatically label the transition levels with their charge states. If there are many transition levels, this can be quite ugly. (Default: False)
filename (PathLike) – Filename to save the plot to. (Default: None (not saved)).
- Returns:
matplotlibFigureobject.
- doped.utils.plotting.get_colormap(colormap: str | Colormap | None = None, default: str = 'batlow') Colormap[source]
Get a colormap from a string or a
Colormapobject.If
_alpha_Xin the colormap name, sets the alpha value to X (0-1).cmcramericolour maps citation: https://zenodo.org/records/8409685- Parameters:
colormap (str, matplotlib.colors.Colormap) – Colormap to use, either as a string (which can be a colormap name from https://www.fabiocrameri.ch/colourmaps or https://matplotlib.org/stable/users/explain/colors/colormaps), or a
Colormap/ListedColormapobject. IfNone(default), usesdefaultcolormap (which is"batlow"by default). Append “S” to the colormap name if using a sequential colormap from https://www.fabiocrameri.ch/colourmaps.default (str) – Default colormap to use if
colormapisNone. Defaults to"batlow"from https://www.fabiocrameri.ch/colourmaps.
- doped.utils.plotting.get_legend_font_size() float[source]
Convenience function to get the current
matplotliblegend font size, in points (pt).- Returns:
Current legend font size in points (pt).
- Return type:
float
- doped.utils.plotting.get_linestyles(linestyles: str | list[str] = '-', num_lines: int = 1) list[str][source]
Get a list of linestyles to use for plotting, from a string or list of strings (linestyles).
If a list is provided which doesn’t match the number of lines, the list is repeated until it does.
- Parameters:
linestyles (str, list[str]) – Linestyles to use for plotting. If a string, uses that linestyle for all lines. If a list, uses each linestyle in the list for each line. Defaults to
"-".num_lines (int) – Number of lines to plot (and thus number of linestyles to output in list). Defaults to 1.
- doped.utils.plotting.plot_chemical_potential_table(ax: Axes, dft_chempots: dict[str, float], cellLoc: str = 'left', el_refs: dict[str, float] | None = None) table[source]
Plot a table of chemical potentials above the plot in
ax.- Parameters:
ax (plt.Axes) – Axes object to plot the table in.
dft_chempots (dict) – Dictionary of chemical potentials of the form
{Element: value}.cellLoc (str) – Alignment of text in cells. Default is “left”.
el_refs (dict) – Dictionary of elemental reference energies of the form
{Element: value}. If provided, the chemical potentials are given with respect to these reference energies.
- Returns:
The
matplotlib.table.Tableobject (which has been added to theaxobject).
doped.utils.stenciling module
Utility functions to re-generate a relaxed defect structure in a different supercell.
- doped.utils.stenciling.get_defect_in_supercell(defect_entry: DefectEntry, target_supercell: Structure, check_bulk: bool = True, target_frac_coords: ndarray[float] | list[float] | bool = True, edge_tol: float = 1, min_dist_tol_factor: float = 0.99) tuple[Structure, Structure][source]
Re-generate a relaxed defect structure in a different supercell.
This function takes the relaxed defect structure of the input
DefectEntry(fromDefectEntry.defect_supercell) and re-generates it in thetarget_supercellstructure, and the closest possible position totarget_frac_coords(default is the supercell centre = [0.5, 0.5, 0.5]), also providing the corresponding bulk supercell (which should be the same for each generated defect supercell given the sametarget_supercelland base supercell fordefect_entry, see note below).target_supercellshould be the same host crystal structure, just with different supercell dimensions, having the same lattice parameters and bond lengths.Note: This function does not guarantee that the generated defect supercell atomic position basis exactly matches that of
target_supercell, which may have come from a different primitive structure definition (e.g. CdTe with{"Cd": [0,0,0], "Te": [0.25,0.25,0.25]}vs{"Cd": [0,0,0], "Te": [0.75,0.75,0.75]}). The generated supercell will have the exact same lattice/cell definition with fully symmetry-equivalent atom positions, but if the actual position basis differs then this can cause issues with parsing finite-size corrections (which rely on site-matched potentials). This is perfectly fine if it occurs, just will require the use of a matching bulk/reference supercell when parsing (rather than the inputtarget_supercell) –dopedwill also throw a warning about this when parsing if a non-matching bulk supercell is used anyway. This function will automatically check if the position basis in the generated supercell differs from that oftarget_supercell, printing a warning if so (unlesscheck_bulkisFalse) and returning the corresponding bulk supercell which should be used for parsing defect calculations with the generated supercell. Of course, if generating multiple defects in the sametarget_supercell, only one such bulk supercell calculation should be required (should correspond to the same bulk supercell in each case).Briefly, this function works by:
Translating the defect site to the centre of the original supercell.
Identifying a super-supercell which fully encompasses the target supercell (regardless of orientation).
Generate this super-supercell, using one copy of the original defect supercell (
DefectEntry.defect_supercell), and the rest of the sites (outside of the original defect supercell box, with the defect translated to the centre) are populated using the bulk supercell (DefectEntry.bulk_supercell).Translate the defect site in this super-supercell to the Cartesian coordinates of the centre of
target_supercell, then stencil out all sites in thetarget_supercellportion of the super-supercell, accounting for possible site displacements in the relaxed defect supercell (e.g. iftarget_supercellhas a different shape and does not fully encompass the original defect supercell). This is done by scanning over possible combinations of sites near the boundary regions of thetarget_supercellportion, and identifying the combination which maximises the minimum inter-atomic distance in the new supercell (i.e. the most bulk-like arrangement).Re-orient this new stenciled supercell to match the orientation and site positions of
target_supercell.If
target_frac_coordsis notFalse, scan over all symmetry operations oftarget_supercelland apply that which places the defect site closest totarget_frac_coords.
- Parameters:
defect_entry (DefectEntry) – A
DefectEntryobject for which to re-generate the relaxed structure (taken fromDefectEntry.defect_supercell) in thetarget_supercelllattice.target_supercell (Structure) – The supercell structure to re-generate the relaxed defect structure in.
check_bulk (bool) – Whether to check if the generated defect/bulk supercells have different atomic position bases to
target_supercell(as described above) – if so, a warning will be printed (unlesscheck_bulkisFalse). Default isTrue.target_frac_coords (Union[np.ndarray[float], list[float], bool]) – The fractional coordinates to target for defect placement in the new supercell. If just set to
True(default), will try to place the defect nearest to the centre of the superset cell (i.e.target_frac_coords = [0.5, 0.5, 0.5]), as is default indopeddefect generation. Note that defect placement is harder in this case than in generation withDefectsGenerator, as we are not starting from primitive cells and we are working with relaxed geometries.edge_tol (float) – A tolerance (in Angstrom) for site displacements at the edge of the stenciled supercell, when determining the best match of sites to stencil out in the new supercell (of
target_supercelldimension). Default is 1 Angstrom, and then this is sequentially increased up to 4.5 Angstrom if the initial scan fails.min_dist_tol_factor (float) – Tolerance factor when checking the minimum interatomic distance in the stenciled defect supercell, as a factor of the minimum distance in the original
DefectEntry.defect_supercell. Default is 0.99.
- Returns:
The re-generated defect supercell in the
target_supercelllattice, and the corresponding bulk/reference supercell for the generated defect supercell (see explanations above).- Return type:
tuple[Structure, Structure]
- doped.utils.stenciling.is_within_frac_bounds(lattice: Lattice, cart_coords: ndarray[float] | list[float], tol: float = 1e-05) bool[source]
Check if a given Cartesian coordinate is inside the unit cell defined by the lattice object.
- Parameters:
lattice (Lattice) –
Latticeobject defining the unit cell.cart_coords (Union[np.ndarray[float], list[float]]) – The Cartesian coordinates to check.
tol (float) – A tolerance (in Angstrom / cartesian units) for coordinates to be considered within the unit cell. If positive, expands the bounds of the unit cell by this amount, if negative, shrinks the bounds.
- Returns:
Whether the Cartesian coordinates are within the fractional bounds of the unit cell, accounting for
tol.- Return type:
bool
doped.utils.supercells module
Utility code and functions for generating & analysing defect supercells.
- doped.utils.supercells.cell_metric(cell_matrix: ndarray, target: str = 'SC', rms: bool = True, eff_cubic_length: float | None = None) float[source]
Calculates the deviation of the given cell matrix from an ideal simple cubic (if target = “SC”) or face-centred cubic (if target = “FCC”) matrix, by evaluating the root mean square (RMS) difference of the vector lengths from that of the idealised values (i.e. the corresponding SC/FCC lattice vector lengths for the given cell volume).
For target = “SC”, the idealised lattice vector length is the effective cubic length (i.e. the cube root of the volume), while for “FCC” it is 2^(1/6) (~1.12) times the effective cubic length.
This is an expanded version of the cell metric function in ASE (
get_deviation_from_optimal_cell_shape), described in https://wiki.fysik.dtu.dk/ase/tutorials/defects/defects.html which previously did not account for rotational invariance (now fixed; https://gitlab.com/ase/ase/-/merge_requests/3404) and has less flexibility.- Parameters:
cell_matrix (np.ndarray) – Cell matrix for which to calculate the cell metric.
target (str) – Target cell shape, for which to calculate the normalised deviation score from. Either “SC” for simple cubic or “FCC” for face-centred cubic. Default = “SC”
rms (bool) – Whether to return the root mean square (RMS) difference of the vector lengths from that of the idealised values (default), or just the mean square difference (to reduce computation time when scanning over many possible matrices). Default = True
eff_cubic_length (float) – Effective cubic length of the cell matrix (to reduce computation time during looping). Default = None
- Returns:
Cell metric (0 is perfect score).
- Return type:
float
- doped.utils.supercells.find_ideal_supercell(cell: ndarray, target_size: int, limit: int = 2, clean: bool = True, return_min_dist: bool = False, verbose: bool = False) ndarray | tuple[ndarray, float][source]
Given an input cell matrix (e.g.
Structure.lattice.matrixorAtoms.cell) and chosentarget_size(size of supercell in number ofcells), finds an ideal supercell matrix (P) that yields the largest minimum image distance (i.e. minimum distance between periodic images of sites in a lattice), while also being as close to cubic as possible.Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly cubic supercell with volume equal to
target_size, and then scanning over all matrices where the elements are within +/-limitof the ideal P matrix elements (rounded to the nearest integer). For relatively smalltarget_sizes (<100) and/or cells with mostly similar lattice vector lengths, the defaultlimitof +/-2 performs very well. For largertarget_sizes,cells with very different lattice vector lengths, and/or cases where small differences in minimum image distance are very important, a largerlimitmay be required (though typically only improves the minimum image distance by 1-6%).This is also known as the Shortest Vector Problem (SVP), and has no known analytical solution, requiring enumeration type approaches. https://wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_(SVP)
Note that this function is used by default to generate defect supercells with the
dopedDefectsGeneratorclass, unless specific supercell settings are used.- Parameters:
cell (np.ndarray) – Unit cell matrix for which to find a supercell.
target_size (int) – Target supercell size (in number of
cells).limit (int) – Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly SC/FCC supercell with volume equal to
target_size, and then scanning over all matrices where the elements are within +/-limitof the ideal P matrix elements (rounded to the nearest integer). (Default = 2)clean (bool) – Whether to return the supercell matrix which gives the ‘cleanest’ supercell (according to _lattice_matrix_sort_func; most symmetric, with mostly positive diagonals and c >= b >= a). (Default = True)
return_min_dist (bool) – Whether to return the minimum image distance (in Å) as a second return value. (Default = False)
verbose (bool) – Whether to print out extra information. (Default = False)
- Returns:
The supercell transformation matrix (P), and if
return_min_distisTrue, the minimum image distance (in Å).- Return type:
np.ndarray | tuple[np.ndarray, float]
- doped.utils.supercells.find_optimal_cell_shape(cell: ndarray, target_size: int, target_shape: str = 'SC', limit: int = 2, return_score: bool = False, verbose: bool = False) ndarray | tuple[ndarray, float][source]
Find the transformation matrix that produces a supercell corresponding to
target_sizeunit cells that most closely approximates the shape defined bytarget_shape.This is an updated version of ASE’s
find_optimal_cell_shapefunction, fixed to be rotationally-invariant (now fixed in ASE with MR 3404) and having significant efficiency improvements, and then secondarily sorted by the (fixed) cell metric (indoped), and then by some other criteria to give the cleanest output.Note: This function will be deprecated by the updates in https://gitlab.com/ase/ase/-/merge_requests/3616, which improves performance, and will be removed once that MR is merged. (TODO)
Finds the optimal supercell transformation matrix by calculating the deviation of the possible supercell matrices from an ideal simple cubic (if target = “SC”) or face-centred cubic (if target = “FCC”) matrix, and then taking that with the best (lowest) score by evaluating the root mean square (RMS) difference of the vector lengths from that of the idealised values (i.e. the corresponding SC/FCC lattice vector lengths for the given cell volume).
For target = “SC”, the idealised lattice vector length is the effective cubic length (i.e. the cube root of the volume), while for “FCC” it is 2^(1/6) (~1.12) times the effective cubic length.
- Parameters:
cell (np.ndarray) – Unit cell matrix for which to find a supercell transformation.
target_size (int) – Target supercell size (in number of
cells).target_shape (str) – Target cell shape, for which to calculate the normalised deviation score from. Either “SC” for simple cubic or “FCC” for face-centred cubic. Default = “SC”
limit (int) – Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly SC/FCC supercell with volume equal to
target_size, and then scanning over all matrices where the elements are within +/-limitof the ideal P matrix elements (rounded to the nearest integer). (Default = 2)return_score (bool) – Whether to return the cell metric score as a second return value. (Default = False)
verbose (bool) – Whether to print out extra information. (Default = False)
- Returns:
The supercell transformation matrix (P), and if
return_scoreisTrue, the cell metric (where 0 is perfect score).- Return type:
np.ndarray | tuple[np.ndarray, float]
- doped.utils.supercells.get_min_image_distance(structure: Structure) float[source]
Get the minimum image distance (i.e. minimum distance between periodic images of sites in a lattice) for the input structure.
This is also known as the Shortest Vector Problem (SVP), and has no known analytical solution, requiring enumeration type approaches. https://wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_(SVP)
- Parameters:
structure (Structure) – Structure object.
- Returns:
Minimum image distance.
- Return type:
float
- doped.utils.supercells.get_pmg_cubic_supercell_dict(struct: Structure, uc_range: tuple = (1, 200)) dict[source]
Get a dictionary of (near-)cubic supercell matrices for the given structure and range of numbers of unit cells (in the supercell).
Returns a dictionary of format:
{Number of Unit Cells: {"P": transformation matrix, "min_dist": minimum image distance} }
for (near-)cubic supercells generated by the
pymatgenCubicSupercellTransformationclass. If a (near-)cubic supercell cannot be found for a given number of unit cells, then the corresponding dict value will be set to an empty dict.- Parameters:
struct (Structure) – Structure to generate supercells for.
uc_range (tuple) – Range of numbers of unit cells to search over.
- Returns:
{Number of Unit Cells: {"P": transformation matrix, "min_dist": minimum image distance}}- Return type:
dict
- doped.utils.supercells.min_dist(structure: Structure, ignored_species: list[str] | None = None) float[source]
Return the minimum interatomic distance in a structure (ignoring any zero distances).
Uses
numpyvectorisation for fast computation.- Parameters:
structure (Structure) – The structure to check.
ignored_species (list[str]) – A list of species symbols to ignore when calculating the minimum interatomic distance. Default is
None(don’t ignore any species).
- Returns:
The minimum interatomic distance in the structure.
- Return type:
float
doped.utils.symmetry module
Utility code and functions for symmetry analysis of structures and defects.
- doped.utils.symmetry.apply_symm_op_to_site(symm_op: SymmOp, site: PeriodicSite, fractional: bool = False, rotate_lattice: Lattice | bool = True, just_unit_cell_frac_coords: bool = False) PeriodicSite[source]
Apply the given symmetry operation to the input site (not in place) and return the new site.
By default, also rotates the lattice accordingly. If you want to apply the symmetry operation but keep the same lattice definition, set
rotate_lattice=False.- Parameters:
symm_op (SymmOp) –
pymatgenSymmOpobject.site (PeriodicSite) –
pymatgenPeriodicSiteobject.fractional (bool) – If the
SymmOpis in fractional or Cartesian (default) coordinates (i.e. to apply tosite.frac_coordsorsite.coords). Default: Falserotate_lattice (Union[Lattice, bool]) – Either a
pymatgenLatticeobject (to use as the new lattice basis of the transformed site, which can be provided to reduce computation time when looping) orTrue/False. IfTrue(default), theSymmOprotation matrix will be applied to the input site lattice, or ifFalse, the original lattice will be retained.just_unit_cell_frac_coords (bool) – If
True, just returns the fractional coordinates of the transformed site (rather than the site itself), within the unit cell. Default: False
- Returns:
Site with the symmetry operation applied.
- Return type:
PeriodicSite
- doped.utils.symmetry.apply_symm_op_to_struct(symm_op: SymmOp, struct: Structure, fractional: bool = False, rotate_lattice: bool = True) Structure[source]
Apply a symmetry operation to a structure and return the new structure.
This differs from pymatgen’s
apply_operationmethod in that it does not apply the operation in place as well (i.e. does not modify the input structure), which avoids the use of unnecessary and slowStructure.copy()calls, making the structure manipulation / symmetry analysis functions more efficient. Also fixes an issue when applying fractional symmetry operations.By default, also rotates the lattice accordingly. If you want to apply the symmetry operation to the sites but keep the same lattice definition, set
rotate_lattice=False.- Parameters:
symm_op –
pymatgenSymmOpobject.struct –
pymatgenStructureobject.fractional – If the
SymmOpis in fractional or Cartesian (default) coordinates (i.e. to apply tosite.frac_coordsorsite.coords). Default: Falserotate_lattice – If the lattice of the input structure should be rotated according to the symmetry operation. Default: True.
- Returns:
Structure with the symmetry operation applied.
- Return type:
Structure
- doped.utils.symmetry.are_equivalent_lattices(lattice_1: Lattice | Structure, lattice_2: Lattice | Structure, ltol: float = 0.005, atol: float = 1) bool[source]
Check if two lattices are (symmetry-)equivalent, allowing for different cell sizes.
- Parameters:
lattice_1 (Lattice | Structure) – The first lattice to check for equivalence.
lattice_2 (Lattice | Structure) – The second lattice to check for equivalence.
ltol (float) – Fractional tolerance for matching lattice vector lengths. Defaults to 5e-3 (i.e. 0.5% tolerance).
atol (float) – Tolerance for matching angles. Defaults to 1 degree.
- Returns:
Trueif the two lattices are (symmetry-)equivalent,Falseotherwise.- Return type:
bool
- doped.utils.symmetry.cached_simplify(eq)[source]
Cached simplification function for
sympyequations, for efficiency.
- doped.utils.symmetry.cached_solve(equation, variable)[source]
Cached solve function for
sympyequations, for efficiency.
- doped.utils.symmetry.cluster_coords(fcoords: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], structure: Structure | Lattice, dist_tol: float = 0.01, method: str = 'single', criterion: str = 'distance') ndarray[source]
Cluster fractional coordinates based on their distances (using
scipyfunctions) and return the cluster numbers (as an array matching the shape and order offcoords).methodchooses the clustering algorithm to use withlinkage()("single"by default, matching thescipydefault), along with adist_toldistance tolerance in Å."single"corresponds to the Nearest Point algorithm and is the recommended choice formethodwhendist_tolis small, but can be sensitive to how many fractional coordinates are included infcoords(allowing for daisy-chaining of sites to give large spaced-out clusters), while"centroid"or"ward"are good choices to avoid this issue.See the
scipyAPI docs for more info.- Parameters:
fcoords (ArrayLike) – Fractional coordinates to cluster.
structure (Structure | Lattice) – Structure or lattice to which the fractional coordinates correspond.
dist_tol (float) – Distance tolerance for clustering, in Å (default: 0.01). For the most part, fractional coordinates with distances less than this tolerance will be clustered together (when
method = "single", giving the Nearest Point algorithm, as is the default).method (str) – Clustering algorithm to use with
linkage()(default:"single").criterion (str) – Criterion to use for flattening hierarchical clusters from the linkage matrix, used with
fcluster()(default:"distance").
- Returns:
Array of cluster numbers, matching the shape and order of
fcoords(i.e. corresponding to the index/number of the cluster to which that fractional coordinate belongs).- Return type:
np.ndarray
- doped.utils.symmetry.cluster_sites_by_dist_tol(sites: Iterable[PeriodicSite | ndarray[float]], structure: Structure | Lattice, dist_tol: float = 0.01, method: str = 'single', criterion: str = 'distance') list[PeriodicSite | ndarray[float]][source]
Cluster sites based on their distances (using
cluster_coords).- Parameters:
sites (Iterable[PeriodicSite | np.ndarray[float]]) – Sites to cluster, as an iterable of
PeriodicSiteobjects or fractional coordinates.structure (Structure | Lattice) – Structure or lattice to which the sites correspond.
dist_tol (float) – Distance tolerance for clustering, in Å (default: 0.01).
method (str) – Clustering algorithm to use with
scipy'slinkage()clustering function incluster_coords(default:"single").criterion (str) – Criterion to use for flattening hierarchical clusters from the linkage matrix, used with
fcluster()(default:"distance").
- Returns:
List of clustered sites, as
PeriodicSiteobjects or fractional coordinates depending on the inputsitestype.- Return type:
list[PeriodicSite | np.ndarray[float]]
- doped.utils.symmetry.doped_cluster_frac_coords(fcoords: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], structure: Structure, tol: float = 0.55, symm_pref_dist_factor: float = 0.85, method: str = 'centroid', criterion: str = 'distance') ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Cluster fractional coordinates that are within a certain distance tolerance of each other, and return the cluster site.
Modified from the
pymatgen-analysis-defects`function as follows: For each site cluster, the possible sites to choose from are the sites in the cluster and the cluster midpoint (average position). Of these sites, the site with the highest symmetry, and then largestmin_dist(distance to any host lattice site), is chosen – if itsmin_distis no more thansymm_pref_dist_factor(0.85 by default) times the largest possiblemin_dist. This is because we want to favour the higher symmetry interstitial sites (as these are typically the more intuitive sites for placement, cleaner, easier for analysis etc, and work well when combined withShakeNBreakor other structure-searching techniques to account for symmetry-breaking), but also interstitials are often lowest-energy when furthest from host atoms (i.e. in the largest interstitial voids – particularly for fully-ionised charge states), and so this approach tries to strike a balance between these two goals.In
pymatgen-analysis-defects, the average cluster position is used, which breaks symmetries and is less easy to manipulate in the following interstitial generation functions.- Parameters:
fcoords (ArrayLike) – Fractional coordinates of points to cluster.
structure (Structure) – The host structure.
tol (float) – Distance tolerance for clustering Voronoi nodes. Default is 0.55 Å.
symm_pref_dist_factor (float) – Minimum acceptable ratio of distance to host atoms for symmetry-favoured sites vs distance-to-host-favoured sites, for which to prefer symmetry-favoured sites. Default is 0.85.
method (str) – Clustering algorithm to use with
linkage()(default:"centroid", better than thescipydefault of"singlefor interstitial generation to avoid daisy-chaining clusters).criterion (str) – Criterion to use for flattening hierarchical clusters from the linkage matrix, used with
fcluster()(default:"distance").
- Returns:
Clustered fractional coordinates.
- Return type:
np.typing.NDArray
- doped.utils.symmetry.get_BCS_conventional_structure(structure: Structure, pbar: tqdm | None = None, return_wyckoff_dict: bool = False) tuple[Structure, ndarray] | tuple[Structure, ndarray, dict[str, ndarray]][source]
Get the conventional crystal structure of the input structure, according to the Bilbao Crystallographic Server (BCS) definition.
Also returns an array of the lattice vector swaps (used with
swap_axes) to convert from thespglib(SpaceGroupAnalyzer) conventional structure definition to the BCS definition.- Parameters:
structure (Structure) – Structure for which to get the corresponding BCS conventional crystal structure.
pbar (ProgressBar) –
tqdmprogress bar object, to update progress. Default isNone.return_wyckoff_dict (bool) – Whether to return the Wyckoff label dict (as
{Wyckoff label: coordinates}).
- Returns:
A tuple of the BCS conventional structure of the input structure, the lattice vector swapping array and, if
return_wyckoff_dictisTrue, the Wyckoff label dict.- Return type:
tuple[Structure, np.ndarray] | tuple[Structure, np.ndarray, dict[str, np.ndarray]]
- doped.utils.symmetry.get_all_equiv_sites(frac_coords: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], structure: Structure, symprec: float = 0.01, dist_tol_factor: float = 1.0, species: str = 'X', just_frac_coords: bool = False, return_symprec_and_dist_tol_factor: bool = False, fixed_symprec_and_dist_tol_factor: bool = False, verbose: bool = False) list[PeriodicSite | ndarray] | tuple[list[PeriodicSite | ndarray], float][source]
Get a list of all equivalent sites of the input fractional coordinates in
structure.Tries to use hashing and caching to accelerate if possible.
- Parameters:
frac_coords (ArrayLike) – Fractional coordinates to get equivalent sites of.
structure (Structure) – Structure to use for the lattice, to which the fractional coordinates correspond, and for determining symmetry operations if not provided.
symprec (float) – Symmetry precision to use for determining symmetry operations. Default is 0.01. If
fixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues), and settingreturn_symprec_and_dist_tol_factortoTruewill return the finalsymprec(anddist_tol_factor) used for the equivalent site generation.dist_tol_factor (float) – Distance tolerance for clustering generated sites (to ensure they are truly distinct), as a multiplicative factor of
symprec. Default is 1.0 (i.e.dist_tol = symprec, in Å). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will also be automatically adjusted if necessary (up to 10x, down to 0.1x)(aftersymprecadjustments) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialleddist_tol_factor(andsymprec) values, and settingreturn_symprec_and_dist_tol_factortoTruewill return the finalsymprec(anddist_tol_factor) used for the equivalent site generation.species (str) – Species to use for the equivalent sites (default: “X”).
just_frac_coords (bool) – If
True, just returns the fractional coordinates of the equivalent sites (rather thanpymatgenPeriodicSiteobjects). Default: False.return_symprec_and_dist_tol_factor (bool) – If
True, returns the final symmetry precision and distance tolerance factor used for the equivalent site generation (seesymprecanddist_tol_factorargument descriptions). Default isFalse.fixed_symprec_and_dist_tol_factor (bool) – If
True, uses the providedsymprecanddist_tol_factorvalues without any automatic adjustments (seesymprecanddist_tol_factorargument descriptions). Default isFalse.verbose (bool) – If
True, prints information on the trialledsymprecanddist_tol_factorvalues, and the identified equivalent sites. Default isFalse.
- Returns:
List of equivalent sites of the input fractional coordinates in
structure, either aspymatgenPeriodicSiteobjects or as fractional coordinates (depending on the value ofjust_frac_coords).- Return type:
list[PeriodicSite | np.ndarray]
- doped.utils.symmetry.get_clean_structure(structure: Structure, return_T: bool = False, dist_precision: float = 0.001, niggli_reduce: bool = True) Structure | tuple[Structure, ndarray][source]
Get a ‘clean’ version of the input structure by searching over equivalent cells, and finding the most optimal according to
_lattice_matrix_sort_func(most symmetric, with mostly positive diagonals and c >= b >= a).- Parameters:
structure (Structure) – Structure object.
return_T (bool) – Whether to return the transformation matrix from the original structure lattice to the new structure lattice (T * Orig = New). (Default = False)
dist_precision (float) – The desired distance precision in Å for rounding of lattice parameters and fractional coordinates. (Default: 0.001)
niggli_reduce (bool) – Whether to Niggli reduce the lattice before searching for the optimal lattice matrix. If this is set to
False, we also skip the search for the best positive determinant lattice matrix. (Default: True)
- Returns:
The ‘clean’ version of the input structure, or a tuple of the ‘clean’ structure and the transformation matrix from the original structure lattice to the new structure lattice (T * Orig = New).
- Return type:
Structure | tuple[Structure, np.ndarray]
- doped.utils.symmetry.get_conv_cell_site(defect_entry: DefectEntry) PeriodicSite | None[source]
Gets an equivalent site of the defect entry in the conventional structure of the host material. If the
conventional_structureattribute is not defined for defect_entry, then it is generated usingSpacegroupAnalyzerand then reoriented to match the Bilbao Crystallographic Server’s conventional structure definition.- Parameters:
defect_entry –
DefectEntryobject.- Returns:
The equivalent site of the defect entry in the conventional structure of the host material, or
Noneif not found.- Return type:
PeriodicSite | None
- doped.utils.symmetry.get_distance_matrix(fcoords: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], lattice: Lattice) ndarray[source]
Get a matrix of the distances between the input fractional coordinates in the input lattice.
- Parameters:
fcoords (ArrayLike) – Fractional coordinates to get distances between.
lattice (Lattice) – Lattice for the fractional coordinates.
- Returns:
Matrix of distances between the input fractional coordinates in the input lattice.
- Return type:
np.ndarray
- doped.utils.symmetry.get_equiv_frac_coords_in_primitive(frac_coords: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], primitive: Structure, supercell: Structure, symprec: float = 0.01, dist_tol_factor: float = 1.0, equiv_coords: bool = True, return_symprec_and_dist_tol_factor: bool = False, fixed_symprec_and_dist_tol_factor: bool = False, verbose: bool = False) list[ndarray] | ndarray | tuple[list[ndarray] | ndarray, float, float][source]
Get equivalent fractional coordinates of
frac_coords(insupercell) in the givenprimitivecell.Returns a list of equivalent fractional coords in the primitive cell if
equiv_coordsisTrue(default).Note that there may be multiple possible symmetry-equivalent sites, all of which are returned if
equiv_coordsisTrue, otherwise the first site in the list (sorted using_frac_coords_sort_func) is returned.- Parameters:
frac_coords (ArrayLike) – Fractional coordinates in the supercell, for which to get equivalent coordinates in the primitive cell.
primitive (Structure) – Primitive cell structure.
supercell (Structure) – Supercell structure.
symprec (float) – Symmetry precision to use for determining symmetry operations. Default is 0.01. If
fixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues), and settingreturn_symprec_and_dist_tol_factortoTruewill return the finalsymprec(anddist_tol_factor) used for the equivalent site generation.dist_tol_factor (float) – Distance tolerance for clustering generated sites (to ensure they are truly distinct), as a multiplicative factor of
symprec. Default is 1.0 (i.e.dist_tol = symprec, in Å). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will also be automatically adjusted if necessary (up to 10x, down to 0.1x)(aftersymprecadjustments) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialleddist_tol_factor(andsymprec) values, and settingreturn_symprec_and_dist_tol_factortoTruewill return the finalsymprec(anddist_tol_factor) used for the equivalent site generation.equiv_coords (bool) – If
True, returns a list of equivalent fractional coords in the primitive cell. IfFalse, returns the first equivalent fractional coordinates in the list, sorted using_frac_coords_sort_func. Default:True.return_symprec_and_dist_tol_factor (bool) – If
True, returns the final symmetry precision and distance tolerance factor used for the equivalent site generation (seesymprecanddist_tol_factorargument descriptions). Default isFalse.fixed_symprec_and_dist_tol_factor (bool) – If
True, uses the providedsymprecanddist_tol_factorvalues without any automatic adjustments (seesymprecanddist_tol_factorargument descriptions). Default isFalse.verbose (bool) – If
True, prints information on the trialledsymprecanddist_tol_factorvalues, and the identified equivalent sites. Default isFalse.
- Returns:
List of equivalent fractional coordinates in the primitive cell, or the first equivalent fractional coordinate in the list (sorted using
_frac_coords_sort_func), depending on the value ofequiv_coords. Ifreturn_symprec_and_dist_tol_factorisTrue, also returns the finalsymprecanddist_tol_factorused for the equivalent site generation.- Return type:
list[np.ndarray] | np.ndarray | tuple[list[np.ndarray] | np.ndarray, float, float]
- doped.utils.symmetry.get_min_dist_between_equiv_sites(site_1: PeriodicSite | Sequence[float] | Defect | DefectEntry, site_2: PeriodicSite | Sequence[float] | Defect | DefectEntry, structure: Structure | None = None, symprec: float = 0.01, dist_tol_factor: float = 1.0, return_symprec_and_dist_tol_factor: bool = False, fixed_symprec_and_dist_tol_factor: bool = False, verbose: bool = False) float | tuple[float, float, float][source]
Get the minimum distance (in Å) between equivalent sites of two input site/
Defect/DefectEntryobjects in a structure.- Parameters:
site_1 (PeriodicSite | Sequence[float, float, float] | Defect | DefectEntry) – First site to get equivalent sites of, to determine minimum distance to equivalent sites of
site_2. Can be aPeriodicSiteobject, a sequence of fractional coordinates, or aDefect/DefectEntryobject.site_2 (PeriodicSite | Sequence[float, float, float] | Defect | DefectEntry) – Second site to get equivalent sites of, to determine minimum distance to equivalent sites of
site_1. Can be aPeriodicSiteobject, a sequence of fractional coordinates, or aDefect/DefectEntryobject.structure (Structure) – Structure to use for determining symmetry-equivalent sites of
site_1andsite_2. Required ifsite_1andsite_2are notDefectorDefectEntryobjects. Default: None.symprec (float) – Symmetry precision to use for determining symmetry operations. Default is 0.01. If
fixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues), and settingreturn_symprec_and_dist_tol_factortoTruewill return the finalsymprec(anddist_tol_factor) used for the equivalent site generation.dist_tol_factor (float) – Distance tolerance for clustering generated sites (to ensure they are truly distinct), as a multiplicative factor of
symprec. Default is 1.0 (i.e.dist_tol = symprec, in Å). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will also be automatically adjusted if necessary (up to 10x, down to 0.1x)(aftersymprecadjustments) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialleddist_tol_factor(andsymprec) values, and settingreturn_symprec_and_dist_tol_factortoTruewill return the finalsymprec(anddist_tol_factor) used for the equivalent site generation.return_symprec_and_dist_tol_factor (bool) – If
True, returns the final symmetry precision and distance tolerance factor used for the equivalent site generation (seesymprecanddist_tol_factorargument descriptions). Default isFalse.fixed_symprec_and_dist_tol_factor (bool) – If
True, uses the providedsymprecanddist_tol_factorvalues without any automatic adjustments (seesymprecanddist_tol_factorargument descriptions). Default isFalse.verbose (bool) – If
True, prints information on the trialledsymprecanddist_tol_factorvalues, and the identified equivalent sites. Default isFalse.
- Returns:
Minimum distance (in Å) between equivalent sites of
site_1andsite_2, or a tuple of (minimum distance,symprec,dist_tol_factor) ifreturn_symprec_and_dist_tol_factorisTrue.- Return type:
float | tuple[float, float, float]
- doped.utils.symmetry.get_orientational_degeneracy(defect_entry: DefectEntry | None = None, relaxed_point_group: str | None = None, bulk_site_point_group: str | None = None, symprec: float = 0.1, bulk_symprec: float = 0.01, **kwargs) float[source]
Get the orientational degeneracy factor for a given relaxed
DefectEntry, by supplying either theDefectEntryobject or the bulk-site & relaxed defect point group symbols (e.g. “Td”, “C3v” etc.).If a
DefectEntryis supplied (and the point group symbols are not), this is computed by determining the relaxed defect point symmetry and the (unrelaxed) bulk site symmetry, and then getting the ratio of their point group orders (equivalent to the ratio of partition functions or number of symmetry operations (i.e. degeneracy)).For interstitials, the bulk site symmetry corresponds to the point symmetry of the interstitial site with no relaxation of the host structure, while for vacancies/substitutions it is simply the symmetry of their corresponding bulk site. This corresponds to the point symmetry of
DefectEntry.defect, orcalculation_metadata["bulk_site"]/["unrelaxed_defect_structure"].Note: This tries to use the
defect_entry.defect_supercellto determine the relaxed site symmetry. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell.dopedtries to automatically check if this is the case, and will warn you if so.This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed
DefectEntryobjects should correctly determine the final relaxed defect symmetry (and orientational degeneracy) – otherwise periodicity-breaking prevents this.If periodicity-breaking prevents auto-symmetry determination, you can manually determine the relaxed defect and bulk-site point symmetries, and/or orientational degeneracy, from visualising the structures (e.g. using VESTA)(can use
get_orientational_degeneracyto obtain the corresponding orientational degeneracy factor for given defect/bulk-site point symmetries) and setting the corresponding values in thecalculation_metadata['relaxed point symmetry']/['bulk site symmetry']and/ordegeneracy_factors['orientational degeneracy']attributes. Note that the bulk-site point symmetry corresponds to that ofDefectEntry.defect, or equivalentlycalculation_metadata["bulk_site"]/["unrelaxed_defect_structure"], which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure. The degeneracy factor is used in the calculation of defect/carrier concentrations and Fermi level behaviour (discussion in https://doi.org/10.1039/D2FD00043A, https://doi.org/10.1039/D3CS00432E, https://doi.org/10.1038/s41578-025-00879-y…).- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject. (Default = None)relaxed_point_group (str) – Point group symmetry (e.g. “Td”, “C3v” etc.) of the relaxed defect structure, if already calculated / manually determined. Default is
None(automatically calculated bydoped).bulk_site_point_group (str) – Point group symmetry (e.g. “Td”, “C3v” etc.) of the defect site in the bulk, if already calculated / manually determined. For vacancies/substitutions, this should match the site symmetry label from
dopedwhen generating the defect, while for interstitials it should be the point symmetry of the final relaxed interstitial site, when placed in the bulk structure. Default isNone(automatically calculated bydoped).symprec (float) – Symmetry precision to use for determining symmetry operations and thus point symmetries with
spglib, for the relaxed point symmetry. Default is0.1which matches that used by theMaterials Projectand is larger than thepymatgendefault of0.01to account for residual structural noise in relaxed defect supercells. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).bulk_symprec (float) – Symmetry precision to use for determining symmetry operations and thus point symmetries with
spglib, for the unrelaxed (bulk site) point symmetry. Default is0.01which matches thepymatgendefault. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).**kwargs – Additional keyword arguments to pass to
get_all_equiv_sites, such asdist_tol_factor,fixed_symprec_and_dist_tol_factor, andverbose.
- Returns:
Orientational degeneracy factor for the defect.
- Return type:
float
- doped.utils.symmetry.get_primitive_structure(structure: Structure, ignored_species: list | None = None, clean: bool = True, return_all: bool = False, **kwargs)[source]
Get a consistent/deterministic primitive structure from a
pymatgenStructure.For some materials (e.g. zinc blende), there are multiple equivalent primitive cells (e.g. Cd (0,0,0) & Te (0.25,0.25,0.25); Cd (0,0,0) & Te (0.75,0.75,0.75) for F-43m CdTe), so for reproducibility and in line with most structure conventions/definitions, take the one with the cleanest lattice and structure definition, according to
struct_sort_func.If
ignored_speciesis set, then the sorting function used to determine the ideal primitive structure will ignore sites with species inignored_species.- Parameters:
structure (Structure) – Structure to get the corresponding primitive structure of.
ignored_species (list | None) – List of species to ignore when determining the ideal primitive structure. (Default: None)
clean (bool) – Whether to return a ‘clean’ version of the primitive structure, with the lattice matrix in a standardised form. (Default: True)
return_all (bool) – Whether to return all possible primitive structures tested, sorted by the sorting function. (Default: False)
**kwargs – Additional keyword arguments to pass to the
get_sgafunction (e.g.symprecetc).
- Returns:
The primitive structure of the input structure, or a list of all possible primitive structures tested, sorted by the sorting function.
- Return type:
Structure | list[Structure]
- doped.utils.symmetry.get_sga(struct: Structure, symprec: float = 0.01) SpacegroupAnalyzer[source]
Get a
SpacegroupAnalyzerobject of the input structure, dynamically adjustingsymprecif needs be.Note that by default, magnetic symmetry (i.e. MAGMOMs) are not used in symmetry analysis in
doped, as noise in these values (particularly in structures from the Materials Project) often leads to incorrect symmetry determinations. To use magnetic moments in symmetry analyses, set the environment variableUSE_MAGNETIC_SYMMETRY=1(i.e.os.environ["USE_MAGNETIC_SYMMETRY"] = "1"in Python).- Parameters:
struct (Structure) – The input structure.
symprec (float) – The symmetry precision to use (default: 0.01).
- Returns:
The symmetry analyzer object.
- Return type:
SpacegroupAnalyzer
- doped.utils.symmetry.get_sga_and_symprec(struct: Structure, symprec: float = 0.01) tuple[SpacegroupAnalyzer, float][source]
Get a
SpacegroupAnalyzerobject of the input structure, dynamically adjustingsymprecif needs be, and the final successfulsymprecused forSpacegroupAnalyzerinitialisation.Note that by default, magnetic symmetry (i.e. MAGMOMs) are not used in symmetry analysis in
doped, as noise in these values (particularly in structures from the Materials Project) often leads to incorrect symmetry determinations. To use magnetic moments in symmetry analyses, set the environment variableUSE_MAGNETIC_SYMMETRY=1(i.e.os.environ["USE_MAGNETIC_SYMMETRY"] = "1"in Python).- Parameters:
struct (Structure) – The input structure.
symprec (float) – The symmetry precision to use (default: 0.01).
- Returns:
Tuple of the
SpacegroupAnalyzerobject and the finalsymprecused.- Return type:
tuple[SpacegroupAnalyzer, float]
- doped.utils.symmetry.get_spglib_conv_structure(sga: SpacegroupAnalyzer) tuple[Structure, SpacegroupAnalyzer][source]
Get a consistent/deterministic conventional structure from a
SpacegroupAnalyzerobject. Also returns the correspondingSpacegroupAnalyzer(for getting Wyckoff symbols corresponding to this conventional structure definition).For some materials (e.g. zinc blende), there are multiple equivalent primitive/conventional cells, so for reproducibility and in line with most structure conventions/definitions, take the one with the lowest summed norm of the fractional coordinates of the sites (i.e. favour Cd (0,0,0) and Te (0.25,0.25,0.25) over Cd (0,0,0) and Te (0.75,0.75,0.75) for F-43m CdTe; SGN 216).
- doped.utils.symmetry.get_wyckoff(frac_coords: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], struct: Structure, equiv_sites: bool = False, symprec: float = 0.01, **kwargs) str | tuple[source]
Get the Wyckoff label of the input fractional coordinates in the input structure. If the symmetry operations of the structure have already been computed, these can be input as a list to speed up the calculation.
- Parameters:
frac_coords (ArrayLike) – Fractional coordinates of the site to get the Wyckoff label of.
struct (Structure) – Structure for which
frac_coordscorresponds to.equiv_sites (bool) – If
True, returns a tuple of (Wyckoff label, list of equivalent sites). Default isFalse.symprec (float) – Symmetry precision to use for determining symmetry operations. Default is 0.01. If
fixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).**kwargs – Additional keyword arguments to pass to
get_all_equiv_sites, such asdist_tol_factor,fixed_symprec_and_dist_tol_factor, andverbose.
- Returns:
The Wyckoff label of the input fractional coordinates in the structure. If
equiv_sitesisTrue, also returns a list of equivalent sites in the structure.- Return type:
str | tuple
- doped.utils.symmetry.get_wyckoff_dict_from_sgn(sgn: int) dict[str, list[list[Expr]]][source]
Get dictionary of
{Wyckoff label: coordinates}for a given space group number.The database used here for Wyckoff analysis (
wyckpos.dat) was obtained from code written by JaeHwan Shim @schinavro (ORCID: 0000-0001-7575-4788) (https://gitlab.com/ase/ase/-/merge_requests/1035) based on the tabulated datasets in https://github.com/xtalopt/randSpg (also found at https://github.com/spglib/spglib/blob/develop/database/Wyckoff.csv). By default, doped uses the Wyckoff functionality ofspglib(along with symmetry operations in pymatgen) when possible, however.- Parameters:
sgn (int) – Space group number.
- Returns:
Dictionary of Wyckoff labels and their corresponding coordinates.
- Return type:
dict[str, list[list[float]]]
- doped.utils.symmetry.get_wyckoff_label_and_equiv_coord_list(defect_entry: DefectEntry | None = None, conv_cell_site: PeriodicSite | None = None, sgn: int | None = None, wyckoff_dict: dict | None = None) tuple[str, list[list[float]]][source]
Return the Wyckoff label and list of equivalent fractional coordinates within the conventional cell for the input defect_entry or conv_cell_site (whichever is provided, defaults to defect_entry if both), given a dictionary of Wyckoff labels and coordinates (
wyckoff_dict).If
wyckoff_dictis not provided, it is generated from the spacegroup number (sgn) usingget_wyckoff_dict_from_sgn(sgn). Ifsgnis not provided, it is obtained from the bulk structure of thedefect_entryif provided.
- doped.utils.symmetry.group_order_from_schoenflies(sch_symbol)[source]
Return the order of the point group from the Schoenflies symbol.
Useful for symmetry and orientational degeneracy analysis.
- doped.utils.symmetry.is_periodic_image(sites_1: Iterable[PeriodicSite | ndarray], sites_2: Iterable[PeriodicSite | ndarray], frac_tol: float = 0.01, same_image: bool = False) bool[source]
Determine if the
PeriodicSite/frac_coordsinsites_1are a periodic image of those insites_2.This function determines if the set of fractional coordinates in
sites_1are periodic images of those insites_2, with only unique site matches permitted (i.e. no repeat matches; each site can only have one match).If
same_imageisTrue, then the sites must all be of the same periodic image translation (i.e. the same rigid translation vector), such thatsites_1can be rigidly translated by any combination of lattice vectors to match the set of fractional coordinates insites_2.Note that the this function tests if the full set of sites is a periodic image of the other, and not just that each site in
sites_1is (individually) a periodic image of a site insites_2(for which thePeriodicSite.is_periodic_imagemethod could be used).- Parameters:
sites_1 (list) – List of
PeriodicSites orfrac_coordsarrays.sites_2 (list) – List of
PeriodicSites orfrac_coordsarrays.frac_tol (float) – Fractional coordinate tolerance for comparing sites.
same_image (bool) – If
True, also check that the sites are the same periodic image translation (i.e. the same rigid translation vector). Default isFalse.
- Returns:
Trueifsites_1is a periodic image ofsites_2,Falseotherwise.- Return type:
bool
- doped.utils.symmetry.point_symmetry_from_defect(defect: Defect, symprec: float = 0.01, **kwargs) str[source]
Get the defect site point symmetry from a Defect object.
Note that this is intended only to be used for unrelaxed, as-generated
Defectobjects (rather than parsed defects).- Parameters:
defect (Defect) –
Defectobject.symprec (float) – Symmetry precision to use for determining symmetry operations and thus point symmetries. Default is 0.01. If
fixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).**kwargs – Additional keyword arguments to pass to
get_all_equiv_sites, such asdist_tol_factor,fixed_symprec_and_dist_tol_factor, andverbose.
- Returns:
Defect point symmetry.
- Return type:
str
- doped.utils.symmetry.point_symmetry_from_defect_entry(defect_entry: DefectEntry, symprec: float | None = None, relaxed: bool = True, verbose: bool | None = None, return_periodicity_breaking: bool = False, **kwargs) str | tuple[str, bool][source]
Get the defect site point symmetry from a
DefectEntryobject.Note: If
relaxed = True(default), then this tries to use thedefect_entry.defect_supercellto determine the site symmetry. This will thus give the relaxed defect point symmetry if this is aDefectEntrycreated from parsed defect calculations. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell.dopedtries to automatically check if this is the case, and will warn you if so.This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed
DefectEntryobjects should correctly determine the final relaxed defect symmetry – otherwise periodicity-breaking prevents this.If periodicity-breaking prevents auto-symmetry determination, you can manually determine the relaxed defect and bulk-site point symmetries, and/or orientational degeneracy, from visualising the structures (e.g. using VESTA)(can use
get_orientational_degeneracyto obtain the corresponding orientational degeneracy factor for given defect/bulk-site point symmetries) and setting the corresponding values in thecalculation_metadata['relaxed point symmetry']/['bulk site symmetry']and/ordegeneracy_factors['orientational degeneracy']attributes. Note that the bulk-site point symmetry corresponds to that ofDefectEntry.defect, or equivalentlycalculation_metadata["bulk_site"]/["unrelaxed_defect_structure"], which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure. The degeneracy factor is used in the calculation of defect/carrier concentrations and Fermi level behaviour (discussion in https://doi.org/10.1039/D2FD00043A, https://doi.org/10.1039/D3CS00432E, https://doi.org/10.1038/s41578-025-00879-y…).- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject.symprec (float) – Symmetry precision to use for determining symmetry operations and thus point symmetries with
spglib. Default is 0.01 for unrelaxed structures, 0.1 for relaxed (to account for residual structural noise, matching that used by theMaterials Project). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).relaxed (bool) – If
False, determines the site symmetry using the defect site in the unrelaxed bulk supercell (i.e. the bulk site symmetry), otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell. Default isTrue.verbose (bool) – If
None(default) orTrue, prints a warning if the supercell is detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry).Truecorresponds to higher verbosity, where information on trialledsymprecanddist_tol_factorvalues in equivalent site generation is also printed. Default isNone.return_periodicity_breaking (bool) – If
True, also returns a boolean specifying if the supercell has been detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry) or not. Mainly for internaldopedusage. Default isFalse.**kwargs – Additional keyword arguments to pass to
get_all_equiv_sites, such asdist_tol_factorandfixed_symprec_and_dist_tol_factor.
- Returns:
Defect point symmetry (and if
return_periodicity_breaking = True, a boolean specifying if the supercell has been detected to break the crystal periodicity).- Return type:
str
- doped.utils.symmetry.point_symmetry_from_site(site: PeriodicSite | ndarray | list, structure: Structure, coords_are_cartesian: bool = False, symprec: float = 0.01, **kwargs) str[source]
Get the point symmetry of a site in a structure.
- Parameters:
site (Union[PeriodicSite, np.ndarray, list]) – Site for which to determine the point symmetry. Can be a
PeriodicSiteobject, or a list or numpy array of the coordinates of the site (fractional coordinates by default, or Cartesian ifcoords_are_cartesian = True).structure (Structure) –
Structureobject for which to determine the point symmetry of the site.coords_are_cartesian (bool) – If
True, the site coordinates are assumed to be in Cartesian coordinates. Default is False.symprec (float) – Symmetry precision to use for determining symmetry operations and thus point symmetries with
spglib. Default is 0.01. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).**kwargs – Additional keyword arguments to pass to
get_all_equiv_sites, such asdist_tol_factor,fixed_symprec_and_dist_tol_factor, andverbose.
- Returns:
Site point symmetry.
- Return type:
str
- doped.utils.symmetry.point_symmetry_from_structure(structure: Structure, bulk_structure: Structure | None = None, symprec: float | None = None, relaxed: bool = True, verbose: bool | None = None, return_periodicity_breaking: bool = False, skip_atom_mapping_check: bool = False, **kwargs) str | tuple[str, bool][source]
Get the point symmetry of a given structure.
Note: For certain non-trivial supercell expansions, the broken cell periodicity can break the site symmetry and lead to incorrect point symmetry determination (particularly if using non-scalar supercell matrices with high symmetry materials). If the unrelaxed bulk structure (
bulk_structure) is also supplied, thendopedwill determine the defect site and then automatically check if this is the case, and warn you if so.This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed
DefectEntryobjects should correctly determine the final relaxed defect symmetry – otherwise periodicity-breaking prevents this.If
bulk_structureis supplied andrelaxedis set toFalse, then returns the bulk site symmetry of the defect, which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure.- Parameters:
structure (Structure) –
Structureobject for which to determine the point symmetry.bulk_structure (Structure) –
Structureobject of the bulk structure, if known. Default isNone. If provided andrelaxed = True, will be used to check if the supercell is breaking the crystal periodicity (and thus preventing accurate determination of the relaxed defect point symmetry) and warn you if so.symprec (float) – Symmetry precision to use for determining symmetry operations and thus point symmetries with
spglib. Default is 0.01 for unrelaxed structures, 0.1 for relaxed (to account for residual structural noise, matching that used by theMaterials Project). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.). Iffixed_symprec_and_dist_tol_factorisFalse(default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites fromspglibhave consistent point group symmetries. SettingverbosetoTruewill print information on the trialledsymprec(anddist_tol_factorvalues).relaxed (bool) – If
False, determines the site symmetry using the defect site in the unrelaxed bulk supercell (i.e. the bulk site symmetry), otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell. Default isTrue.verbose (bool) – If
None(default) orTrue, prints a warning if the supercell is detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry).Truecorresponds to higher verbosity, where information on trialledsymprecanddist_tol_factorvalues in equivalent site generation is also printed. Default isNone.return_periodicity_breaking (bool) – If
True, also returns a boolean specifying if the supercell has been detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry) or not. Default isFalse.skip_atom_mapping_check (bool) – If
True, skips the atom mapping check which ensures that the bulk and defect supercell lattice definitions are matched (important for accurate defect site determination and charge corrections). Can be used to speed up parsing when you are sure the cell definitions match (e.g. both supercells were generated withdoped). Default isFalse.**kwargs – Additional keyword arguments to pass to
get_all_equiv_sites, such asdist_tol_factorandfixed_symprec_and_dist_tol_factor.
- Returns:
Structure point symmetry (and if
return_periodicity_breaking = True, a boolean specifying if the supercell has been detected to break the crystal periodicity).- Return type:
str
- doped.utils.symmetry.schoenflies_from_hermann(herm_symbol)[source]
Convert from Hermann-Mauguin to Schoenflies.
- doped.utils.symmetry.summed_rms_dist(struct_a: Structure, struct_b: Structure, ignored_species: list[str] | None = None) float[source]
Get the summed root-mean-square (RMS) distance between the sites of two structures, in Å.
Note that this assumes the lattices of the two structures are equal!
- Parameters:
struct_a –
pymatgenStructureobject.struct_b –
pymatgenStructureobject.ignored_species – List of species to ignore when calculating the RMS distance (default: None).
- Returns:
The summed RMS distance between the sites of the two structures, in Å.
- Return type:
float
- doped.utils.symmetry.swap_axes(structure: Structure, axes: list[int] | tuple[int, ...]) Structure[source]
Swap axes of the given structure.
The new order of the axes is given by the axes parameter. For example,
axes=(2, 1, 0)will swap the first and third axes.
- doped.utils.symmetry.translate_structure(structure: Structure, vector: ndarray, frac_coords: bool = True, to_unit_cell: bool = True) Structure[source]
Translate a structure and its sites by a given vector (not in place).
- Parameters:
structure –
pymatgenStructureobject.vector – Translation vector, fractional or Cartesian.
frac_coords – Whether the input vector is in fractional coordinates. (Default: True)
to_unit_cell – Whether to translate the sites to the unit cell. (Default: True)
- Returns:
pymatgenStructureobject with translated sites.
Module contents
Submodule for utility functions in doped.