doped.utils package
Submodules
doped.utils.configurations module
Utility functions for generating and parsing configurational coordinate (CC) diagrams, for potential energy surfaces (PESs), Nudged Elastic Band (NEB), non- radiative recombination calculations etc.
- doped.utils.configurations.get_dist_equiv_stol(dist: float, structure: Structure) float [source]
Get the equivalent
stol
value for a given Cartesian distance (dist
) in a givenStructure
.stol
is a site tolerance parameter used inpymatgen
StructureMatcher
functions, defined as the fraction of the average free length per atom := ( V / Nsites ) ** (1/3).- Parameters:
dist (float) – Cartesian distance in Å.
structure (Structure) – Structure to calculate
stol
for.
- Returns:
Equivalent
stol
value for the given distance.- Return type:
float
- doped.utils.configurations.get_min_stol_for_s1_s2(struct1: Structure, struct2: Structure, **sm_kwargs) float [source]
Get the minimum possible
stol
value which will give a match betweenstruct1
andstruct2
usingStructureMatcher
, based on the ranges of per-element minimum interatomic distances in the two structures.- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher()
. Just used to check ifignored_species
orcomparator
has been set here.
- Returns:
Minimum
stol
value for a match betweenstruct1
andstruct2
. If a direct match is detected (corresponding to minstol
= 0, then1e-4
is returned).- Return type:
float
- doped.utils.configurations.get_path_structures(struct1: Structure, struct2: Structure, n_images: int | ndarray | list[float] = 7, displacements: ndarray | list[float] | None = None, displacements2: ndarray | list[float] | None = None) dict[str, Structure] | tuple[dict[str, Structure], dict[str, Structure]] [source]
Generate a series of interpolated structures along the linear path between
struct1
andstruct2
, typically for use in NEB calculations or configuration coordinate (CC) diagrams.Structures are output as a dictionary with keys corresponding to either the index of the interpolated structure (0-indexed;
00
,01
etc as for VASP NEB calculations) or the fractional displacement along the interpolation path between structures, and values corresponding to the interpolated structure. Ifdisplacements
is set (and thus two sets of structures are generated), a tuple of such dictionaries is returned.Note that for NEB calculations, the the lattice vectors and order of sites (atomic indices) must be consistent in both
struct1
andstruct2
. This can be ensured by using theorient_s2_like_s1()
function indoped.utils.configurations
, as shown in thedoped
tutorials. This is also desirable for CC diagrams, as the atomic indices are assumed to match for many parsing and plotting functions (e.g. innonrad
andCarrierCapture.jl
), but is not strictly necessary. If the input structures are detected to be different (symmetry-inequivalent) geometries (e.g. not a simple defect migration between two symmetry-equivalent sites), but have mis-matching orientations/ positions (such that they do not correspond to the shortest linear path between them), a warning will be raised. See thedoped
configuration coordinate / NEB path generation tutorial for a deeper explanation.If only
n_images
is set (anddisplacements
isNone
)(default), then only one set of interpolated structures is generated (in other words, assuming a standard NEB/PES calculation is being performed). Ifdisplacements
(and possiblydisplacements2
) is set, then two sets of interpolated structures are generated (in other words, assuming a CC / non-radiative recombination calculation is being performed, where the two sets of structures are to be calculated in separate charge/spin etc states).- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
n_images (int) – Number of images to interpolate between
struct1
andstruct2
, or a list of fractiional interpolation values (displacements) to use. Note thatn_images
is ignored ifdisplacements
is set (in which case CC / non-radiative recombination calculations are assumed, otherwise a standard NEB / PES calculation is assumed). Default: 7displacements (np.ndarray or list) – Displacements to use for
struct1
along the linear transformation path tostruct2
. If set, then CC / non-radiative recombination calculations are assumed, and two sets of interpolated structures will be generated. If set anddisplacements2
is not set, then the same set of displacements is used for both sets of interpolated structures. Default:None
displacements2 (np.ndarray or list) – Displacements to use for
struct2
along the linear transformation path tostruct1
. If not set anddisplacements
is notNone
, then the same set of displacements is used for both sets of interpolated structures. Default:None
- doped.utils.configurations.get_s2_like_s1(struct1: Structure, struct2: Structure, verbose: bool = False, **sm_kwargs)
Re-orient
struct2
to a fully symmetry-equivalent orientation (i.e. without changing the actual geometry) to match the orientation ofstruct1
as closely as possible , with matching atomic indices as needed for VASP NEB calculations and other structural transformation analyses (e.g. configuration coordinate (CC) diagrams vianonrad
,CarrierCapture.jl
etc.).This corresponds to minimising the root-mean-square displacement from the shortest _linear_ path to transform from
struct1
to a symmetry-equivalent definition ofstruct2
)… (TODO) Uses theStructureMatcher.get_s2_like_s1()
method frompymatgen
, but extended to ensure the correct atomic indices matching and lattice vector definitions.If
verbose=True
, information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures is printed. This is the typical x-axis unit in configurational coordinate diagrams (see e.g. 10.1103/PhysRevB.90.075202).- Parameters:
struct1 (Structure) – Initial structure
struct2 (Structure) – Final structure
verbose (bool) – Print information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures. Default: False
**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher()
(e.g.ignored_species
,comparator
etc).
- Returns:
struct2
re-oriented to matchstruct1
as closely as possible.# TODO: Option to return RMSD, just displacement, anything else?
- Return type:
Structure
- doped.utils.configurations.orient_s2_like_s1(struct1: Structure, struct2: Structure, verbose: bool = False, **sm_kwargs)[source]
Re-orient
struct2
to a fully symmetry-equivalent orientation (i.e. without changing the actual geometry) to match the orientation ofstruct1
as closely as possible , with matching atomic indices as needed for VASP NEB calculations and other structural transformation analyses (e.g. configuration coordinate (CC) diagrams vianonrad
,CarrierCapture.jl
etc.).This corresponds to minimising the root-mean-square displacement from the shortest _linear_ path to transform from
struct1
to a symmetry-equivalent definition ofstruct2
)… (TODO) Uses theStructureMatcher.get_s2_like_s1()
method frompymatgen
, but extended to ensure the correct atomic indices matching and lattice vector definitions.If
verbose=True
, information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures is printed. This is the typical x-axis unit in configurational coordinate diagrams (see e.g. 10.1103/PhysRevB.90.075202).- Parameters:
struct1 (Structure) – Initial structure
struct2 (Structure) – Final structure
verbose (bool) – Print information about the mass-weighted displacement (ΔQ in amu^(1/2)Å) between the input and re-oriented structures. Default: False
**sm_kwargs – Additional keyword arguments to pass to
StructureMatcher()
(e.g.ignored_species
,comparator
etc).
- Returns:
struct2
re-oriented to matchstruct1
as closely as possible.# TODO: Option to return RMSD, just displacement, anything else?
- Return type:
Structure
- doped.utils.configurations.write_path_structures(struct1: Structure, struct2: Structure, output_dir: str | PathLike | None = None, n_images: int | list = 7, displacements: ndarray | list[float] | None = None, displacements2: ndarray | list[float] | None = None)[source]
Generate a series of interpolated structures along the linear path between
struct1
andstruct2
, typically for use in NEB calculations or configuration coordinate (CC) diagrams, and write to folders.Folder names are labelled by the index of the interpolated structure (0-indexed;
00
,01
etc as for VASP NEB calculations) or the fractional displacement along the interpolation path between structures (e.g.delQ_0.0
,delQ_0.1
,delQ_-0.1
etc), depending on the inputn_images
/displacements
settings.Note that for NEB calculations, the the lattice vectors and order of sites (atomic indices) must be consistent in both
struct1
andstruct2
. This can be ensured by using theorient_s2_like_s1()
function indoped.utils.configurations
, as shown in thedoped
tutorials. This is also desirable for CC diagrams, as the atomic indices are assumed to match for many parsing and plotting functions (e.g. innonrad
andCarrierCapture.jl
), but is not strictly necessary. If the input structures are detected to be different (symmetry-inequivalent) geometries (e.g. not a simple defect migration between two symmetry-equivalent sites), but have mis-matching orientations/ positions (such that they do not correspond to the shortest linear path between them), a warning will be raised. See thedoped
configuration coordinate / NEB path generation tutorial for a deeper explanation. (TODO)If only
n_images
is set (anddisplacements
isNone
)(default), then only one set of interpolated structures is written (in other words, assuming a standard NEB/PES calculation is being performed). Ifdisplacements
(and possiblydisplacements2
) is set, then two sets of interpolated structures are written (in other words, assuming a CC / non-radiative recombination calculation is being performed, where the two sets of structures are to be calculated in separate charge/spin etc states).- Parameters:
struct1 (Structure) – Initial structure.
struct2 (Structure) – Final structure.
output_dir (PathLike) – Directory to write the interpolated structures to. Defaults to “Configuration_Coordinate” if
displacements
is set, otherwise “NEB”.n_images (int) – Number of images to interpolate between
struct1
andstruct2
, or a list of fractiional interpolation values (displacements) to use. Note thatn_images
is ignored ifdisplacements
is set (in which case CC / non-radiative recombination calculations are assumed, otherwise a standard NEB / PES calculation is assumed). Default: 7displacements (np.ndarray or list) – Displacements to use for
struct1
along the linear transformation path tostruct2
. If set, then CC / non-radiative recombination calculations are assumed, and two sets of interpolated structures will be written to file. If set anddisplacements2
is not set, then the same set of displacements is used for both sets of interpolated structures. Default:None
displacements2 (np.ndarray or list) – Displacements to use for
struct2
along the linear transformation path tostruct1
. If not set anddisplacements
is notNone
, then the same set of displacements is used for both sets of interpolated structures. Default:None
doped.utils.displacements module
Code to analyse site displacements around a defect.
- doped.utils.displacements.calc_displacements_ellipsoid(defect_entry: DefectEntry, quantile: float = 0.8, relaxed_distances: bool = False, return_extras: bool = False) tuple [source]
Calculate displacements around a defect site and fit an ellipsoid to these displacements, returning a tuple of the ellipsoid’s center, radii, rotation matrix and dataframe of anisotropy information.
- Parameters:
defect_entry (DefectEntry) –
DefectEntry
object.quantile (float) – The quantile threshold for selecting significant displacements (between 0 and 1). Default is 0.8.
relaxed_distances (bool) – Whether to use the atomic positions in the _relaxed_ defect supercell for
'Distance to defect'
,'Vector to site from defect'
and'Displacement wrt defect'
values (True
), or unrelaxed positions (i.e. the bulk structure positions)(False
). Defaults toFalse
.return_extras (bool) – Whether to also return the
disp_df
(output fromcalc_site_displacements(defect_entry, relative_to_defect=True)
) and the points used to fit the ellipsoid, corresponding to the Cartesian coordinates of the sites with displacements above the threshold, where the structure has been shifted to place the defect at the cell midpoint ([0.5, 0.5, 0.5]) in fractional coordinates. Default isFalse
.
Returns: - (ellipsoid_center, ellipsoid_radii, ellipsoid_rotation, aniostropy_df):
A tuple containing the ellipsoid’s center, radii, rotation matrix, and a dataframe of anisotropy information, or
(None, None, None, None)
if fitting was unsuccessful.If
return_extras=True
, also returnsdisp_df
and the points used to fit the ellipsoid, appended to the return tuple.
- doped.utils.displacements.calc_site_displacements(defect_entry: DefectEntry, relaxed_distances: bool = False, relative_to_defect: bool = False, vector_to_project_on: list | None = None, threshold: float = 2.0) DataFrame [source]
Calculates the site displacements in the defect supercell, relative to the bulk supercell, and returns a
DataFrame
of site displacement info.The signed displacements are stored in the calculation_metadata of the
DefectEntry
object under the"site_displacements"
key.- Parameters:
defect_entry (DefectEntry) –
DefectEntry
objectrelaxed_distances (bool) – Whether to use the atomic positions in the _relaxed_ defect supercell for
'Distance to defect'
,'Vector to site from defect'
and'Displacement wrt defect'
values (True
), or unrelaxed positions (i.e. the bulk structure positions)(False
). Defaults toFalse
.relative_to_defect (bool) – Whether to calculate the signed displacements along the line from the defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect. Defaults to False. If True, the relative displacements are stored in the Displacement wrt defect key of the returned dictionary.
vector_to_project_on (list) – Direction to project the site displacements along (e.g. [0, 0, 1]). Defaults to
None
.threshold (float) – If the distance between a pair of matched sites is larger than this, then a warning will be thrown. Default is 2.0 Å.
- Returns:
pandas
DataFrame
with site displacements (compared to pristine supercell), and other displacement-related information.
- doped.utils.displacements.plot_displacements_ellipsoid(defect_entry: DefectEntry, plot_ellipsoid: bool = True, plot_anisotropy: bool = False, quantile: float = 0.8, use_plotly: bool = False, show_supercell: bool = True, style_file: str | PathLike | None = None) tuple [source]
Plot the displacement ellipsoid and/or anisotropy around a relaxed defect.
Set
use_plotly = True
to get an interactiveplotly
plot, useful for analysis!The supercell edges are also plotted if
show_supercell = True
(default).- Parameters:
defect_entry (DefectEntry) –
DefectEntry
object.plot_ellipsoid (bool) – If True, plot the fitted ellipsoid in the crystal lattice.
plot_anisotropy (bool) – If True, plot the anisotropy of the ellipsoid radii.
quantile (float) – The quantile threshold for selecting significant displacements (between 0 and 1). Default is 0.8.
use_plotly (bool) – Whether to use
plotly
for plotting. Default isFalse
. Set toTrue
to get an interactive plot.show_supercell (bool) – Whether to show the supercell edges in the plot. Default is
True
.style_file (PathLike) – Path to
matplotlib
style file. if not set, will use thedoped
default displacements style.
- Returns:
Either a single
plotly
ormatplotlib
Figure
, if only one ofplot_ellipsoid
orplot_anisotropy
areTrue
, or a tuple of plots if both areTrue
.
- doped.utils.displacements.plot_site_displacements(defect_entry: DefectEntry, separated_by_direction: bool = False, relaxed_distances: bool = False, relative_to_defect: bool = False, vector_to_project_on: list | None = None, use_plotly: bool = False, style_file: str | PathLike | None = None)[source]
Plots site displacements around a defect.
Set
use_plotly = True
to get an interactiveplotly
plot, useful for analysis!- Parameters:
defect_entry (DefectEntry) –
DefectEntry
objectseparated_by_direction (bool) – Whether to plot site displacements separated by direction (x, y, z). Default is
False
.relaxed_distances (bool) – Whether to use the atomic positions in the _relaxed_ defect supercell for
'Distance to defect'
,'Vector to site from defect'
and'Displacement wrt defect'
values (True
), or unrelaxed positions (i.e. the bulk structure positions)(False
). Defaults toFalse
.relative_to_defect (bool) – Whether to plot the signed displacements along the line from the defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect (tensile strain). Uses the relaxed defect position as reference.
vector_to_project_on (bool) – Direction to project the site displacements along (e.g. [0, 0, 1]). Defaults to None (e.g. the displacements are calculated in the cartesian basis x, y, z).
use_plotly (bool) – Whether to use
plotly
for plotting. Default isFalse
. Set toTrue
to get an interactive plot.style_file (PathLike) – Path to
matplotlib
style file. if not set, will use thedoped
default displacements style.
- Returns:
plotly
ormatplotlib
Figure
.
doped.utils.efficiency module
Utility functions to improve the efficiency of common
functions/workflows/calculations in doped
.
- class doped.utils.efficiency.DopedTopographyAnalyzer(structure: Structure, image_tol: float = 0.0001, max_cell_range: int = 1, constrained_c_frac: float = 0.5, thickness: float = 0.5)[source]
Bases:
object
This is a modified version of
pymatgen.analysis.defects.utils.TopographyAnalyzer
to lean down the input options and make initialisation far more efficient (~2 orders of magnitude faster).The original code was written by Danny Broberg and colleagues (10.1016/j.cpc.2018.01.004), which was then added to
pymatgen
before being cut.- Parameters:
structure (Structure) – An initial structure.
image_tol (float) – A tolerance distance for the analysis, used to determine if something are actually periodic boundary images of each other. Default is usually fine.
max_cell_range (int) – This is the range of periodic images to construct the Voronoi tessellation. A value of 1 means that we include all points from (x +- 1, y +- 1, z+- 1) in the voronoi construction. This is because the Voronoi poly extends beyond the standard unit cell because of PBC. Typically, the default value of 1 works fine for most structures and is fast. But for very small unit cells with high symmetry, this may need to be increased to 2 or higher. If there are < 5 atoms in the input structure and max_cell_range is 1, this will automatically be increased to 2.
constrained_c_frac (float) – Constraint the region where users want to do Topology analysis the default value is 0.5, which is the fractional coordinate of the cell
thickness (float) – Along with constrained_c_frac, limit the thickness of the regions where we want to explore. Default is 0.5, which is mapping all the site of the unit cell.
- doped.utils.efficiency.cache_ready_PeriodicSite__eq__(self, other)[source]
Custom
__eq__
method forPeriodicSite
instances, using a cached equality function to speed up comparisons.
- doped.utils.efficiency.cached_Structure_eq_func(self_hash, other_hash)[source]
Cached equality function for
Composition
instances.
- doped.utils.efficiency.cached_allclose(a: tuple, b: tuple, rtol: float = 1e-05, atol: float = 1e-08)[source]
Cached version of
np.allclose
, taking tuples as inputs (so that they are hashable and thus cacheable).
- doped.utils.efficiency.doped_Composition_eq_func(self_hash, other_hash)[source]
Update equality function for
Composition
instances, which breaks early for mismatches and also uses caching, making orders of magnitude faster thanpymatgen
equality function.
- doped.utils.efficiency.doped_Structure__eq__(self, other: IStructure) bool [source]
Copied from
pymatgen
, but updated to break early once a mis-matching site is found, to speed up structure matching by ~2x.
- doped.utils.efficiency.fast_Composition_eq(self, other)[source]
Fast equality function for
Composition
instances, breaking early for mismatches.
- doped.utils.efficiency.get_voronoi_nodes(structure: Structure) list[PeriodicSite] [source]
Get the Voronoi nodes of a
pymatgen
Structure
.Maximises efficiency by mapping down to the primitive cell, doing Voronoi analysis (with the efficient
DopedTopographyAnalyzer
class), and then mapping back to the original structure (typically a supercell).- Parameters:
structure (
Structure
) – pymatgen Structure object.- Returns:
List of PeriodicSite objects representing the Voronoi nodes.
- Return type:
list[PeriodicSite]
doped.utils.eigenvalues module
Helper functions for setting up PHS analysis.
Contains modified versions of functions from pydefect (https://github.com/kumagai-group/pydefect)
and vise (https://github.com/kumagai-group/vise), to avoid requiring additional files (i.e. PROCAR
s).
Note that this module attempts to import modules from pydefect
& vise
, which are highly-recommended
but not strictly required dependencies of doped
(currently not available on conda-forge
), and so
any imports of code from this module will attempt their import, raising an ImportError
if not
available.
- doped.utils.eigenvalues.band_edge_properties_from_vasprun(vasprun: Vasprun, integer_criterion: float = 0.1) BandEdgeProperties [source]
Create a
pydefect
BandEdgeProperties
object from aVasprun
object.- Parameters:
vasprun (Vasprun) –
Vasprun
object.integer_criterion (float) – Threshold criterion for determining if a band is unoccupied (<
integer_criterion
), partially occupied (betweeninteger_criterion
and 1 -integer_criterion
), or fully occupied (> 1 -integer_criterion
). Default is 0.1.
- Returns:
BandEdgeProperties
object.
- doped.utils.eigenvalues.get_band_edge_info(bulk_vr: Vasprun, defect_vr: Vasprun, bulk_procar: str | PathLike | EasyunfoldProcar | Procar | None = None, defect_procar: str | PathLike | EasyunfoldProcar | Procar | None = None, defect_supercell_site: PeriodicSite | None = None, neighbor_cutoff_factor: float = 1.3) tuple[BandEdgeOrbitalInfos, EdgeInfo, EdgeInfo] [source]
Generate metadata required for performing eigenvalue & orbital analysis, specifically
pydefect
BandEdgeOrbitalInfos
, andEdgeInfo
objects for the bulk VBM and CBM.See https://doped.readthedocs.io/en/latest/Tips.html#perturbed-host-states.
- Parameters:
bulk_vr (Vasprun) –
Vasprun
object of the bulk supercell calculation. Ifbulk_procar
is not provided, then this must have theprojected_eigenvalues
attribute (i.e. from a calculation withLORBIT > 10
in theINCAR
and parsed withparse_projected_eigen = True
).defect_vr (Vasprun) –
Vasprun
object of the defect supercell calculation. Ifdefect_procar
is not provided, then this must have theprojected_eigenvalues
attribute (i.e. from a calculation withLORBIT > 10
in theINCAR
and parsed withparse_projected_eigen = True
).bulk_procar (PathLike, EasyunfoldProcar, Procar) – Either a path to the
VASP
PROCAR
output file (withLORBIT > 10
in theINCAR
) or aneasyunfold
/pymatgen
Procar
object, for the bulk supercell calculation. Not required if the suppliedbulk_vr
was parsed withparse_projected_eigen = True
. Default isNone
.defect_procar (PathLike, EasyunfoldProcar, Procar) – Either a path to the
VASP
PROCAR
output file (withLORBIT > 10
in theINCAR
) or aneasyunfold
/pymatgen
Procar
object, for the defect supercell calculation. Not required if the suppliedbulk_vr
was parsed withparse_projected_eigen = True
. Default isNone
.defect_supercell_site (PeriodicSite) –
PeriodicSite
object of the defect site in the defect supercell, from which the defect neighbours are determined for localisation analysis. IfNone
(default), then the defect site is determined automatically from the defect and bulk supercell structures.neighbor_cutoff_factor (float) – Sites within
min_distance * neighbor_cutoff_factor
of the defect site in the relaxed defect supercell are considered neighbors for localisation analysis, wheremin_distance
is the minimum distance between sites in the defect supercell. Default is 1.3 (matching thepydefect
default).
- Returns:
pydefect
BandEdgeOrbitalInfos
, andEdgeInfo
objects for the bulk VBM and CBM.
- doped.utils.eigenvalues.get_eigenvalue_analysis(defect_entry: DefectEntry | None = None, plot: bool = True, filename: str | None = None, ks_labels: bool = False, style_file: str | None = None, bulk_vr: str | PathLike | Vasprun | None = None, bulk_procar: str | PathLike | EasyunfoldProcar | Procar | None = None, defect_vr: str | PathLike | Vasprun | None = None, defect_procar: str | PathLike | EasyunfoldProcar | Procar | None = None, force_reparse: bool = False, ylims: tuple[float, float] | None = None, legend_kwargs: dict | None = None, similar_orb_criterion: float | None = None, similar_energy_criterion: float | None = None) BandEdgeStates | tuple[BandEdgeStates, Figure] [source]
Get eigenvalue & orbital info (with automated classification of PHS states) for the band edge and in-gap electronic states for the input defect entry / calculation outputs, as well as a plot of the single-particle electronic eigenvalues and their occupation (if
plot=True
).Can be used to determine if a defect is adopting a perturbed host state (PHS / shallow state), see https://doped.readthedocs.io/en/latest/Tips.html#perturbed-host-states. Note that the classification of electronic states as band edges or localized orbitals is based on the similarity of orbital projections and eigenvalues between the defect and bulk cell calculations (see
similar_orb/energy_criterion
argument descriptions below for more details). You may want to adjust the default values of these keyword arguments, as the defaults may not be appropriate in all cases. In particular, the P-ratio values can give useful insight, revealing the level of (de)localisation of the states.Either a
doped
DefectEntry
object can be provided, or the required VASP output files/objects for the bulk and defect supercell calculations (Vasprun
s, orVasprun
s andProcar
s). If aDefectEntry
is provided but eigenvalue data has not already been parsed (default indoped
is to parse this data withDefectsParser
/DefectParser
, as controlled by theparse_projected_eigen
flag), then this function will attempt to load the eigenvalue data from either the inputVasprun
/PROCAR
objects or files, or from thebulk/defect_path
s indefect_entry.calculation_metadata
. If so, will initially try to load orbital projections fromvasprun.xml(.gz)
files (slightly slower but more accurate), or failing that fromPROCAR(.gz)
files if present.This function uses code from
pydefect
, so please cite thepydefect
paper: “Insights into oxygen vacancies from high-throughput first-principles calculations” Yu Kumagai, Naoki Tsunoda, Akira Takahashi, and Fumiyasu Oba Phys. Rev. Materials 5, 123803 (2021) – 10.1103/PhysRevMaterials.5.123803- Parameters:
defect_entry (DefectEntry) –
doped
DefectEntry
object. Default isNone
.plot (bool) – Whether to plot the single-particle eigenvalues. (Default: True)
filename (str) – Filename to save the eigenvalue plot to (if
plot = True
). IfNone
(default), plots are not saved.ks_labels (bool) – Whether to add band index labels to the KS levels. (Default: False)
style_file (str) – Path to a
mplstyle
file to use for the plot. If None (default), uses thedoped
displacement plot style (doped/utils/displacement.mplstyle
).bulk_vr (PathLike, Vasprun) – Not required if
defect_entry
provided and eigenvalue data already parsed (default behaviour when parsing withdoped
, data indefect_entry.calculation_metadata["eigenvalue_data"]
). Either a path to theVASP
vasprun.xml(.gz)
output file or apymatgen
Vasprun
object, for the reference bulk supercell calculation. IfNone
(default), tries to load theVasprun
object fromdefect_entry.calculation_metadata["run_metadata"]["bulk_vasprun_dict"]
, or, failing that, from avasprun.xml(.gz)
file atdefect_entry.calculation_metadata["bulk_path"]
.bulk_procar (PathLike, EasyunfoldProcar, Procar) – Not required if
defect_entry
provided and eigenvalue data already parsed (default behaviour when parsing withdoped
, data indefect_entry.calculation_metadata["eigenvalue_data"]
), or ifbulk_vr
was parsed withparse_projected_eigen = True
. Either a path to theVASP
PROCAR
output file (withLORBIT > 10
in theINCAR
) or aneasyunfold
/pymatgen
Procar
object, for the reference bulk supercell calculation. IfNone
(default), tries to load from aPROCAR(.gz)
file atdefect_entry.calculation_metadata["bulk_path"]
.defect_vr (PathLike, Vasprun) – Not required if
defect_entry
provided and eigenvalue data already parsed (default behaviour when parsing withdoped
, data indefect_entry.calculation_metadata["eigenvalue_data"]
). Either a path to theVASP
vasprun.xml(.gz)
output file or apymatgen
Vasprun
object, for the defect supercell calculation. IfNone
(default), tries to load theVasprun
object fromdefect_entry.calculation_metadata["run_metadata"]["defect_vasprun_dict"]
, or, failing that, from avasprun.xml(.gz)
file atdefect_entry.calculation_metadata["defect_path"]
.defect_procar (PathLike, EasyunfoldProcar, Procar) – Not required if
defect_entry
provided and eigenvalue data already parsed (default behaviour when parsing withdoped
, data indefect_entry.calculation_metadata["eigenvalue_data"]
), or ifdefect_vr
was parsed withparse_projected_eigen = True
. Either a path to theVASP
PROCAR
output file (withLORBIT > 10
in theINCAR
) or aneasyunfold
/pymatgen
Procar
object, for the defect supercell calculation. IfNone
(default), tries to load from aPROCAR(.gz)
file atdefect_entry.calculation_metadata["defect_path"]
.force_reparse (bool) – Whether to force re-parsing of the eigenvalue data, even if already present in the
calculation_metadata
.ylims (tuple[float, float]) – Custom y-axis limits for the eigenvalue plot. If
None
(default), the y-axis limits are automatically set to +/-5% of the eigenvalue range.legend_kwargs (dict) – Custom keyword arguments to pass to the
ax.legend
call in the eigenvalue plot (e.g. “loc”, “fontsize”, “framealpha” etc.). If set toFalse
, then no legend is shown. Default isNone
.similar_orb_criterion (float) – Threshold criterion for determining if the orbitals of two eigenstates are similar (for identifying band-edge and defect states). If the summed orbital projection differences, normalised by the total orbital projection coefficients, are less than this value, then the orbitals are considered similar. Default is to try with 0.2 (
pydefect
default), then if this fails increase to 0.35, and lastly 0.5.similar_energy_criterion (float) – Threshold criterion for considering two eigenstates similar in energy, used for identifying band-edge (and defect states). Bands within this energy difference from the VBM/CBM of the bulk are considered potential band-edge states. Default is to try with the larger of either 0.25 eV or 0.1 eV + the potential alignment from defect to bulk cells as determined by the charge correction in
defect_entry.corrections_metadata
if present. If this fails, then it is increased to thepydefect
default of 0.5 eV.
- Returns:
pydefect
BandEdgeStates
object, containing the band-edge and defect eigenvalue information, and the eigenvalue plot (ifplot=True
).
- doped.utils.eigenvalues.make_band_edge_orbital_infos(defect_vr: Vasprun, vbm: float, cbm: float, eigval_shift: float = 0.0, neighbor_indices: list[int] | None = None, defect_procar: EasyunfoldProcar | Procar | None = None)[source]
Make
BandEdgeOrbitalInfos
from aVasprun
object.Modified from
pydefect
to use projected orbitals stored in theVasprun
object.- Parameters:
defect_vr (Vasprun) – Defect
Vasprun
object.vbm (float) – VBM eigenvalue in eV.
cbm (float) – CBM eigenvalue in eV.
eigval_shift (float) – Shift eigenvalues down by this value (to set VBM at 0 eV). Default is 0.0.
neighbor_indices (list[int]) – Indices of neighboring atoms to the defect site, for localisation analysis. Default is
None
.defect_procar (EasyunfoldProcar, Procar) –
EasyunfoldProcar
orProcar
object, for the defect supercell, if projected eigenvalue/orbitals data is not provided indefect_vr
.
- Returns:
BandEdgeOrbitalInfos
object.
- doped.utils.eigenvalues.make_perfect_band_edge_state_from_vasp(vasprun: Vasprun, procar: Procar, integer_criterion: float = 0.1) PerfectBandEdgeState [source]
Create a
pydefect
PerfectBandEdgeState
object from just aVasprun
andProcar
object, without the need for theOutcar
input (as inpydefect
).- Parameters:
vasprun (Vasprun) –
Vasprun
object.procar (Procar) –
Procar
object.integer_criterion (float) – Threshold criterion for determining if a band is unoccupied (<
integer_criterion
), partially occupied (betweeninteger_criterion
and 1 -integer_criterion
), or fully occupied (> 1 -integer_criterion
). Default is 0.1.
- Returns:
PerfectBandEdgeState
object.
doped.utils.legacy_corrections module
Functions for computing legacy finite-size charge corrections (Makov-Payne, Murphy-Hine, Lany-Zunger) for defect formation energies.
Mostly adapted from the deprecated AIDE package developed by the dynamic duo Adam Jackson and Alex Ganose ( https://github.com/SMTG-Bham/aide)
- doped.utils.legacy_corrections.get_murphy_image_charge_correction(lattice, dielectric_matrix, conv=0.3, factor=30, verbose=False)[source]
Calculates the anisotropic image charge correction by Sam Murphy in eV.
This a rewrite of the code ‘madelung.pl’ written by Sam Murphy (see [1]). The default convergence parameter of conv = 0.3 seems to work perfectly well. However, it may be worth testing convergence of defect energies with respect to the factor (i.e. cut-off radius).
References
[1] S. T. Murphy and N. D. H. Hine, Phys. Rev. B 87, 094111 (2013).
- Parameters:
lattice (list) – The defect cell lattice as a 3x3 matrix.
dielectric_matrix (list) – The dielectric tensor as 3x3 matrix.
conv (float) – A value between 0.1 and 0.9 which adjusts how much real space vs reciprocal space contribution there is.
factor – The cut-off radius, defined as a multiple of the longest cell parameter.
verbose (bool) – If True details of the correction will be printed.
- Returns:
The image charge correction as a
{charge: correction}
dictionary.
- doped.utils.legacy_corrections.lany_zunger_corrected_defect_dict(defect_dict: dict)[source]
Convert input parsed defect dictionary (presumably created using DefectParser) with Freysoldt/Kumagai charge corrections to the same.
parsed defect dictionary but with the Lany-Zunger charge correction (same potential alignment plus 0.65 * Makov-Payne image charge correction).
- Parameters:
defect_dict (dict) – Dictionary of parsed defect calculations (presumably created using DefectParser (see tutorials) Must have ‘freysoldt_meta’ in defect.calculation_metadata for each charged defect (from DefectParser.load_FNV_data())
- Returns:
Parsed defect dictionary with Lany-Zunger charge corrections.
doped.utils.parsing module
Helper functions for parsing VASP supercell defect calculations.
- doped.utils.parsing.check_atom_mapping_far_from_defect(bulk_supercell: Structure, defect_supercell: Structure, defect_coords: ndarray[float], coords_are_cartesian: bool = False, displacement_tol: float = 0.5, warning: bool | str = 'verbose') bool [source]
Check the displacement of atoms far from the determined defect site, and warn the user if they are large (often indicates a mismatch between the bulk and defect supercell definitions).
The threshold for identifying ‘large’ displacements is if the mean displacement of any species is greater than
displacement_tol
Ångströms for sites of that species outside the Wigner-Seitz radius of the defect in the defect supercell. The Wigner-Seitz radius corresponds to the radius of the largest sphere which can fit in the cell.- Parameters:
bulk_supercell (Structure) – The bulk structure.
defect_supercell (Structure) – The defect structure.
defect_coords (np.ndarray[float]) – The coordinates of the defect site.
coords_are_cartesian (bool) – Whether the defect coordinates are in Cartesian or fractional coordinates. Default is
False
(fractional).displacement_tol (float) – The tolerance for the displacement of atoms far from the defect site, in Ångströms. Default is 0.5 Å.
warning (bool, str) – Whether to throw a warning if a mismatch is detected. If
warning = "verbose"
(default), the individual atomic displacements are included in the warning message.
- Returns:
Returns
False
if a mismatch is detected, elseTrue
.- Return type:
bool
- doped.utils.parsing.defect_charge_from_vasprun(defect_vr: Vasprun, charge_state: int | None) int [source]
Determine the defect charge state from the defect vasprun, and compare to the manually-set charge state if provided.
- Parameters:
defect_vr (Vasprun) – Defect
pymatgen
Vasprun
object.charge_state (int) – Manually-set charge state for the defect, to check if it matches the auto-determined charge state.
- Returns:
The auto-determined defect charge state.
- Return type:
int
- doped.utils.parsing.doped_entry_id(vasprun: Vasprun) str [source]
Generate an
entry_id
from apymatgen
Vasprun
object, to use withComputedEntry
/ComputedStructureEntry
objects (fromVasprun.get_computed_entry()
).The
entry_id
is set to:{reduced chemical formula}_{vr.energy}
This is to avoid the use of parsing-time-dependent
entry_id
frompymatgen
, and may be replaced in the future if this issue is resolved: https://github.com/materialsproject/pymatgen/issues/4259Currently not used in
doped
parsing however, asComputedEntry.parameters
gets randomly re-organised upon saving tojson
, so the sameComputedEntry
saved to file at different times still gives slightly differentjson
files.- Parameters:
vasprun (Vasprun) – The
Vasprun
object from which to generate theentry_id
.- Returns:
The generated
entry_id
.- Return type:
str
- doped.utils.parsing.find_archived_fname(fname, raise_error=True)[source]
Find a suitable filename, taking account of possible use of compression software.
- doped.utils.parsing.find_missing_idx(frac_coords1: list | ndarray, frac_coords2: list | ndarray, lattice: Lattice)[source]
Find the missing/outlier index between two sets of fractional coordinates (differing in size by 1), by grouping the coordinates based on the minimum distances between coordinates or, if that doesn’t give a unique match, the site combination that gives the minimum summed squared distances between paired sites.
The index returned is the index of the missing/outlier coordinate in the larger set of coordinates.
- Parameters:
frac_coords1 (Union[list, np.ndarray]) – First set of fractional coordinates.
frac_coords2 (Union[list, np.ndarray]) – Second set of fractional coordinates.
lattice (Lattice) – The lattice object to use with the fractional coordinates.
- doped.utils.parsing.find_nearest_coords(candidate_frac_coords: list | ndarray, target_frac_coords: list | ndarray, lattice: Lattice, return_idx: bool = False) tuple[list | ndarray, int] | list | ndarray [source]
Find the nearest coords in
candidate_frac_coords
totarget_frac_coords
.If
return_idx
isTrue
, also returns the index of the nearest coords incandidate_frac_coords
totarget_frac_coords
.- Parameters:
candidate_frac_coords (Union[list, np.ndarray]) – Fractional coordinates (typically from a bulk supercell), to find the nearest coordinates to
target_frac_coords
.target_frac_coords (Union[list, np.ndarray]) – The target coordinates to find the nearest coordinates to in
candidate_frac_coords
.lattice (Lattice) – The lattice object to use with the fractional coordinates.
return_idx (bool) – Whether to also return the index of the nearest coordinates in
candidate_frac_coords
totarget_frac_coords
.
- doped.utils.parsing.get_coords_and_idx_of_species(structure, species_name, frac_coords=True)[source]
Get arrays of the coordinates and indices of the given species in the structure.
- doped.utils.parsing.get_core_potentials_from_outcar(outcar_path: str | PathLike, dir_type: str = '', total_energy: list | float | None = None)[source]
Get the core potentials from the OUTCAR file, which are needed for the Kumagai-Oba (eFNV) finite-size correction.
This parser skips the full
pymatgen
Outcar
initialisation/parsing, to expedite parsing and make it more robust (doesn’t fail ifOUTCAR
is incomplete, as long as it has the core potentials information).- Parameters:
outcar_path (PathLike) – The path to the OUTCAR file.
dir_type (str) – The type of directory the OUTCAR is in (e.g.
bulk
ordefect
) for informative error messages.total_energy (Optional[Union[list, float]]) – The already-parsed total energy for the structure. If provided, will check that the total energy of the
OUTCAR
matches this value / one of these values, and throw a warning if not.
- Returns:
The core potentials from the last ionic step in the
OUTCAR
file.- Return type:
np.ndarray
- doped.utils.parsing.get_defect_site_idxs_and_unrelaxed_structure(bulk_supercell: Structure, defect_supercell: Structure)
Get the defect type, site (indices in the bulk and defect supercells) and unrelaxed structure, where ‘unrelaxed structure’ corresponds to the pristine defect supercell structure for vacancies / substitutions (with no relaxation), and the pristine bulk structure with the final relaxed interstitial site for interstitials.
Initial draft contributed by Dr. Alex Ganose (@ Imperial Chemistry) and refactored for extrinsic species and several code efficiency/robustness improvements.
- Parameters:
bulk_supercell (Structure) – The bulk supercell structure.
defect_supercell (Structure) – The defect supercell structure.
- Returns:
- The type of defect as a string (
interstitial
,vacancy
or
substitution
).- bulk_site_idx:
Index of the site in the bulk structure that corresponds to the defect site in the defect structure
- defect_site_idx:
Index of the defect site in the defect structure
- unrelaxed_defect_structure:
Pristine defect supercell structure for vacancies/substitutions (i.e. pristine bulk with unrelaxed vacancy/substitution), or the pristine bulk structure with the final relaxed interstitial site for interstitials.
- The type of defect as a string (
- Return type:
defect_type
- doped.utils.parsing.get_defect_type_and_composition_diff(bulk: Structure | Composition, defect: Structure | Composition) tuple[str, dict] [source]
Get the difference in composition between a bulk structure and a defect structure.
Contributed by Dr. Alex Ganose (@ Imperial Chemistry) and refactored for extrinsic species and code efficiency/robustness improvements.
- Parameters:
bulk (Union[Structure, Composition]) – The bulk structure or composition.
defect (Union[Structure, Composition]) – The defect structure or composition.
- Returns:
The defect type (
interstitial
,vacancy
orsubstitution
) and the composition difference between the bulk and defect structures as a dictionary.- Return type:
Tuple[str, Dict[str, int]]
- doped.utils.parsing.get_defect_type_site_idxs_and_unrelaxed_structure(bulk_supercell: Structure, defect_supercell: Structure)[source]
Get the defect type, site (indices in the bulk and defect supercells) and unrelaxed structure, where ‘unrelaxed structure’ corresponds to the pristine defect supercell structure for vacancies / substitutions (with no relaxation), and the pristine bulk structure with the final relaxed interstitial site for interstitials.
Initial draft contributed by Dr. Alex Ganose (@ Imperial Chemistry) and refactored for extrinsic species and several code efficiency/robustness improvements.
- Parameters:
bulk_supercell (Structure) – The bulk supercell structure.
defect_supercell (Structure) – The defect supercell structure.
- Returns:
- The type of defect as a string (
interstitial
,vacancy
or
substitution
).- bulk_site_idx:
Index of the site in the bulk structure that corresponds to the defect site in the defect structure
- defect_site_idx:
Index of the defect site in the defect structure
- unrelaxed_defect_structure:
Pristine defect supercell structure for vacancies/substitutions (i.e. pristine bulk with unrelaxed vacancy/substitution), or the pristine bulk structure with the final relaxed interstitial site for interstitials.
- The type of defect as a string (
- Return type:
defect_type
- doped.utils.parsing.get_interstitial_site_and_orientational_degeneracy(interstitial_defect_entry: DefectEntry, dist_tol: float = 0.15) int [source]
Get the combined site and orientational degeneracy of an interstitial defect entry.
The standard approach of using
_get_equiv_sites()
for interstitial site multiplicity and thenpoint_symmetry_from_defect_entry()
&get_orientational_degeneracy
for symmetry/orientational degeneracy is preferred (as used in theDefectParser
code), but alternatively this function can be used to compute the product of the site and orientational degeneracies.This is done by determining the number of equivalent sites in the bulk supercell for the given interstitial site (from defect_supercell_site), which gives the combined site and orientational degeneracy if there was no relaxation of the bulk lattice atoms. This matches the true combined degeneracy in most cases, except for split-interstitial type defects etc, where this would give an artificially high degeneracy (as, for example, the interstitial site is automatically assigned to one of the split-interstitial atoms and not the midpoint, giving a doubled degeneracy as it considers the two split-interstitial sites as two separate (degenerate) interstitial sites, instead of one). This is counteracted by dividing by the number of sites which are present in the defect supercell (within a distance tolerance of dist_tol in Å) with the same species, ensuring none of the predicted different equivalent sites are actually included in the defect structure.
- Parameters:
interstitial_defect_entry – DefectEntry object for the interstitial defect.
dist_tol – distance tolerance in Å for determining equivalent sites.
- Returns:
combined site and orientational degeneracy of the interstitial defect entry (int).
- doped.utils.parsing.get_locpot(locpot_path: str | PathLike)[source]
Read the
LOCPOT(.gz)
file as apymatgen
Locpot
object.
- doped.utils.parsing.get_magnetization_from_vasprun(vasprun: Vasprun) int | float [source]
Determine the magnetization (number of spin-up vs spin-down electrons) from a
Vasprun
object.- Parameters:
vasprun (Vasprun) – The
Vasprun
object from which to extract the total magnetization.- Returns:
The total magnetization of the system.
- Return type:
int or float
- doped.utils.parsing.get_nelect_from_vasprun(vasprun: Vasprun) int | float [source]
Determine the number of electrons (
NELECT
) from aVasprun
object.- Parameters:
vasprun (Vasprun) – The
Vasprun
object from which to extractNELECT
.- Returns:
The number of electrons in the system.
- Return type:
int or float
- doped.utils.parsing.get_neutral_nelect_from_vasprun(vasprun: Vasprun, skip_potcar_init: bool = False) int | float [source]
Determine the number of electrons (
NELECT
) from aVasprun
object, corresponding to a neutral charge state for the structure.- Parameters:
vasprun (Vasprun) – The
Vasprun
object from which to extractNELECT
.skip_potcar_init (bool) – Whether to skip the initialisation of the
POTCAR
statistics (i.e. the auto-charge determination) and instead try to reverse engineerNELECT
using theDefectDictSet
.
- Returns:
The number of electrons in the system for a neutral charge state.
- Return type:
int or float
- doped.utils.parsing.get_orientational_degeneracy(defect_entry: DefectEntry | None = None, relaxed_point_group: str | None = None, bulk_site_point_group: str | None = None, bulk_symm_ops: list | None = None, defect_symm_ops: list | None = None, symprec: float = 0.1) float [source]
Get the orientational degeneracy factor for a given relaxed DefectEntry, by supplying either the DefectEntry object or the bulk-site & relaxed defect point group symbols (e.g. “Td”, “C3v” etc).
If a DefectEntry is supplied (and the point group symbols are not), this is computed by determining the relaxed defect point symmetry and the (unrelaxed) bulk site symmetry, and then getting the ratio of their point group orders (equivalent to the ratio of partition functions or number of symmetry operations (i.e. degeneracy)).
For interstitials, the bulk site symmetry corresponds to the point symmetry of the interstitial site with no relaxation of the host structure, while for vacancies/substitutions it is simply the symmetry of their corresponding bulk site. This corresponds to the point symmetry of
DefectEntry.defect
, orcalculation_metadata["bulk_site"]/["unrelaxed_defect_structure"]
.Note: This tries to use the defect_entry.defect_supercell to determine the relaxed site symmetry. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell.
doped
tries to automatically check if this is the case, and will warn you if so.This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed DefectEntry objects should correctly determine the final relaxed defect symmetry (and orientational degeneracy) - otherwise periodicity-breaking prevents this.
If periodicity-breaking prevents auto-symmetry determination, you can manually determine the relaxed defect and bulk-site point symmetries, and/or orientational degeneracy, from visualising the structures (e.g. using VESTA)(can use
get_orientational_degeneracy
to obtain the corresponding orientational degeneracy factor for given defect/bulk-site point symmetries) and setting the corresponding values in thecalculation_metadata['relaxed point symmetry']/['bulk site symmetry']
and/ordegeneracy_factors['orientational degeneracy']
attributes. Note that the bulk-site point symmetry corresponds to that ofDefectEntry.defect
, or equivalentlycalculation_metadata["bulk_site"]/["unrelaxed_defect_structure"]
, which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure. The degeneracy factor is used in the calculation of defect/carrier concentrations and Fermi level behaviour (see e.g. https://doi.org/10.1039/D2FD00043A & https://doi.org/10.1039/D3CS00432E).- Parameters:
defect_entry (DefectEntry) – DefectEntry object. (Default = None)
relaxed_point_group (str) – Point group symmetry (e.g. “Td”, “C3v” etc) of the relaxed defect structure, if already calculated / manually determined. Default is None (automatically calculated by doped).
bulk_site_point_group (str) – Point group symmetry (e.g. “Td”, “C3v” etc) of the defect site in the bulk, if already calculated / manually determined. For vacancies/substitutions, this should match the site symmetry label from
doped
when generating the defect, while for interstitials it should be the point symmetry of the final relaxed interstitial site, when placed in the bulk structure. Default is None (automatically calculated by doped).bulk_symm_ops (list) – List of symmetry operations of the defect_entry.bulk_supercell structure (used in determining the unrelaxed bulk site symmetry), to avoid re-calculating. Default is None (recalculates).
defect_symm_ops (list) – List of symmetry operations of the defect_entry.defect_supercell structure (used in determining the relaxed point symmetry), to avoid re-calculating. Default is None (recalculates).
symprec (float) – Symmetry tolerance for
spglib
to use when determining point symmetries and thus orientational degeneracies. Default is0.1
which matches that used by theMaterials Project
and is larger than thepymatgen
default of0.01
to account for residual structural noise in relaxed defect supercells. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).
- Returns:
orientational degeneracy factor for the defect.
- Return type:
float
- doped.utils.parsing.get_outcar(outcar_path: str | PathLike)[source]
Read the
OUTCAR(.gz)
file as apymatgen
Outcar
object.
- doped.utils.parsing.get_procar(procar_path: str | PathLike)[source]
Read the
PROCAR(.gz)
file as aneasyunfold
Procar
object (ifeasyunfold
installed), else apymatgen
Procar
object (doesn’t support SOC).If
easyunfold
installed, theProcar
will be parsed witheasyunfold
and then theproj_data
attribute will be converted to adata
attribute (to be compatible withpydefect
, which uses thepymatgen
format).
- doped.utils.parsing.get_site_mapping_indices(struct1: Structure, struct2: Structure, species=None, allow_duplicates: bool = False, threshold: float = 2.0, dists_only: bool = False)[source]
Get the site mapping indices between two structures (from
struct1
tostruct2
), based on the fractional coordinates of the sites.The template structure may have a different species ordering to the
input_structure
.NOTE: This assumes that both structures have the same lattice definitions (i.e. that they match, and aren’t rigidly translated/rotated with respect to each other), which is mostly the case unless we have a mismatching defect/bulk supercell (in which case the
check_atom_mapping_far_from_defect
warning should be thrown anyway during parsing). Currently this function is only used for analysing site displacements in thedisplacements
module so this is fine (user will already have been warned at this point if there is a possible mismatch).- Parameters:
struct1 (Structure) – The input structure.
struct2 (Structure) – The template structure.
species (str) – If provided, only sites of this species will be considered when matching sites. Default is
None
(all species).allow_duplicates (bool) – If
True
, allow multiple sites instruct1
to be matched to the same site instruct2
. Default isFalse
.threshold (float) – If the distance between a pair of matched sites is larger than this, then a warning will be thrown. Default is 2.0 Å.
dists_only (bool) – Whether to return only the distances between matched sites, rather than a list of lists containing the distance, index in
struct1
and index instruct2
. Default isFalse
.
- Returns:
A list of lists containing the distance, index in struct1 and index in struct2 for each matched site. If
dists_only
isTrue
, then only the distances between matched sites are returned.- Return type:
list
- doped.utils.parsing.get_vasprun(vasprun_path: str | PathLike, **kwargs)[source]
Read the
vasprun.xml(.gz)
file as apymatgen
Vasprun
object.
- doped.utils.parsing.get_wigner_seitz_radius(lattice: Structure | Lattice) float [source]
Calculates the Wigner-Seitz radius of the structure, which corresponds to the maximum radius of a sphere fitting inside the cell.
Uses the
calc_max_sphere_radius
function frompydefect
, with a wrapper to avoid unnecessary logging output and warning suppression fromvise
.- Parameters:
lattice (Union[Structure,Lattice]) – The lattice of the structure (either a
pymatgen
Structure
orLattice
object).- Returns:
The Wigner-Seitz radius of the structure.
- Return type:
float
- doped.utils.parsing.parse_projected_eigen_no_mag(elem)[source]
Parse the projected eigenvalues from a
Vasprun
object (used during initialisation), but excluding the projected magnetisation for efficiency.This is a modified version of
_parse_projected_eigen
frompymatgen.io.vasp.outputs.Vasprun
, which skips parsing of the projected magnetisation in order to expedite parsing indoped
, as well as some small adjustments to maximise efficiency.
- doped.utils.parsing.reorder_s1_like_s2(s1_structure: Structure, s2_structure: Structure, threshold=5.0) Structure [source]
Reorder the atoms of a (relaxed) structure, s1, to match the ordering of the atoms in s2_structure.
s1/s2 structures may have a different species orderings.
Previously used to ensure correct site matching when pulling site potentials for the eFNV Kumagai correction, though no longer used for this purpose. If threshold is set to a low value, it will raise a warning if there is a large site displacement detected.
NOTE: This assumes that both structures have the same lattice definitions (i.e. that they match, and aren’t rigidly translated/rotated with respect to each other), which is mostly the case unless we have a mismatching defect/bulk supercell (in which case the
check_atom_mapping_far_from_defect
warning should be thrown anyway during parsing). Currently this function is no longer used, but if it is reintroduced at any point, this point should be noted!- Parameters:
s1_structure (Structure) – The input structure.
s2_structure (Structure) – The template structure.
threshold (float) – If the distance between a pair of matched sites is larger than this, then a warning will be thrown. Default is 5.0 Å.
- Returns:
The reordered structure.
- Return type:
Structure
- doped.utils.parsing.simple_spin_degeneracy_from_charge(structure, charge_state: int = 0) int [source]
Get the defect spin degeneracy from the supercell and charge state, assuming either simple singlet (S=0) or doublet (S=1/2) behaviour.
Even-electron defects are assumed to have a singlet ground state, and odd- electron defects are assumed to have a doublet ground state.
doped.utils.plotting module
Code to analyse VASP defect calculations.
These functions are built from a combination of useful modules from pymatgen and AIDE (by Adam Jackson and Alex Ganose), alongside substantial modification, in the efforts of making an efficient, user-friendly package for managing and analysing defect calculations, with publication-quality outputs.
- doped.utils.plotting.format_defect_name(defect_species: str, include_site_info_in_name: bool = False, wout_charge: bool = False) str | None [source]
Format defect name for plot titles.
(i.e. from
"Cd_i_C3v_0"
to"$Cd_{i}^{0}$"
or"$Cd_{i_{C3v}}^{0}$"
). Note this assumes “V_…” means vacancy not Vanadium.- Parameters:
defect_species (
str
) – Name of defect including charge state (e.g."Cd_i_C3v_0"
)include_site_info_in_name (
bool
) – Whether to include site info in name (e.g."$Cd_{i}^{0}$"
or"$Cd_{i_{C3v}}^{0}$"
). Defaults toFalse
.wout_charge (
bool
, optional) – Whether to exclude the charge state from the formatteddefect_species
name. Defaults toFalse
.
- Returns:
formatted defect name
- Return type:
str
- doped.utils.plotting.formation_energy_plot(defect_thermodynamics: DefectThermodynamics, dft_chempots: dict | None = None, el_refs: dict | None = None, chempot_table: bool = True, all_entries: bool | str = False, xlim: tuple[float, float] | None = None, ylim: tuple[float, float] | None = None, fermi_level: float | None = None, include_site_info: bool = False, title: str | None = None, colormap: str | Colormap | None = None, linestyles: str | list[str] = '-', auto_labels: bool = False, filename: str | PathLike | None = None)[source]
Produce defect formation energy vs Fermi energy plot.
- Parameters:
defect_thermodynamics (DefectThermodynamics) –
DefectThermodynamics
object containing defect entries to plot.dft_chempots (dict) – Dictionary of
{Element: value}
giving the chemical potential of each element.el_refs (dict) – Dictionary of
{Element: value}
giving the reference energy of each element.chempot_table (bool) – Whether to print the chemical potential table above the plot. (Default: True)
all_entries (bool, str) – Whether to plot the formation energy lines of all defect entries, rather than the default of showing only the equilibrium states at each Fermi level position (traditional). If instead set to “faded”, will plot the equilibrium states in bold, and all unstable states in faded grey (Default: False)
xlim – Tuple (min,max) giving the range of the x-axis (Fermi level). May want to set manually when including transition level labels, to avoid crossing the axes. Default is to plot from -0.3 to +0.3 eV above the band gap.
ylim – Tuple (min,max) giving the range for the y-axis (formation energy). May want to set manually when including transition level labels, to avoid crossing the axes. Default is from 0 to just above the maximum formation energy value in the band gap.
fermi_level (float) – If set, plots a dashed vertical line at this Fermi level value, typically used to indicate the equilibrium Fermi level position (e.g. calculated with py-sc-fermi). (Default: None)
include_site_info (bool) – Whether to include site info in defect names in the plot legend (e.g. $Cd_{i_{C3v}}^{0}$ rather than $Cd_{i}^{0}$). Default is
False
, where site info is not included unless we have inequivalent sites for the same defect type. If, even with site info added, there are duplicate defect names, then “-a”, “-b”, “-c” etc are appended to the names to differentiate.title (str) – Title for the plot. (Default: None)
colormap (str, matplotlib.colors.Colormap) – Colormap to use for the formation energy lines, either as a string (which can be a colormap name from https://matplotlib.org/stable/users/explain/colors/colormaps or from https://www.fabiocrameri.ch/colourmaps – append ‘S’ if using a sequential colormap from the latter) or a
Colormap
/ListedColormap
object. IfNone
(default), usestab10
withalpha=0.75
(if 10 or fewer lines to plot),tab20
(if 20 or fewer lines) orbatlow
(if more than 20 lines).linestyles (list) – Linestyles to use for the formation energy lines, either as a single linestyle (
str
) or list of linestyles (list[str]
) in the order of appearance of lines in the plot legend. Default is"-"
; i.e. solid linestyle for all entries.auto_labels (bool) – Whether to automatically label the transition levels with their charge states. If there are many transition levels, this can be quite ugly. (Default: False)
filename (PathLike) – Filename to save the plot to. (Default: None (not saved))
- Returns:
matplotlib
Figure
object.
- doped.utils.plotting.get_colormap(colormap: str | Colormap | None = None, default: str = 'batlow') Colormap [source]
Get a colormap from a string or a
Colormap
object.If
_alpha_X
in the colormap name, sets the alpha value to X (0-1).cmcrameri
colour maps citation: https://zenodo.org/records/8409685- Parameters:
colormap (str, matplotlib.colors.Colormap) –
Colormap to use, either as a string (which can be a colormap name from https://www.fabiocrameri.ch/colourmaps or https://matplotlib.org/stable/users/explain/colors/colormaps), or a
Colormap
/ListedColormap
object. IfNone
(default), usesdefault
colormap (which is"batlow"
by default).Append “S” to the colormap name if using a sequential colormap from https://www.fabiocrameri.ch/colourmaps.
default (str) – Default colormap to use if
colormap
isNone
. Defaults to"batlow"
from https://www.fabiocrameri.ch/colourmaps.
- doped.utils.plotting.get_legend_font_size() float [source]
Convenience function to get the current
matplotlib
legend font size, in points (pt).- Returns:
Current legend font size in points (pt).
- Return type:
float
- doped.utils.plotting.get_linestyles(linestyles: str | list[str] = '-', num_lines: int = 1) list[str] [source]
Get a list of linestyles to use for plotting, from a string or list of strings (linestyles).
If a list is provided which doesn’t match the number of lines, the list is repeated until it does.
- Parameters:
linestyles (str, list[str]) – Linestyles to use for plotting. If a string, uses that linestyle for all lines. If a list, uses each linestyle in the list for each line. Defaults to
"-"
.num_lines (int) – Number of lines to plot (and thus number of linestyles to output in list). Defaults to 1.
- doped.utils.plotting.plot_chemical_potential_table(ax: Axes, dft_chempots: dict[str, float], cellLoc: str = 'left', el_refs: dict[str, float] | None = None) table [source]
Plot a table of chemical potentials above the plot in
ax
.- Parameters:
ax (plt.Axes) – Axes object to plot the table in.
dft_chempots (dict) – Dictionary of chemical potentials of the form
{Element: value}
.cellLoc (str) – Alignment of text in cells. Default is “left”.
el_refs (dict) – Dictionary of elemental reference energies of the form
{Element: value}
. If provided, the chemical potentials are given with respect to these reference energies.
- Returns:
The
matplotlib.table.Table
object (which has been added to theax
object).
doped.utils.stenciling module
doped.utils.supercells module
Utility code and functions for generating defect supercells.
- doped.utils.supercells.cell_metric(cell_matrix: ndarray, target: str = 'SC', rms: bool = True, eff_cubic_length: float | None = None) float [source]
Calculates the deviation of the given cell matrix from an ideal simple cubic (if target = “SC”) or face-centred cubic (if target = “FCC”) matrix, by evaluating the root mean square (RMS) difference of the vector lengths from that of the idealised values (i.e. the corresponding SC/FCC lattice vector lengths for the given cell volume).
For target = “SC”, the idealised lattice vector length is the effective cubic length (i.e. the cube root of the volume), while for “FCC” it is 2^(1/6) (~1.12) times the effective cubic length. This is a fixed version of the cell metric function in ASE (
get_deviation_from_optimal_cell_shape
), described in https://wiki.fysik.dtu.dk/ase/tutorials/defects/defects.html which currently does not account for rotated matrices (e.g. a cubic cell with target = “SC”, which should have a perfect score of 0, will have a bad score if its lattice vectors are rotated away from x, y and z, or even if they are just swapped as z, x, y). e.g. with ASE, [[1, 0, 0], [0, 1, 0], [0, 0, 1]] and [[0, 0, 1], [1, 0, 0], [0, 1, 0]] give scores of 0 and 1, but with this function they both give perfect scores of 0 as desired.- Parameters:
cell_matrix (np.ndarray) – Cell matrix for which to calculate the cell metric.
target (str) – Target cell shape, for which to calculate the normalised deviation score from. Either “SC” for simple cubic or “FCC” for face-centred cubic. Default = “SC”
rms (bool) – Whether to return the root mean square (RMS) difference of the vector lengths from that of the idealised values (default), or just the mean square difference (to reduce computation time when scanning over many possible matrices). Default = True
eff_cubic_length (float) – Effective cubic length of the cell matrix (to reduce computation time during looping). Default = None
- Returns:
Cell metric (0 is perfect score)
- Return type:
float
- doped.utils.supercells.find_ideal_supercell(cell: ndarray, target_size: int, limit: int = 2, clean: bool = True, return_min_dist: bool = False, verbose: bool = False) ndarray | tuple [source]
Given an input cell matrix (e.g. Structure.lattice.matrix or Atoms.cell) and chosen target_size (size of supercell in number of
cell
s), finds an ideal supercell matrix (P) that yields the largest minimum image distance (i.e. minimum distance between periodic images of sites in a lattice), while also being as close to cubic as possible.Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly cubic supercell with volume equal to
target_size
, and then scanning over all matrices where the elements are within +/-limit
of the ideal P matrix elements (rounded to the nearest integer). For relatively smalltarget_size
s (<100) and/or cells with mostly similar lattice vector lengths, the defaultlimit
of +/-2 performs very well. For largertarget_size
s,cell
s with very different lattice vector lengths, and/or cases where small differences in minimum image distance are very important, a largerlimit
may be required (though typically only improves the minimum image distance by 1-6%).This is also known as the Shortest Vector Problem (SVP), and has no known analytical solution, requiring enumeration type approaches. (https://wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_(SVP)), so can be slow for certain cases.
Note that this function is used by default to generate defect supercells with the
doped
DefectsGenerator
class, unless specific supercell settings are used.- Parameters:
cell (np.ndarray) – Unit cell matrix for which to find a supercell.
target_size (int) – Target supercell size (in number of
cell
s).limit (int) – Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly SC/FCC supercell with volume equal to target_size, and then scanning over all matrices where the elements are within +/-
limit
of the ideal P matrix elements (rounded to the nearest integer). (Default = 2)clean (bool) – Whether to return the supercell matrix which gives the ‘cleanest’ supercell (according to _lattice_matrix_sort_func; most symmetric, with mostly positive diagonals and c >= b >= a). (Default = True)
return_min_dist (bool) – Whether to return the minimum image distance (in Å) as a second return value. (Default = False)
verbose (bool) – Whether to print out extra information. (Default = False)
- Returns:
Supercell matrix (P). float: Minimum image distance (in Å) if
return_min_dist
is True.- Return type:
np.ndarray
- doped.utils.supercells.find_optimal_cell_shape(cell: ndarray, target_size: int, target_shape: str = 'SC', limit: int = 2, return_score: bool = False, verbose: bool = False) ndarray | tuple [source]
Find the transformation matrix that produces a supercell corresponding to target_size unit cells that most closely approximates the shape defined by target_shape.
This is an updated version of ASE’s find_optimal_cell_shape() function, but fixed to be rotationally-invariant (explained below), with significant efficiency improvements, and then secondarily sorted by the (fixed) cell metric (in doped), and then by some other criteria to give the cleanest output.
Finds the optimal supercell transformation matrix by calculating the deviation of the possible supercell matrices from an ideal simple cubic (if target = “SC”) or face-centred cubic (if target = “FCC”) matrix - and then taking that with the best (lowest) score, by evaluating the root mean square (RMS) difference of the vector lengths from that of the idealised values (i.e. the corresponding SC/FCC lattice vector lengths for the given cell volume).
For target = “SC”, the idealised lattice vector length is the effective cubic length (i.e. the cube root of the volume), while for “FCC” it is 2^(1/6) (~1.12) times the effective cubic length. The
get_deviation_from_optimal_cell_shape
function in ASE - described in https://wiki.fysik.dtu.dk/ase/tutorials/defects/defects.html - currently does not account for rotated matrices (e.g. a cubic cell with target = “SC”, which should have a perfect score of 0, will have a bad score if its lattice vectors are rotated away from x, y and z, or even if they are just swapped as z, x, y). e.g. with ASE, [[1, 0, 0], [0, 1, 0], [0, 0, 1]] and [[0, 0, 1], [1, 0, 0], [0, 1, 0]] give scores of 0 and 1, but with this function they both give perfect scores of 0 as desired.- Parameters:
cell (np.ndarray) – Unit cell matrix for which to find a supercell transformation.
target_size (int) – Target supercell size (in number of
cell
s).target_shape (str) – Target cell shape, for which to calculate the normalised deviation score from. Either “SC” for simple cubic or “FCC” for face-centred cubic. Default = “SC”
limit (int) – Supercell matrices are searched for by first identifying the ideal (fractional) transformation matrix (P) that would yield a perfectly SC/FCC supercell with volume equal to target_size, and then scanning over all matrices where the elements are within +/-
limit
of the ideal P matrix elements (rounded to the nearest integer). (Default = 2)return_score (bool) – Whether to return the cell metric score as a second return value. (Default = False)
verbose (bool) – Whether to print out extra information. (Default = False)
- Returns:
Supercell matrix (P). float: Cell metric (0 is perfect score) if
return_score
is True.- Return type:
np.ndarray
- doped.utils.supercells.get_min_image_distance(structure: Structure) float [source]
Get the minimum image distance (i.e. minimum distance between periodic images of sites in a lattice) for the input structure.
This is also known as the Shortest Vector Problem (SVP), and has no known analytical solution, requiring enumeration type approaches. (https://wikipedia.org/wiki/Lattice_problem#Shortest_vector_problem_(SVP))
- Parameters:
structure (Structure) – Structure object.
- Returns:
Minimum image distance.
- Return type:
float
- doped.utils.supercells.get_pmg_cubic_supercell_dict(struct: Structure, uc_range: tuple = (1, 200)) dict [source]
Get a dictionary of (near-)cubic supercell matrices for the given structure and range of numbers of unit cells (in the supercell).
Returns a dictionary of format:
{Number of Unit Cells: {"P": transformation matrix, "min_dist": minimum image distance} }
for (near-)cubic supercells generated by the pymatgen CubicSupercellTransformation class. If a (near-)cubic supercell cannot be found for a given number of unit cells, then the corresponding dict value will be set to an empty dict.
- Parameters:
struct (Structure) – pymatgen Structure object to generate supercells for
uc_range (tuple) – Range of numbers of unit cells to search over
- Returns:
{Number of Unit Cells: {"P": transformation matrix, "min_dist": minimum image distance}}
- Return type:
dict of
- doped.utils.supercells.min_dist(structure: Structure, ignored_species: list[str] | None = None) float [source]
Return the minimum interatomic distance in a structure.
Uses numpy vectorisation for fast computation.
- Parameters:
structure (Structure) – The structure to check.
ignored_species (list[str]) – A list of species symbols to ignore when calculating the minimum interatomic distance. Default is
None
.
- Returns:
The minimum interatomic distance in the structure.
- Return type:
float
doped.utils.symmetry module
Utility code and functions for symmetry analysis of structures and defects.
- doped.utils.symmetry.apply_symm_op_to_site(symm_op: SymmOp, site: PeriodicSite, fractional: bool = False, rotate_lattice: Lattice | bool = True, just_unit_cell_frac_coords: bool = False) PeriodicSite [source]
Apply the given symmetry operation to the input site (not in place) and return the new site.
By default, also rotates the lattice accordingly. If you want to apply the symmetry operation but keep the same lattice definition, set
rotate_lattice=False
.- Parameters:
symm_op (SymmOp) –
pymatgen
SymmOp
object.site (PeriodicSite) –
pymatgen
PeriodicSite
object.fractional (bool) – If the
SymmOp
is in fractional or Cartesian (default) coordinates (i.e. to apply tosite.frac_coords
orsite.coords
). Default: Falserotate_lattice (Union[Lattice, bool]) – Either a
pymatgen
Lattice
object (to use as the new lattice basis of the transformed site, which can be provided to reduce computation time when looping) orTrue/False
. IfTrue
(default), theSymmOp
rotation matrix will be applied to the input site lattice, or ifFalse
, the original lattice will be retained.just_unit_cell_frac_coords (bool) – If
True
, just returns the fractional coordinates of the transformed site (rather than the site itself), within the unit cell. Default: False
- Returns:
pymatgen
PeriodicSite
object with the symmetry operation applied
- doped.utils.symmetry.apply_symm_op_to_struct(symm_op: SymmOp, struct: Structure, fractional: bool = False, rotate_lattice: bool = True) Structure [source]
Apply a symmetry operation to a structure and return the new structure.
This differs from pymatgen’s
apply_operation
method in that it does not apply the operation in place as well (i.e. does not modify the input structure), which avoids the use of unnecessary and slowStructure.copy()
calls, making the structure manipulation / symmetry analysis functions more efficient. Also fixes an issue when applying fractional symmetry operations.By default, also rotates the lattice accordingly. If you want to apply the symmetry operation to the sites but keep the same lattice definition, set
rotate_lattice=False
.- Parameters:
symm_op –
pymatgen
SymmOp
object.struct –
pymatgen
Structure
object.fractional – If the
SymmOp
is in fractional or Cartesian (default) coordinates (i.e. to apply tosite.frac_coords
orsite.coords
). Default: Falserotate_lattice – If the lattice of the input structure should be rotated according to the symmetry operation. Default: True.
- Returns:
pymatgen
Structure
object with the symmetry operation applied.
- doped.utils.symmetry.cached_simplify(eq)[source]
Cached simplification function for
sympy
equations, for efficiency.
- doped.utils.symmetry.cached_solve(equation, variable)[source]
Cached solve function for
sympy
equations, for efficiency.
- doped.utils.symmetry.get_BCS_conventional_structure(structure, pbar=None, return_wyckoff_dict=False)[source]
Get the conventional crystal structure of the input structure, according to the Bilbao Crystallographic Server (BCS) definition. Also returns the transformation matrix from the
spglib
(SpaceGroupAnalyzer
) conventional structure definition to the BCS definition.- Parameters:
structure (Structure) – pymatgen Structure object for this to get the corresponding BCS conventional crystal structure
pbar (ProgressBar) – tqdm progress bar object, to update progress.
return_wyckoff_dict (bool) – whether to return the Wyckoff label dict ({Wyckoff label: coordinates})
number.
- Returns:
pymatgen Structure object and
spglib
-> BCS conv cell transformation matrix.
- doped.utils.symmetry.get_clean_structure(structure: Structure, return_T: bool = False, dist_precision: float = 0.001, niggli_reduce: bool = True)[source]
Get a ‘clean’ version of the input structure by searching over equivalent cells, and finding the most optimal according to
_lattice_matrix_sort_func
(most symmetric, with mostly positive diagonals and c >= b >= a).- Parameters:
structure (Structure) – Structure object.
return_T (bool) – Whether to return the transformation matrix from the original structure lattice to the new structure lattice (T * Orig = New). (Default = False)
dist_precision (float) – The desired distance precision in Å for rounding of lattice parameters and fractional coordinates. (Default: 0.001)
niggli_reduce (bool) – Whether to Niggli reduce the lattice before searching for the optimal lattice matrix. If this is set to
False
, we also skip the search for the best positive determinant lattice matrix. (Default: True)
- doped.utils.symmetry.get_conv_cell_site(defect_entry)[source]
Gets an equivalent site of the defect entry in the conventional structure of the host material. If the
conventional_structure
attribute is not defined for defect_entry, then it is generated usingSpacegroupAnalyzer
and then reoriented to match the Bilbao Crystallographic Server’s conventional structure definition.- Parameters:
defect_entry –
DefectEntry
object.
- doped.utils.symmetry.get_primitive_structure(structure: Structure, ignored_species: list | None = None, clean: bool = True, return_all: bool = False, **kwargs)[source]
Get a consistent/deterministic primitive structure from a
pymatgen
Structure
.For some materials (e.g. zinc blende), there are multiple equivalent primitive cells (e.g. Cd (0,0,0) & Te (0.25,0.25,0.25); Cd (0,0,0) & Te (0.75,0.75,0.75) for F-43m CdTe), so for reproducibility and in line with most structure conventions/definitions, take the one with the cleanest lattice and structure definition, according to
struct_sort_func
.If
ignored_species
is set, then the sorting function used to determine the ideal primitive structure will ignore sites with species inignored_species
.- Parameters:
structure –
Structure
object to get the corresponding primitive structure of.ignored_species – List of species to ignore when determining the ideal primitive structure. (Default: None)
clean – Whether to return a ‘clean’ version of the primitive structure, with the lattice matrix in a standardised form. (Default: True)
return_all – Whether to return all possible primitive structures tested, sorted by the sorting function. (Default: False)
**kwargs – Additional keyword arguments to pass to the
get_sga
function (e.g.symprec
etc).
- doped.utils.symmetry.get_sga(struct: Structure, symprec: float = 0.01, return_symprec: bool = False)[source]
Get a
SpacegroupAnalyzer
object of the input structure, dynamically adjusting symprec if needs be.- Parameters:
struct (Structure) – The input structure.
symprec (float) – The symmetry precision to use (default: 0.01).
return_symprec (bool) – Whether to return the fianl
symprec
used (default: False).
- Returns:
The symmetry analyzer object. If
return_symprec
isTrue
, returns a tuple of the symmetry analyzer object and the finalsymprec
used.- Return type:
SpacegroupAnalyzer
- doped.utils.symmetry.get_spglib_conv_structure(sga)[source]
Get a consistent/deterministic conventional structure from a
SpacegroupAnalyzer
object. Also returns the correspondingSpacegroupAnalyzer
(for getting Wyckoff symbols corresponding to this conventional structure definition).For some materials (e.g. zinc blende), there are multiple equivalent primitive/conventional cells, so for reproducibility and in line with most structure conventions/definitions, take the one with the lowest summed norm of the fractional coordinates of the sites (i.e. favour Cd (0,0,0) and Te (0.25,0.25,0.25) over Cd (0,0,0) and Te (0.75,0.75,0.75) for F-43m CdTe; SGN 216).
- doped.utils.symmetry.get_wyckoff(frac_coords, struct, symm_ops: list | None = None, equiv_sites=False, symprec=0.01)[source]
Get the Wyckoff label of the input fractional coordinates in the input structure. If the symmetry operations of the structure have already been computed, these can be input as a list to speed up the calculation.
- Parameters:
frac_coords – Fractional coordinates of the site to get the Wyckoff label of.
struct – pymatgen Structure object for which frac_coords corresponds to.
symm_ops – List of pymatgen SymmOps of the structure. If None (default), will recompute these from the input struct.
equiv_sites – If True, also returns a list of equivalent sites in struct.
symprec – Symmetry precision for SpacegroupAnalyzer.
- doped.utils.symmetry.get_wyckoff_dict_from_sgn(sgn)[source]
Get dictionary of {Wyckoff label: coordinates} for a given space group number.
The database used here for Wyckoff analysis (
wyckpos.dat
) was obtained from code written by JaeHwan Shim @schinavro (ORCID: 0000-0001-7575-4788) (https://gitlab.com/ase/ase/-/merge_requests/1035) based on the tabulated datasets in https://github.com/xtalopt/randSpg (also found at https://github.com/spglib/spglib/blob/develop/database/Wyckoff.csv). By default, doped uses the Wyckoff functionality ofspglib
(along with symmetry operations in pymatgen) when possible however.
- doped.utils.symmetry.get_wyckoff_label_and_equiv_coord_list(defect_entry=None, conv_cell_site=None, sgn=None, wyckoff_dict=None)[source]
Return the Wyckoff label and list of equivalent fractional coordinates within the conventional cell for the input defect_entry or conv_cell_site (whichever is provided, defaults to defect_entry if both), given a dictionary of Wyckoff labels and coordinates (
wyckoff_dict
).If
wyckoff_dict
is not provided, it is generated from the spacegroup number (sgn) usingget_wyckoff_dict_from_sgn(sgn)
. Ifsgn
is not provided, it is obtained from the bulk structure of thedefect_entry
if provided.
- doped.utils.symmetry.group_order_from_schoenflies(sch_symbol)[source]
Return the order of the point group from the Schoenflies symbol.
Useful for symmetry and orientational degeneracy analysis.
- doped.utils.symmetry.point_symmetry_from_defect(defect, symm_ops=None, symprec=0.01)[source]
Get the defect site point symmetry from a Defect object.
Note that this is intended only to be used for unrelaxed, as-generated
Defect
objects (rather than parsed defects).- Parameters:
defect (Defect) –
Defect
object.symm_ops (list) – List of symmetry operations of
defect.structure
, to avoid re-calculating. Default is None (recalculates).symprec (float) – Symmetry tolerance for
spglib
. Default is 0.01.
- Returns:
Defect point symmetry.
- Return type:
str
- doped.utils.symmetry.point_symmetry_from_defect_entry(defect_entry: DefectEntry, symm_ops: list | None = None, symprec: float | None = None, relaxed: bool = True, verbose: bool = True, return_periodicity_breaking: bool = False)[source]
Get the defect site point symmetry from a
DefectEntry
object.Note: If
relaxed = True
(default), then this tries to use the defect_entry.defect_supercell to determine the site symmetry. This will thus give the relaxed defect point symmetry if this is a DefectEntry created from parsed defect calculations. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell. doped tries to automatically check if this is the case, and will warn you if so.This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed
DefectEntry
objects should correctly determine the final relaxed defect symmetry - otherwise periodicity-breaking prevents this.If periodicity-breaking prevents auto-symmetry determination, you can manually determine the relaxed defect and bulk-site point symmetries, and/or orientational degeneracy, from visualising the structures (e.g. using VESTA)(can use
get_orientational_degeneracy
to obtain the corresponding orientational degeneracy factor for given defect/bulk-site point symmetries) and setting the corresponding values in thecalculation_metadata['relaxed point symmetry']/['bulk site symmetry']
and/ordegeneracy_factors['orientational degeneracy']
attributes. Note that the bulk-site point symmetry corresponds to that ofDefectEntry.defect
, or equivalentlycalculation_metadata["bulk_site"]/["unrelaxed_defect_structure"]
, which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure. The degeneracy factor is used in the calculation of defect/carrier concentrations and Fermi level behaviour (see e.g. https://doi.org/10.1039/D2FD00043A & https://doi.org/10.1039/D3CS00432E).- Parameters:
defect_entry (DefectEntry) –
DefectEntry
object.symm_ops (list) – List of symmetry operations of either the defect_entry.bulk_supercell structure (if
relaxed=False
) or defect_entry.defect_supercell (ifrelaxed=True
), to avoid re-calculating. Default is None (recalculates).symprec (float) – Symmetry tolerance for
spglib
. Default is 0.01 for unrelaxed structures, 0.1 for relaxed (to account for residual structural noise, matching that used by theMaterials Project
). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).relaxed (bool) – If False, determines the site symmetry using the defect site in the unrelaxed bulk supercell (i.e. the bulk site symmetry), otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell. Default is True.
verbose (bool) – If True, prints a warning if the supercell is detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry). Default is True.
return_periodicity_breaking (bool) – If True, also returns a boolean specifying if the supercell has been detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry) or not. Mainly for internal
doped
usage. Default is False.
- Returns:
Defect point symmetry (and if
return_periodicity_breaking = True
, a boolean specifying if the supercell has been detected to break the crystal periodicity).- Return type:
str
- doped.utils.symmetry.point_symmetry_from_site(site: PeriodicSite | ndarray | list, structure: Structure, coords_are_cartesian: bool = False, symm_ops: list | None = None, symprec: float = 0.01)[source]
Get the point symmetry of a site in a structure.
- Parameters:
site (Union[PeriodicSite, np.ndarray, list]) – Site for which to determine the point symmetry. Can be a
PeriodicSite
object, or a list or numpy array of the coordinates of the site (fractional coordinates by default, or cartesian ifcoords_are_cartesian = True
).structure (Structure) –
Structure
object for which to determine the point symmetry of the site.coords_are_cartesian (bool) – If True, the site coordinates are assumed to be in cartesian coordinates. Default is False.
symm_ops (list) – List of symmetry operations of the
structure
to avoid re-calculating. Default is None (recalculates).symprec (float) – Symmetry tolerance for
spglib
. Default is 0.01. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).
- Returns:
Site point symmetry.
- Return type:
str
- doped.utils.symmetry.point_symmetry_from_structure(structure: Structure, bulk_structure: Structure | None = None, symm_ops: list | None = None, symprec: float | None = None, relaxed: bool = True, verbose: bool = True, return_periodicity_breaking: bool = False)[source]
Get the point symmetry of a given structure.
Note: For certain non-trivial supercell expansions, the broken cell periodicity can break the site symmetry and lead to incorrect point symmetry determination (particularly if using non-scalar supercell matrices with high symmetry materials). If the unrelaxed bulk structure (
bulk_structure
) is also supplied, thendoped
will determine the defect site and then automatically check if this is the case, and warn you if so.This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed
DefectEntry
objects should correctly determine the final relaxed defect symmetry - otherwise periodicity-breaking prevents this.If
bulk_structure
is supplied andrelaxed
is set toFalse
, then returns the bulk site symmetry of the defect, which for vacancies/substitutions is the symmetry of the corresponding bulk site, while for interstitials it is the point symmetry of the final relaxed interstitial site when placed in the (unrelaxed) bulk structure.- Parameters:
structure (Structure) –
Structure
object for which to determine the point symmetry.bulk_structure (Structure) –
Structure
object of the bulk structure, if known. Default is None. If provided andrelaxed = True
, will be used to check if the supercell is breaking the crystal periodicity (and thus preventing accurate determination of the relaxed defect point symmetry) and warn you if so.symm_ops (list) – List of symmetry operations of either the
bulk_structure
structure (ifrelaxed=False
) orstructure
(ifrelaxed=True
), to avoid re-calculating. Default is None (recalculates).symprec (float) – Symmetry tolerance for
spglib
. Default is 0.01 for unrelaxed structures, 0.1 for relaxed (to account for residual structural noise, matching that used by theMaterials Project
). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).relaxed (bool) – If
False
, determines the site symmetry using the defect site in the unrelaxed bulk supercell (i.e. the bulk site symmetry), otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell. Default is True.verbose (bool) – If True, prints a warning if the supercell is detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry). Default is True.
return_periodicity_breaking (bool) – If True, also returns a boolean specifying if the supercell has been detected to break the crystal periodicity (and hence not be able to return a reliable relaxed point symmetry) or not. Default is False.
- Returns:
Structure point symmetry (and if
return_periodicity_breaking = True
, a boolean specifying if the supercell has been detected to break the crystal periodicity).- Return type:
str
- doped.utils.symmetry.schoenflies_from_hermann(herm_symbol)[source]
Convert from Hermann-Mauguin to Schoenflies.
- doped.utils.symmetry.summed_rms_dist(struct_a: Structure, struct_b: Structure) float [source]
Get the summed root-mean-square (RMS) distance between the sites of two structures, in Å.
Note that this assumes the lattices of the two structures are equal!
- Parameters:
struct_a –
pymatgen
Structure
object.struct_b –
pymatgen
Structure
object.
- Returns:
The summed RMS distance between the sites of the two structures, in Å.
- Return type:
float
- doped.utils.symmetry.swap_axes(structure, axes)[source]
Swap axes of the given structure.
The new order of the axes is given by the axes parameter. For example,
axes=(2, 1, 0)
will swap the first and third axes.
- doped.utils.symmetry.translate_structure(structure: Structure, vector: ndarray, frac_coords: bool = True, to_unit_cell: bool = True) Structure [source]
Translate a structure and its sites by a given vector (not in place).
- Parameters:
structure –
pymatgen
Structure
object.vector – Translation vector, fractional or Cartesian.
frac_coords – Whether the input vector is in fractional coordinates. (Default: True)
to_unit_cell – Whether to translate the sites to the unit cell. (Default: True)
- Returns:
pymatgen
Structure
object with translated sites.
Module contents
Submodule for utility functions in doped.