doped.core module

Core functions and classes for defects in doped.

class doped.core.Defect(structure: Structure, site: PeriodicSite, multiplicity: int | None = None, oxi_state: float | str | None = None, equivalent_sites: list[PeriodicSite] | None = None, symprec: float = 0.01, angle_tolerance: float = 5, user_charges: list[int] | None = None, **doped_kwargs)[source]

Bases: Defect

doped Defect object.

Subclass of Defect with additional attributes and methods used by doped.

Parameters:

structure (Structure) – The structure in which to create the defect. Typically the primitive structure of the host crystal for defect generation, and/or the calculation supercell for defect parsing.
site (PeriodicSite) – The defect site in the structure.
multiplicity (int) – The multiplicity of the defect in the structure.
oxi_state (float, int or str) – The oxidation state of the defect. If not specified, this will be determined automatically.
equivalent_sites (list[PeriodicSite]) – A list of equivalent sites for the defect in the structure.
symprec (float) – Symmetry tolerance for identifying equivalent sites. Default is 0.01.
angle_tolerance (float) – Angle tolerance for identifying equivalent sites. Default is 5.
user_charges (list[int]) – User specified charge states. If specified, get_charge_states will return this list. If None or an empty list, the charge states will be determined automatically.
**doped_kwargs – Additional keyword arguments to define doped-specific attributes (listed below), in the form doped_attribute_name=value; (e.g. wyckoff = "4a").

as_dict()[source]

JSON-serializable dict representation of Defect.

Needs to be redefined because attributes not explicitly specified in subclasses, which is required for monty functions.

property defect_site: PeriodicSite[source]

The defect site in the structure.

Re-written from pymatgen-analysis-defects version to be far more efficient, when used in loops (e.g. for calculating defect concentrations as functions of chemical potentials, temperature etc.).

Cached for efficiency; the cache is invalidated (in __setattr__) whenever structure or site are changed.

property element_changes: dict[Element, int]

The stoichiometry changes of the defect, as a dict.

e.g. {“Mg”: -1, “O”: +1} for a O-on-Mg antisite in MgO. Redefined from the pymatgen-analysis-defects method to be far more efficient when used in loops (e.g. for calculating defect concentrations as functions of chemical potentials, temperature etc.).

Returns:: The species changes of the defect.
Return type:: dict[Element, int]

classmethod from_json(filename: str)[source]

Load a Defect object from a json(.gz) file.

Note that .json.gz files can be loaded directly.

Parameters:: filename (PathLike) – Filename of json file to load Defect from.
Returns:: Defect object

get_charge_states(padding: int = 1) → list[int][source]: Refactored version of pymatgen-analysis-defects’s get_charge_states to not break when oxi_state is not set.

get_multiplicity(primitive_structure: Structure | None = None, symprec: float | None = None, dist_tol_factor: float = 1.0, **kwargs) → int[source]

Calculate the multiplicity of the defect site (self.site) in the host structure (self.structure).

This function determines all equivalent sites of self.site in self.structure, by first folding down to the primitive unit cell (which may be the same as self.structure) and getting all equivalent primitive cell sites (which avoids issues with periodicity-breaking supercells, and boosts efficiency), then multiplying by len(self.structure)/len(primitive_structure), giving the site multiplicity in self.structure.

Parameters:

primitive_structure (Structure | None) – Structure to use for the primitive unit cell. Can be provided to avoid recalculation of the primitive cell.
symprec (float) – Symmetry precision to use for determining symmetry operations and thus equivalent sites with spglib. Default is None, which uses self.symprec (which is 0.01 by default, matching the pymatgen default. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.). If fixed_symprec_and_dist_tol_factor is False (default), this value will be automatically adjusted (up to 10x, down to 0.1x) until the identified equivalent sites from spglib have consistent point group symmetries. Setting verbose to True will print information on the trialled symprec (and dist_tol_factor) values.
dist_tol_factor (float) – Distance tolerance for clustering generated sites (to ensure they are truly distinct), as a multiplicative factor of symprec. Default is 1.0 (i.e. dist_tol = symprec, in Å). If fixed_symprec_and_dist_tol_factor is False (default), this value will also be automatically adjusted if necessary (up to 10x, down to 0.1x)(after symprec adjustments) until the identified equivalent sites from spglib have consistent point group symmetries. Setting verbose to True will print information on the trialled dist_tol_factor (and symprec) values.
**kwargs – Additional keyword arguments to pass to get_all_equiv_sites, such as fixed_symprec_and_dist_tol_factor and verbose.

Returns:

The multiplicity of self.site in self.structure.

Return type:

int

get_supercell_structure(sc_mat: ndarray | None = None, target_frac_coords: ndarray | list[float] | None = None, return_sites: bool = False, min_image_distance: float = 10.0, min_atoms: int = 50, force_cubic: bool = False, force_diagonal: bool = False, ideal_threshold: float = 0.1, min_length: float | None = None, dummy_species: str | None = None) → Structure[source]

Generate the simulation supercell for a defect.

Redefined from the parent class to allow the use of target_frac_coords to place the defect at the closest equivalent site to the target fractional coordinates in the supercell, while keeping the supercell fixed (to avoid any issues with defect parsing). Also returns information about equivalent defect sites in the supercell.

If sc_mat is None, then the supercell is generated automatically using the doped algorithm described in the get_ideal_supercell_matrix function docstring in doped.generation.

Parameters:

sc_mat (3x3 matrix) – Transformation matrix of self.structure to create the supercell. If None (default), then automatically computed using get_ideal_supercell_matrix from doped.generation.
target_frac_coords (3x1 matrix) – If set, the defect will be placed at the closest equivalent site to these fractional coordinates (using self.equivalent_sites).
return_sites (bool) – If True, returns a tuple of the defect supercell, defect supercell site and list of equivalent supercell sites.
dummy_species (str) – Dummy species to highlight the defect position (for visualizing vacancies).
min_image_distance (float) – Minimum image distance in Å of the generated supercell (i.e. minimum distance between periodic images of atoms/sites in the lattice), if sc_mat is None. (Default = 10.0)
min_atoms (int) – Minimum number of atoms allowed in the generated supercell, if sc_mat is None. (Default = 50)
force_cubic (bool) – Enforce usage of CubicSupercellTransformation from pymatgen for supercell generation (if sc_mat is None). (Default = False)
force_diagonal (bool) – If True, return a transformation with a diagonal transformation matrix (if sc_mat is None). (Default = False)
ideal_threshold (float) – Threshold for increasing supercell size (beyond that which satisfies min_image_distance and min_atoms`) to achieve an ideal supercell matrix (i.e. a diagonal expansion of the primitive or conventional cell). Supercells up to 1 + perfect_cell_threshold times larger (rounded up) are trialled, and will instead be returned if they yield an ideal transformation matrix (if sc_mat is None). (Default = 0.1; i.e. 10% larger than the minimum size)
min_length (float) – Same as min_image_distance (kept for compatibility).

Returns:

The defect supercell structure. If return_sites is True, also returns the defect supercell site and list of equivalent supercell sites.

to_json(filename: str | Path | None = None)[source]

Save the Defect object to a json file, which can be reloaded with the Defect.from_json() class method.

Note that file extensions with “.gz” will be automatically compressed (recommended to save space)!

Parameters:: filename (PathLike) – Filename to save json file as. If None, the filename will be set as “{Defect.name}.json.gz”.

property volume: float

The volume (in Å³) of the structure in which the defect is created (i.e. Defect.structure).

Ensures volume is only computed once when calculating defect concentrations in loops (e.g. for calculating defect concentrations as functions of chemical potentials, temperature etc.).

class doped.core.DefectEntry(defect: Defect, charge_state: int, sc_entry: ComputedStructureEntry, corrections: dict[str, float] | None = None, corrections_metadata: dict[str, Any] | None = None, sc_defect_frac_coords: tuple[float, float, float] | None = None, bulk_entry: ComputedEntry | None = None, entry_id: str | None = None, name: str | None = None, calculation_metadata: dict[str, Any] | None = None, degeneracy_factors: dict[str, float] | None = None, conventional_structure: Structure | None = None, conv_cell_frac_coords: ndarray | None = None, equiv_conv_cell_frac_coords: list[ndarray] | None = None, _BilbaoCS_conv_cell_vector_mapping: list[int] | None = None, wyckoff: str | None = None, charge_state_guessing_log: list[dict] | None = None, defect_supercell: Structure | None = None, defect_supercell_site: PeriodicSite | None = None, equivalent_supercell_sites: list[PeriodicSite] | None = None, bulk_supercell: Structure | None = None, _bulk_entry_energy: float | None = None, _bulk_entry_hash: int | None = None, _sc_entry_energy: float | None = None, _sc_entry_hash: int | None = None)[source]

Bases: DefectEntry

doped DefectEntry object.

Subclass of DefectEntry with additional attributes used by doped.

Core Attributes:

defect:: doped/pymatgen defect object corresponding to the defect in the entry.
charge_state:: Charge state of the defect.
sc_entry:: pymatgen ComputedStructureEntry for the defect supercell.
sc_defect_frac_coords:: The fractional coordinates of the defect in the supercell.
bulk_entry:: pymatgen ComputedEntry for the bulk supercell reference. Required for calculating defect formation energy.
corrections:: A dictionary of energy corrections which are summed and added to the defect formation energy.
corrections_metadata:: A dictionary that acts as a generic container for storing information about how the corrections were calculated. Only used for debugging and plotting purposes.

Parsing Attributes:

calculation_metadata:: A dictionary of calculation parameters and data, used to perform charge corrections and compute formation energies.
degeneracy_factors:: A dictionary of degeneracy factors contributing to the total degeneracy of the defect species (such as spin and configurational degeneracy etc). This is an important factor in the defect concentration equation (https://doi.org/10.1039/D2FD00043A, https://doi.org/10.1038/s41578-025-00879-y, https://doi.org/10.1039/D3CS00432E), and so affects the output of the defect concentration / Fermi level functions. Spin and configurational (geometry) degeneracy factors are automatically determined by doped during parsing (for details, see the spin_degeneracy_from_vasprun(), get_orientational_degeneracy() and point_symmetry_from_defect_entry functions), but can also be edited in DefectEntry.degeneracy_factors. For discussion, see the Magnetization tips section.

Generation Attributes:

name:: The doped-generated name of the defect entry. See DefectsGenerator for the doped naming algorithm.
conventional_structure:: Conventional cell structure of the host according to the Bilbao Crystallographic Server (BCS) definition, used to determine defect site Wyckoff labels and multiplicities.
conv_cell_frac_coords:: Fractional coordinates of the defect in the conventional cell.
equiv_conv_cell_frac_coords:: Symmetry-equivalent defect positions in fractional coordinates of the conventional cell.
_BilbaoCS_conv_cell_vector_mapping:: A vector mapping the lattice vectors of the spglib-defined conventional cell to that of the Bilbao Crystallographic Server definition (these are equal for most space groups).
wyckoff:: Wyckoff label of the defect site.
charge_state_guessing_log:: A log of the input & computed values used to determine charge state probabilities.
defect_supercell:: pymatgen Structure object of the defect supercell.
defect_supercell_site:: pymatgen PeriodicSite object of the defect in the defect supercell.
equivalent_supercell_sites:: List of pymatgen PeriodicSite objects of symmetry-equivalent defect sites in the defect supercell.
bulk_supercell:: pymatgen Structure object of the bulk (pristine, defect-free) supercell.

as_dict() → dict[source]

Return a JSON-serializable dict representation of DefectEntry.

Slightly modified from the parent function to remove any hash values, as these are only relevant to the current python session.

property bulk_entry_energy

Get the raw energy of the bulk supercell calculation (i.e. bulk_entry.energy).

Refactored from pymatgen-analysis-defects to be more efficient, reducing compute times when looping over formation energy / concentration functions.

property bulk_site_concentration: Return the site concentration (in cm^-3) of the corresponding atomic site of the defect in the pristine bulk material (e.g. if the defect is V_O in SrTiO3, returns the site concentration of (symmetry-equivalent) oxygen atoms in SrTiO3).

property corrected_energy: float

Get the energy of the defect supercell calculation, with all corrections (in DefectEntry.corrections) applied.

Refactored from pymatgen-analysis-defects to be more efficient, reducing compute times when looping over formation energy / concentration functions.

equilibrium_concentration(temperature: float = 300, chempots: dict | None = None, limit: str | None = None, el_refs: dict | None = None, fermi_level: float = 0, vbm: float | None = None, per_site: bool = False, symprec: float | None = None, formation_energy: float | None = None, site_competition: bool | None = True, **kwargs) → float[source]

Compute the equilibrium concentration (in cm^-3) for the DefectEntry at a given chemical potential limit, fermi_level and temperature, assuming the dilute limit approximation.

Note that these are the equilibrium defect concentrations! DefectThermodynamics.get_fermi_level_and_concentrations() can instead be used to calculate the Fermi level and defect concentrations for a material grown/annealed at higher temperatures and then cooled (quenched) to room/operating temperature (where defect concentrations are assumed to remain fixed) – this is known as the frozen defect approach and is typically the most valid approximation (see its docstring for more information, and discussion in 10.1039/D3CS00432E).

The degeneracy/multiplicity factor “g” is an important parameter in the defect concentration equation, affecting the final concentration by up to 2 orders of magnitude. This factor is taken from the product of the defect_entry.defect.multiplicity and defect_entry.degeneracy_factors attributes. Discussion in: https://doi.org/10.1039/D2FD00043A, https://doi.org/10.1039/D3CS00432E, https://doi.org/10.1038/s41578-025-00879-y…

Note that this function sets lower and upper bounds on the per-site defect concentrations of 1e-50 and 1 (the latter only applying when site_competition = False).

Parameters:

temperature (float) – Temperature in Kelvin at which to calculate the equilibrium concentration. Default is 300 K.
chempots (dict) –
Dictionary of chemical potentials to use for calculating the defect formation energy (and thus concentration). This can have the form of: {"limits": [{'limit': [chempot_dict]}]} (the format generated by doped's chemical potential parsing functions (see tutorials)) and specific limits (chemical potential limits) can then be chosen using limit.

Alternatively this can be a dictionary of chemical potentials for a single limit, in the format: {element symbol: chemical potential}. If manually specifying chemical potentials this way, you can set the el_refs option with the DFT reference energies of the elemental phases, in which case it is the formal chemical potentials (i.e. relative to the elemental references) that should be given here, otherwise the absolute (DFT) chemical potentials should be given.

If None (default), sets all chemical potentials to 0 (inaccurate formation energies and concentrations!).
limit (str) –
The chemical potential limit for which to calculate the formation energy and thus concentration. Can be:
- None, default if chempots corresponds to a single chemical potential limit – otherwise will use the first chemical potential limit in the chempots dict.
- ”X-rich”/”X-poor” where X is an element in the system, in which case the most X-rich/poor limit will be used (e.g. “Li-rich”).
- A key in the (self.)chempots["limits"] dictionary.
The latter two options can only be used if chempots is in the doped format (see chemical potentials tutorial). (Default: None)
el_refs (dict) – Dictionary of elemental reference energies for the chemical potentials in the format: {element symbol: reference energy} (to determine the formal chemical potentials, when chempots has been manually specified as {element symbol: chemical potential}). Unnecessary if chempots is provided/present in format generated by doped (see tutorials). (Default: None)
vbm (float) – VBM eigenvalue to use as Fermi level reference point for calculating the formation energy. If None (default), will use "vbm" from the calculation_metadata dict attribute if present – which corresponds to the VBM of the bulk supercell calculation by default, unless bulk_band_gap_vr is set during defect parsing.
fermi_level (float) – Value corresponding to the electron chemical potential, referenced to the VBM. Default is 0 (i.e. the VBM).
per_site (bool) – Whether to return the concentration as fractional concentration per site, rather than the default of per cm^3. Multiply by 100 for concentration in percent. Default is False.
symprec (float) – Symmetry tolerance for spglib to use when determining relaxed defect point symmetries and thus orientational degeneracies. Default in doped is 0.1 which matches that used by the Materials Project and is larger than the pymatgen default of 0.01 to account for residual structural noise in relaxed defect supercells. If set, then site symmetries & degeneracies will be re-parsed/computed even if already present in the DefectEntry object calculation_metadata. You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc.).
formation_energy (float) – Pre-calculated formation energy to use for the defect concentration calculation, in order to reduce compute times (e.g. when looping over many chemical potential / temperature / etc ranges). Only really intended for internal doped usage. If None (default), will calculate the formation energy using the input chempots, limit, el_refs, vbm and fermi_level arguments. (Default: None)
site_competition (bool) – If True (default), uses the updated Fermi-Dirac-like formula for defect concentration, which accounts for defect site competition at high concentrations (see Kasamatsu et al. (10.1016/j.ssi.2010.11.022) appendix for derivation – updated here to additionally account for configurational degeneracies g (see https://doi.org/10.1039/D3CS00432E)), which gives the following defect concentration equation: N_X = N*[g*exp(-E/kT) / (1 + sum(g_i*exp(-E_i/kT)))] (https://doi.org/10.1038/s41578-025-00879-y, https://doi.org/10.1021/jacs.5c07104) where i runs over all defects which occupy the same site as the defect of interest. Otherwise, uses the standard dilute limit approximation. Note that when used with DefectEntry.equilibrium_concentration() here, only this defect itself is considered in the sum over i in the denominator (as it has no knowledge of other defect concentrations), but if used with DefectThermodynamics.get_equilibrium_concentrations() or DefectThermodynamics.get_fermi_level_and_concentrations() (recommended) then all defects in the system occupying the same lattice site are considered. If set to None (primarily intended for internal usage), the standard dilute limit is used without capping the defect concentration at the bulk site concentration.
**kwargs – Additional keyword arguments to pass to _parse_and_set_symmetries_and_degeneracies, such as bulk_symprec, dist_tol_factor etc.

Returns:

Concentration in cm^-3 (or as fractional per site, if per_site is True).

Return type:

float

formation_energy(chempots: dict | None = None, limit: str | None = None, el_refs: dict | None = None, vbm: float | None = None, fermi_level: float = 0) → float[source]

Compute the formation energy for the DefectEntry at a given chemical potential limit and fermi_level.

Parameters:

chempots (dict) –
Dictionary of chemical potentials to use for calculating the defect formation energy. This can have the form of: {"limits": [{'limit': [chempot_dict]}]} (the format generated by doped's chemical potential parsing functions (see tutorials)) and specific limits (chemical potential limits) can then be chosen using limit.

Alternatively this can be a dictionary of chemical potentials for a single limit, in the format: {element symbol: chemical potential}. If manually specifying chemical potentials this way, you can set the el_refs option with the (QM/DFT) reference energies of the elemental phases, in which case it is the formal chemical potentials (i.e. relative to the elemental references) that should be given here, otherwise the absolute (QM/DFT) chemical potentials should be given.

If None (default), sets all chemical potentials to zero. (Default: None)
limit (str) –
The chemical potential limit for which to calculate the formation energy. Can be either:
- None, default if chempots corresponds to a single chemical potential limit – otherwise will use the first chemical potential limit in the chempots dict.
- ”X-rich”/”X-poor” where X is an element in the system, in which case the most X-rich/poor limit will be used (e.g. “Li-rich”).
- A key in the (self.)chempots["limits"] dictionary.
The latter two options can only be used if chempots is in the doped format (see chemical potentials tutorial). (Default: None)
el_refs (dict) – Dictionary of elemental reference energies for the chemical potentials in the format: {element symbol: reference energy} (to determine the formal chemical potentials, when chempots has been manually specified as {element symbol: chemical potential}). Unnecessary if chempots is provided/present in format generated by doped (see tutorials). (Default: None)
vbm (float) – VBM eigenvalue to use as Fermi level reference point for calculating formation energy. If None (default), will use "vbm" from the calculation_metadata dict attribute if present – which corresponds to the VBM of the bulk supercell calculation by default, unless bulk_band_gap_vr is set during defect parsing).
fermi_level (float) – Value corresponding to the electron chemical potential, referenced to the VBM eigenvalue. Default is 0 (i.e. the VBM).

Returns:

Formation energy value (float)

classmethod from_dict(d: dict)[source]

Class method to create a DefectEntry object from a dictionary.

Defined to avoid unnecessary vise/pydefect INFO messages.

Parameters:: d (dict) – Dictionary representation of the DefectEntry object.
Returns:: DefectEntry object

classmethod from_json(filename: str)[source]

Load a DefectEntry object from a json(.gz) file.

Note that .json.gz files can be loaded directly.

Parameters:: filename (PathLike) – Filename of json file to load DefectEntry from.
Returns:: DefectEntry object

get_ediff() → float[source]

Get the energy difference between the defect and bulk supercell calculations, including finite-size corrections.

Refactored from pymatgen-analysis-defects to be more efficient, reducing compute times when looping over formation energy / concentration functions.

Returns information about the band edge and in-gap electronic states and their orbital character / localisation degree for the defect entry, as well as a plot of the single-particle electronic eigenvalues and their occupation (if plot=True).

Can be used to determine if a defect is adopting a perturbed host state (PHS / shallow state), see the Perturbed Host States (Shallow Defects) tips section.

If eigenvalue data has not already been parsed for DefectEntry (default in doped is to parse this data with DefectsParser/ DefectParser, as controlled by the parse_projected_eigen flag), then this function will attempt to load the eigenvalue data from either the input Vasprun/Procar objects or files, or from the bulk/defect_paths in defect_entry.calculation_metadata. If so, will initially try to load orbital projections from vasprun.xml(.gz) files (more accurate), or failing that from PROCAR(.gz) files if present.

This function uses code from pydefect, so please cite the pydefect paper: 10.1103/PhysRevMaterials.5.123803

Note that this function sets the eigenvalues, projected_eigenvalues and projected_magnetization attributes to None to reduce memory demand (as these properties are not required in later stages of doped analysis workflows), if clear_attributes is True (default).

Parameters:

plot (bool) – Whether to plot the single-particle eigenvalues. (Default: True)
filename (PathLike) – Filename to save the eigenvalue plot to (if plot = True). If None (default), plots are not saved.
defect_vr (PathLike, Vasprun) – Not required if eigenvalue data has already been parsed for DefectEntry (default behaviour when parsing, with data in defect_entry.calculation_metadata["eigenvalue_data"]). Either a path to the VASP vasprun.xml(.gz) output file or a pymatgen Vasprun object, for the defect supercell calculation. If None (default), tries to load the Vasprun object from self.calculation_metadata["run_metadata"]["defect_vasprun_dict"], or, failing that, from a vasprun.xml(.gz) file at self.calculation_metadata["defect_path"].
defect_procar (PathLike, Procar) – Not required if eigenvalue data has already been parsed for DefectEntry (default behaviour when parsing, with data in defect_entry.calculation_metadata["eigenvalue_data"]), and/or if defect_vr was parsed with parse_projected_eigen = True.
bulk_vr (PathLike, Vasprun) – Not required if eigenvalue data has already been parsed for DefectEntry (default behaviour when parsing, with data in defect_entry.calculation_metadata["eigenvalue_data"]). Either a path to the VASP vasprun.xml(.gz) output file or a pymatgen Vasprun object, for the reference bulk supercell calculation. If None (default), tries to load the Vasprun object from calculation_metadata["run_metadata"]["bulk_vasprun_dict"], or, failing that, from a vasprun.xml(.gz) file at self.calculation_metadata["bulk_path"].
bulk_procar (PathLike, Procar) – Not required if eigenvalue data has already been parsed for DefectEntry (default behaviour when parsing, with data in defect_entry.calculation_metadata["eigenvalue_data"]), and/or if bulk_vr was parsed with parse_projected_eigen = True. Either a path to the VASP PROCAR(.gz) output file (with LORBIT > 10 in the INCAR) or a pymatgen Procar object, for the reference bulk supercell calculation. If None (default), tries to load from a PROCAR(.gz) file at self.calculation_metadata["bulk_path"]. Either a path to the VASP PROCAR(.gz) output file (with LORBIT > 10 in the INCAR) or a pymatgen Procar object, for the defect supercell calculation. If None (default), tries to load from a PROCAR(.gz) file at self.calculation_metadata["defect_path"].
force_reparse (bool) – Whether to force re-parsing of the eigenvalue data, even if already present in the calculation_metadata.
clear_attributes (bool) – If True (default), sets the eigenvalues, projected_eigenvalues and projected_magnetization attributes to None to reduce memory demand (as these properties are not required in later stages of doped analysis workflows).
**kwargs – Additional kwargs to pass to doped.utils.eigenvalues.get_eigenvalue_analysis, such as style_file, ks_levels, ylims, legend_kwargs, similar_orb_criterion, similar_energy_criterion.

Returns:

pydefect BandEdgeStates object and matplotlib Figure object (if plot=True).

Compute the isotropic Freysoldt (FNV) correction for the defect_entry.

The correction is added to the defect_entry.corrections dictionary (to be used in following formation energy calculations). If this correction is used, please cite Freysoldt’s original paper; 10.1103/PhysRevLett.102.016402.

The charge correction error is estimated by computing the average standard deviation of the planar-averaged potential difference in the sampling region, and multiplying by the defect charge. This is expected to be a lower bound estimate of the true charge correction error.

The defect coordinates are taken as the relaxed defect site by default (DefectEntry.defect_supercell_site) – which is the bulk site for vacancies, but this can be overridden with the defect_frac_coords keyword argument.

As a general rule of thumb, the charge correction terms should follow relatively consistent trends in terms of magnitudes. A large outlier (easily scanned with get_formation_energies()) often indicates something unusual/unexpected. See the FNV/eFNV and other finite-size charge correction papers for further details.

Parameters:

dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constance (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If None, then the dielectric constant is taken from the DefectEntry calculation_metadata if available. See the Dielectric Constant tutorial section for information on calculating and converging the dielectric constant.
defect_locpot – Path to the output VASP LOCPOT file from the defect supercell calculation, or the corresponding pymatgen Locpot object, or a dictionary of the planar-averaged potential in the form: {i: Locpot.get_average_along_axis(i) for i in [0,1,2]}. If None, will try to use defect_locpot from the defect_entry calculation_metadata if available.
bulk_locpot – Path to the output VASP LOCPOT file from the bulk supercell calculation, or the corresponding pymatgen Locpot object, or a dictionary of the planar-averaged potential as: {i: Locpot.get_average_along_axis(i) for i in [0,1,2]}. If None, will try to use bulk_locpot from the defect_entry calculation_metadata if available.
plot (bool) – Whether to plot the FNV electrostatic potential plots (for manually checking the behaviour of the charge correction here).
filename (PathLike) – Filename to save the FNV electrostatic potential plots to. If None, plots are not saved.
axis (int or None) – If int, then the FNV electrostatic potential plot along the specified axis (0, 1, 2 for a, b, c) will be plotted. Note that the output charge correction is still that for all axes. If None, then all three axes are plotted.
return_correction_error (bool) – If True, also returns the average standard deviation of the planar-averaged potential difference times the defect charge (which gives an estimate of the error range of the correction energy). Default is False.
error_tolerance (float) – If the estimated error in the charge correction, based on the variance of the potential in the sampling region, is greater than this value (in eV), then a warning is raised. (default: 0.05 eV)
style_file (PathLike) – Path to a .mplstyle file to use for the plot. If None (default), uses the default doped style (from doped/utils/doped.mplstyle).
**kwargs – Additional kwargs to pass to get_freysoldt_correction() (e.g. energy_cutoff, mad_tol, q_model, step, defect_frac_coords).

Returns:

utils.CorrectionResults (summary of the corrections applied and metadata), and the matplotlib Figure object (or axis object if axis specified) if plot is True, and the estimated charge correction error if return_correction_error is True.

Compute the Kumagai (eFNV) finite-size charge correction for the defect_entry. Compatible with both isotropic/cubic and anisotropic systems.

The correction is added to the defect_entry.corrections dictionary (to be used in following formation energy calculations).

Typically for reasonably well-converged supercell sizes, the default defect_region_radius works perfectly well. However, for certain materials at small/intermediate supercell sizes, you may want to adjust this (and/or excluded_indices) to ensure the best sampling of the plateau region away from the defect position – doped should throw a warning in these cases (about the correction error being above the default tolerance (50 meV)). For example, with layered materials, the defect charge is often localised to one layer, so we may want to adjust defect_region_radius and/or excluded_indices to ensure that only sites in other layers are used for the sampling region (plateau) - see example on doped docs Tips page.

The correction error is estimated by computing the standard error of the mean of the sampled site potential differences, multiplied by the defect charge. This is expected to be a lower bound estimate of the true charge correction error.

If this correction is used, please cite the Kumagai & Oba (eFNV) paper: 10.1103/PhysRevB.89.195205 and the pydefect paper: “Insights into oxygen vacancies from high-throughput first-principles calculations” Yu Kumagai, Naoki Tsunoda, Akira Takahashi, and Fumiyasu Oba Phys. Rev. Materials 5, 123803 (2021) – 10.1103/PhysRevMaterials.5.123803

The defect coordinates are taken as the relaxed defect site by default (DefectEntry.defect_supercell_site) – which is the bulk site for vacancies, but this can be overridden with the defect_coords keyword argument.

As a general rule of thumb, the charge correction terms should follow relatively consistent trends in terms of magnitudes. A large outlier (easily scanned with get_formation_energies()) often indicates something unusual/unexpected. See the FNV/eFNV and other finite-size charge correction papers for further details.

Parameters:

dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constance (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If None, then the dielectric constant is taken from the DefectEntry calculation_metadata if available. See the Dielectric Constant tutorial section for information on calculating and converging the dielectric constant.
defect_region_radius (float) – Radius of the defect region (in Å). Sites outside the defect region are used for sampling the electrostatic potential far from the defect (to obtain the potential alignment). If None (default), uses the Wigner-Seitz radius of the supercell.
excluded_indices (list) – List of site indices (in the defect supercell) to exclude from the site potential sampling in the correction calculation/plot. If None (default), no sites are excluded.
defect_outcar (PathLike or Outcar) – Path to the output VASP OUTCAR file from the defect supercell calculation, or the corresponding pymatgen Outcar object. If None, will use defect_supercell_site_potentials from the defect_entry calculation_metadata if available.
bulk_outcar (PathLike or Outcar) – Path to the output VASP OUTCAR file from the bulk supercell calculation, or the corresponding pymatgen Outcar object. If None, will try to use bulk_supercell_site_potentials from the defect_entry calculation_metadata if available.
plot (bool) – Whether to plot the Kumagai site potential plots (for manually checking the behaviour of the charge correction here).
filename (PathLike) – Filename to save the Kumagai site potential plots to. If None, plots are not saved.
return_correction_error (bool) – If True, also returns the standard error of the mean of the sampled site potential differences times the defect charge (which gives an estimate of the error range of the correction energy). Default is False.
error_tolerance (float) – If the estimated error in the charge correction, based on the variance of the potential in the sampling region, is greater than this value (in eV), then a warning is raised. (default: 0.05 eV)
style_file (PathLike) – Path to a .mplstyle file to use for the plot. If None (default), uses the default doped style (from doped/utils/doped.mplstyle).
**kwargs – Additional kwargs to pass to pydefect.corrections.efnv_correction.ExtendedFnvCorrection (e.g. charge, defect_region_radius, defect_coords).

Returns:

utils.CorrectionResults (summary of the corrections applied and metadata), and the matplotlib Figure object if plot is True, and the estimated charge correction error if return_correction_error is True.

property is_shallow: bool[source]: Whether the DefectEntry is determined to be a shallow (perturbed host) state, based on pydefect eigenvalue analysis, or not.

plot_site_displacements(relative_to_defect: bool = True, separated_by_direction: bool = False, relaxed_distances: bool = False, vector_to_project_on: list | None = None, use_plotly: bool = False, ax: mpl.axes.Axes | None = None, fig: go.Figure | None = None, style_file: str | Path | None = '')[source]

Plot the site displacements as a function of distance from the defect site.

Set use_plotly = True to get an interactive plotly plot, useful for analysis!

The plot mode depends on the combination of options:

relative_to_defect=True (default): Single-panel signed displacement along the defect -> atom direction (negative = towards defect).
relative_to_defect=False: Single-panel absolute displacement vs. distance to defect.
vector_to_project_on=[x,y,z]: 2-panel plot showing displacement parallel and perpendicular to the given vector.
separated_by_direction=True: 3-panel plot showing the x, y, z displacement components separately.

separated_by_direction and vector_to_project_on are mutually exclusive, and if either is set then relative_to_defect is set to False.

Parameters:

defect_entry (DefectEntry) – DefectEntry object.
relative_to_defect (bool) – Whether to plot the signed displacements along the line from the (relaxed) defect site to that atom. Negative values indicate the atom moves towards the defect (compressive strain), positive values indicate the atom moves away from the defect (tensile strain). Default is True.
separated_by_direction (bool) – Whether to plot site displacements separated into x, y, z components (3-panel figure). Default is False.
relaxed_distances (bool) – Whether to use the atomic positions in the relaxed defect supercell for 'Distance to defect', 'Vector to site from defect' and 'Displacement wrt defect' values (True), or unrelaxed positions (i.e. the bulk structure positions)(False). Defaults to False.
vector_to_project_on (list) – Direction to project the site displacements along (e.g. [0, 0, 1]). Produces a 2-panel figure showing displacement parallel and perpendicular to the given vector. Defaults to None (i.e. don’t project displacements).
use_plotly (bool) – Whether to use plotly for plotting. Default is False (i.e. use matplotlib for plotting). Set to True to get an interactive plot.
ax (matplotlib.axes.Axes or sequence of matplotlib.axes.Axes) – Optional matplotlib Axes to plot on. If None, a new figure and axes are created. For single-panel modes (default), provide a single Axes. For multi-panel modes, provide a matching sequence of Axes: 2 axes for vector_to_project_on, 3 axes for separated_by_direction. A ValueError is raised if the wrong number of axes is supplied. Only used with use_plotly=False. Default is None.
fig (plotly.graph_objects.Figure) – Optional plotly Figure to add traces to. If None, a new figure is created (including the required subplot layout and titles for multi-panel modes). When supplying an existing figure for multi-panel modes, it must already have the correct number of subplots configured (2 for vector_to_project_on, 3 for separated_by_direction). Only used with use_plotly=True. Default is None.
style_file (PathLike) – Path to a matplotlib style file. If not set, uses the doped default displacement plotting style.

property sc_entry_energy

Get the raw energy of the defect supercell calculation (i.e. sc_entry.energy).

Refactored from pymatgen-analysis-defects to be more efficient, reducing compute times when looping over formation energy / concentration functions.

to_json(filename: str | Path | None = None)[source]

Save the DefectEntry object to a json file, which can be reloaded with the DefectEntry.from_json() class method.

Note that file extensions with “.gz” will be automatically compressed (recommended to save space)!

Parameters:: filename (PathLike) – Filename to save json file as. If None, the filename will be set as {DefectEntry.name}.json.gz.

class doped.core.Interstitial(*args, **kwargs)[source]

Bases: Defect, Interstitial

Subclass of Interstitial with additional attributes and methods used by doped.

If multiplicity is not set in kwargs, then it will be automatically calculated using get_multiplicity. Keyword arguments for get_multiplicity, such as symprec (-> self.symprec), dist_tol_factor, fixed_symprec_and_dist_tol_factor and verbose can also be passed in kwargs.

class doped.core.Substitution(*args, **kwargs)[source]

Bases: Defect, Substitution

Subclass of Substitution with additional attributes and methods used by doped.

class doped.core.Vacancy(*args, **kwargs)[source]

Bases: Defect, Vacancy

Subclass of Vacancy with additional attributes and methods used by doped.

doped.core.doped_defect_from_pmg_defect(defect: Defect, bulk_oxi_states: Structure | Composition | dict | bool = False, **doped_kwargs)[source]

Create the corresponding doped Defect (Vacancy, Interstitial, Substitution) from an input pymatgen Defect object.

Parameters:

defect – pymatgen Defect object.
bulk_oxi_states –
Controls oxi-state guessing (later used for charge state guessing). By default, oxidation states are taken from doped_kwargs['oxi_state'] if set, otherwise from bulk_oxi_states which can be either a pymatgen Structure or Composition object, or a dict (of {element: oxi_state}), or otherwise guessed using the doped methods. If bulk_oxi_states is False, then just uses the already-set Defect oxi_state attribute (default = 0), with no more guessing. If True, re-guesses the oxidation state of the defect (ignoring the pymatgen Defect oxi_state attribute).

If the structure is mixed-valence, then bulk_oxi_states should be either a structure input or True (to re-guess).

Default behaviour in doped generation is to provide bulk_oxi_states as an oxi-state decorated Structure, to make defect setup more robust and efficient (particularly for odd input structures, such as defect supercells etc.). Oxidation states are removed from structures in the pymatgen defect generation functions, so this allows us to re-add them after.
**doped_kwargs – Additional keyword arguments to define doped-specific attributes (see class docstring).

doped.core.get_oxi_probabilities(element_symbol: str) → dict[source]

Get a dictionary of oxidation states and their probabilities for an element.

Tries to get the probabilities from the pymatgen tabulated ICSD oxidation state probabilities, and if not available, uses the common oxidation states of the element.

Parameters:: element_symbol (str) – Element symbol.
Returns:: Dictionary of oxidation states (ints) and their probabilities (floats).
Return type:: dict

doped.core.guess_and_set_oxi_states_with_timeout(structure: Structure, timeout_1: float = 10, timeout_2: float = 15, break_early_if_expensive: bool = False, common_oxi_states: bool = True) → Structure | bool[source]

Tries to guess (and set) the oxidation states of the input structure, with a timeout catch for cases where the structure is complex and oxi state guessing will take a very very long time.

Tries first without using the pymatgen BVAnalyzer class, and if this fails, tries using the ICSD oxidation state probabilities (with timeouts) to guess.

Parameters:

structure (Structure) – The structure for which to guess the oxidation states.
timeout_1 (float) – Timeout in seconds for the second attempt to guess the oxidation states, using ICSD oxidation state probabilities (with max_sites=-1). Default is 10 seconds.
timeout_2 (float) – Timeout in seconds for the third attempt to guess the oxidation states, using ICSD oxidation state probabilities (without max_sites=-1). Default is 15 seconds.
break_early_if_expensive (bool) – Whether to stop the function if the first oxi state guessing attempt (with BVAnalyzer) fails and the cost estimate for the ICSD probability guessing is high (expected to take a long time; > 10 seconds). Default is False.
common_oxi_states (bool) – Whether to allow falling back to just using the most common oxidation states for the elements present (with a warning), if other oxidation state guessing routines fail. Default is True.

Returns:

The structure with oxidation states guessed and set, or False if oxidation states could not be guessed.

Return type:

Structure

doped.core.guess_and_set_struct_oxi_states(structure: Structure, try_without_max_sites: bool = False)[source]

Tries to guess (and set) the oxidation states of the input structure, first using the pymatgen BVAnalyzer class, and if that fails, using the ICSD oxidation state probabilities to guess.

Parameters:

structure (Structure) – The structure for which to guess the oxidation states.
try_without_max_sites (bool) – Whether to try to guess the oxidation states without using the max_sites=-1 argument (True)(which attempts to use the reduced composition for guessing oxi states) or not (False; default), when using the ICSD oxidation state probability guessing.

Returns:

The structure with oxidation states guessed and set, or False if oxidation states could not be guessed.

Return type:

Structure

doped.core.is_shallow(defect_entry: DefectEntry, default: bool = False) → bool[source]

Return whether a DefectEntry is determined to be a shallow (perturbed host) state, based on pydefect eigenvalue analysis.

Parameters:

defect_entry (DefectEntry) – doped DefectEntry object.
default (bool) – Default value to return if the eigenvalue analysis fails (e.g. if eigenvalue data is not present). Default is False.

doped.core.most_common_oxi(element_symbol: str) → int[source]

Convenience function to get the most common oxidation state of an element, using elemental data from pymatgen.

Parameters:: element_symbol (str) – Element symbol.
Returns:: Most common oxidation state of the element.

doped.core.remove_site_oxi_state(site: PeriodicSite)[source]

Remove site oxidation state in-place.

Same method as Structure.remove_oxidation_states(), but applied to an individual site.

Parameters:: site (PeriodicSite) – The site to remove oxidation states from.

doped.core.template_defect_entry_from_structures(defect_supercell: Structure, bulk_supercell: Structure, **kwargs) → DefectEntry[source]

Helper function to create a template/partial DefectEntry from the input bulk and defect supercells.

Uses defect_and_info_from_structures to extract the defect structural information, and creates a corresponding DefectEntry object (which has no bulk_entry and a fake zero-energy sc_entry, and so cannot be used for energy analyses). Primarily intended for internal usage in doped parsing/analysis functions, but can be useful in other workflows.

Parameters:

defect_supercell (Structure) – The defect supercell structure.
bulk_supercell (Structure) – The bulk supercell structure.
**kwargs – Keyword arguments to pass to get_equiv_frac_coords_in_primitive (such as symprec, dist_tol_factor, fixed_symprec_and_dist_tol_factor, verbose) and/or Defect initialization (such as oxi_state, multiplicity, symprec, dist_tol_factor) in the defect_and_info_from_structures function.

Returns:

A partial DefectEntry object containing the defect and defect site information, but no bulk_entry and a zero-energy sc_entry.

Return type:

DefectEntry