doped.analysis module
Code to analyse VASP defect calculations.
These functions are built from a combination of useful modules from pymatgen
,
alongside substantial modification, in the efforts of making an efficient,
user-friendly package for managing and analysing defect calculations, with
publication-quality outputs.
- class doped.analysis.DefectParser(defect_entry: DefectEntry, defect_vr: Vasprun | None = None, bulk_vr: Vasprun | None = None, skip_corrections: bool = False, error_tolerance: float = 0.05, parse_projected_eigen: bool | None = None, **kwargs)[source]
Bases:
object
Create a
DefectParser
object, which has methods for parsing the results of defect supercell calculations.Direct initialisation with
DefectParser()
is typically not recommended. RatherDefectParser.from_paths()
ordefect_entry_from_paths()
are preferred as shown in thedoped
parsing tutorials.- Parameters:
defect_entry (DefectEntry) – doped
DefectEntry
defect_vr (Vasprun) –
pymatgen
Vasprun
object for the defect supercell calculationbulk_vr (Vasprun) –
pymatgen
Vasprun
object for the reference bulk supercell calculationskip_corrections (bool) – Whether to skip calculation and application of finite-size charge corrections to the defect energy (not recommended in most cases). Default = False.
error_tolerance (float) – If the estimated error in the defect charge correction, based on the variance of the potential in the sampling region is greater than this value (in eV), then a warning is raised. (default: 0.05 eV)
parse_projected_eigen (bool) – Whether to parse the projected eigenvalues & orbitals from the bulk and defect calculations (so
DefectEntry.get_eigenvalue_analysis()
can then be used with no further parsing). Will initially try to load orbital projections fromvasprun.xml(.gz)
files (slightly slower but more accurate), or failing that fromPROCAR(.gz)
files if present in the bulk/defect directories. Parsing this data can increase total parsing time by anywhere from ~5-25%, so set toFalse
if parsing speed is crucial. Default isNone
, which will attempt to load this data but with no warning if it fails (otherwise ifTrue
a warning will be printed).**kwargs – Keyword arguments to pass to
DefectParser()
methods (load_FNV_data()
,load_eFNV_data()
,load_bulk_gap_data()
)point_symmetry_from_defect_entry()
ordefect_from_structures
, includingbulk_locpot_dict
,bulk_site_potentials
,use_MP
,mpid
,api_key
,symprec
oroxi_state
. Primarily used byDefectsParser
to expedite parsing by avoiding reloading bulk data for each defect.
- classmethod from_paths(defect_path: str | PathLike, bulk_path: str | PathLike | None = None, bulk_vr: Vasprun | None = None, bulk_procar: EasyunfoldProcar | Procar | None = None, dielectric: float | int | ndarray | list | None = None, charge_state: int | None = None, initial_defect_structure_path: str | PathLike | None = None, skip_corrections: bool = False, error_tolerance: float = 0.05, bulk_band_gap_vr: str | PathLike | Vasprun | None = None, parse_projected_eigen: bool | None = None, **kwargs)[source]
Parse the defect calculation outputs in
defect_path
and return theDefectParser
object. By default, theDefectParser.defect_entry.name
attribute (later used to label defects in plots) is set to the defect_path folder name (if it is a recognised defect name), else it is set to the default doped name for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances).Note that the bulk and defect supercells should have the same definitions/basis sets (for site-matching and finite-size charge corrections to work appropriately).
- Parameters:
defect_path (PathLike) – Path to defect supercell folder (containing at least
vasprun.xml(.gz)
).bulk_path (PathLike) – Path to bulk supercell folder (containing at least
vasprun.xml(.gz)
). Not required ifbulk_vr
is provided.bulk_vr (Vasprun) –
pymatgen
Vasprun
object for the reference bulk supercell calculation, if already loaded (can be supplied to expedite parsing). Default isNone
.bulk_procar (Procar) –
easyunfold
/pymatgen
Procar
object, for the reference bulk supercell calculation if already loaded (can be supplied to expedite parsing). Default isNone
.dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constance (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If not provided, charge corrections cannot be computed and so
skip_corrections
will be set toTrue
. See https://doped.readthedocs.io/en/latest/GGA_workflow_tutorial.html#dielectric-constant for information on calculating and converging the dielectric constant.charge_state (int) – Charge state of defect. If not provided, will be automatically determined from defect calculation outputs, or if that fails, using the defect folder name (must end in “_+X” or “_-X” where +/-X is the defect charge state).
initial_defect_structure_path (PathLike) – Path to the initial/unrelaxed defect structure. Only recommended for use if structure matching with the relaxed defect structure(s) fails (rare). Default is
None
.skip_corrections (bool) – Whether to skip the calculation and application of finite-size charge corrections to the defect energy (not recommended in most cases). Default =
False
.error_tolerance (float) – If the estimated error in the defect charge correction, based on the variance of the potential in the sampling region, is greater than this value (in eV), then a warning is raised. (default: 0.05 eV)
bulk_band_gap_vr (PathLike or Vasprun) –
Path to a
vasprun.xml(.gz)
file, or apymatgen
Vasprun
object, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit/add the"gap"
and"vbm"
entries inself.defect_entry.calculation_metadata
to match the correct (eigen)values. If None, will useDefectEntry.calculation_metadata["bulk_path"]
(i.e. the bulk supercell calculation output).Note that the
"gap"
and"vbm"
values should only affect the reference for the Fermi level values output bydoped
(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.parse_projected_eigen (bool) – Whether to parse the projected eigenvalues & orbitals from the bulk and defect calculations (so
DefectEntry.get_eigenvalue_analysis()
can then be used with no further parsing). Will initially try to load orbital projections fromvasprun.xml(.gz)
files (slightly slower but more accurate), or failing that fromPROCAR(.gz)
files if present in the bulk/defect directories. Parsing this data can increase total parsing time by anywhere from ~5-25%, so set toFalse
if parsing speed is crucial. Default isNone
, which will attempt to load this data but with no warning if it fails (otherwise ifTrue
a warning will be printed).**kwargs – Keyword arguments to pass to
DefectParser()
methods (load_FNV_data()
,load_eFNV_data()
,load_bulk_gap_data()
)point_symmetry_from_defect_entry()
ordefect_from_structures
, includingbulk_locpot_dict
,bulk_site_potentials
,use_MP
,mpid
,api_key
,symprec
oroxi_state
. Primarily used byDefectsParser
to expedite parsing by avoiding reloading bulk data for each defect.
- Returns:
DefectParser
object.
- load_FNV_data(bulk_locpot_dict: dict | None = None)[source]
Load metadata required for performing Freysoldt correction (i.e. LOCPOT planar-averaged potential dictionary).
Requires “bulk_path” and “defect_path” to be present in DefectEntry.calculation_metadata, and VASP LOCPOT files to be present in these directories. Can read compressed “LOCPOT.gz” files. The bulk_locpot_dict can be supplied if already parsed, for expedited parsing of multiple defects.
Saves the
bulk_locpot_dict
anddefect_locpot_dict
dictionaries (containing the planar-averaged electrostatic potentials along each axis direction) to the DefectEntry.calculation_metadata dict, for use with DefectEntry.get_freysoldt_correction().- Parameters:
bulk_locpot_dict (dict) – Planar-averaged potential dictionary for bulk supercell, if already parsed. If
None
(default), will load fromLOCPOT(.gz)
file indefect_entry.calculation_metadata["bulk_path"]
- Returns:
bulk_locpot_dict for reuse in parsing other defect entries
- load_and_check_calculation_metadata()[source]
Pull metadata about the defect supercell calculations from the outputs, and check if the defect and bulk supercell calculations settings are compatible.
- load_bulk_gap_data(bulk_band_gap_vr: str | PathLike | Vasprun | None = None, use_MP: bool = False, mpid: str | None = None, api_key: str | None = None)[source]
Load the
"gap"
and"vbm"
values for the parsedDefectEntry
s.If
bulk_band_gap_vr
is provided, then these values are parsed from it, else taken from the parsed bulk supercell calculation.Alternatively, one can specify query the Materials Project (MP) database for the bulk gap data, using
use_MP = True
, in which case the MP entry with the lowest number ID and composition matching the bulk will be used, or the MP ID (mpid
) of the bulk material to use can be specified. This is not recommended as it will correspond to a severely-underestimated GGA DFT bandgap!- Parameters:
bulk_band_gap_vr (PathLike or Vasprun) –
Path to a
vasprun.xml(.gz)
file, or apymatgen
Vasprun
object, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit/add the"gap"
and"vbm"
entries inself.defect_entry.calculation_metadata
to match the correct (eigen)values. If None, will useDefectEntry.calculation_metadata["bulk_path"]
(i.e. the bulk supercell calculation output).Note that the
"gap"
and"vbm"
values should only affect the reference for the Fermi level values output bydoped
(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.use_MP (bool) – If True, will query the Materials Project database for the bulk gap data.
mpid (str) – If provided, will query the Materials Project database for the bulk gap data, using this Materials Project ID.
api_key (str) – Materials API key to access database.
- load_eFNV_data(bulk_site_potentials: list | None = None)[source]
Load metadata required for performing Kumagai correction (i.e. atomic site potentials from the OUTCAR files).
Requires “bulk_path” and “defect_path” to be present in
DefectEntry.calculation_metadata
, andVASP
OUTCAR
files to be present in these directories. Can read compressedOUTCAR.gz
files. The bulk_site_potentials can be supplied if already parsed, for expedited parsing of multiple defects.Saves the
bulk_site_potentials
anddefect_site_potentials
lists (containing the atomic site electrostatic potentials, from-1*np.array(Outcar.electrostatic_potential)
) toDefectEntry.calculation_metadata
, for use withDefectEntry.get_kumagai_correction()
.- Parameters:
bulk_site_potentials (list) – Atomic site potentials for the bulk supercell, if already parsed. If
None
(default), will load fromOUTCAR(.gz)
file indefect_entry.calculation_metadata["bulk_path"]
- Returns:
bulk_site_potentials
for reuse in parsing other defect entries
- class doped.analysis.DefectsParser(output_path: str | PathLike = '.', dielectric: float | int | ndarray | list | None = None, subfolder: str | PathLike | None = None, bulk_path: str | PathLike | None = None, skip_corrections: bool = False, error_tolerance: float = 0.05, bulk_band_gap_vr: str | PathLike | Vasprun | None = None, processes: int | None = None, json_filename: str | PathLike | bool | None = None, parse_projected_eigen: bool | None = None, **kwargs)[source]
Bases:
object
A class for rapidly parsing multiple VASP defect supercell calculations for a given host (bulk) material.
Loops over calculation directories in
output_path
(likely the sameoutput_path
used withDefectsSet
for file generation indoped.vasp
) and parses the defect calculations into a dictionary of:{defect_name: DefectEntry}
, where thedefect_name
is set to the defect calculation folder name (if it is a recognised defect name), else it is set to the defaultdoped
name for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances). By default, searches for folders inoutput_path
withsubfolder
containingvasprun.xml(.gz)
files, and tries to parse them asDefectEntry
s.By default, tries multiprocessing to speed up defect parsing, which can be controlled with
processes
. If parsing hangs, this may be due to memory issues, in which case you should reduceprocesses
(e.g. 4 or less).Defect charge states are automatically determined from the defect calculation outputs if
POTCAR
s are set up withpymatgen
(see docs Installation page), or if that fails, using the defect folder name (must end in “_+X” or “_-X” where +/-X is the defect charge state).Uses the (single)
DefectParser
class to parse the individual defect calculations. Note that the bulk and defect supercells should have the same definitions/basis sets (for site-matching and finite-size charge corrections to work appropriately).- Parameters:
output_path (PathLike) – Path to the output directory containing the defect calculation folders (likely the same
output_path
used withDefectsSet
for file generation indoped.vasp
). Default = current directory.dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constance (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If not provided, charge corrections cannot be computed and so
skip_corrections
will be set toTrue
. See https://doped.readthedocs.io/en/latest/GGA_workflow_tutorial.html#dielectric-constant for information on calculating and converging the dielectric constant.subfolder (PathLike) – Name of subfolder(s) within each defect calculation folder (in the
output_path
directory) containing the VASP calculation files to parse (e.g.vasp_ncl
,vasp_std
,vasp_gam
etc.). If not specified,doped
checks first forvasp_ncl
,vasp_std
,vasp_gam
subfolders with calculation outputs (vasprun.xml(.gz)
files) and uses the highest level VASP type (ncl > std > gam) found assubfolder
, otherwise uses the defect calculation folder itself with no subfolder (setsubfolder = "."
to enforce this).bulk_path (PathLike) – Path to bulk supercell reference calculation folder. If not specified, searches for folder with name “X_bulk” in the
output_path
directory (matching the defaultdoped
name for the bulk supercell reference folder).skip_corrections (bool) – Whether to skip the calculation & application of finite-size charge corrections to the defect energies (not recommended in most cases). Default = False.
error_tolerance (float) – If the estimated error in any charge correction, based on the variance of the potential in the sampling region, is greater than this value (in eV), then a warning is raised. (default: 0.05 eV)
bulk_band_gap_vr (PathLike or Vasprun) –
Path to a
vasprun.xml(.gz)
file, or apymatgen
Vasprun
object, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit/add the"gap"
and"vbm"
entries inself.defect_entry.calculation_metadata
to match the correct (eigen)values. If None, will useDefectEntry.calculation_metadata["bulk_path"]
(i.e. the bulk supercell calculation output).Note that the
"gap"
and"vbm"
values should only affect the reference for the Fermi level values output bydoped
(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.processes (int) – Number of processes to use for multiprocessing for expedited parsing. If not set, defaults to one less than the number of CPUs available. Set to 1 for no multiprocessing.
json_filename (PathLike) – Filename to save the parsed defect entries dict (
DefectsParser.defect_dict
) to inoutput_path
, to avoid having to re-parse defects when later analysing further and aiding calculation provenance. Can be reloaded using theloadfn
function frommonty.serialization
(and then input toDefectThermodynamics
etc.). IfNone
(default), set as{Host Chemical Formula}_defect_dict.json.gz
. IfFalse
, no json file is saved.parse_projected_eigen (bool) – Whether to parse the projected eigenvalues & orbitals from the bulk and defect calculations (so
DefectEntry.get_eigenvalue_analysis()
can then be used with no further parsing). Will initially try to load orbital projections fromvasprun.xml(.gz)
files (slightly slower but more accurate), or failing that fromPROCAR(.gz)
files if present in the bulk/defect directories. Parsing this data can increase total parsing time by anywhere from ~5-25%, so set toFalse
if parsing speed is crucial. Default isNone
, which will attempt to load this data but with no warning if it fails (otherwise ifTrue
a warning will be printed).**kwargs – Keyword arguments to pass to
DefectParser()
methods (load_FNV_data()
,load_eFNV_data()
,load_bulk_gap_data()
)point_symmetry_from_defect_entry()
ordefect_from_structures
, includingbulk_locpot_dict
,bulk_site_potentials
,use_MP
,mpid
,api_key
,symprec
oroxi_state
. Primarily used byDefectsParser
to expedite parsing by avoiding reloading bulk data for each defect.
- defect_dict
Dictionary of parsed defect calculations in the format:
{"defect_name": DefectEntry}
where the defect_name is set to the defect calculation folder name (if it is a recognised defect name), else it is set to the defaultdoped
name for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances).- Type:
dict
- get_defect_thermodynamics(chempots: dict | None = None, el_refs: dict | None = None, vbm: float | None = None, band_gap: float | None = None, dist_tol: float = 1.5, check_compatibility: bool = True, bulk_dos: FermiDos | None = None, skip_check: bool = False) DefectThermodynamics [source]
Generates a DefectThermodynamics object from the parsed
DefectEntry
objects in self.defect_dict, which can then be used to analyse and plot the defect thermodynamics (formation energies, transition levels, concentrations etc).Note that the DefectEntry.name attributes (rather than the defect_name key in the defect_dict) are used to label the defects in plots.
- Parameters:
chempots (dict) –
Dictionary of chemical potentials to use for calculating the defect formation energies. This can have the form of
{"limits": [{'limit': [chempot_dict]}]}
(the format generated bydoped
's chemical potential parsing functions (see tutorials)) which allows easy analysis over a range of chemical potentials - where limit(s) (chemical potential limit(s)) to analyse/plot can later be chosen using thelimits
argument.Alternatively this can be a dictionary of chemical potentials for a single limit (limit), in the format:
{element symbol: chemical potential}
. If manually specifying chemical potentials this way, you can set theel_refs
option with the DFT reference energies of the elemental phases in order to show the formal (relative) chemical potentials above the formation energy plot, in which case it is the formal chemical potentials (i.e. relative to the elemental references) that should be given here, otherwise the absolute (DFT) chemical potentials should be given.If None (default), sets all chemical potentials to zero. Chemical potentials can also be supplied later in each analysis function. (Default: None)
el_refs (dict) – Dictionary of elemental reference energies for the chemical potentials in the format:
{element symbol: reference energy}
(to determine the formal chemical potentials, whenchempots
has been manually specified as{element symbol: chemical potential}
). Unnecessary ifchempots
is provided in format generated bydoped
(see tutorials). (Default: None)vbm (float) – VBM eigenvalue to use as Fermi level reference point for analysis. If None (default), will use
"vbm"
from thecalculation_metadata
dict attributes of the parsedDefectEntry
objects, which by default is taken from the bulk supercell VBM (unlessbulk_band_gap_vr
is set during parsing). Note thatvbm
should only affect the reference for the Fermi level values output bydoped
(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.band_gap (float) – Band gap of the host, to use for analysis. If None (default), will use “gap” from the calculation_metadata dict attributes of the parsed DefectEntry objects.
dist_tol (float) – Threshold for the closest distance (in Å) between equivalent defect sites, for different species of the same defect type, to be grouped together (for plotting and transition level analysis). If the minimum distance between equivalent defect sites is less than
dist_tol
, then they will be grouped together, otherwise treated as separate defects. (Default: 1.5)check_compatibility (bool) – Whether to check the compatibility of the bulk entry for each defect entry (i.e. that all reference bulk energies are the same). (Default: True)
bulk_dos (FermiDos or Vasprun or PathLike) –
pymatgen
FermiDos
for the bulk electronic density of states (DOS), for calculating Fermi level positions and defect/carrier concentrations. Alternatively, can be apymatgen
Vasprun
object or path to thevasprun.xml(.gz)
output of a bulk DOS calculation in VASP. Can also be provided later when usingget_equilibrium_fermi_level()
,get_quenched_fermi_level_and_concentrations
etc, or set usingDefectThermodynamics.bulk_dos = ...
(with the same input options).Usually this is a static calculation with the primitive cell of the bulk material, with relatively dense k-point sampling (especially for materials with disperse band edges) to ensure an accurately-converged DOS and thus Fermi level.
ISMEAR = -5
(tetrahedron smearing) is usually recommended for best convergence wrt k-point sampling. Consistent functional settings should be used for the bulk DOS and defect supercell calculations. (Default: None)skip_check (bool) – Whether to skip the warning about the DOS VBM differing from the defect entries VBM by >0.05 eV. Should only be used when the reason for this difference is known/acceptable. (Default: False)
- Returns:
doped DefectThermodynamics object (
DefectThermodynamics
)
- doped.analysis.check_and_set_defect_entry_name(defect_entry: DefectEntry, possible_defect_name: str = '', bulk_symm_ops: list | None = None) None [source]
Check that
possible_defect_name
is a recognised format by doped (i.e. in the format"{defect_name}_{optional_site_info}_{charge_state}"
).If the DefectEntry.name attribute is not defined or does not end with the charge state, then the entry will be renamed with the doped default name for the unrelaxed defect (i.e. using the point symmetry of the defect site in the bulk cell).
- Parameters:
defect_entry (DefectEntry) –
DefectEntry
object.possible_defect_name (str) – Possible defect name (usually the folder name) to check if recognised by
doped
, otherwise defect name is re-determined.bulk_symm_ops (list) – List of symmetry operations of the defect_entry.bulk_supercell structure (used in determining the unrelaxed point symmetry), to avoid re-calculating. Default is None (recalculates).
- doped.analysis.defect_entry_from_paths(defect_path: str | PathLike, bulk_path: str | PathLike, dielectric: float | int | ndarray | list | None = None, charge_state: int | None = None, initial_defect_structure_path: str | PathLike | None = None, skip_corrections: bool = False, error_tolerance: float = 0.05, bulk_band_gap_vr: str | PathLike | Vasprun | None = None, **kwargs)[source]
Parse the defect calculation outputs in
defect_path
and return the parsedDefectEntry
object.By default, the
DefectEntry.name
attribute (later used to label the defects in plots) is set to the defect_path folder name (if it is a recognised defect name), else it is set to the defaultdoped
name for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances).Note that the bulk and defect supercells should have the same definitions/basis sets (for site-matching and finite-size charge corrections to work appropriately).
- Parameters:
defect_path (PathLike) – Path to defect supercell folder (containing at least vasprun.xml(.gz)).
bulk_path (PathLike) – Path to bulk supercell folder (containing at least vasprun.xml(.gz)).
dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constance (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If not provided, charge corrections cannot be computed and so
skip_corrections
will be set toTrue
. See https://doped.readthedocs.io/en/latest/GGA_workflow_tutorial.html#dielectric-constant for information on calculating and converging the dielectric constant.charge_state (int) – Charge state of defect. If not provided, will be automatically determined from the defect calculation outputs.
initial_defect_structure_path (PathLike) – Path to the initial/unrelaxed defect structure. Only recommended for use if structure matching with the relaxed defect structure(s) fails (rare). Default is None.
skip_corrections (bool) – Whether to skip the calculation and application of finite-size charge corrections to the defect energy (not recommended in most cases). Default = False.
error_tolerance (float) – If the estimated error in the defect charge correction, based on the variance of the potential in the sampling region is greater than this value (in eV), then a warning is raised. (default: 0.05 eV)
bulk_band_gap_vr (PathLike or Vasprun) –
Path to a
vasprun.xml(.gz)
file, or apymatgen
Vasprun
object, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit/add the"gap"
and"vbm"
entries inself.defect_entry.calculation_metadata
to match the correct (eigen)values. If None, will useDefectEntry.calculation_metadata["bulk_path"]
(i.e. the bulk supercell calculation output).Note that the
"gap"
and"vbm"
values should only affect the reference for the Fermi level values output bydoped
(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.**kwargs – Keyword arguments to pass to
DefectParser()
methods (load_FNV_data()
,load_eFNV_data()
,load_bulk_gap_data()
)point_symmetry_from_defect_entry()
ordefect_from_structures
, includingbulk_locpot_dict
,bulk_site_potentials
,use_MP
,mpid
,api_key
,symprec
oroxi_state
.
- Returns:
Parsed
DefectEntry
object.
- doped.analysis.defect_from_structures(bulk_supercell: Structure, defect_supercell: Structure, return_all_info: bool = False, bulk_voronoi_node_dict: dict | None = None, skip_atom_mapping_check: bool = False, **kwargs)[source]
Auto-determines the defect type and defect site from the supplied bulk and defect structures, and returns a corresponding
Defect
object.If
return_all_info
is set to true, then also returns:relaxed defect site in the defect supercell
the defect site in the bulk supercell
defect site index in the defect supercell
bulk site index (index of defect site in bulk supercell)
guessed initial defect structure (before relaxation)
‘unrelaxed defect structure’ (also before relaxation, but with interstitials at their final relaxed positions, and all bulk atoms at their unrelaxed positions).
- Parameters:
bulk_supercell (Structure) – Bulk supercell structure.
defect_supercell (Structure) – Defect structure to use for identifying the defect site and type.
return_all_info (bool) – If True, returns additional python objects related to the site-matching, listed above. (Default = False)
bulk_voronoi_node_dict (dict) – Dictionary of bulk supercell Voronoi node information, for expedited site-matching. If None, will be re-calculated.
skip_atom_mapping_check (bool) – If
True
, skips the atom mapping check which ensures that the bulk and defect supercell lattice definitions are matched (important for accurate defect site determination and charge corrections). Can be used to speed up parsing when you are sure the cell definitions match (e.g. both supercells were generated withdoped
). Default isFalse
.**kwargs – Keyword arguments to pass to
Defect
initialization, such asoxi_state
ormultiplicity
. These are mainly intended for use cases when fast site matching andDefect
creation are desired (e.g. when analysing MD trajectories of defects), where providing these parameters can greatly speed up parsing. Settingoxi_state='N/A'
andmultiplicity=1
will skip their auto-determination and accelerate parsing, if these properties are not required.
- Returns:
doped Defect object.
If
return_all_info
is True, then also:- defect_site (Site):
pymatgen Site object of the relaxed defect site in the defect supercell.
- defect_site_in_bulk (Site):
pymatgen Site object of the defect site in the bulk supercell (i.e. unrelaxed vacancy/substitution site, or final relaxed interstitial site for interstitials).
- defect_site_index (int):
index of defect site in defect supercell (None for vacancies)
- bulk_site_index (int):
index of defect site in bulk supercell (None for interstitials)
- guessed_initial_defect_structure (Structure):
pymatgen Structure object of the guessed initial defect structure.
- unrelaxed_defect_structure (Structure):
pymatgen Structure object of the unrelaxed defect structure.
- bulk_voronoi_node_dict (dict):
Dictionary of bulk supercell Voronoi node information, for further expedited site-matching.
- Return type:
defect (Defect)
- doped.analysis.defect_name_from_structures(bulk_structure: Structure, defect_structure: Structure)[source]
Get the doped/SnB defect name using the bulk and defect structures.
- Parameters:
bulk_structure (Structure) – Bulk (pristine) structure.
defect_structure (Structure) – Defect structure.
- Returns:
Defect name.
- Return type:
str
- doped.analysis.guess_defect_position(defect_supercell: Structure) ndarray[float] [source]
Guess the position (in Cartesian coordinates) of a defect in an input defect supercell, without a bulk/reference supercell.
This is achieved by computing cosine dissimilarities between site SOAP vectors (and the mean SOAP vectors for each species) and then determining the centre of mass of sites, weighted by the squared cosine dissimilarities. For accurate defect site determination, the
defect_from_structure
function (or underlying code) is preferred. These coordinates are unlikely to _directly_ match the defect position (especially in the presence of random noise), but should provide a pretty good estimate in most cases. If the defect is an extrinsic interstitial/substitution, then this will identify the exact defect site.- Parameters:
defect_supercell (Structure) – Defect supercell structure.
- Returns:
Guessed position of the defect in Cartesian coordinates.
- Return type:
np.ndarray[float]