doped.analysis module
Code to analyse VASP defect calculations.
These functions are built from a combination of useful modules from
pymatgen, alongside substantial modification, in the efforts of making an
efficient, user-friendly package for managing and analysing defect
calculations, with publication-quality outputs.
- class doped.analysis.DefectParser(defect_entry: DefectEntry, defect_vr: Vasprun | None = None, bulk_vr: Vasprun | None = None, error_tolerance: float = 0.05, parse_projected_eigen: bool | None = None, **kwargs)[source]
Bases:
objectCreate a
DefectParserobject, which has methods for parsing the results of defect supercell calculations.Direct initialisation with
DefectParser()is typically not recommended. RatherDefectParser.from_paths()ordefect_entry_from_paths()are preferred.- Parameters:
defect_entry (
DefectEntry) – dopedDefectEntrydefect_vr (
Vasprun) –pymatgenVasprunobject for the defect supercell calculation.bulk_vr (
Vasprun) –pymatgenVasprunobject for the reference bulk supercell calculation.error_tolerance (float) – If the estimated error in the defect charge correction, based on the variance of the potential in the sampling region is greater than this value (in eV), then a warning is raised. Default is 0.05 eV.
parse_projected_eigen (bool) – Whether to parse the projected eigenvalues & magnetization from the bulk and defect calculations (so
DefectEntry.get_eigenvalue_analysis()can then be used with no further parsing, and magnetization values can be pulled for SOC / non-collinear magnetism calculations). Will initially try to load orbital projections fromvasprun.xml(.gz)files (slightly slower but more accurate), or failing that fromPROCAR(.gz)files if present in the bulk/defect directories. Parsing this data can increase total parsing time by anywhere from ~5-25%, so set toFalseif parsing speed is crucial. Default isNone, which will attempt to load this data but with no warning if it fails (otherwise ifTruea warning will be printed).**kwargs – Keyword arguments to pass to
DefectParser()methods (load_FNV_data(),load_eFNV_data(),load_bulk_gap_data()),point_symmetry_from_defect_entry(),parse_symmetry_and_degeneracy_metadataordefect_and_info_from_structures, includingbulk_locpot_dict,bulk_site_potentials,use_MP,mpid,api_key,oxi_state,multiplicity,angle_tolerance,attempt_periodicity_restoration,user_charges,initial_defect_structure_pathetc (see their docstrings). Primarily used byDefectsParserto expedite parsing by avoiding reloading bulk data for each defect. Note thatbulk_sympreccan be supplied as thesymprecvalue to use for determining equivalent sites (and thus defect multiplicities / unrelaxed site symmetries), while an inputsymprecvalue will be used for determining relaxed site symmetries.
- apply_corrections()[source]
Get and apply defect corrections, and warn if likely to be inappropriate (based on error tolerances).
- classmethod from_paths(defect_path: str | Path, bulk_path: str | Path | None = None, bulk_vr: Vasprun | None = None, bulk_procar: Procar | None = None, dielectric: float | ndarray | list | None = None, charge_state: int | None = None, skip_corrections: bool = False, error_tolerance: float = 0.05, bulk_band_gap_vr: str | Path | Vasprun | None = None, parse_projected_eigen: bool | None = None, **kwargs)[source]
Parse the defect calculation outputs in
defect_pathand return theDefectParserobject. By default, theDefectParser.defect_entry.nameattribute (later used to label defects in plots) is set to the defect_path folder name (if it is a recognised defect name), else it is set to the default doped` name for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances).Note that the bulk and defect supercells should have the same definitions/basis sets (for site-matching and finite-size charge corrections to work appropriately).
- Parameters:
defect_path (PathLike) – Path to defect supercell folder (containing at least
vasprun.xml(.gz)).bulk_path (PathLike) – Path to bulk supercell folder (containing at least
vasprun.xml(.gz)). Not required ifbulk_vris provided.bulk_vr (
Vasprun) –pymatgenVasprunobject for the reference bulk supercell calculation, if already loaded (can be supplied to expedite parsing). Default isNone.bulk_procar (
Procar) –pymatgenProcarobject, for the reference bulk supercell calculation if already loaded (can be supplied to expedite parsing). Default isNone.dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constant (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If not provided, charge corrections cannot be computed and so
skip_correctionswill be set toTrue. See https://doped.readthedocs.io/en/latest/GGA_workflow_tutorial.html#dielectric-constant for information on calculating and converging the dielectric constant.charge_state (int) – Charge state of defect. If not provided, will be automatically determined from defect calculation outputs, or if that fails, using the defect folder name (must end in “_+X” or “_-X” where +/-X is the defect charge state).
skip_corrections (bool) – Whether to skip the calculation and application of finite-size charge corrections to the defect energy (not recommended in most cases). Default =
False.error_tolerance (float) – If the estimated error in the defect charge correction, based on the variance of the potential in the sampling region, is greater than this value (in eV), then a warning is raised. Default is 0.05 eV.
bulk_band_gap_vr (PathLike or
Vasprun) –Path to a
vasprun.xml(.gz)file, or apymatgenVasprunobject, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit the"band_gap"and"vbm"entries inself.defect_entry.calculation_metadatato match the correct (eigen)values. IfNone, will useDefectEntry.calculation_metadata["bulk_path"](i.e. the bulk supercell calculation output).Note that the
"band_gap"and"vbm"values should only affect the reference for the Fermi level values output bydoped(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.parse_projected_eigen (bool) – Whether to parse the projected eigenvalues & magnetization from the bulk and defect calculations (so
DefectEntry.get_eigenvalue_analysis()can then be used with no further parsing, and magnetization values can be pulled for SOC / non-collinear magnetism calculations). Will initially try to load orbital projections fromvasprun.xml(.gz)files (slightly slower but more accurate), or failing that fromPROCAR(.gz)files if present in the bulk/defect directories. Parsing this data can increase total parsing time by anywhere from ~5-25%, so set toFalseif parsing speed is crucial. Default isNone, which will attempt to load this data but with no warning if it fails (otherwise ifTruea warning will be printed).**kwargs – Keyword arguments to pass to
DefectParser()methods (load_FNV_data(),load_eFNV_data(),load_bulk_gap_data()),point_symmetry_from_defect_entry(),parse_symmetry_and_degeneracy_metadataordefect_and_info_from_structures, includingbulk_locpot_dict,bulk_site_potentials,use_MP,mpid,api_key,oxi_state,multiplicity,angle_tolerance,attempt_periodicity_restoration,user_charges,initial_defect_structure_pathetc (see their docstrings). Primarily used byDefectsParserto expedite parsing by avoiding reloading bulk data for each defect. Note thatbulk_sympreccan be supplied as thesymprecvalue to use for determining equivalent sites (and thus defect multiplicities / unrelaxed site symmetries), while an inputsymprecvalue will be used for determining relaxed site symmetries.
- Returns:
DefectParserobject.
- load_FNV_data(bulk_locpot_dict: dict | None = None)[source]
Load metadata required for performing Freysoldt correction (i.e.
LOCPOTplanar-averaged potential dictionary).Requires “bulk_path” and “defect_path” to be present in
DefectEntry.calculation_metadata, and VASPLOCPOTfiles to be present in these directories. Can read compressed “LOCPOT.gz” files. Thebulk_locpot_dictcan be supplied if already parsed, for expedited parsing of multiple defects.Saves the
bulk_locpot_dictanddefect_locpot_dictdictionaries (containing the planar-averaged electrostatic potentials along each axis direction) to theDefectEntry.calculation_metadatadict, for use withDefectEntry.get_freysoldt_correction().- Parameters:
bulk_locpot_dict (dict) – Planar-averaged potential dictionary for bulk supercell, if already parsed. If
None(default), will try to load from theLOCPOT(.gz)file indefect_entry.calculation_metadata["bulk_path"].- Returns:
bulk_locpot_dictfor reuse in parsing other defect entries.
- load_and_check_calculation_metadata()[source]
Pull metadata about the defect supercell calculations from the outputs, and check if the defect and bulk supercell calculations settings are compatible.
- load_bulk_gap_data(bulk_band_gap_vr: str | Path | Vasprun | None = None, use_MP: bool = False, mpid: str | None = None, api_key: str | None = None)[source]
Load the
"band_gap","vbm"and"cbm"values for the parsedDefectEntrys.If
bulk_band_gap_vris provided, then these values are parsed from it, else taken from the parsed bulk supercell calculation."band_gap"and"vbm"are used by default when generatingDefectThermodynamicsobjects, to be used in plotting & analysis.Alternatively, one can specify query the Materials Project (MP) database for the bulk gap data, using
use_MP = True, in which case the MP entry with the lowest number ID and composition matching the bulk will be used, or the MP ID (mpid) of the bulk material to use can be specified. This is not recommended as it will correspond to a severely-underestimated GGA DFT bandgap!- Parameters:
bulk_band_gap_vr (PathLike or
Vasprun) –Path to a
vasprun.xml(.gz)file, or apymatgenVasprunobject, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit the"band_gap"and"vbm"entries inself.defect_entry.calculation_metadatato match the correct (eigen)values. IfNone, will useDefectEntry.calculation_metadata["bulk_path"](i.e. the bulk supercell calculation output).Note that the
"band_gap"and"vbm"values should only affect the reference for the Fermi level values output bydoped(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.use_MP (bool) – If True, will query the Materials Project database for the bulk gap data.
mpid (str) – If provided, will query the Materials Project database for the bulk gap data, using this Materials Project ID.
api_key (str) – Materials Project API key to access database.
- load_eFNV_data(bulk_site_potentials: list | None = None)[source]
Load metadata required for performing Kumagai correction (i.e. atomic site potentials from the
OUTCARfiles).Requires “bulk_path” and “defect_path” to be present in
DefectEntry.calculation_metadata, andVASPOUTCARfiles to be present in these directories. Can read compressedOUTCAR.gzfiles. The bulk_site_potentials can be supplied if already parsed, for expedited parsing of multiple defects.Saves the
bulk_site_potentialsanddefect_site_potentialslists (containing the atomic site electrostatic potentials, from-1*np.array(Outcar.electrostatic_potential)) toDefectEntry.calculation_metadata, for use withDefectEntry.get_kumagai_correction().- Parameters:
bulk_site_potentials (list) – Atomic site potentials for the bulk supercell, if already parsed. If
None(default), will load fromOUTCAR(.gz)file indefect_entry.calculation_metadata["bulk_path"].- Returns:
bulk_site_potentialsto reuse in parsing other defect entries.
- class doped.analysis.DefectsParser(output_path: str | Path = '.', dielectric: float | ndarray | list | None = None, subfolder: str | Path | None = None, bulk_path: str | Path | None = None, skip_corrections: bool = False, error_tolerance: float = 0.05, bulk_band_gap_vr: str | Path | Vasprun | None = None, processes: int | None = None, json_filename: str | Path | bool | None = None, parse_projected_eigen: bool | None = None, **kwargs)[source]
Bases:
objectA class for rapidly parsing multiple VASP defect supercell calculations for a given host (bulk) material.
Loops over calculation directories in
output_path(likely the sameoutput_pathused withDefectsSetfor file generation indoped.vasp) and parses the defect calculations into a dictionary of:{defect_name: DefectEntry}, where thedefect_nameis set to the defect calculation folder name (if it is a recognised defect name), else it is set to the defaultdopedname for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances). By default, searches for folders inoutput_pathwithsubfoldercontainingvasprun.xml(.gz)files, and tries to parse them asDefectEntrys.By default, tries multiprocessing to speed up defect parsing, which can be controlled with
processes. If parsing hangs, this may be due to memory issues, in which case you should manually reduceprocesses(e.g. <=4).Defect charge states are automatically determined from the defect calculation outputs if
POTCARs are set up withpymatgen(see docs Installation page), or if that fails, using the defect folder name (must end in “_+X” or “_-X” where +/-X is the defect charge state).Uses the (single)
DefectParserclass to parse the individual defect calculations. Note that the bulk and defect supercells should have the same definitions/basis sets (for site-matching and finite-size charge corrections to work appropriately).- Parameters:
output_path (PathLike) – Path to the output directory containing the defect calculation folders (likely the same
output_pathused withDefectsSetfor file generation indoped.vasp). Default is current directory.dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constant (ionic + static contributions), in the same xyz Cartesian basis as the supercell calculations (likely but not necessarily the same as the raw output of a VASP dielectric calculation, if an oddly-defined primitive cell is used). If not provided, charge corrections cannot be computed and so
skip_correctionswill be set toTrue. See https://doped.readthedocs.io/en/latest/GGA_workflow_tutorial.html#dielectric-constant for information on calculating and converging the dielectric constant.subfolder (PathLike) – Name of subfolder(s) within each defect calculation folder (in the
output_pathdirectory) containing the VASP calculation files to parse (e.g.vasp_ncl,vasp_std,vasp_gametc.). If not specified,dopedchecks, case-insensitively and in order, for"vasp_ncl","singlepoint","final","relax","vasp_std","vasp_nkred_std","vasp_gam"subfolders (following_SUBFOLDER_PRIORITY) with calculation outputs (vasprun.xml(.gz)files), and uses the first matching subfolder name assubfolder, otherwise uses the defect calculation folder itself with no subfolder (setsubfolder = "."to enforce this).bulk_path (PathLike) – Path to bulk supercell reference calculation folder. If not specified, searches for folder with name “X_bulk” in the
output_pathdirectory (matching the defaultdopedname for the bulk supercell reference folder). Can be the full path, or the relative path from theoutput_pathdirectory.skip_corrections (bool) – Whether to skip the calculation & application of finite-size charge corrections to the defect energies (not recommended in most cases). Default is
False.error_tolerance (float) – If the estimated error in any charge correction, based on the variance of the potential in the sampling region, is greater than this value (in eV), then a warning is raised. Default is 0.05 eV. Note that this warning is skipped for defects which are predicted to not be stable for any Fermi level in the band gap (based on all parsed defects here), or are predicted to be shallow (perturbed host) states according to eigenvalue analysis and only be stable for Fermi levels within a small window to a band edge (taken as the smaller of
error_toleranceor 10% of the band gap, by default, or can be set by ashallow_charge_stability_tolerance = Xkeyword argument).bulk_band_gap_vr (PathLike or
Vasprun) –Path to a
vasprun.xml(.gz)file, or apymatgenVasprunobject, from which to determine the bulk band gap and band edge positions. If the VBM/CBM occur at k-points which are not included in the bulk supercell calculation, then this parameter should be used to provide the output of a bulk bandstructure calculation so that these are correctly determined. Alternatively, you can edit the"band_gap"and"vbm"entries inself.defect_entry.calculation_metadatato match the correct (eigen)values. IfNone, will useDefectEntry.calculation_metadata["bulk_path"](i.e. the bulk supercell calculation output).Note that the
"band_gap"and"vbm"values should only affect the reference for the Fermi level values output bydoped(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.processes (int) – Number of processes to use for multiprocessing for expedited parsing. If not set, defaults to one less than the number of CPUs available. Set to 1 for no multiprocessing.
json_filename (PathLike) – Filename to save the parsed defect entries dict (
DefectsParser.defect_dict) to inoutput_path, to avoid having to re-parse defects when later analysing further and aiding calculation provenance. Can be reloaded using theloadfnfunction frommonty.serialization(and then input toDefectThermodynamicsetc.). IfNone(default), set as{Host Chemical Formula}_defect_dict.json.gz. IfFalse, no json file is saved.parse_projected_eigen (bool) – Whether to parse the projected eigenvalues & magnetization from the bulk and defect calculations (so
DefectEntry.get_eigenvalue_analysis()can then be used with no further parsing, and magnetization values can be pulled for SOC / non-collinear magnetism calculations). Will initially try to load orbital projections fromvasprun.xml(.gz)files (slightly slower but more accurate), or failing that fromPROCAR(.gz)files if present in the bulk/defect directories. Parsing this data can increase total parsing time by anywhere from ~5-25%, so set toFalseif parsing speed is crucial. Default isNone, which will attempt to load this data but with no warning if it fails (otherwise ifTruea warning will be printed).**kwargs – Keyword arguments to pass to
DefectParser()methods (load_FNV_data(),load_eFNV_data(),load_bulk_gap_data()),point_symmetry_from_defect_entry(),parse_symmetry_and_degeneracy_metadataordefect_and_info_from_structuresorget_dimer_bonds(), includingbulk_locpot_dict,bulk_site_potentials,use_MP,mpid,api_key,oxi_state,multiplicity,angle_tolerance,attempt_periodicity_restoration,user_charges,initial_defect_structure_path,rtoletc. (see their docstrings); or for controlling shallow defect charge correction error warnings (seeerror_tolerancedescription) withshallow_charge_stability_tolerance. Note thatbulk_sympreccan be supplied as thesymprecvalue to use for determining equivalent sites (and thus defect multiplicities / unrelaxed site symmetries), while an inputsymprecvalue will be used for determining relaxed site symmetries.
- defect_dict
Dictionary of parsed defect calculations in the format:
{"defect_name": DefectEntry}where the defect_name is set to the defect calculation folder name (if it is a recognised defect name), else it is set to the defaultdopedname for that defect (using the estimated unrelaxed defect structure, for the point group and neighbour distances).- Type:
dict
- get_defect_thermodynamics(chempots: dict | None = None, el_refs: dict | None = None, vbm: float | None = None, band_gap: float | None = None, dist_tol: float = 1.5, check_compatibility: bool = True, bulk_dos: FermiDos | None = None, skip_dos_check: bool = False, **kwargs) DefectThermodynamics[source]
Generates a
DefectThermodynamicsobject from the parsedDefectEntryobjects inself.defect_dict, which can then be used to analyse and plot the defect thermodynamics (formation energies, transition levels, concentrations etc).Note that the
DefectEntry.nameattributes (rather than thedefect_namekey in thedefect_dict) are used to label the defects in plots.See the
DefectThermodynamicsand accompanying methods docstrings indoped.thermodynamicsfor more.- Parameters:
chempots (dict) –
Dictionary of chemical potentials to use for calculating the defect formation energies. This can have the form of
{"limits": [{'limit': [chempot_dict]}]}(the format generated bydoped's chemical potential parsing functions (see tutorials)) which allows easy analysis over a range of chemical potentials – where limit(s) (chemical potential limit(s)) to analyse/plot can later be chosen using thelimitsargument.Alternatively this can be a dictionary of chemical potentials for a single limit, in the format:
{element symbol: chemical potential}. If manually specifying chemical potentials this way, you can set theel_refsoption with the DFT reference energies of the elemental phases in order to show the formal (relative) chemical potentials above the formation energy plot, in which case it is the formal chemical potentials (i.e. relative to the elemental references) that should be given here, otherwise the absolute (DFT) chemical potentials should be given.If
None(default), sets all chemical potentials to zero. Chemical potentials can also be supplied later in each analysis function. (Default: None)el_refs (dict) –
Dictionary of elemental reference energies for the chemical potentials in the format:
{element symbol: reference energy}(to determine the formal chemical potentials, whenchempotshas been manually specified as{element symbol: chemical potential}). Unnecessary ifchempotsis provided in format generated bydoped(see tutorials).If
None(default), sets all elemental reference energies to zero. Reference energies can also be supplied later in each analysis function, or set usingDefectThermodynamics.el_refs = ...(with the same input options).vbm (float) – VBM eigenvalue to use as Fermi level reference point for analysis. If
None(default), will use"vbm"from thecalculation_metadatadict attributes of the parsedDefectEntryobjects, which by default is taken from the bulk supercell VBM (unlessbulk_band_gap_vris set during parsing). Note thatvbmshould only affect the reference for the Fermi level values output bydoped(as this VBM eigenvalue is used as the zero reference), thus affecting the position of the band edges in the defect formation energy plots and doping window / dopability limit functions, and the reference of the reported Fermi levels.band_gap (float) – Band gap of the host, to use for analysis. If
None(default), will use “band_gap” from thecalculation_metadatadict attributes of the parsedDefectEntryobjects.dist_tol (float) – Threshold for the closest distance (in Å) between equivalent defect sites, for different species of the same defect type, to be grouped together (for plotting, transition level analysis and defect concentration calculations). For the most part, if the minimum distance between equivalent defect sites is less than
dist_tol, then they will be grouped together, otherwise treated as separate defects. Seeplot()andget_fermi_level_and_concentrations()docstrings for more information. (Default: 1.5)check_compatibility (bool) – Whether to check the compatibility of the bulk entry for each defect entry (i.e. that all reference bulk energies are the same). (Default: True)
bulk_dos (FermiDos or
Vasprunor PathLike) –pymatgenFermiDosfor the bulk electronic density of states (DOS), for calculating Fermi level positions and defect/carrier concentrations. Alternatively, can be apymatgenVasprunobject or path to thevasprun.xml(.gz)output of a bulk DOS calculation in VASP. Can also be provided later when usingget_equilibrium_fermi_level(),get_fermi_level_and_concentrationsetc, or set usingDefectThermodynamics.bulk_dos = ...(with the same input options).Usually this is a static calculation with the primitive cell of the bulk material, with relatively dense k-point sampling (especially for materials with disperse band edges) to ensure an accurately-converged DOS and thus Fermi level. Using large
NEDOS(>3000) andISMEAR = -5(tetrahedron smearing) are recommended for best convergence (wrt k-point sampling) in VASP. Consistent functional settings should be used for the bulk DOS and defect supercell calculations. See https://doped.readthedocs.io/en/latest/Tips.html#density-of-states-dos-calculations (Default: None)skip_dos_check (bool) – Whether to skip the warning about the DOS VBM differing from the defect entries VBM by >0.05 eV. Should only be used when the reason for this difference is known/acceptable. (Default: False)
**kwargs – Additional keyword arguments to pass to the
DefectThermodynamicsconstructor.
- Returns:
dopedDefectThermodynamicsobject
- doped.analysis.check_and_set_defect_entry_name(defect_entry: DefectEntry, possible_defect_name: str = '') None[source]
Check that
possible_defect_nameis a recognised format by doped (i.e. in the format"{defect_name}_{optional_site_info}_{charge_state}").If the
DefectEntry.nameattribute is not defined or does not end with charge state, then the entry will be renamed with the doped default name for the unrelaxed defect (i.e. using the point symmetry of the defect site in the bulk cell).- Parameters:
defect_entry (
DefectEntry) –DefectEntryobject.possible_defect_name (str) – Possible defect name (usually the folder name) to check if recognised by
doped, otherwise defect name is re-determined.
- doped.analysis.defect_and_info_from_structures(defect_supercell: Structure, bulk_supercell: Structure, skip_atom_mapping_check: bool = False, initial_defect_structure_path: str | Path | None = None, _parameter_order_warn: bool = True, **kwargs) tuple[Defect, PeriodicSite, dict][source]
Generates a corresponding
Defectobject from the supplied bulk and defect supercells (usingdefect_from_structures), and returns theDefectobject, the relaxed defect site in the defect supercell, and a dictionary of calculation metadata (including the defect site in the bulk supercell, defect site indices in the defect and bulk supercells, the guessed initial defect structure, and the unrelaxed defect structure).Note that this assumes consistent cell definitions (lattice vectors and bases) for the input defect and bulk supercells, and does not perform any structural re-orientations.
- Parameters:
defect_supercell (
Structure) – Defect structure to use for identifying the defect site and type.bulk_supercell (
Structure) – Bulk supercell structure.skip_atom_mapping_check (bool) – If
True, skips the atom mapping check which ensures that the bulk and defect supercell lattice definitions are matched (important for accurate defect site determination and charge corrections). Can be used to speed up parsing when you are sure the cell definitions match (e.g. both supercells were generated withdoped). Default isFalse.initial_defect_structure_path (PathLike) – Path to the initial/unrelaxed defect structure. Only recommended for use if structure matching with the relaxed defect structure(s) fails (rare). Default is
None.**kwargs – Keyword arguments to pass to
get_equiv_frac_coords_in_primitive(such assymprec,dist_tol_factor,fixed_symprec_and_dist_tol_factor,verbose) and/orDefectinitialization (such asoxi_state,multiplicity,symprec,dist_tol_factor). Mainly intended for cases where fast site matching andDefectcreation are desired (e.g. when analysing MD trajectories of defects), where providing these parameters can greatly speed up parsing. Settingoxi_state='N/A'andmultiplicity=1will skip their auto-determination and accelerate parsing, if these properties are not required.
- Returns:
- defect (
Defect): dopedDefectobject, defined in the primitive structure.- defect_site (
PeriodicSite): pymatgenPeriodicSiteobject of the relaxed defect site in the defect supercell.- defect_structure_metadata (dict):
Dictionary containing metadata about the defect structure, including:
guessed_initial_defect_structure: The guessed initial defect structure (before relaxation).guessed_defect_displacement: Displacement from the guessed initial defect site to the final relaxed site (Nonefor vacancies).defect_site_index: Index of the defect site in the defect supercell (Nonefor vacancies).bulk_site_index: Index of the defect site in the bulk supercell (Nonefor interstitials).unrelaxed_defect_structure: The unrelaxed defect structure (similar toguessed_initial_defect_structure, but with interstitials at their final relaxed positions, and all bulk atoms at their unrelaxed positions).bulk_site: The defect site in the bulk supercell (i.e. unrelaxed vacancy/substitution site, or final relaxed site for interstitials).
- defect (
- Return type:
tuple[Defect, PeriodicSite, dict]
- doped.analysis.defect_from_structures(defect_supercell: Structure, bulk_supercell: Structure, return_all_info: bool = False, skip_atom_mapping_check: bool = False, _parameter_order_warn: bool = True, **kwargs) Defect | tuple[Defect, PeriodicSite, PeriodicSite, int | None, int | None, Structure, Structure][source]
Auto-determines the defect type and defect site from the supplied bulk and defect structures, and returns a corresponding
Defectobject with the defect site in the primitive structure.Note that this assumes consistent cell definitions (lattice vectors and bases) for the input defect and bulk supercells, and does not perform any structural re-orientations.
If
return_all_infois set to true, then also returns:relaxed defect site in the defect supercell
the defect site in the bulk supercell
defect site index in the defect supercell
bulk site index (index of defect site in bulk supercell)
guessed initial defect structure (before relaxation)
‘unrelaxed defect structure’ (also before relaxation, but with interstitials at their final relaxed positions, and all bulk atoms at their unrelaxed positions).
- Parameters:
defect_supercell (
Structure) – Defect structure to use for identifying the defect site and type.bulk_supercell (
Structure) – Bulk supercell structure.return_all_info (bool) – If
True, returns additional info related to the site-matching; see return signature. (Default:False)skip_atom_mapping_check (bool) – If
True, skips the atom mapping check which ensures that the bulk and defect supercell lattice definitions are matched (important for accurate defect site determination and charge corrections). Can be used to speed up parsing when you are sure the cell definitions match (e.g. both supercells were generated withdoped). Default isFalse.**kwargs – Keyword arguments to pass to
get_equiv_frac_coords_in_primitive(such assymprec,dist_tol_factor,fixed_symprec_and_dist_tol_factor,verbose) and/orDefectinitialization (such asoxi_state,multiplicity,symprec,dist_tol_factor). Mainly intended for cases where fast site matching andDefectcreation are desired (e.g. when analysing MD trajectories of defects), where providing these parameters can greatly speed up parsing. Settingoxi_state='N/A'andmultiplicity=1will skip their auto-determination and accelerate parsing, if these properties are not required.
- Returns:
dopedDefectobject, defined in the primitive structure.If
return_all_infois True, then also returns:- defect_site (
PeriodicSite): pymatgenPeriodicSiteobject of the relaxed defect site in the defect supercell.- defect_site_in_bulk (
PeriodicSite): pymatgenPeriodicSiteobject of the defect site in the bulk supercell (i.e. unrelaxed vacancy/substitution site, or final relaxed interstitial site for interstitials).- defect_site_index (int):
Index of defect site in defect supercell (None for vacancies)
- bulk_site_index (int):
Index of defect site in bulk supercell (None for interstitials)
- guessed_initial_defect_structure (
Structure): pymatgenStructureobject of the guessed initial defect structure.- unrelaxed_defect_structure (
Structure): pymatgenStructureobject of the unrelaxed defect structure.
- defect_site (
- Return type:
defect (
Defect)
- doped.analysis.defect_name_from_structures(defect_supercell: Structure, bulk_supercell: Structure, _parameter_order_warn: bool = True, **kwargs) str[source]
Get the doped/SnB defect name using the bulk and defect structures.
- Parameters:
- Returns:
Defect name.
- Return type:
str
- doped.analysis.defect_site_from_structures(defect_supercell: Structure, bulk_supercell: Structure, return_all_info: bool = False, _parameter_order_warn: bool = True) PeriodicSite | tuple[PeriodicSite, str, PeriodicSite, int | None, int | None, Structure][source]
Auto-determines the defect site from the supplied bulk and defect structures, returning the corresponding
PeriodicSite.Note that this assumes consistent cell definitions (lattice vectors and bases) for the input defect and bulk supercells, and does not perform any structural re-orientations.
- Parameters:
- Returns:
pymatgenPeriodicSiteobject for the relaxed defect sitein the defect supercell.
If
return_all_infois True, then also returns:- defect_type (str):
The type of defect as a string (
interstitial,vacancyorsubstitution).- defect_site_in_bulk (
PeriodicSite): pymatgenPeriodicSiteobject of the defect site in the bulk supercell (i.e. unrelaxed vacancy/substitution site, or final relaxed interstitial site for interstitials).- defect_site_index (int):
Index of defect site in defect supercell (None for vacancies)
- bulk_site_index (int):
Index of defect site in bulk supercell (None for interstitials)
- unrelaxed_defect_structure (
Structure): pymatgenStructureobject of the unrelaxed defect structure.
- Return type:
defect_site (
PeriodicSite)
- doped.analysis.guess_defect_position(defect_supercell: Structure, bulk_supercell: Structure | None = None, soap_n_jobs: int = 1, soap_r_cut: float = 5.0, soap_n_max: int = 6, soap_l_max: int = 4) ndarray[float][source]
Guess the position (in Cartesian coordinates) of a defect in an input defect supercell, optionally using a bulk/reference supercell (but not required!).
This is achieved by computing cosine dissimilarities between site SOAP vectors and a reference, and then determining the centre of mass of the squared cosine dissimilarities.
If no
bulk_supercellis provided (default), each site’s SOAP vector is compared to the mean SOAP vector of all sites of the same species indefect_supercell. If abulk_supercellis provided, each defect supercell site’s SOAP vector is instead compared to the SOAP vector of its nearest site (by Cartesian distance, accounting for periodic boundary conditions) in the bulk supercell, which typically gives a stronger signal around the defect site. This assumes the defect and bulk supercells share the same lattice and are in the same origin frame (as is the case for supercells generated bydoped).For accurate defect site determination, the
defect_from_structuresfunction (or underlying code) is preferred. These coordinates are unlikely to directly match the defect position (especially in the presence of random noise), but should provide a pretty good estimate in most cases. If the defect is an extrinsic interstitial / substitution, then this will identify the exact defect site.Performance: Creating SOAP descriptors (via
dscribe) is usually the bottleneck. You can: (1) setsoap_n_jobs> 1 to parallelise over site-centres; (2) tunesoap_l_max/soap_n_max/soap_r_cutas needed. Default hyperparameters are a compact real-speciesdscribeSOAP (n_max=6,l_max=4).- Parameters:
defect_supercell (
Structure) – Defect supercell structure.bulk_supercell (
Structure| None) – Optional bulk (pristine) reference supercell. When provided, site cosine dissimilarities are computed relative to the nearest matching bulk-supercell site (rather than the per-species mean in the defect supercell). Assumesdefect_supercellandbulk_supercellshare the same lattice/origin alignment. Default isNone.soap_n_jobs (int) –
n_jobspassed todscribe’screate()(parallelise over site centres). Default is 1 (no parallelisation).soap_r_cut (float) – SOAP cut-off radius in Å (for
dscribe), default 5.0.soap_n_max (int) – SOAP radial basis size (for
dscribe), default 6.soap_l_max (int) – SOAP maximum angular momentum (for
dscribe), default 4.
- Returns:
Guessed position of the defect in Cartesian coordinates.
- Return type:
np.ndarray[float]
- doped.analysis.parse_symmetry_and_degeneracy_metadata(defect_entry: DefectEntry, **kwargs)[source]
Determine the unrelaxed (‘bulk’) and relaxed defect point symmetries for the input
DefectEntry, whether there is any periodicity-breaking in the supercell, and the corresponding orientational degeneracy factor.If the supercell is detected to break the crystal periodicity, and
attempt_periodicity_restorationisTrue(default), then periodicity will be attempted to be restored by stenciling the relaxed defect geometry into a supercell which retains periodicity, and then determining the point symmetry for that.Results are stored in the
calculation_metadataanddegeneracy_factorsproperty dicts of theDefectEntry.- Parameters:
defect_entry (
DefectEntry) – TheDefectEntryobject to parse the symmetry and degeneracy metadata for. Parsed results are stored in thecalculation_metadataanddegeneracy_factorsproperty dicts of theDefectEntry.**kwargs – Additional keyword arguments to pass to the
point_symmetry_from_defect_entryfunction, such assymprec,dist_tol_factor,fixed_symprec_and_dist_tol_factor,verboseandbulk_symprec. Also includesattempt_periodicity_restoration, which ifTrue(default), will attempt to restore periodicity for periodicity-breaking defect supercells (mostly an edge case) by attempting to stencil the relaxed defect geometry into a supercell which retains periodicity, and then getting the point symmetry for that.
- doped.analysis.shallow_dopant_binding_energy(eff_mass: float, dielectric: float | ndarray | list)[source]
Estimate the binding energy of a shallow dopant /defect in a semiconductor, using effective mass theory.
Discussion here: https://doped.readthedocs.io/en/latest/Tips.html#perturbed-host-states-shallow-defects
For delocalised, shallow states (a.k.a. perturbed host states), the hydrogenic effective mass model typically gives quite a good estimate of the binding energy, at least for dispersive 3D semiconductors.
Note that this formula can also be used to estimate the binding energy of a delocalised (Wannier-Mott) exciton, in which case the reduced effective mass of the electron-hole pair should be used, as:
\[μ_reduced = (m_e * m_h) / (m_e + m_h)\]- Parameters:
eff_mass (float) – Effective mass of the dopant.
dielectric (float or int or 3x1 matrix or 3x3 matrix) – Total dielectric constant (ionic + static contributions) of the semiconductor host.
- Returns:
Binding energy of the shallow dopant, in eV.
- Return type:
float