doped.generation module
Code to generate Defect objects and supercell structures for ab-initio
calculations.
- class doped.generation.DefectsGenerator(structure: Structure | Atoms | str | Path, extrinsic: str | list | dict | None = None, interstitial_coords: list | None = None, generate_supercell: bool = True, charge_state_gen_kwargs: dict | None = None, supercell_gen_kwargs: dict[str, int | float | bool] | None = None, interstitial_gen_kwargs: dict | bool | None = None, target_frac_coords: list | None = None, processes: int | None = None, **kwargs)[source]
Bases:
MSONableClass for generating
dopedDefectEntryobjects.Generates
dopedDefectEntryobjects for defects in the input host structure. By default, generates all intrinsic defects, but extrinsic defects (impurities) can also be created using theextrinsicargument.Interstitial sites are generated using Voronoi tessellation by default (found to be the most reliable) using the
get_interstitial_sitesfunction, which also generates adsorbate sites if the structure is determined to be a slab/layer/rod. This can be controlled using theinterstitial_gen_kwargsargument. Alternatively, a list of interstitial sites (or single interstitial site) can be manually specified using theinterstitial_coordsargument.By default, supercells are generated for each defect using the
dopedget_ideal_supercell_matrix()function (see docstring), with default settings ofmin_image_distance = 10(minimum distance between periodic images of 10 Å),min_atoms = 50(minimum 50 atoms in the supercell) andideal_threshold = 0.1(allow up to 10% larger supercell if it is a diagonal expansion of the primitive or conventional cell). This uses a custom algorithm indopedto efficiently search over possible supercell transformations and identify that with the minimum number of atoms (hence computational cost) that satisfies the minimum image distance, number of atoms andideal_thresholdconstraints. These settings can be controlled by specifying keyword arguments withsupercell_gen_kwargs, which are passed toget_ideal_supercell_matrix()(e.g. for a minimum image distance of 15 Å with at least 100 atoms,supercell_gen_kwargswould be:{'min_image_distance': 15, 'min_atoms': 100}). If the input structure already satisfies these constraints (for the same number of atoms as thedoped-generated supercell), then it will be used. Alternatively ifgenerate_supercell = False, then no supercell is generated and the input structure is used as the defect & bulk supercell. (Note this may give a slightly different (but fully equivalent) set of coordinates).The algorithm for determining defect entry names is to use the
pymatgendefect name (e.g.v_Cd,Cd_Teetc.) for vacancies / antisites / substitutions, unless there are multiple inequivalent sites for the defect, in which case the point group of the defect site is appended (e.g.v_Cd_Td,Cd_Te_Tdetc.), and if this is still not unique, then element identity and distance to the nearest neighbour of the defect site is appended (e.g.v_Cd_Td_Te2.83,Cd_Te_Td_Cd2.83etc.). For interstitials, the same naming scheme is used, but the point group is always appended to thepymatgenname.Possible charge states for the defects are estimated using the probability of the corresponding defect element oxidation state, the magnitude of the charge state, and the maximum magnitude of the host oxidation states (i.e. how ‘charged’ the host is). Large (absolute) charge states, low probability oxidation states and/or greater charge/oxidation state magnitudes than that of the host are disfavoured. This can be controlled using the
probability_threshold(default = 0.0075) orpadding(default = 1) keys in thecharge_state_gen_kwargsparameter, which are passed to theguess_defect_charge_states()function. The input and computed values used to guess charge state probabilities are provided in theDefectEntry.charge_state_guessing_logattributes. See docs for examples of modifying the generated charge states, and the docstrings ofcharge_state_probability()&get_vacancy_charge_states()for more details on the charge state guessing algorithm. Thedopedalgorithm was found to give optimal performance in terms of efficiency and completeness (see JOSS paper), but of course may not be perfect in all cases, so make sure to critically consider the estimated charge states for your system!Note that Wyckoff letters can depend on the ordering of elements in the conventional standard structure, for which doped uses the
spglibconvention.- Parameters:
structure (Structure) – Structure of the host material, either as a
pymatgenStructure,ASEAtomsor path to a structure file (e.g.CONTCAR). If this is not the primitive unit cell, it will be reduced to the primitive cell for defect generation, before supercell generation.extrinsic (Union[str, list, dict]) – List or dict of elements (or string for single element) to be used for extrinsic defect generation (i.e. dopants/impurities). If a list is provided, all possible substitutional defects for each extrinsic element will be generated. If a dict is provided, the keys should be the host elements to be substituted, and the values the extrinsic element(s) to substitute in; as a string or list. In both cases, all possible extrinsic interstitials are generated.
interstitial_coords (list) – List of fractional coordinates (corresponding to the input structure), or a single set of fractional coordinates, to use as interstitial defect site(s). Default (when
interstitial_coordsnot specified) is to automatically generate interstitial sites using Voronoi tessellation. The inputinterstitial_coordsare converted toDefectsGenerator.prim_interstitial_coords_mult_and_equiv_coords, which are the corresponding fractional coordinates inDefectsGenerator.primitive_structure(which is used for defect generation), along with the multiplicity and equivalent coordinates, sorted according to thedopedconvention.generate_supercell (bool) – Whether to generate a supercell for the output defect entries (using the custom algorithm in
dopedwhich efficiently searches over possible supercell transformations and identifies that with the minimum number of atoms (hence computational cost) that satisfies the minimum image distance, number of atoms andideal_thresholdconstraints – which can be controlled withsupercell_gen_kwargs). IfFalse, then the input structure is used as the defect & bulk supercell (note this may give a slightly different (but fully equivalent) set of coordinates). Default isTrue.charge_state_gen_kwargs (dict) – Keyword arguments to pass to the
guess_defect_charge_statesfunction (such asprobability_threshold(default = 0.0075, used for substitutions and interstitials) andpadding(default = 1, used for vacancies)) to control defect charge state generation.supercell_gen_kwargs (dict) – Keyword arguments to pass to the
get_ideal_supercell_matrixfunction (such asmin_image_distance(default = 10),min_atoms(default = 50),ideal_threshold(default = 0.1),force_cubic– which enforces a (near-)cubic supercell output (default = False), orforce_diagonal(default = False)).interstitial_gen_kwargs (dict, bool) – Keyword arguments to be passed to
get_interstitial_sites()(such asmin_dist(0.9 Å, or 0.5 Å for Hydrogen),clustering_tol(0.8 Å),symm_pref_dist_factor(0.85),stol(0.32),tight_stol(0.02),symprec(0.01),vacuum_radius(1.5 * bulk bond length),include_unique_wyckoffs(False) – see its docstring, parentheses indicate default values), orInterstitialGenerator(where onlymin_distis used) ifinterstitial_coordsis specified. If set toFalse, interstitial generation will be skipped entirely.target_frac_coords (list) – Defects are placed at the closest equivalent site to these fractional coordinates in the generated supercells. Default is
[0.5, 0.5, 0.5](i.e. the supercell centre, to aid visualisation).processes (int) – Number of processes to use for multiprocessing. If not set, defaults to one less than the number of CPUs available.
**kwargs –
Additional keyword arguments for defect generation. Options:
{defect}_elementswhere{defect}isvacancy,substitution, orinterstitial, in which cases only those defects of the specified elements will be generated (where{defect}_elementsis a list of element symbol strings). Setting{defect}_elementsto an empty list will skip defect generation for that defect type entirely.{defect}_charge_statesto specify the charge states to use for all defects of that type (as a list of integers).neutral_onlyto only generate neutral charge states.symprecforget_sga_and_symprec(),get_primitive_structure, defect object initialisation etc. Default is0.01.Keyword arguments for
get_all_equiv_sites, such asdist_tol_factor,fixed_symprec_and_dist_tol_factor, andverbose.
- Key attributes:
- defect_entries (dict):
Dictionary of
{defect_species: DefectEntry}for all defect entries (with charge state and supercell properties) generated.- defects (dict):
Dictionary of
{defect_type: [Defect, ...]}for all defect objects generated.- primitive_structure (Structure):
Primitive cell structure of the host used to generate defects.
- supercell_matrix (Matrix):
Matrix to generate defect/bulk supercells from the primitive cell structure.
- bulk_supercell (Structure):
Supercell structure of the host (equal to
self.primitive_structure * self.supercell_matrix).- conventional_structure (Structure):
Conventional cell structure of the host according to the Bilbao Crystallographic Server (BCS) definition, used to determine defect site Wyckoff labels and multiplicities.
- prim_interstitial_coords_mult_and_equiv_coords (list):
List of interstitial coordinates in the primitive cell, their multiplicity and the equivalent fractional coordinates, as a list of tuples.
DefectsGeneratorinput parameters are also set as attributes.
- add_charge_states(defect_entry_name: str, charge_states: list | int)[source]
Add additional
DefectEntrys with the specified charge states toself.defect_entries.- Parameters:
defect_entry_name (str) – Name of defect entry to add charge states to. Doesn’t need to include the charge state.
charge_states (list) – List of charge states to add to defect entry (e.g. [-2, -3]), or a single charge state (e.g. -2).
- classmethod from_dict(d)[source]
Reconstructs
DefectsGeneratorobject from a dict representation created usingDefectsGenerator.as_dict().- Parameters:
d (dict) – dict representation of
DefectsGenerator.- Returns:
DefectsGeneratorobject
- classmethod from_json(filename: str | Path)[source]
Load a
DefectsGeneratorobject from a json(.gz) file.Note that
.json.gzfiles can be loaded directly.- Parameters:
filename (PathLike) – Filename of json file to load
DefectsGeneratorobject from.- Returns:
DefectsGeneratorobject
- remove_charge_states(defect_entry_name: str, charge_states: list | int)[source]
Remove
DefectEntrys with the specified charge states fromself.defect_entries.- Parameters:
defect_entry_name (str) – Name of defect entry to remove charge states from. Doesn’t need to include the charge state.
charge_states (list) – List of charge states to add to defect entry (e.g. [-2, -3]), or a single charge state (e.g. -2).
- to_json(filename: str | Path | None = None)[source]
Save the
DefectsGeneratorobject as a json file, which can be reloaded with theDefectsGenerator.from_json()class method.Note that file extensions with “.gz” will be automatically compressed (recommended to save space)!
- Parameters:
filename (PathLike) – Filename to save
jsonfile as. IfNone, the filename will be set as{Chemical Formula}_defects_generator.json.gzwhere {Chemical Formula} is the chemical formula of the host material.
- doped.generation.charge_state_probability(charge_state: int, defect_el_oxi_state: int, defect_el_oxi_probability: float, max_host_oxi_magnitude: int, return_log: bool = False) float | dict[source]
Function to estimate the probability of a given defect charge state, using the probability of the corresponding defect element oxidation state, the magnitude of the charge state, and the maximum magnitude of the host oxidation states (i.e. how ‘charged’ the host is).
Disfavours large (absolute) charge states, low probability oxidation states and greater charge/oxidation state magnitudes than that of the host. This charge state probability function is primarily intended for substitutions and interstitials, while the
get_vacancy_charge_states()function is used for vacancies.Specifically, the overall probability is given by the product of these probability factors:
The probability of the corresponding oxidation state of the defect element (e.g. Na_i^+1 has Na in the +1 oxidation state), as given by its prevalence in the ICSD.
The magnitude of the charge state; with a probability function:
1/|charge_state|^(2/3)The magnitude of the charge state relative to the max host oxidation state (i.e. how ‘charged’ the host is); with a probability function:
1/(2*|charge_state - max_host_oxi_magnitude|)^(2/3)ifcharge_state > max_host_oxi_magnitude, otherwise 1.The magnitude of the defect element oxidation state relative to the max host oxidation state; with a probability function:
1/(2*|defect_el_oxi_state - max_host_oxi_magnitude|)^(2/3)ifdefect_el_oxi_state > max_host_oxi_magnitude, otherwise 1.
Note that neutral charge states are always included.
This probability function was found to give optimal performance in terms of efficiency and completeness when tested against other approaches (see the
dopedJOSS paper: https://doi.org/10.21105/joss.06433) but of course may not be perfect in all cases, so make sure to critically consider the estimated charge states for your system!- Parameters:
charge_state (int) – Charge state of defect.
defect_el_oxi_state (int) – Oxidation state of defect element.
defect_el_oxi_probability (float) – Probability of oxidation state of defect element.
max_host_oxi_magnitude (int) – Maximum host oxidation state magnitude.
return_log (bool) – If true, returns a dictionary of input & computed values used to determine charge state probability. Default is False.
- Returns:
Probability of the defect charge state (between 0 and 1) if
return_logisFalse, otherwise a dictionary of input & computed values used to determine charge state probability.
- doped.generation.closest_site_info(defect_entry_or_defect: DefectEntry | Defect, n: int = 1, element_list: list[str] | None = None)[source]
Return the element and distance (rounded to 2 decimal places) of the nth closest site to the defect site in the input
DefectEntryorDefectobject.If there are multiple elements with the same distance (to within ~0.01 Å), then the preference for which element to return is controlled by
element_list. Ifelement_listis not provided, then it is set to match the order of appearance in the structure composition.If
n> 1, then it returns the nth closest site, where the nth site must be at least 0.02 Å further away than the (n-1)th site.- Parameters:
defect_entry_or_defect (Union[DefectEntry, Defect]) –
DefectEntryorDefectobject, to get neighbour info.n (int) – Return the element symbol and distance for the
nth closest site. Default is 1, corresponding to the (1st) closest neighbour.element_list (list) – Sorted list of elements in the host structure to govern the preference of elemental symbols to return, when the distance to multiple neighbours with different elements is the same. Default is to use
_get_element_list(), which follows the order of appearance of elements in theStructurecomposition.
- Returns:
Element symbol and distance (rounded to 2 decimal places) of the nth closest site to the defect site.
- Return type:
str
- doped.generation.get_defect_entry_from_defect(defect: Defect, defect_supercell: Structure, charge_state: int, dummy_species: DummySpecies = DummySpecies X0+, sc_defect_frac_coords: ndarray | None = None)[source]
Generate a
dopedDefectEntryobject from adopedDefectobject.This is used to describe a
Defectwith a specified simulation cell. To set thesc_defect_frac_coordsattribute forDefectEntry(fractional coordinates of the defect in thedefect_supercell), eitherdummy_speciesmust be present in thedefect_supercell(which is taken as the defect site), orsc_defect_frac_coordsmust be set.- Parameters:
defect (Defect) –
doped/pymatgenDefectobject.defect_supercell (Structure) – Defect supercell structure.
charge_state (int) – Charge state of the defect.
dummy_species (DummySpecies) – Dummy species present in the
defect_supercellstructure, used to determinesc_defect_frac_coords. If not found andsc_defect_frac_coordsis not set,DefectEntry.sc_defect_frac_coordsis set toNone. Default isDummySpecies("X").sc_defect_frac_coords (np.ndarray) – Fractional coordinates of the defect in the defect supercell. If not set and
dummy_speciesis not found in thedefect_supercell,DefectEntry.sc_defect_frac_coordsis set toNone. Default is None.
- Returns:
dopedDefectEntryobject.- Return type:
- doped.generation.get_defect_name_from_defect(defect: Defect, element_list: list[str] | None = None, symprec: float = 0.01)[source]
Get the doped/SnB defect name from a
Defectobject.- Parameters:
defect (Defect) –
Defectobject.element_list (list) – Sorted list of elements in the host structure, so that
closest_site_info()returns deterministic results (in case two different elements located at the same distance from defect site). Default is None.symprec (float) – Symmetry tolerance for
spglib. Default is 0.01.
- Returns:
Defect name.
- Return type:
str
- doped.generation.get_defect_name_from_entry(defect_entry: DefectEntry, element_list: list | None = None, symprec: float | None = None, relaxed: bool = True)[source]
Get the doped/SnB defect name from a
DefectEntryobject.Note: If relaxed = True (default), then this tries to use the defect_entry.defect_supercell to determine the site symmetry. This will thus give the relaxed defect point symmetry if this is a DefectEntry created from parsed defect calculations. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell. doped tries to automatically check if this is the case, and will warn you if so.
This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed DefectEntry objects should correctly determine the final relaxed defect symmetry (and closest site info) – otherwise periodicity-breaking prevents this.
- Parameters:
defect_entry (DefectEntry) –
DefectEntryobject.element_list (list) – Sorted list of elements in the host structure, so that
closest_site_info()returns deterministic results (in case two different elements located at the same distance from defect site). Default is None.symprec (float) – Symmetry tolerance for
spglib. Default is 0.01 for unrelaxed structures, 0.2 for relaxed (to account for residual structural noise). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc).relaxed (bool) – If
False, determines the site symmetry using the defect site in the unrelaxed bulk supercell, otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell). Default isTrue.
- Returns:
Defect name.
- Return type:
str
- doped.generation.get_ideal_supercell_matrix(structure: Structure, min_image_distance: float = 10.0, min_atoms: int = 50, force_cubic: bool = False, force_diagonal: bool = False, ideal_threshold: float = 0.1, pbar: tqdm | None = None) ndarray | None[source]
Determine the ideal supercell matrix for a given structure, based on the minimum image distance, minimum number of atoms and
ideal_thresholdfor further expanding if a diagonal expansion of the primitive/conventional cell is possible.The ideal supercell is the smallest possible supercell which has a minimum image distance (i.e. minimum distance between periodic images of atoms/sites in a lattice) greater than
min_image_distance(default = 10 Å – which is a typical threshold value used in DFT defect supercell calculations) and a number of atoms greater thanmin_atoms(default = 50). Once these criteria have been reached,dopedwill then continue searching up to supercell sizes (numbers of atoms)1 + ideal_thresholdtimes larger (rounded up) to see if they return a diagonal expansion of the primitive/conventional cell (which can make later visualisation and analysis much easier) – if so, this larger supercell will be returned.This search for the ideal supercell transformation matrix is performed using the
find_ideal_supercellfunction fromdoped.utils.supercells(see its docstring for more details), which efficiently scans over possible supercell matrices and identifies that with the minimum image distance and most cubic-like supercell shape. The advantage of this over that inpymatgen-analysis-defectsis that it avoids thefind_optimal_cell_shapefunction fromASE(which currently does not work for rotated matrices, is inefficient, and optimises based on cubic- like shape rather than minimum image distance), giving greatly reduced supercell sizes for a given minimum image distance.If
force_cubicorforce_diagonalareTrue, then theCubicSupercellTransformationfrompymatgenis used to identify any simple near-cubic supercell transformations which satisfy the minimum image distance and atom number criteria.- Parameters:
structure (Structure) – Primitive unit cell structure to generate supercell for.
min_image_distance (float) – Minimum image distance in Å of the supercell (i.e. minimum distance between periodic images of atoms/sites in the lattice). (Default = 10.0)
min_atoms (int) – Minimum number of atoms allowed in the supercell. (Default = 50)
force_cubic (bool) – Enforce usage of
CubicSupercellTransformationfrompymatgenfor supercell generation. (Default = False)force_diagonal (bool) – If True, return a transformation with a diagonal transformation matrix. (Default = False)
ideal_threshold (float) – Threshold for increasing supercell size (beyond that which satisfies
min_image_distanceand min_atoms`) to achieve an ideal supercell matrix (i.e. a diagonal expansion of the primitive or conventional cell). Supercells up to1 + perfect_cell_thresholdtimes larger (rounded up) are trialled, and will instead be returned if they yield an ideal transformation matrix. (Default = 0.1; i.e. 10% larger than the minimum size)pbar (tqdm) –
tqdmprogress bar object to update (for internaldopedusage). Default isNone.
- Returns:
Ideal supercell matrix (
np.ndarray) orNoneif no suitable supercell could be found.
- doped.generation.get_interstitial_sites(host_structure: Structure, min_dist: float = 0.9, clustering_tol: float = 0.8, symm_pref_dist_factor: float = 0.85, stol: float = 0.32, tight_stol: float = 0.02, symprec: float = 0.01, vacuum_radius: float | None = None, include_unique_wyckoffs: bool = False) list[source]
Generate candidate interstitial sites using Voronoi analysis.
This function uses a similar approach to that in
VoronoiInterstitialGeneratorfrompymatgen-analysis-defects, but with modifications to make interstitial generation much faster, fix bugs with interstitial grouping (which could lead to undesired dropping of candidate sites) and achieve better control over site placement in order to favour sites which are higher-symmetry and furthest from the host lattice atoms (typically the most favourable interstitial sites).If the structure is detected to have a significant vacuum volume (determined using
vacuum_radius), thus corresponding to a slab/layer/line structure, then thepymatgenadsorbate finder will be used to generate further candidate interstitial sites.The logic for picking interstitial sites is as follows:
Generate all candidate sites using (efficient) Voronoi analysis
If vacuum volume present, include adsorbate sites as candidate sites
Remove any sites which are within
min_distof any host atomsCluster the remaining sites using a tolerance of
clustering_toland symmetry-preference ofsymm_pref_dist_factor(seedoped_cluster_frac_coords)Determine the multiplicities and symmetry-equivalent coordinates of the clustered sites using
dopedsymmetry functions.Group the clustered sites by symmetry using (looser) site matching as controlled by
stol.From each group, pick the site with the highest symmetry and furthest distance from the host atoms, if its
min_distis no more than no more thansymm_pref_dist_factor(0.85 by default) times the largest possiblemin_dist(distanceto the host atoms).
(Parameters mentioned here can be supplied via
interstitial_gen_kwargsas noted in the args section below.)The motivation for favouring high symmetry interstitial sites and then distance to host atoms is because higher symmetry interstitial sites are typically the more intuitive sites for placement, cleaner, easier for analysis etc., and interstitials are often lowest-energy when furthest from host atoms (i.e. in the largest interstitial voids – particularly for fully-ionised charge states), and so this approach tries to strike a balance between these two goals.
One caveat to this preference for high symmetry interstitial sites, is that they can also be slightly more prone to being stuck in local minima on the PES, and so as always it is highly recommended to use
ShakeNBreakor another structure-searching technique to account for symmetry-breaking when performing defect relaxations!You can see what Cartesian distance the chosen
stolcorresponds to using theget_stol_equiv_distfunction. Note that you will likely want to reducemin_distfor hydrogen interstitials! (This is done by default withDefectsGenerator, reducing it to 0.5 Å for Hydrogen.)- Parameters:
host_structure (Structure) – Host structure.
min_dist (float) – Minimum distance from host atoms for interstitial sites. Defaults to 0.9 Å.
clustering_tol (float) – Tolerance for clustering interstitial sites. Defaults to 0.8 Å.
symm_pref_dist_factor (float) – Minimum acceptable ratio of distance to host atoms for symmetry-favoured sites vs distance-to-host-favoured sites, for which to prefer symmetry-favoured sites. Defaults to 0.85.
stol (float) – Structure matcher tolerance for looser site matching. Defaults to 0.32.
tight_stol (float) – Structure matcher tolerance for tighter site matching. Defaults to 0.02.
symprec (float) – Symmetry precision for (symmetry-)equivalent site determination. Defaults to 0.01.
vacuum_radius (float) – Tolerance radius for determining if a significant vacuum volume is present in the structure (if any Voronoi node has a minimum distance to a host atom greater than this value), in which case adsorbate sites will also be included for interstitial generation. Defaults to
None, in which case it is set to 1.5 times the minimum bulk bond length.include_unique_wyckoffs (bool) – Include unique Wyckoff positions (those with no free variables) in the set of candidate interstitial sites. This can help ensure that expected canonical interstitial sites (e.g. distorted tetrahedral and octahedral in BCC crystals) are included, but will increase the number of candidate interstitials and is not expected to locate lower-energy sites compared to the default Voronoi approach in most cases. Defaults to
False.
- Returns:
List of interstitial sites as fractional coordinates
- Return type:
list
- doped.generation.get_neighbour_distances_and_symbols(site: PeriodicSite, structure: Structure, n: int = 1, element_list: list[str] | None = None, dist_tol_prefactor: float = 3.0, min_dist: float = 0.02)[source]
Get a list of sorted tuples of (distance, element) for the
nth closest sites to the inputsiteinstructure, where each consecutive neighbour in the list must be at least 0.02 Å further away than the previous neighbour.If there are multiple elements with the same distance (to within ~0.01 Å), then the preference for which element to return is controlled by
element_list. Ifelement_listis not provided, then it is set to match the order of appearance in the structure composition.For efficiency, this function dynamically increases the radius when searching for nearest neighbours as necessary, using the input
dist_tol_prefactorandnto estimate a reasonable starting range.- Parameters:
site (PeriodicSite) – Site to get neighbour info.
structure (Structure) – Structure containing the site and neighbours.
n (int) – Return the element symbol and distance tuples for the
nth closest neighbours. Default is 1, corresponding to only the nearest neighbour.element_list (list) – Sorted list of elements in the host structure to govern the preference of elemental symbols to return, when the distance to multiple neighbours with different elements is the same. Default is to use the order of appearance of elements in the
Structurecomposition.dist_tol_prefactor (float) – Initial distance tolerance prefactor to use when searching for neighbours, where the initial search radius is set to
dist_tol_prefactor * sqrt(n)Å. This value is dynamically updated as needed, and so should only be adjusted if providing odd systems with very small/large bond lengths for which efficiency improvements are possible. Default is 3.0.min_dist (float) – Minimum distance in Å of neighbours from
site. Intended as a distance tolerance to exclude the site itself from the neighbour list, for which the default value (0.02 Å) should be perfectly fine in most cases. Set to 0 to include the site itself in the output list (in which case it counts towardn).
- Returns:
Sorted list of tuples of (distance, element) for the
nth closest neighbours to the inputsite.- Return type:
list
- doped.generation.get_stol_equiv_dist(stol: float, structure: Structure) float[source]
Get the equivalent Cartesian distance of a given
stolvalue for a givenStructure.stolis a site tolerance parameter used inpymatgenStructureMatcherfunctions, defined as the fraction of the average free length per atom := ( V / Nsites ) ** (1/3).- Parameters:
stol (float) – Site tolerance parameter.
structure (Structure) – Structure to get equivalent distance for.
- Returns:
Equivalent Cartesian distance for the given
stolvalue.- Return type:
float
- doped.generation.get_vacancy_charge_states(vacancy: Vacancy, padding: int = 1) list[int][source]
Get the estimated charge states for a vacancy defect, which is from +/-
paddingto the fully-ionised vacancy charge state (a.k.a. the vacancy oxidation state).e.g. for vacancies in Sb2O5 (doi.org/10.1021/acs.chemmater.3c03257), the fully-ionised charge states for
V_SbandV_Oare -5 and +2 respectively (i.e. the negative of the elemental oxidation states in Sb2O5), so the estimated charge states would be from +1 to -5 forV_Sband from +2 to -1 forV_Ofor the defaultpaddingof 1.This probability function was found to give optimal performance in terms of efficiency and completeness when tested against other approaches (see the
dopedJOSS paper: https://doi.org/10.21105/joss.06433) but of course may not be perfect in all cases, so make sure to critically consider the estimated charge states for your system!- Parameters:
vacancy (Defect) – A
dopedVacancyobject.padding (int) – Padding for vacancy charge states, such that the vacancy charge states are set to
range(vacancy oxi state, padding)if vacancy oxidation state is negative, or torange(-padding, vacancy oxi state)if positive. Default is 1.
- Returns:
A list of estimated charge states for the defect.
- Return type:
list[int]
- doped.generation.guess_defect_charge_states(defect: Defect, probability_threshold: float = 0.0075, padding: int = 1, return_log: bool = False) list[int] | tuple[list[int], list[dict]][source]
Guess the possible stable charge states of a defect.
This function estimates the probabilities of the charge states of a defect, using the probability of the corresponding defect element oxidation states, the magnitudes of the charge states, and the maximum magnitude of the host oxidation states (i.e. how ‘charged’ the host is), and returns a list of charge states that have an estimated probability greater than the
probability_threshold.Disfavours large (absolute) charge states, low probability oxidation states and greater charge/oxidation state magnitudes than that of the host. Note that neutral charge states are always included.
For specific details on the probability functions employed, see the
charge_state_probability(for substitutions and interstitials) andget_vacancy_charge_states()(for vacancies) functions.These probability functions were found to give optimal performance in terms of efficiency and completeness when tested against other approaches (see the
dopedJOSS paper: https://doi.org/10.21105/joss.06433) but of course may not be perfect in all cases, so make sure to critically consider the estimated charge states for your system!- Parameters:
defect (Defect) –
dopedDefectobject.probability_threshold (float) – Probability threshold for including defect charge states (for substitutions and interstitials). Default is 0.0075.
padding (int) – Padding for vacancy charge states, such that the vacancy charge states are set to
range(vacancy oxi state, padding)if vacancy oxidation state is negative, or torange(-padding, vacancy oxi state)if positive. Default is 1.return_log (bool) – If true, returns a tuple of the defect charge states and a list of dictionaries of input & computed values used to determine charge state probability. Default is
False.
- Returns:
List of defect charge states (int) or a tuple of the defect charge states (list) and a list of dictionaries of input & computed values used to determine charge state probability.
- doped.generation.name_defect_entries(defect_entries: list[DefectEntry | Defect], element_list: list[str] | None = None)[source]
Create a dictionary of
{name: DefectEntry}from a list ofDefectEntryobjects, where the names are set according to the default doped algorithm; which is to use the pymatgen defect name (e.g. v_Cd, Cd_Te etc.) for vacancies/antisites/substitutions, unless there are multiple inequivalent sites for the defect, in which case the point group of the defect site is appended (e.g. v_Cd_Td, Cd_Te_Td etc.), and if this is still not unique, then element identity and distance to the nearest neighbour of the defect site is appended (e.g. v_Cd_Td_Te2.83, Cd_Te_Td_Cd2.83 etc.). Names do not yet have charge states included.For interstitials, the same naming scheme is used, but the point group is always appended to the pymatgen defect name.
If still not unique after the 3rd nearest neighbour info, then “a, b, c” etc. is appended to the name of different defects to distinguish.
- Parameters:
defect_entries (list) – List of
DefectEntryorDefectobjects to name.element_list (list) – Sorted list of elements in the host structure, so that
closest_site_info()returns deterministic results (in case two different elements located at the same distance from defect site). Default is None.
- Returns:
Dictionary of
{name: DefectEntry}objects.- Return type:
dict
- doped.generation.sort_defect_entries(defect_entries: dict | list, element_list: list | None = None)[source]
Sort defect entries for deterministic behaviour; for output and when reloading
DefectsGeneratorobjects, withDefectThermodynamicsentries (particularly for deterministic plotting behaviour), and withDefectsParserobjects.Sorts defect entries by defect type (vacancies, substitutions, interstitials), then by order of appearance of elements in the host composition, then by periodic group (main groups 1, 2, 13-18 first, then TMs), then by atomic number, then (for defect entries of the same type) sort by name and charge state (from positive to negative).
- Parameters:
defect_entries (dict or list) – Dictionary (in the format:
{defect_entry_name: defect_entry}) or list of defect entries to sort.element_list (list, optional) – List of elements present, used to determine preferential ordering. If
None, determined by_get_element_list(), which orders by appearance of elements in the host composition, then by periodic group (main groups 1, 2, 13-18 first, then TMs), then by atomic number.
- Returns:
Sorted dictionary or list of defect entries.