doped.generation module
Code to generate Defect objects and supercell structures for ab-initio calculations.
- class doped.generation.DefectsGenerator(structure: Structure, extrinsic: str | list | dict | None = None, interstitial_coords: list | None = None, generate_supercell: bool = True, charge_state_gen_kwargs: dict | None = None, supercell_gen_kwargs: dict[str, int | float | bool] | None = None, interstitial_gen_kwargs: dict | None = None, target_frac_coords: list | None = None, processes: int | None = None)[source]
Bases:
MSONable
Class for generating doped DefectEntry objects.
Generates doped DefectEntry objects for defects in the input host structure. By default, generates all intrinsic defects, but extrinsic defects (impurities) can also be created using the
extrinsic
argument.Interstitial sites are generated using Voronoi tessellation by default (found to be the most reliable), which can be controlled using the
interstitial_gen_kwargs
argument (passed as keyword arguments to theVoronoiInterstitialGenerator
class). Alternatively, a list of interstitial sites (or single interstitial site) can be manually specified using theinterstitial_coords
argument.By default, supercells are generated for each defect using the doped
get_ideal_supercell_matrix()
function (see docstring), with default settings ofmin_image_distance = 10
(minimum distance between periodic images of 10 Å),min_atoms = 50
(minimum 50 atoms in the supercell) andideal_threshold = 0.1
(allow up to 10% larger supercell if it is a diagonal expansion of the primitive or conventional cell). This uses a custom algorithm indoped
to efficiently search over possible supercell transformations and identify that with the minimum number of atoms (hence computational cost) that satisfies the minimum image distance, number of atoms andideal_threshold
constraints. These settings can be controlled by specifying keyword arguments withsupercell_gen_kwargs
, which are passed toget_ideal_supercell_matrix()
(e.g. for a minimum image distance of 15 Å with at least 100 atoms, use:supercell_gen_kwargs = {'min_image_distance': 15, 'min_atoms': 100}
). If the input structure already satisfies these constraints (for the same number of atoms as thedoped
-generated supercell), then it will be used. Alternatively ifgenerate_supercell = False
, then no supercell is generated and the input structure is used as the defect & bulk supercell. (Note this may give a slightly different (but fully equivalent) set of coordinates).The algorithm for determining defect entry names is to use the pymatgen defect name (e.g.
v_Cd
,Cd_Te
etc.) for vacancies/antisites/substitutions, unless there are multiple inequivalent sites for the defect, in which case the point group of the defect site is appended (e.g.v_Cd_Td
,Cd_Te_Td
etc.), and if this is still not unique, then element identity and distance to the nearest neighbour of the defect site is appended (e.g.v_Cd_Td_Te2.83
,Cd_Te_Td_Cd2.83
etc.). For interstitials, the same naming scheme is used, but the point group is always appended to the pymatgen defect name.Possible charge states for the defects are estimated using the probability of the corresponding defect element oxidation state, the magnitude of the charge state, and the maximum magnitude of the host oxidation states (i.e. how ‘charged’ the host is), with large (absolute) charge states, low probability oxidation states and/or greater charge/oxidation state magnitudes than that of the host being disfavoured. This can be controlled using the
probability_threshold
(default = 0.0075) orpadding
(default = 1) keys in thecharge_state_gen_kwargs
parameter, which are passed to the_charge_state_probability()
function. The input and computed values used to guess charge state probabilities are provided in theDefectEntry.charge_state_guessing_log
attributes. See docs for examples of modifying the generated charge states.- Parameters:
structure (Structure) – Structure of the host material (as a pymatgen Structure object). If this is not the primitive unit cell, it will be reduced to the primitive cell for defect generation, before supercell generation.
extrinsic (Union[str, list, dict]) – List or dict of elements (or string for single element) to be used for extrinsic defect generation (i.e. dopants/impurities). If a list is provided, all possible substitutional defects for each extrinsic element will be generated. If a dict is provided, the keys should be the host elements to be substituted, and the values the extrinsic element(s) to substitute in; as a string or list. In both cases, all possible extrinsic interstitials are generated.
interstitial_coords (list) – List of fractional coordinates (corresponding to the input structure), or a single set of fractional coordinates, to use as interstitial defect site(s). Default (when interstitial_coords not specified) is to automatically generate interstitial sites using Voronoi tessellation. The input interstitial_coords are converted to
DefectsGenerator.prim_interstitial_coords
, which are the corresponding fractional coordinates inDefectsGenerator.primitive_structure
(which is used for defect generation), along with the multiplicity and equivalent coordinates, sorted according to the doped convention.generate_supercell (bool) – Whether to generate a supercell for the output defect entries (using the custom algorithm in
doped
which efficiently searches over possible supercell transformations and identifies that with the minimum number of atoms (hence computational cost) that satisfies the minimum image distance, number of atoms andideal_threshold
constraints - which can be controlled withsupercell_gen_kwargs
). If False, then the input structure is used as the defect & bulk supercell. (Note this may give a slightly different (but fully equivalent) set of coordinates).charge_state_gen_kwargs (dict) – Keyword arguments to be passed to the
_charge_state_probability
function (such asprobability_threshold
(default = 0.0075, used for substitutions and interstitials) andpadding
(default = 1, used for vacancies)) to control defect charge state generation.supercell_gen_kwargs (dict) – Keyword arguments to be passed to the
get_ideal_supercell_matrix
function (such asmin_image_distance
(default = 10),min_atoms
(default = 50),ideal_threshold
(default = 0.1),force_cubic
- which enforces a (near-)cubic supercell output (default = False), orforce_diagonal
(default = False)).interstitial_gen_kwargs (dict, bool) – Keyword arguments to be passed to the
VoronoiInterstitialGenerator
class (such asclustering_tol
,stol
,min_dist
etc), or toInterstitialGenerator
ifinterstitial_coords
is specified. If set to False, interstitial generation will be skipped entirely.target_frac_coords (list) – Defects are placed at the closest equivalent site to these fractional coordinates in the generated supercells. Default is [0.5, 0.5, 0.5] if not set (i.e. the supercell centre, to aid visualisation).
processes (int) – Number of processes to use for multiprocessing. If not set, defaults to one less than the number of CPUs available.
- defect_entries
Dictionary of {defect_species: DefectEntry} for all defect entries (with charge state and supercell properties) generated.
- Type:
dict
- defects
Dictionary of {defect_type: [Defect, …]} for all defect objects generated.
- Type:
dict
- primitive_structure
Primitive cell structure of the host used to generate defects.
- Type:
Structure
- supercell_matrix
Matrix to generate defect/bulk supercells from the primitive cell structure.
- Type:
Matrix
- bulk_supercell
Supercell structure of the host (equal to primitive_structure * supercell_matrix).
- Type:
Structure
- conventional_structure
Conventional cell structure of the host according to the Bilbao Crystallographic Server (BCS) definition, used to determine defect site Wyckoff labels and multiplicities.
- Type:
Structure
- ``DefectsGenerator`` input parameters are also set as attributes.
- add_charge_states(defect_entry_name: str, charge_states: list)[source]
Add additional
DefectEntry
s with the specified charge states toself.defect_entries
.- Parameters:
defect_entry_name (str) – Name of defect entry to add charge states to. Doesn’t need to include the charge state.
charge_states (list) – List of charge states to add to defect entry (e.g. [-2, -3]).
- classmethod from_dict(d)[source]
Reconstructs DefectsGenerator object from a dict representation created using DefectsGenerator.as_dict().
- Parameters:
d (dict) – dict representation of DefectsGenerator.
- Returns:
DefectsGenerator object
- classmethod from_json(filename: str)[source]
Load a DefectsGenerator object from a json file.
- Parameters:
filename (str) – Filename of json file to load DefectsGenerator
from. (object)
- Returns:
DefectsGenerator object
- remove_charge_states(defect_entry_name: str, charge_states: list)[source]
Remove
DefectEntry
s with the specified charge states fromself.defect_entries
.- Parameters:
defect_entry_name (str) – Name of defect entry to remove charge states from. Doesn’t need to include the charge state.
charge_states (list) – List of charge states to add to defect entry (e.g. [-2, -3]).
- to_json(filename: str | None = None)[source]
Save the
DefectsGenerator
object as a json file, which can be reloaded with theDefectsGenerator.from_json()
class method.- Parameters:
filename (str) – Filename to save json file as. If None, the filename will be set as “{Chemical Formula}_defects_generator.json” where {Chemical Formula} is the chemical formula of the host material.
- doped.generation.closest_site_info(defect_entry_or_defect, n=1, element_list=None)[source]
Return the element and distance (rounded to 2 decimal places) of the closest site to the defect in the input DefectEntry or Defect object.
If DefectEntry, uses defect_entry.defect_supercell_site if set, otherwise defect_entry.sc_defect_frac_coords, with defect_entry.sc_entry.structure. If Defect, uses defect.get_supercell_structure() with a 2x2x2 supercell to ensure none of the detected sites are periodic images of the defect site.
Requires distances > 0.01 (i.e. so not the site itself), and if there are multiple elements with the same distance, sort by order of appearance of elements in the composition, then alphabetically and return the first one.
If n is set, then it returns the nth closest site, where the nth site must be at least 0.02 Å further away than the n-1th site.
- doped.generation.get_defect_entry_from_defect(defect: ~doped.core.Defect, defect_supercell: ~pymatgen.core.structure.Structure, charge_state: int, dummy_species: ~pymatgen.core.periodic_table.DummySpecies = DummySpecies X0+)[source]
Generate doped DefectEntry object from a doped Defect object.
This is used to describe a Defect with a specified simulation cell.
- Parameters:
defect (Defect) – doped/pymatgen Defect object.
defect_supercell (Structure) – Defect supercell structure.
charge_state (int) – Charge state of the defect.
dummy_species (DummySpecies) – Dummy species used to keep track of defect
- Returns:
doped DefectEntry object.
- Return type:
- doped.generation.get_defect_name_from_defect(defect, element_list=None, symm_ops=None, symprec=0.01)[source]
Get the doped/SnB defect name from Defect object.
- Parameters:
defect (Defect) – Defect object.
element_list (list) – Sorted list of elements in the host structure, so that closest_site_info returns deterministic results (in case two different elements located at the same distance from defect site). Default is None.
symm_ops (list) – List of symmetry operations of
defect.structure
, to avoid re-calculating. Default is None (recalculates).symprec (float) – Symmetry tolerance for
spglib
. Default is 0.01.
- Returns:
Defect name.
- Return type:
str
- doped.generation.get_defect_name_from_entry(defect_entry: DefectEntry, element_list: list | None = None, symm_ops: list | None = None, symprec: float | None = None, relaxed: bool = True)[source]
Get the doped/SnB defect name from a DefectEntry object.
Note: If relaxed = True (default), then this tries to use the defect_entry.defect_supercell to determine the site symmetry. This will thus give the relaxed defect point symmetry if this is a DefectEntry created from parsed defect calculations. However, it should be noted that this is not guaranteed to work in all cases; namely for non-diagonal supercell expansions, or sometimes for non-scalar supercell expansion matrices (e.g. a 2x1x2 expansion)(particularly with high-symmetry materials) which can mess up the periodicity of the cell. doped tries to automatically check if this is the case, and will warn you if so.
This can also be checked by using this function on your doped generated defects:
from doped.generation import get_defect_name_from_entry for defect_name, defect_entry in defect_gen.items(): print(defect_name, get_defect_name_from_entry(defect_entry, relaxed=False), get_defect_name_from_entry(defect_entry), "\n")
And if the point symmetries match in each case, then using this function on your parsed relaxed DefectEntry objects should correctly determine the final relaxed defect symmetry (and closest site info) - otherwise periodicity-breaking prevents this.
- Parameters:
defect_entry (DefectEntry) –
DefectEntry
object.element_list (list) – Sorted list of elements in the host structure, so that closest_site_info returns deterministic results (in case two different elements located at the same distance from defect site). Default is None.
symm_ops (list) – List of symmetry operations of either the defect_entry.bulk_supercell structure (if relaxed=False) or defect_entry.defect_supercell (if relaxed=True), to avoid re-calculating. Default is None (recalculates).
symprec (float) – Symmetry tolerance for
spglib
. Default is 0.01 for unrelaxed structures, 0.2 for relaxed (to account for residual structural noise). You may want to adjust for your system (e.g. if there are very slight octahedral distortions etc).relaxed (bool) – If False, determines the site symmetry using the defect site in the unrelaxed bulk supercell, otherwise tries to determine the point symmetry of the relaxed defect in the defect supercell). Default is True.
- Returns:
Defect name.
- Return type:
str
- doped.generation.get_ideal_supercell_matrix(structure: Structure, min_image_distance: float = 10.0, min_atoms: int = 50, force_cubic: bool = False, force_diagonal: bool = False, ideal_threshold: float = 0.1, pbar: tqdm | None = None) ndarray | None [source]
Determine the ideal supercell matrix for a given structure, based on the minimum image distance, minimum number of atoms and
ideal_threshold
for further expanding if a diagonal expansion of the primitive/conventional cell is possible.The ideal supercell is the smallest possible supercell which has a minimum image distance (i.e. minimum distance between periodic images of atoms/sites in a lattice) greater than
min_image_distance
(default = 10 Å - which is a typical threshold value used in DFT defect supercell calculations) and a number of atoms greater thanmin_atoms
(default = 50). Once these criteria have been reached,doped
will then continue searching up to supercell sizes (numbers of atoms)1 + ideal_threshold
times larger (rounded up) to see if they return a diagonal expansion of the primitive/conventional cell (which can make later visualisation and analysis much easier) - if so, this larger supercell will be returned.This search for the ideal supercell transformation matrix is performed using the
find_ideal_supercell
function fromdoped.utils.supercells
(see its docstring for more details), which efficiently scans over possible supercell matrices and identifies that with the minimum image distance and most cubic-like supercell shape. The advantage of this over that inpymatgen-analysis-defects
is that it avoids thefind_optimal_cell_shape
function fromASE
(which currently does not work for rotated matrices, is inefficient, and optimises based on cubic-like shape rather than minimum image distance), giving greatly reduced supercell sizes for a given minimum image distance.If
force_cubic
orforce_diagonal
areTrue
, then theCubicSupercellTransformation
frompymatgen
is used to identify any simple near-cubic supercell transformations which satisfy the minimum image distance and atom number criteria.- Parameters:
structure (Structure) – Primitive unit cell structure to generate supercell for.
min_image_distance (float) – Minimum image distance in Å of the supercell (i.e. minimum distance between periodic images of atoms/sites in the lattice). (Default = 10.0)
min_atoms (int) – Minimum number of atoms allowed in the supercell. (Default = 50)
force_cubic (bool) – Enforce usage of
CubicSupercellTransformation
frompymatgen
for supercell generation. (Default = False)force_diagonal (bool) – If True, return a transformation with a diagonal transformation matrix. (Default = False)
ideal_threshold (float) – Threshold for increasing supercell size (beyond that which satisfies
min_image_distance
and min_atoms`) to achieve an ideal supercell matrix (i.e. a diagonal expansion of the primitive or conventional cell). Supercells up to1 + perfect_cell_threshold
times larger (rounded up) are trialled, and will instead be returned if they yield an ideal transformation matrix. (Default = 0.1; i.e. 10% larger than the minimum size)pbar (tqdm) – tqdm progress bar object to update (for internal
doped
usage). Default is None.
- Returns:
Ideal supercell matrix (np.ndarray) or None if no suitable supercell could be found.
- doped.generation.get_oxi_probabilities(element_symbol: str) dict [source]
Get a dictionary of oxidation states and their probabilities for an element.
Tries to get the probabilities from the
pymatgen
tabulated ICSD oxidation state probabilities, and if not available, uses the common oxidation states of the element.- Parameters:
element_symbol (str) – Element symbol.
- Returns:
Dictionary of oxidation states (ints) and their probabilities (floats).
- Return type:
dict
- doped.generation.guess_defect_charge_states(defect: Defect, probability_threshold: float = 0.0075, padding: int = 1, return_log: bool = False) list[int] | tuple[list[int], list[dict]] [source]
Guess the possible charge states of a defect.
- Parameters:
defect (Defect) – doped Defect object.
probability_threshold (float) – Probability threshold for including defect charge states (for substitutions and interstitials). Default is 0.0075.
padding (int) – Padding for vacancy charge states, such that the vacancy charge states are set to range(vacancy oxi state, padding), if vacancy oxidation state is negative, or to range(-padding, vacancy oxi state), if positive. Default is 1.
return_log (bool) – If true, returns a tuple of the defect charge states and a list of dictionaries of input & computed values used to determine charge state probability. Default is False.
- Returns:
List of defect charge states (int) or a tuple of the defect charge states (list) and a list of dictionaries of input & computed values used to determine charge state probability.
- doped.generation.name_defect_entries(defect_entries, element_list=None, symm_ops=None)[source]
Create a dictionary of {Name: DefectEntry} from a list of DefectEntry objects, where the names are set according to the default doped algorithm; which is to use the pymatgen defect name (e.g. v_Cd, Cd_Te etc.) for vacancies/antisites/substitutions, unless there are multiple inequivalent sites for the defect, in which case the point group of the defect site is appended (e.g. v_Cd_Td, Cd_Te_Td etc.), and if this is still not unique, then element identity and distance to the nearest neighbour of the defect site is appended (e.g. v_Cd_Td_Te2.83, Cd_Te_Td_Cd2.83 etc.). Names do not yet have charge states included.
For interstitials, the same naming scheme is used, but the point group is always appended to the pymatgen defect name.
If still not unique after the 3rd nearest neighbour info, then “a, b, c” etc is appended to the name of different defects to distinguish.
- Parameters:
defect_entries (list) – List of DefectEntry objects to name.
element_list (list) – Sorted list of elements in the host structure, so that closest_site_info returns deterministic results (in case two different elements located at the same distance from defect site). Default is None.
symm_ops (list) – List of symmetry operations of defect.structure (i.e. the primitive structure), to avoid re-calculating. Default is None (recalculates).
- Returns:
Dictionary of {Name: DefectEntry} objects.
- Return type:
dict