hikari.dataframes

This module contains all dataframes utilised in hikari. A dataframe is a low-level object, which stores and manipulates certain crystallographic information. At the moment, the following dataframes are implemented:

  • hkl - for single crystal reflection data

  • cif - for crystallographic open format data (partially)

  • res - for shelx crystal structure data (partially)

Please mind that the hkl frame, HklFrame is the most developed. Other frames are in an early stage of development.

Submodules

Classes

CifFrame

A master object which manages cif files. It utilises other Cif* classes

CifBlock

CifBlock object handles all data inside an individual block of Cif file.

BaseFrame

This class stores and manipulates basic information present

UBaseFrame

A sub-class of hikari.dataframes.BaseFrame capable of the same

HklFrame

A master object which manages single-crystal diffraction files.

ResFrame

This class stores and manipulates basic information present

LstFrame

Package Contents

class hikari.dataframes.CifFrame(dict=None, /, **kwargs)[source]

Bases: collections.UserDict

A master object which manages cif files. It utilises other Cif* classes to manage multiple CifBlock`s with crystallographic information. As a subclass of an `UserDict, in python3.7+ it is ordered by design. Individual Cif blocks and items within them can be accessed or assigned using a single- or nested- dict-like syntax.

Similarly to other Frame`s, `CifFrame is designed to work in-place, meaning it should be first created, and only then accessed using methods such as read() or write(), but not chain assignments.

Unlike dict, CifBlock always initiates empty and does not accept any parameters at creation.

read(path)[source]

Read the contents of .cif file specified by the path parameter. Store each found block as a {block_name: CifBlock} pair.

Parameters:

path (str) – Absolute or relative path to the .cif file.

Return type:

None

write(path)[source]

Write the contents of CifFrame to the .cif file specified by the path parameter.

Parameters:

path (str) – Absolute or relative path to the .cif file.

Return type:

None

class hikari.dataframes.CifBlock(dict=None, /, **kwargs)[source]

Bases: collections.UserDict

CifBlock object handles all data inside an individual block of Cif file. As a subclass of an UserDict, in python3.7+ it is ordered by design. Individual Cif items can be accessed or assigned using a dict-like syntax.

get_as_type(key, typ, default=None)[source]

Get value of self[key] converted to typ. If value is a list, convert its contents element-wise.

Parameters:
  • key (str) – key associated with accessed element

  • typ (Callable[[Any], T]) – type/function applied to a value or its every element

  • default (Any) – if given, return it on KeyError

Returns:

value of self[key] or default converted to typ

Return type:

T

read(path, block)[source]

Read the contents of .cif file specified by the path parameter, but access and store only the block data block in self.

Parameters:
  • path (str) – Absolute or relative path to the .cif file.

  • block (str) – Name of the cif data block to be accessed

Return type:

None

write(path)[source]

Write the contents of CifBlock to the .cif file specified by the path parameter, using ‘hikari’ as block name.

Parameters:

path (str) – Absolute or relative path to the .cif file.

Return type:

None

class hikari.dataframes.BaseFrame[source]

This class stores and manipulates basic information present in majority of crystallographic information files such as unit cell parameters stored in scalars and vectors.

BaseFrame utilises the following notation for stored attributes:

  • The name begins from a unit cell property we are interested in:

  • “a”, “b”, “c” describe unit cell lengths/vectors a, b, c,

  • “al”, “be”, “ga” describe unit cell angles alpha, beta, gamma,

  • “v” describes unit cell volume,

  • “x”, “y”, “z” describe directions - normalised unit cell vectors.

  • “A”, “G” describe stacked vector and metric matrix, respectively.

  • The unit cell parameter symbol is then followed by an underscore “_”.

  • The name ends with a single letter denoting type of space and variable:

  • “d” (from Direct) denotes direct space scalars/matrices,

  • “r” (from Reciprocal) denotes reciprocal space scalars/matrices,

  • “v” (from Vector) denotes direct space vectors,

  • ‘w” (similar to “v”) denotes reciprocal space vectors.

The values can be accessed by referencing a given attribute in the object, for example BaseFrame. a_d stores information about the lattice constant a in direct space as a floating point, but BaseFrame. a_v is a direct space vector. Available attributes have been once again presented in a table below:

Available constants

in direct space

in reciprocal space

Unit (^-1 in reciprocal)

Scalars

a, b, c

a_d, b_d, c_d

a_r, b_r, c_r

Angstrom

al, be, ga

al_d, be_d, ga_d

al_r, be_r, ga_r

Radian

v

v_d

v_r

Angstrom^3

Vectors

a, b, c

a_v, b_v, c_v

a_w, b_w, c_w

Angstrom

x, y, z

x_v, y_v, z_v

x_w, y_w, z_w

Angstrom

Matrices

A

A_d

A_r

Angstrom^2

G

G_d

G_r

Angstrom^2

IMPORTED_FROM_CIF
orientation

3x3 matrix describing orientation of crystal during experiment.

edit_cell(**parameters)[source]

Edit direct space unit cell using a dictionary with the following keys:

  • “a” - for unit cell parameter a given in Angstrom,

  • “b” - for unit cell parameter b given in Angstrom,

  • “c” - for unit cell parameter c given in Angstrom,

  • “al” - for unit cell parameter alpha given in degrees or radians,

  • “be” - for unit cell parameter beta given in degrees or radians,

  • “ga” - for unit cell parameter gamma given in degrees or radians.

This method is equivalent to manually setting all six unit cell parameters in direct space, a_d, b_d, c_d, al_d, be_d, ga_d, and then running a private method _refresh_cell() to update other values.

Please mind that the while the “a”, “b” and “c” are always given in Angstrom, the angles might be given either in degrees or in radians. For details see function hikari.utility.math_tools.angle2rad().

It is not required for all previously stated keys to be present at each method call. If a key has not been given, previously provided and stored value is being used. If no value has been given, the default length values of 1.0 for a, b, c and default angle values of pi/2 for al, be, ga are used instead.

Parameters:

parameters (float) – Values of unit cell parameters to be changed

fill_from_cif_block(block, fragile=False)[source]

Import all data specified in IMPORTED_FROM_CIF such as unit cell parameters and orientation matrix from provided instance of hikari.dataframes.cif.CifBlock called block. Unless fragile is True, use defaults instead of rising KeyError.

Parameters:
  • block (hikari.dataframes.CifBlock) – CifBlock containing imported information.

  • fragile (bool) – If True, raise Error when any imported info is missing

_refresh_cell()[source]

Recalculate all vectors and scalars other than a_d, b_d, c_d, al_d, be_d, ga_d based on the currently stored values of the aforementioned six.

property a_d

Length of unit cell vector a in direct space. :rtype: float

Type:

return

property b_d

Length of unit cell vector b in direct space. :rtype: float

Type:

return

property c_d

Length of unit cell vector c in direct space. :rtype: float

Type:

return

property al_d

Angle between vectors b and c in degrees. :rtype: float

Type:

return

property be_d

Angle between vectors c and a in degrees. :rtype: float

Type:

return

property ga_d

Angle between vectors a and b in degrees. :rtype: float

Type:

return

property v_d

Unit cell volume in direct space. :rtype: float

Type:

return

property a_v

Unit cell vector a in direct space. :rtype: numpy.array

Type:

return

property b_v

Unit cell vector b in direct space. :rtype: numpy.array

Type:

return

property c_v

Unit cell vector c in direct space. :rtype: numpy.array

Type:

return

property A_d

Basis matrix A with vertically stacked direct space vectors. :rtype: np.array

Type:

return

property G_d

Direct space metric matrix [ai . aj]ij. :rtype: np.array

Type:

return

property a_r

Length of unit cell vector a* in reciprocal space. :rtype: float

Type:

return

property b_r

Length of unit cell vector b* in reciprocal space. :rtype: float

Type:

return

property c_r

Length of unit cell vector c* in reciprocal space. :rtype: float

Type:

return

property al_r

Angle between vectors b* and c* in degrees. :rtype: float

Type:

return

property be_r

Angle between vectors c* and a* in degrees. :rtype: float

Type:

return

property ga_r

Angle between vectors a* and b* in degrees. :rtype: float

Type:

return

property v_r

Unit cell volume in reciprocal space. :rtype: float

Type:

return

property a_w

Unit cell vector a* in reciprocal space. :rtype: numpy.array

Type:

return

property b_w

Unit cell vector b* in reciprocal space. :rtype: numpy.array

Type:

return

property c_w

Unit cell vector c* in reciprocal space. :rtype: numpy.array

Type:

return

property A_r

Basis matrix A* with vertically stacked reciprocal space vectors. :rtype: np.array

Type:

return

property G_r

Reciprocal space metric matrix [ai* . aj*]ij. :rtype: np.array

Type:

return

SELLING_S6_TRANSFORMATIONS
SELLING_E3_TRANSFORMATIONS
class hikari.dataframes.UBaseFrame[source]

Bases: hikari.dataframes.BaseFrame

A sub-class of hikari.dataframes.BaseFrame capable of the same operation as its parent, but using uncertainty.ufloats instead of floats. As a result, types specified in docstring might be wrong due to inheritance.

IMPORTED_FROM_CIF
orientation

3x3 matrix describing orientation of crystal during experiment.

_refresh_cell()[source]

Recalculate all vectors and scalars other than a_d, b_d, c_d, al_d, be_d, ga_d based on the currently stored values of the aforementioned six.

class hikari.dataframes.HklFrame[source]

Bases: hikari.dataframes.BaseFrame

A master object which manages single-crystal diffraction files. It utilises other Hkl* classes to import, store, manipulate and output information about single-crystal diffraction patterns.

HklFrame acts as a container which stores the diffraction data (Pandas dataframe, table) and elementary crystal cell data (hikari.dataframes.Base). Demanding methods belonging to this class are vectorized, providing relatively satisfactory performance and high memory capacity. HklFrame methods are designed to work in-place, so the work strategy is to create a new instance of HklFrame for each reflection dataset, manipulate it using methods, eg. merge() or trim(), and copy() to other object or output using write() if needed.

The HklFrame always initiates empty and does not accept any arguments. Some magic methods, such as __len__() and __add__() are defined and describe/operate on the frame.

HKL_LIMIT = 127

Highest absolute value of h, k or l index, which can be interpreted correctly by current version of the software.

__la = 0.71069

Wavelength of radiation used in experiment.

table

Pandas dataframe containing diffraction data information. Each row represents one reflection observation, while each column has one piece of information about the reflections. For a list of available keys, see HklKeys, whose instance is used to menage the keys of this table.

__add__(other)[source]
Parameters:

other (HklFrame) – HklFrame to be added to data

Returns:

concatenated table dataframes with metadata from first

Return type:

HklFrame

__len__()[source]
Returns:

Number of rows (individual reflections) in self.data

Return type:

int

__str__()[source]
Returns:

Human-readable representation of self.data

Return type:

str

property la

Wavelength of radiation used in the diffraction experiment. Can be set using popular abbreviations such as “MoKa” or “CuKb”, where a and b stand for alpha and beta. Implemented cathode materials include: “Ag”, “Co”, “Cr”, “Cu”, “Fe”, “Mn”, “Mo”, “Ni”, “Pd”, “Rh”, “Ti”, “Zn” and have been imported from International Tables of Crystallography, Volume C, Table 4.2.4.1, 3rd Edition.

Returns:

wavelength of radiation used in experiment

Return type:

float

property r_lim

Radius of limiting sphere in A^-1 calculated as 2/la :rtype: float

Type:

return

_in_dacs(opening_angle, vectors)[source]
dac_trim(opening_angle=35.0, vector=None)[source]

Remove reflections outside the opening_angle DAC-accessible volume. Sample/DAC orientation can be supplied either via specifying crystal orientation in hikari.dataframes.BaseFrame, in orientation or providing a xyz* vector perpendicular to the dac-accessible disc. For further details, see *Tchoń & Makal, IUCrJ 8, 1006-1017 (2021)*.

Parameters:
  • opening_angle (float) – DAC single opening angle in degrees, default 35.

  • vector (tuple[float]) – Provides information about orientation of crystal relative to DAC. If None, orientation is used instead.

Returns:

HklFrame containing only reflections in dac-accessible region.

Return type:

HklFrame

dacs_count(opening_angle=35.0, vectors=np.array((1, 0, 0)))[source]

Count unique dac-accessible reflections for n crystals placed such that vector n is perpendicular to diamond. For details see dac_trim().

Parameters:
  • opening_angle (float) – DAC single opening angle in degrees, default 35.

  • vectors (np.array) – Array with rotational axes of available DAC-discs.

Returns:

Array with numbers of unique reflns in DAC-accessible region.

Return type:

np.array

copy()[source]
Returns:

An exact deep copy of this HklFrame.

Return type:

HklFrame

extinct(space_group=SG['P1'])[source]

Removes from dataframe reflections which should be extinct based on space hikari.symmetry.group.Group. For ref. see ITC-A12.3.5.

Parameters:

space_group (hikari.symmetry.group.Group) – Space group used to extinct the reflections.

find_equivalents(point_group=PG['1'])[source]

Assign each reflection its symmetry equivalence identifier and store it in the hikari.dataframes.HklFrame.data[‘equiv’] column. The ID is an integer unique for each set of equivalent reflections.

In order to provide an information about equivalence, a point_group of reciprocal space must be provided (default PG[‘1’]). Point groups and their notation can be found in hikari.symmetry sub-package.

Parameters:

point_group (hikari.symmetry.Group) – Point group used to determine symmetry equivalence

from_dict(dictionary)[source]

Construct the self.data using information stored in dictionary. The dictionary keys must be valid strings, see HklKeys for a list of valid keys. The dictionary values must be iterable of equal size, preferably numpy.ndarray.

Parameters:

dictionary (Dict[str, numpy.ndarray]) – Dictionary with “key - iterable of values” pairs.

fill(radius=2.0)[source]

Fill dataframe with all reflections within radius from space origin.

Parameters:

radius (float) – Maximum distance from the reciprocal space origin to placed reflection (in reciprocal Angstrom).

Return type:

None

stats(bins=10, space_group=SG['P1'])[source]

Returns completeness, redundancy, number of all, unique & theoretically possible reflections within equal-volume bins in given space group.

Parameters:
  • bins (int) – Number of equal-volume bins to divide the data into.

  • space_group (hikari.symmetry.Group) – Group used to calculate equivalence and extinctions

Returns:

String containing table with stats as a function of resolution

Return type:

str

merge(point_group=PG['1'])[source]

Average down each set of redundant reflections present in the table, to one reflection. The redundancy is determined using the find_equivalents() method with appropriate point group. Thus, the merging can be used in different ways depending on point group:

  • For PG[‘1’], only reflections with exactly the same h, k, l indices will be merged. Resulting dataframe will not contain any duplicates.

  • For PG[‘-1’] reflections with the same h, k and l as well as their Friedel pairs will be merged together to one reflection.

  • For PG[‘mmm’] all equivalent reflections of “mmm” point group will be merged. Since “mmm” is centrosymmetric, Friedel pairs will be merged.

  • For PG[‘mm2’] symmetry-equivalent reflections within the “mmm” point group will be merged, but the Friedel pairs will be preserved.

The procedure will have a different effect on different dataframe keys, depending on their “reduce_behaviour” specified in HklKeys. Fixed parameters h, k, l, x, y, z, r and equiv will be preserved; Floating points such as intensity I, structure factor F and their uncertainties si and sf will be averaged using arithmetic mean; Multiplicity m will be summed; Other parameters which would lose their meaning such as batch number b will be discarded.

The merging inevitably removes some information from the dataframe, but it can be necessary for some operations. For example, the drawing procedures work faster and provide clearer image if multiple points occupying the same position in space are reduced to one instance.

Parameters:

point_group (hikari.symmetry.Group) – Point Group used to determine symmetry equivalence

place()[source]

Assign reflections their positions in reciprocal space (“x”, “y”, “z”) and calculate their distance from origin (“r”) in reciprocal Angstrom. Save four new keys and their values into the dataframe.

calculate_fcf_statistics()[source]

Calculate values of zeta (I - Ic) / si on other stats based on contents of fcf files. Save new key and its values into the dataframe.

read(hkl_path, hkl_format='shelx_4')[source]

Read the contents of .hkl file as specified by path and format, and store them in the pandas dataframe in self.data. For a list of all available .hkl formats, please refer to hikari.dataframes.HklIo.format.

Parameters:
  • hkl_path (str) – Absolute or relative path to the .hkl file.

  • hkl_format (union[int, str, dict]) – Format of provided .hkl file.

_recalculate_structure_factors_and_intensities()[source]

Calculate ‘I’ and ‘si’ or ‘F’ and ‘sf’, depending on which are missing.

_recalculate_structure_factors_from_intensities()[source]

Recalculate the structure factor F and its uncertainty sf.

Structure factor is calculated as follows: F = signum(I) * sqrt(abs(I)).

Structure factor’s uncertainty is calculated as follows: sf = si / (2 * sqrt(abs(I))).

_recalculate_intensities_from_structure_factors()[source]

Recalculate the intensity I and its uncertainty si.

Intensity is calculated as follows: I = signum(F) * F ** 2.

Intensity’s uncertainty is calculated as follows: si = 2 * sf * abs(F).

transform(operations)[source]

Apply a symmetry operation or list of symmetry operations to transform the diffraction pattern.

If one symmetry operation (3x3 or 4x4 numpy array) is provided, it effectively multiplies the hkl matrix by the operation matrix and accordingly alters the self.data dataframe. As a result, the length of self.data before and after transformation is the same.

However, the function behaves slightly counter-intuitively if two or more operation matrices are provided. In such case the method applies the transformation procedure independently for each operation, and then concatenates resulting matrices. Resulting self.data is len(operations) times longer than the initial.

The function can use 3x3 or larger (e.g. 4x4) matrices, as it selects only the upper-left 3x3 segment for the sake of calculations. Also, while reconstructing the symmetry of merged reflection file it is important to use all symmetry operations, not only generators.

Single symmetry operations or their lists belonging to certain point groups can be imported from hikari.symmetry module.

Parameters:

operations (Union[Iterable[np.ndarray], np.ndarray]) – Iterable of operation matrices to be applied

thin_out(target_cplt=1.0)[source]

Randomly remove reflections from dataframe in order to decrease the completeness to target_cplt (relatively to initial completeness).

Parameters:

target_cplt (float) – Percentage of data not removed from dataframe

to_res(path='hkl.res', colored='m')[source]

Export the reflection information from table to .res file, so that a software used to visualize .res files can be used to visualize a diffraction data in three dimensions.

Parameters:
  • colored (str) – Which key of dataframe should be visualized using color

  • path (str) – Absolute or relative path where the file should be saved

trim(limit)[source]

Remove reflections further than limit from reciprocal space origin.

Parameters:

limit (float) – Radius of the trimming sphere in reciprocal Angstrom

write(hkl_path, hkl_format='shelx_4')[source]

Write the contents of dataframe to a .hkl file using specified path and format. For a list of all available .hkl formats, please refer to hikari.dataframes.HklIo.format.

Parameters:
  • hkl_path (str) – Absolute or relative path to the .hkl file.

  • hkl_format (union[int, str, dict]) – Desired format of .hkl file.

class hikari.dataframes.ResFrame[source]

Bases: hikari.dataframes.BaseFrame

This class stores and manipulates basic information present in majority of crystallographic information files such as unit cell parameters stored in scalars and vectors.

BaseFrame utilises the following notation for stored attributes:

  • The name begins from a unit cell property we are interested in:

  • “a”, “b”, “c” describe unit cell lengths/vectors a, b, c,

  • “al”, “be”, “ga” describe unit cell angles alpha, beta, gamma,

  • “v” describes unit cell volume,

  • “x”, “y”, “z” describe directions - normalised unit cell vectors.

  • “A”, “G” describe stacked vector and metric matrix, respectively.

  • The unit cell parameter symbol is then followed by an underscore “_”.

  • The name ends with a single letter denoting type of space and variable:

  • “d” (from Direct) denotes direct space scalars/matrices,

  • “r” (from Reciprocal) denotes reciprocal space scalars/matrices,

  • “v” (from Vector) denotes direct space vectors,

  • ‘w” (similar to “v”) denotes reciprocal space vectors.

The values can be accessed by referencing a given attribute in the object, for example BaseFrame. a_d stores information about the lattice constant a in direct space as a floating point, but BaseFrame. a_v is a direct space vector. Available attributes have been once again presented in a table below:

Available constants

in direct space

in reciprocal space

Unit (^-1 in reciprocal)

Scalars

a, b, c

a_d, b_d, c_d

a_r, b_r, c_r

Angstrom

al, be, ga

al_d, be_d, ga_d

al_r, be_r, ga_r

Radian

v

v_d

v_r

Angstrom^3

Vectors

a, b, c

a_v, b_v, c_v

a_w, b_w, c_w

Angstrom

x, y, z

x_v, y_v, z_v

x_w, y_w, z_w

Angstrom

Matrices

A

A_d

A_r

Angstrom^2

G

G_d

G_r

Angstrom^2

data
atomic_form_factor(atom, hkl)[source]

Calculate X-ray atomic form factors for a single atom and a hkl array

Parameters:
  • atom (str) – Atom/ion name/identifier interpreted by form factor table

  • hkl (np.array) – A 2D array listing all hkls to consider

Returns:

A 1D array listing atomic form factors for desired hkls

Return type:

np.array

temperature_factor(hkl, u)[source]

Calculate temperature factor for single u matrix and a hkl array

Parameters:
  • hkl (np.array) – A 2D array listing all hkls to consider

  • u (np.array) – A classical anisotropic displacement parameters matrix

Returns:

A 1D array listing temperature factors for desired hkls

Return type:

np.array

form_factor(hkl, space_group)[source]

Calculate form factors based on current structure, hkls, and space group

Parameters:
  • hkl (np.array) – A 2D array listing all hkls to consider

  • space_group (hikari.symmetry.Group) – Space group describing the internal crystal symmetry

Returns:

A 1D array listing total form factors for desired hkls

Return type:

np.array

read(path)[source]

Read data from specified ins/res file and return an dict

Parameters:

path (str) – Relative or absolute path to the res file to be read

Returns:

None

Return type:

None

class hikari.dataframes.LstFrame[source]
static read_r1(path)[source]

Read and return the final value of R1 from lst file