ozzy.utils

This submodule provides utility functions for other parts of ozzy, from simple formatting operations to more complicated file-finding tasks.

axis_from_extent §

axis_from_extent(nx, lims)

Create a numerical axis from the number of cells and extent limits. The axis values are centered with respect to each cell.

Parameters:

Name	Type	Description	Default
`nx` §	`int`	The number of cells in the axis.	required
`lims` §	`tuple[float, float]`	The extent limits (min, max).	required

Returns:

Name	Type	Description
`ax`	`ndarray`	The numerical axis.

Raises:

Type	Description
`ZeroDivisionError`	If the number of cells is zero.
`TypeError`	If the second element of `lims` is not larger than the first element.

Examples:

Simple axis

import ozzy as oz
axis = oz.utils.axis_from_extent(10, (0,1))
axis
# array([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95])

Note how the axis values correspond to the center of each cell.

bins_from_axis §

bins_from_axis(axis)

Create bin edges from a numerical axis. This is useful for binning operations that require the bin edges.

Parameters:

Name	Type	Description	Default
`axis` §	`ndarray`	The numerical axis.	required

Returns:

Name	Type	Description
`binaxis`	`ndarray`	The bin edges.

Examples:

Bin edges from simple axis

First we create a simple axis with the axis_from_extent function:

import ozzy as oz
axis = oz.utils.axis_from_extent(10, (0,1))
print(axis)
# [0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95]

Now we get the bin edges:

bedges = oz.utils.bins_from_axis(axis)
bedges
# array([-6.9388939e-18,  1.0000000e-01,  2.0000000e-01,  3.0000000e-01, 4.0000000e-01,  5.0000000e-01,  6.0000000e-01,  7.0000000e-01, 8.0000000e-01,  9.0000000e-01,  1.0000000e+00])

(In this example there is some rounding error for the zero edge.)

check_h5_availability §

check_h5_availability(path)

Check if an HDF5 file can be opened for writing.

Note

This method is useful for longer analysis operations that save a file at the end. Without checking the output file writeability at the beginning, there is the risk of undergoing the lengthy processing and then failing to write the result to a file at the end.

Parameters:

Name	Type	Description	Default
`path` §	`str`	The path to the HDF5 file.	required

Raises:

Type	Description
`FileNotFoundError`	If the file is not found.
`BlockingIOError`	If the file is in use and cannot be overwritten.
`OSError`	If there is another issue with the file.

Examples:

Writing a custom analysis function

import ozzy as oz

def my_analysis(ds, output_file='output.h5'):

    # Check whether output file is writeable
    oz.utils.check_h5_availability(output_file)

    # Perform lengthy analysis
    # ...
    new_ds = 10 * ds

    # Save result
    new_ds.ozzy.save(output_file)

    return

find_runs §

find_runs(path, runs_pattern)

Find run directories matching a glob pattern.

Parameters:

Name	Type	Description	Default
`path` §	`str`	The base path.	required
`runs_pattern` §	`str \| list[str]`	The run directory name or glob pattern(s).	required

Returns:

Name	Type	Description
`dirs_dict`	`dict`	A dictionary mapping run names to their relative directory paths.

Examples:

Finding set of run folders

Let's say we have a set of simulations that pick up from different checkpoints of a baseline simulation, with the following folder tree:

.
└── all_simulations/
    ├── baseline/
    │   ├── data.h5
    │   ├── checkpoint_t_00200.h5
    │   ├── checkpoint_t_00400.h5
    │   ├── checkpoint_t_00600.h5
    │   └── ...
    ├── from_t_00200/
    │   └── data.h5
    ├── from_t_00400/
    │   └── data.h5
    ├── from_t_00600/
    │   └── data.h5
    ├── ...
    └── other_simulation

To get the directories of each subfolder, we could use either

import ozzy as oz
run_dirs = oz.utils.find_runs(path = "all_simulations", runs_pattern = "from_t_*")
print(run_dirs)
# {'from_t_00200': 'from_t_00200', 'from_t_00400': 'from_t_00400', ...}

or

import ozzy as oz
run_dirs = oz.utils.find_runs(path = ".", runs_pattern = "all_simulations/from_t_*")
print(run_dirs)
# {'from_t_00200': 'all_simulations/from_t_00200', 'from_t_00400': 'all_simulations/from_t_00400', ...}

Note that this function does not work recursively, though it still returns the current directory if no run folders are found:

import ozzy as oz
run_dirs = oz.utils.find_runs(path = ".", runs_pattern = "from_t_*")
# Could not find any run folder:
# - Checking whether already inside folder...
#     ...no
# - Proceeding without a run name.
print(run_dirs)
# {'undefined': '.'}

force_str_to_list §

force_str_to_list(var)

Convert a string to a list containing the string.

Parameters:

Name	Type	Description	Default
`var` §	`str \| object`	The input variable.	required

Returns:

Name	Type	Description
`var`	`list`	A list containing the input variable if it was a string, or the original object.

Examples:

Example

import ozzy as oz
oz.utils.force_str_to_list('hello')
# ['hello']
oz.utils.force_str_to_list([1, 2, 3])
# [1, 2, 3]

get_attr_if_exists §

get_attr_if_exists(
    da, attr, str_exists=None, str_doesnt=None
)

Retrieve an attribute from a xarray DataArray if it exists, or return a specified value otherwise.

Parameters:

Name	Type	Description	Default
`da` §	`DataArray`	The xarray DataArray object to check for the attribute.	required
`attr` §	`str`	The name of the attribute to retrieve.	required
`str_exists` §	`str \| Iterable[str] \| Callable \| None`	The value or function to use if the attribute exists. If `str`: return as-is. If `Iterable`: concatenate the first element, existing value, and second element. If `Callable`: apply this function to the existing attribute value. If `None`: return attribute if it exists, otherwise return `None`.	`None`
`str_doesnt` §	`str \| None`	The value to return if the attribute doesn't exist. If `None`, returns `None`.	`None`

Returns:

Type	Description
`str \| None`	The processed attribute value if it exists, `str_doesnt` if it doesn't exist, or `None` if `str_doesnt` is `None` and the attribute doesn't exist.

Notes

If str_exists is an Iterable with more than two elements, only the first two are used, and a warning is printed.

Examples:

Basic usage with string

import ozzy as oz
import numpy as np

# Create a sample DataArray with an attribute
da = oz.DataArray(np.random.rand(3, 3), attrs={'units': 'meters'})

result = get_attr_if_exists(da, 'missing_attr', 'Exists', 'Does not exist')
print(result)
# Output: Does not exist

Using an Iterable and a Callable

import ozzy as oz
import numpy as np

da = oz.DataArray(np.random.rand(3, 3), attrs={'units': 'meters'})

# Using an Iterable
result = get_attr_if_exists(da, 'units', ['Unit: ', ' (SI)'], 'No unit')
print(result)
# Output: Unit: meters (SI)

result = get_attr_if_exists(da, 'units', lambda x: f'The unit is: {x}', 'No unit found')
print(result)
# Output: The unit is: meters

# Using a Callable
result = get_attr_if_exists(da, 'units', lambda x: x.upper(), 'No unit')
print(result)
# Output: METERS

get_regex_snippet §

get_regex_snippet(pattern, string)

Extract a regex pattern from a string using re.search.

Tip

Use regex101.com to experiment with and debug regular expressions.

Parameters:

Name	Type	Description	Default
`pattern` §	`str`	The regular expression pattern.	required
`string` §	`str`	The input string.	required

Returns:

Name	Type	Description
`match`	`str`	The matched substring.

Examples:

Get number from file name

import ozzy as oz
oz.utils.get_regex_snippet(r'\d+', 'field-001234.h5')
# '001234'

get_user_methods §

get_user_methods(clss)

Get a list of user-defined methods in a class.

Parameters:

Name	Type	Description	Default
`clss` §	`class`	The input class.	required

Returns:

Name	Type	Description
`methods`	`list[str]`	A list of user-defined method names in the class.

Examples:

Minimal class

class MyClass:
    def __init__(self):
        pass
    def my_method(self):
        pass

import ozzy as oz
oz.utils.get_user_methods(MyClass)
# ['my_method']

path_list_to_pars §

path_list_to_pars(pathlist)

Split a list of file paths into common directory, run directories, and quantities.

Parameters:

Name	Type	Description	Default
`pathlist` §	`list[str]`	A list of file paths.	required

Returns:

Name	Type	Description
`common_dir`	`str`	The common directory shared by all file paths.
`dirs_runs`	`dict[str, str]`	A dictionary mapping run folder names to their absolute paths.
`quants`	`list[str]`	A list of unique quantities (file names) present in the input paths.

Examples:

Simple example

import os
from ozzy.utils import path_list_to_pars

pathlist = ['/path/to/run1/quantity1.txt',
            '/path/to/run1/quantity2.txt',
            '/path/to/run2/quantity1.txt']

common_dir, dirs_runs, quants = path_list_to_pars(pathlist)

print(f"Common directory: {common_dir}")
# Common directory: /path/to
print(f"Run directories: {dirs_runs}")
# Run directories: {'run1': '/path/to/run1', 'run2': '/path/to/run2'}
print(f"Quantities: {quants}")
# Quantities: ['quantity2.txt', 'quantity1.txt']

Single file path

import os
from ozzy.utils import path_list_to_pars

pathlist = ['/path/to/run1/quantity.txt']

common_dir, dirs_runs, quants = path_list_to_pars(pathlist)

print(f"Common directory: {common_dir}")
# Common directory: /path/to/run1
print(f"Run directories: {dirs_runs}")
# Run directories: {'.': '/path/to/run1'}
print(f"Quantities: {quants}")
# Quantities: ['quantity.txt']

prep_file_input §

prep_file_input(files)

Prepare path input argument by expanding user paths and converting to absolute paths.

Parameters:

Name	Type	Description	Default
`files` §	`str \| list of str`	The input file(s).	required

Returns:

Name	Type	Description
`filelist`	`list of str`	A list of absolute file paths.

Examples:

Expand user folder

import ozzy as oz
oz.utils.prep_file_input('~/example.txt')
# ['/home/user/example.txt']
oz.utils.prep_file_input(['~/file1.txt', '~/file2.txt'])
# ['/home/user/file1.txt', '/home/user/file2.txt']

print_file_item §

print_file_item(file)

Print a file name with a leading ' - '.

Parameters:

Name	Type	Description	Default
`file` §	`str`	The file name to be printed.	required

Examples:

Example

import ozzy as oz
oz.utils.print_file_item('example.txt')
# - example.txt

recursive_search_for_file §

recursive_search_for_file(fname, path=os.getcwd())

Recursively search for files with a given name or pattern in a specified directory and its subdirectories.

Parameters:

Name	Type	Description	Default
`fname` §	`str`	The name or name pattern of the file to search for.	required
`path` §	`str`	The path to the directory where the search should start. If not specified, uses the current directory via `os.getcwd`.	`getcwd()`

Returns:

Type	Description
`list[str]`	A list of paths to the files found, relative to `path`.

Examples:

Search for a file in the current directory

from ozzy.utils import recursive_search_for_file
files = recursive_search_for_file('example.txt')
# files = ['/path/to/current/dir/example.txt']

Search for many files in a subdirectory

from ozzy.utils import recursive_search_for_file
files = recursive_search_for_file('data-*.h5', '/path/to/project')
# files = ['data/data-000.h5', 'data/data-001.h5', 'analysis/data-modified.h5']

set_attr_if_exists §

set_attr_if_exists(da, attr, str_exists, str_doesnt=None)

Set or modify an attribute of a DataArray if it exists, or modify if it doesn't exist or is None.

Parameters:

Name	Type	Description	Default
`da` §	`DataArray`	The input DataArray.	required
`attr` §	`str`	The name of the attribute to set or modify.	required
`str_exists` §	`str \| Iterable[str] \| Callable \| None`	The value or function to use if the attribute exists. If `str`: replace the attribute with this string. If `Iterable`: concatenate the first element, existing value, and second element. If `Callable`: apply this function to the existing attribute value. If `None`: do not change the attribute.	required
`str_doesnt` §	`str \| None`	The value to set if the attribute doesn't exist. If `None`, no action is taken.	`None`

Returns:

Type	Description
`DataArray`	The modified DataArray with updated attributes.

Notes

If str_exists is an Iterable with more than two elements, only the first two are used, and a warning is printed.

Examples:

Set an existing attribute

import ozzy as oz
import numpy as np

# Create a sample DataArray
da = oz.DataArray(np.random.rand(3, 3), attrs={'units': 'meters'})

# Set an existing attribute
da = set_attr_if_exists(da, 'units', 'kilometers')
print(da.attrs['units'])
# Output: kilometers

Modify an existing attribute with a function

import ozzy as oz
import numpy as np

# Create a sample DataArray
da = oz.DataArray(np.random.rand(3, 3), attrs={'description': 'Random data'})

# Modify an existing attribute with a function
da = set_attr_if_exists(da, 'description', lambda x: x.upper())
print(da.attrs['description'])
# Output: RANDOM DATA

Set a non-existing attribute

import ozzy as oz
import numpy as np

# Create a sample DataArray
da = oz.DataArray(np.random.rand(3, 3))

# Set a non-existing attribute
da = set_attr_if_exists(da, 'units', 'meters', str_doesnt='unknown')
print(da.attrs['units'])
# Output: unknown

stopwatch §

stopwatch(method)

Decorator function to measure the execution time of a method.

Parameters:

Name	Type	Description	Default
`method` §	`callable`	The method to be timed.	required

Returns:

Name	Type	Description
`timed`	`callable`	A wrapped version of the input method that prints the execution time.

Examples:

Get execution time whenever a function is called

from ozzy.utils import stopwatch

@stopwatch
def my_function(a, b):
    return a + b

my_function(2, 3)
# -> 'my_function' took: 0:00:00.000001
# 5

tex_format §

tex_format(str)

Format a string for TeX by enclosing it with '$' symbols.

Parameters:

Name	Type	Description	Default
`str` §	`str`	The input string.	required

Returns:

Name	Type	Description
`newstr`	`str`	The TeX-formatted string.

Examples:

Example

import ozzy as oz
oz.utils.tex_format('k_p^2')
# '$k_p^2$'
oz.utils.tex_format('')
# ''

unpack_attr §

unpack_attr(attr)

Unpack a NumPy array attribute, typically from HDF5 files.

This function handles different shapes and data types of NumPy arrays, particularly focusing on string (byte string) attributes. It's useful for unpacking attributes read from HDF5 files using h5py.

Parameters:

Name	Type	Description	Default
`attr` §	`ndarray`	The input NumPy array to unpack.	required

Returns:

Type	Description
`object`	The unpacked attribute. For string attributes, it returns a UTF-8 decoded string. For other types, it returns either a single element (if the array has only one element) or the entire array.

Raises:

Type	Description
`AssertionError`	If the input is not a NumPy array.

Notes

For string attributes (dtype.kind == 'S'):
- 0D arrays: returns the decoded string
- 1D arrays: returns the first element decoded
- 2D arrays: returns the first element if size is 1, otherwise the entire array
For non-string attributes:
- If the array has only one element, returns that element
- Otherwise, returns the entire array

Examples:

Unpacking a string attribute

import numpy as np
import ozzy.utils as utils

# Create a NumPy array with a byte string
attr = np.array(b'Hello, World!')
result = utils.unpack_attr(attr)
print(result)
# Output: Hello, World!

Unpacking a numeric attribute

import numpy as np
import ozzy.utils as utils

# Create a NumPy array with a single number
attr = np.array([42])
result = utils.unpack_attr(attr)
print(result)
# Output: 42

ozzy.utils

axis_from_extent §

nx §

lims §

bins_from_axis §

axis §

check_h5_availability §

path §

find_runs §

path §

runs_pattern §

force_str_to_list §

var §

get_attr_if_exists §

da §

attr §

str_exists §

str_doesnt §

get_regex_snippet §

pattern §

string §

get_user_methods §

clss §

path_list_to_pars §

pathlist §

prep_file_input §

files §

print_file_item §

file §

recursive_search_for_file §

fname §

path §

set_attr_if_exists §

da §

attr §

str_exists §

str_doesnt §

stopwatch §

method §

tex_format §

str §

unpack_attr §

attr §

`nx` §

`lims` §

`axis` §

`path` §

`path` §

`runs_pattern` §

`var` §

`da` §

`attr` §

`str_exists` §

`str_doesnt` §

`pattern` §

`string` §

`clss` §

`pathlist` §

`files` §

`file` §

`fname` §

`path` §

`da` §

`attr` §

`str_exists` §

`str_doesnt` §

`method` §

`str` §

`attr` §