Skip to content

ozzy.utils

This submodule provides utility functions for other parts of ozzy, from simple formatting operations to more complicated file-finding tasks.

axis_from_extent §

axis_from_extent(nx, lims)

Create a numerical axis from the number of cells and extent limits. The axis values are centered with respect to each cell.

Parameters:

Name Type Description Default

nx §

int

The number of cells in the axis.

required

lims §

tuple[float, float]

The extent limits (min, max).

required

Returns:

Name Type Description
ax ndarray

The numerical axis.

Raises:

Type Description
ZeroDivisionError

If the number of cells is zero.

TypeError

If the second element of lims is not larger than the first element.

Examples:

Simple axis

import ozzy as oz
axis = oz.utils.axis_from_extent(10, (0,1))
axis
# array([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95])
Note how the axis values correspond to the center of each cell.

bins_from_axis §

bins_from_axis(axis)

Create bin edges from a numerical axis. This is useful for binning operations that require the bin edges.

Parameters:

Name Type Description Default

axis §

ndarray

The numerical axis.

required

Returns:

Name Type Description
binaxis ndarray

The bin edges.

Examples:

Bin edges from simple axis

First we create a simple axis with the axis_from_extent function:

import ozzy as oz
axis = oz.utils.axis_from_extent(10, (0,1))
print(axis)
# [0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95]
Now we get the bin edges:

bedges = oz.utils.bins_from_axis(axis)
bedges
# array([-6.9388939e-18,  1.0000000e-01,  2.0000000e-01,  3.0000000e-01, 4.0000000e-01,  5.0000000e-01,  6.0000000e-01,  7.0000000e-01, 8.0000000e-01,  9.0000000e-01,  1.0000000e+00])

(In this example there is some rounding error for the zero edge.)

check_h5_availability §

check_h5_availability(path)

Check if an HDF5 file can be opened for writing.

Note

This method is useful for longer analysis operations that save a file at the end. Without checking the output file writeability at the beginning, there is the risk of undergoing the lengthy processing and then failing to write the result to a file at the end.

Parameters:

Name Type Description Default

path §

str

The path to the HDF5 file.

required

Raises:

Type Description
FileNotFoundError

If the file is not found.

BlockingIOError

If the file is in use and cannot be overwritten.

OSError

If there is another issue with the file.

Examples:

Writing a custom analysis function
import ozzy as oz

def my_analysis(ds, output_file='output.h5'):

    # Check whether output file is writeable
    oz.utils.check_h5_availability(output_file)

    # Perform lengthy analysis
    # ...
    new_ds = 10 * ds

    # Save result
    new_ds.ozzy.save(output_file)

    return

find_runs §

find_runs(path, runs_pattern)

Find run directories matching a glob pattern.

Parameters:

Name Type Description Default

path §

str

The base path.

required

runs_pattern §

str | list[str]

The run directory name or glob pattern(s).

required

Returns:

Name Type Description
dirs_dict dict

A dictionary mapping run names to their relative directory paths.

Examples:

Finding set of run folders

Let's say we have a set of simulations that pick up from different checkpoints of a baseline simulation, with the following folder tree:

.
└── all_simulations/
    ├── baseline/
    │   ├── data.h5
    │   ├── checkpoint_t_00200.h5
    │   ├── checkpoint_t_00400.h5
    │   ├── checkpoint_t_00600.h5
    │   └── ...
    ├── from_t_00200/
    │   └── data.h5
    ├── from_t_00400/
    │   └── data.h5
    ├── from_t_00600/
    │   └── data.h5
    ├── ...
    └── other_simulation

To get the directories of each subfolder, we could use either

import ozzy as oz
run_dirs = oz.utils.find_runs(path = "all_simulations", runs_pattern = "from_t_*")
print(run_dirs)
# {'from_t_00200': 'from_t_00200', 'from_t_00400': 'from_t_00400', ...}
or
import ozzy as oz
run_dirs = oz.utils.find_runs(path = ".", runs_pattern = "all_simulations/from_t_*")
print(run_dirs)
# {'from_t_00200': 'all_simulations/from_t_00200', 'from_t_00400': 'all_simulations/from_t_00400', ...}

Note that this function does not work recursively, though it still returns the current directory if no run folders are found:

import ozzy as oz
run_dirs = oz.utils.find_runs(path = ".", runs_pattern = "from_t_*")
# Could not find any run folder:
# - Checking whether already inside folder...
#     ...no
# - Proceeding without a run name.
print(run_dirs)
# {'undefined': '.'}

force_str_to_list §

force_str_to_list(var)

Convert a string to a list containing the string.

Parameters:

Name Type Description Default

var §

str | object

The input variable.

required

Returns:

Name Type Description
var list

A list containing the input variable if it was a string, or the original object.

Examples:

Example
import ozzy as oz
oz.utils.force_str_to_list('hello')
# ['hello']
oz.utils.force_str_to_list([1, 2, 3])
# [1, 2, 3]

get_attr_if_exists §

get_attr_if_exists(
    da, attr, str_exists=None, str_doesnt=None
)

Retrieve an attribute from a xarray DataArray if it exists, or return a specified value otherwise.

Parameters:

Name Type Description Default

da §

DataArray

The xarray DataArray object to check for the attribute.

required

attr §

str

The name of the attribute to retrieve.

required

str_exists §

str | Iterable[str] | Callable | None

The value or function to use if the attribute exists. If str: return as-is. If Iterable: concatenate the first element, existing value, and second element. If Callable: apply this function to the existing attribute value. If None: return attribute if it exists, otherwise return None.

None

str_doesnt §

str | None

The value to return if the attribute doesn't exist. If None, returns None.

None

Returns:

Type Description
str | None

The processed attribute value if it exists, str_doesnt if it doesn't exist, or None if str_doesnt is None and the attribute doesn't exist.

Notes

If str_exists is an Iterable with more than two elements, only the first two are used, and a warning is printed.

Examples:

Basic usage with string
import ozzy as oz
import numpy as np

# Create a sample DataArray with an attribute
da = oz.DataArray(np.random.rand(3, 3), attrs={'units': 'meters'})

result = get_attr_if_exists(da, 'missing_attr', 'Exists', 'Does not exist')
print(result)
# Output: Does not exist
Using an Iterable and a Callable
import ozzy as oz
import numpy as np

da = oz.DataArray(np.random.rand(3, 3), attrs={'units': 'meters'})

# Using an Iterable
result = get_attr_if_exists(da, 'units', ['Unit: ', ' (SI)'], 'No unit')
print(result)
# Output: Unit: meters (SI)

result = get_attr_if_exists(da, 'units', lambda x: f'The unit is: {x}', 'No unit found')
print(result)
# Output: The unit is: meters

# Using a Callable
result = get_attr_if_exists(da, 'units', lambda x: x.upper(), 'No unit')
print(result)
# Output: METERS

get_regex_snippet §

get_regex_snippet(pattern, string)

Extract a regex pattern from a string using re.search.

Tip

Use regex101.com to experiment with and debug regular expressions.

Parameters:

Name Type Description Default

pattern §

str

The regular expression pattern.

required

string §

str

The input string.

required

Returns:

Name Type Description
match str

The matched substring.

Examples:

Get number from file name
import ozzy as oz
oz.utils.get_regex_snippet(r'\d+', 'field-001234.h5')
# '001234'

get_user_methods §

get_user_methods(clss)

Get a list of user-defined methods in a class.

Parameters:

Name Type Description Default

clss §

class

The input class.

required

Returns:

Name Type Description
methods list[str]

A list of user-defined method names in the class.

Examples:

Minimal class
class MyClass:
    def __init__(self):
        pass
    def my_method(self):
        pass

import ozzy as oz
oz.utils.get_user_methods(MyClass)
# ['my_method']

path_list_to_pars §

path_list_to_pars(pathlist)

Split a list of file paths into common directory, run directories, and quantities.

Parameters:

Name Type Description Default

pathlist §

list[str]

A list of file paths.

required

Returns:

Name Type Description
common_dir str

The common directory shared by all file paths.

dirs_runs dict[str, str]

A dictionary mapping run folder names to their absolute paths.

quants list[str]

A list of unique quantities (file names) present in the input paths.

Examples:

Simple example
import os
from ozzy.utils import path_list_to_pars

pathlist = ['/path/to/run1/quantity1.txt',
            '/path/to/run1/quantity2.txt',
            '/path/to/run2/quantity1.txt']

common_dir, dirs_runs, quants = path_list_to_pars(pathlist)

print(f"Common directory: {common_dir}")
# Common directory: /path/to
print(f"Run directories: {dirs_runs}")
# Run directories: {'run1': '/path/to/run1', 'run2': '/path/to/run2'}
print(f"Quantities: {quants}")
# Quantities: ['quantity2.txt', 'quantity1.txt']
Single file path
import os
from ozzy.utils import path_list_to_pars

pathlist = ['/path/to/run1/quantity.txt']

common_dir, dirs_runs, quants = path_list_to_pars(pathlist)

print(f"Common directory: {common_dir}")
# Common directory: /path/to/run1
print(f"Run directories: {dirs_runs}")
# Run directories: {'.': '/path/to/run1'}
print(f"Quantities: {quants}")
# Quantities: ['quantity.txt']

prep_file_input §

prep_file_input(files)

Prepare path input argument by expanding user paths and converting to absolute paths.

Parameters:

Name Type Description Default

files §

str | list of str

The input file(s).

required

Returns:

Name Type Description
filelist list of str

A list of absolute file paths.

Examples:

Expand user folder
import ozzy as oz
oz.utils.prep_file_input('~/example.txt')
# ['/home/user/example.txt']
oz.utils.prep_file_input(['~/file1.txt', '~/file2.txt'])
# ['/home/user/file1.txt', '/home/user/file2.txt']

print_file_item §

print_file_item(file)

Print a file name with a leading ' - '.

Parameters:

Name Type Description Default

file §

str

The file name to be printed.

required

Examples:

Example
import ozzy as oz
oz.utils.print_file_item('example.txt')
# - example.txt

recursive_search_for_file §

recursive_search_for_file(fname, path=os.getcwd())

Recursively search for files with a given name or pattern in a specified directory and its subdirectories.

Parameters:

Name Type Description Default

fname §

str

The name or name pattern of the file to search for.

required

path §

str

The path to the directory where the search should start. If not specified, uses the current directory via os.getcwd.

getcwd()

Returns:

Type Description
list[str]

A list of paths to the files found, relative to path.

Examples:

Search for a file in the current directory
from ozzy.utils import recursive_search_for_file
files = recursive_search_for_file('example.txt')
# files = ['/path/to/current/dir/example.txt']
Search for many files in a subdirectory
from ozzy.utils import recursive_search_for_file
files = recursive_search_for_file('data-*.h5', '/path/to/project')
# files = ['data/data-000.h5', 'data/data-001.h5', 'analysis/data-modified.h5']

set_attr_if_exists §

set_attr_if_exists(da, attr, str_exists, str_doesnt=None)

Set or modify an attribute of a DataArray if it exists, or modify if it doesn't exist or is None.

Parameters:

Name Type Description Default

da §

DataArray

The input DataArray.

required

attr §

str

The name of the attribute to set or modify.

required

str_exists §

str | Iterable[str] | Callable | None

The value or function to use if the attribute exists. If str: replace the attribute with this string. If Iterable: concatenate the first element, existing value, and second element. If Callable: apply this function to the existing attribute value. If None: do not change the attribute.

required

str_doesnt §

str | None

The value to set if the attribute doesn't exist. If None, no action is taken.

None

Returns:

Type Description
DataArray

The modified DataArray with updated attributes.

Notes

If str_exists is an Iterable with more than two elements, only the first two are used, and a warning is printed.

Examples:

Set an existing attribute
import ozzy as oz
import numpy as np

# Create a sample DataArray
da = oz.DataArray(np.random.rand(3, 3), attrs={'units': 'meters'})

# Set an existing attribute
da = set_attr_if_exists(da, 'units', 'kilometers')
print(da.attrs['units'])
# Output: kilometers
Modify an existing attribute with a function
import ozzy as oz
import numpy as np

# Create a sample DataArray
da = oz.DataArray(np.random.rand(3, 3), attrs={'description': 'Random data'})

# Modify an existing attribute with a function
da = set_attr_if_exists(da, 'description', lambda x: x.upper())
print(da.attrs['description'])
# Output: RANDOM DATA
Set a non-existing attribute
import ozzy as oz
import numpy as np

# Create a sample DataArray
da = oz.DataArray(np.random.rand(3, 3))

# Set a non-existing attribute
da = set_attr_if_exists(da, 'units', 'meters', str_doesnt='unknown')
print(da.attrs['units'])
# Output: unknown

stopwatch §

stopwatch(method)

Decorator function to measure the execution time of a method.

Parameters:

Name Type Description Default

method §

callable

The method to be timed.

required

Returns:

Name Type Description
timed callable

A wrapped version of the input method that prints the execution time.

Examples:

Get execution time whenever a function is called
from ozzy.utils import stopwatch

@stopwatch
def my_function(a, b):
    return a + b

my_function(2, 3)
# -> 'my_function' took: 0:00:00.000001
# 5

tex_format §

tex_format(str)

Format a string for TeX by enclosing it with '$' symbols.

Parameters:

Name Type Description Default

str §

str

The input string.

required

Returns:

Name Type Description
newstr str

The TeX-formatted string.

Examples:

Example
import ozzy as oz
oz.utils.tex_format('k_p^2')
# '$k_p^2$'
oz.utils.tex_format('')
# ''

unpack_attr §

unpack_attr(attr)

Unpack a NumPy array attribute, typically from HDF5 files.

This function handles different shapes and data types of NumPy arrays, particularly focusing on string (byte string) attributes. It's useful for unpacking attributes read from HDF5 files using h5py.

Parameters:

Name Type Description Default

attr §

ndarray

The input NumPy array to unpack.

required

Returns:

Type Description
object

The unpacked attribute. For string attributes, it returns a UTF-8 decoded string. For other types, it returns either a single element (if the array has only one element) or the entire array.

Raises:

Type Description
AssertionError

If the input is not a NumPy array.

Notes
  • For string attributes (dtype.kind == 'S'):
    • 0D arrays: returns the decoded string
    • 1D arrays: returns the first element decoded
    • 2D arrays: returns the first element if size is 1, otherwise the entire array
  • For non-string attributes:
    • If the array has only one element, returns that element
    • Otherwise, returns the entire array

Examples:

Unpacking a string attribute
import numpy as np
import ozzy.utils as utils

# Create a NumPy array with a byte string
attr = np.array(b'Hello, World!')
result = utils.unpack_attr(attr)
print(result)
# Output: Hello, World!
Unpacking a numeric attribute
import numpy as np
import ozzy.utils as utils

# Create a NumPy array with a single number
attr = np.array([42])
result = utils.unpack_attr(attr)
print(result)
# Output: 42