ozzy§

Core functions of the ozzy library.

This module contains the main entry points for working with ozzy, including functions to create new DataArray and Dataset objects, and to open data files of various types.

The open() function is the primary way to load data into ozzy, and supports a variety of file types. The open_series() function can be used to load a series of files, and open_compare() can be used to compare data across multiple file types and runs.

DataArray §

DataArray(
    *args, pic_data_type=None, data_origin=None, **kwargs
)

Create a new DataArray object with added Ozzy functionality. See xarray.DataArray for more information on *args and **kwargs.

Warning

This function should be used instead of xarray.DataArray() to create a new DataArray object, since it sets attributes that enable access to ozzy-specific methods.

Parameters:

Name	Type	Description	Default
`*args` §		Positional arguments passed to xarray.DataArray.	`()`
`pic_data_type` §	`str \| None`	Type of data in the DataArray. Current options: `'grid'` (data defined on an n-dimensional grid, as a function of some coordinate(s)), or `'part'` (data defined on a particle-by-particle basis). If given, this overwrites the corresponding attribute in any data objects passed as positional arguments (*args).	`None`
`data_origin` §	`str \| None`	Type of simulation data. Current options: `'ozzy'`, `'osiris'`, or `'lcode'`.	`None`
`**kwargs` §		Keyword arguments passed to xarray.DataArray.	`{}`

Returns:

Type	Description
`DataArray`	The newly created DataArray object.

Examples:

Empty DataArray

import ozzy as oz
da = oz.DataArray()
print(da)
# <xarray.DataArray ()> Size: 8B
# array(nan)
# Attributes:
#     pic_data_type:  None
#     data_origin:    None
da.size, da.shape
# (1, ())

A DataArray cannot be empty, so it is initialized as a NaN variable (zero array dimensions).

Dummy DataArray

import ozzy as oz
import numpy as np
da = oz.DataArray(np.random.rand(10,30), dims=['t','x'], coords={'x': np.linspace(-5,0,30)}, name='var1', pic_data_type='grid', data_origin='ozzy')
da
# <xarray.DataArray 'var1' (t: 10, x: 30)> Size: 2kB
# array([[0.64317574, 0.24791049, 0.54208619, 0.27064002, 0.65152958,
# ...
#         0.28523593, 0.76475677, 0.86068012, 0.03214018, 0.55055121]])
# Coordinates:
# * x        (x) float64 240B -5.0 -4.828 -4.655 -4.483 ... -0.3448 -0.1724 0.0
# Dimensions without coordinates: t
# Attributes:
#     pic_data_type:  grid
#     data_origin:    ozzy

Dataset §

Dataset(
    *args, pic_data_type=None, data_origin=None, **kwargs
)

Create a new Dataset object with added ozzy functionality. See xarray.Dataset for more information on *args and **kwargs.

Warning

This function should be used instead of xarray.Dataset() to create a new Dataset object, since it sets attributes that enable access to ozzy-specific methods.

Parameters:

Name	Type	Description	Default
`*args` §		Positional arguments passed to xarray.Dataset.	`()`
`pic_data_type` §	`str \| list[str] \| None`	Type of data contained in the Dataset. Current options: `'grid'` (data defined on an n-dimensional grid, as a function of some coordinate(s)), or `'part'` (data defined on a particle-by-particle basis). If given, this overwrites the corresponding attribute in any data objects passed as positional arguments (*args).	`None`
`data_origin` §	`str \| list[str] \| None`	Type of simulation data. Current options: `'ozzy'`, `'osiris'`, or `'lcode'`.	`None`
`**kwargs` §		Keyword arguments passed to xarray.Dataset.	`{}`

Returns:

Type	Description
`Dataset`	The newly created Dataset object.

Examples:

Empty Dataset

import ozzy as oz
ds = oz.Dataset()
ds
# <xarray.Dataset> Size: 0B
# Dimensions:  ()
# Data variables:
#     *empty*
# Attributes:
#     pic_data_type:  None
#     data_origin:    None

Dummy Dataset

import ozzy as oz
import numpy as np
ds = oz.Dataset({'var1': (['t','x'], np.random.rand(10,30))}, coords={'x': np.linspace(-5,0,30)}, pic_data_type='grid', data_origin='ozzy')
ds
# <xarray.Dataset> Size: 3kB
# Dimensions:  (t: 10, x: 30)
# Coordinates:
# * x        (x) float64 240B -5.0 -4.828 -4.655 -4.483 ... -0.3448 -0.1724 0.0
# Dimensions without coordinates: t
# Data variables:
#     var1     (t, x) float64 2kB 0.9172 0.3752 0.1873 ... 0.5211 0.8016 0.335
# Attributes:
#     pic_data_type:  grid
#     data_origin:    ozzy

available_backends §

available_backends()

List available backend options for reading simulation data.

Returns:

Type	Description
`list[str]`	Available backend names.

Examples:

Show available file backends

import ozzy as oz
backends = oz.available_backends()
print(backends)
# ['osiris', 'lcode', 'ozzy']

open §

open(file_type, path, axes_lims=None)

Open a data file and return a data object (DataArray or Dataset).

Parameters:

Name	Type	Description	Default
`file_type` §	`str`	The type of data file to open. Current options: `'ozzy'`, `'osiris'`, or `'lcode'`.	required
`path` §	`str \| list[str]`	The path to the data file(s) to open. Can be a single path or a list of paths. Paths can be absolute or relative, but cannot contain wildcards or glob patterns.	required
`axes_lims` §	`dict[str, tuple[float, float]] \| None`	A dictionary specifying the limits for each axis in the data (only used for `'lcode'` data type, optionally). Keys are axis names, and values are tuples of (min, max) values.	`None`

Returns:

Type	Description
`Dataset \| DataArray`	The Ozzy data object containing the data from the opened file(s).

Examples:

Read Osiris field data

import ozzy as oz
ds = oz.open('osiris', 'path/to/file/e1-000020.h5')

Read LCODE field data

LCODE simulation files do not contain any axis information, so we must supply the simulation window size in order to define the axis coordinates (this is optional).

import ozzy as oz
ds = oz.open('lcode', 'path/to/file/ez02500.swp', axes_lims = {'x1': (-100,0.0), 'x2': (0.0, 6.0)})

open_compare §

open_compare(
    file_types,
    path=os.getcwd(),
    runs="*",
    quants="*",
    axes_lims=None,
)

Open and compare data files of different types and from different runs.

Parameters:

Name	Type	Description	Default
`file_types` §	`str \| list[str]`	The type(s) of data files to open. Current options are: `'ozzy'`, `'osiris'`, or `'lcode'`.	required
`path` §	`str`	The path to the directory containing the run folders. Default is the current working directory.	`getcwd()`
`runs` §	`str \| list[str]`	A string or glob pattern to match the run folder names. Default is '*' to match all folders.	`'*'`
`quants` §	`str \| list[str]`	A string or glob pattern to match the quantity names. Default is '*' to match all quantities.	`'*'`
`axes_lims` §	`dict[str, tuple[float, float]] \| None`	A dictionary specifying the limits for each axis in the data (only used for `'lcode'` data type, optionally). Keys are axis names, and values are tuples of (min, max) values.	`None`

Returns:

Type	Description
`DataFrame`	A DataFrame containing the data objects for each run and quantity, with runs as rows and quantities as columns.

Examples:

Opening files across different folders

Let's say we have the following directory:

.
└── parameter_scans/
    ├── run_a/
    │   ├── e1-000000.h5
    │   ├── e1-000001.h5
    │   ├── e1-000002.h5
    │   ├── ...
    │   └── e1-000100.h5
    ├── run_b/
    │   ├── e1-000000.h5
    │   ├── e1-000001.h5
    │   ├── e1-000002.h5
    │   ├── ...
    │   └── e1-000100.h5
    └── test_run/
        └── ...

We want to compare the simulations results for the longitudinal field from two different simulations, run_a and run_b.

import ozzy as oz
df = oz.open_compare('osiris', path='parameter_scans', runs='run_*', quants='e1')
df

This function returns a pandas.DataFrame. Each dataset can be accessed with a standard Pandas lookup method like .at/.iat or .loc/.iloc:

ds = df.at['run_b', 'e1']

Opening files with two different backends

Let's say we have the following directory:

/MySimulations/
├── OSIRIS/
│   └── my_sim_1/
│       └── MS/
│           └── DENSITY/
│               └── electrons/
│                   └── charge/
│                       ├── charge-electrons-000000.h5
│                       ├── charge-electrons-000001.h5
│                       ├── charge-electrons-000002.h5
│                       └── ...
└── LCODE/
    └── my_sim_2/
        ├── ez00200.swp
        ├── ez00400.swp
        ├── ez00600.swp
        └── ...

We can read two quantities produced by two different simulation codes:

import ozzy as oz
df = oz.open_compare(
    ["osiris", "lcode"],
    path='/MySimulations',
    runs=["OSIRIS/my_sim_1", "LCODE/my_sim_2"],
    quants=["charge", "ez"],
)
# ...
print(df)
#                   charge-electrons    ez
# OSIRIS/my_sim_1           [charge]    []
# LCODE/my_sim_2                  []  [ez]

open_series §

open_series(file_type, files, axes_lims=None, nfiles=None)

Open a series of data files and return a data object (DataArray or Dataset).

Parameters:

Name	Type	Description	Default
`file_type` §	`str`	The type of data files to open (currently: `'ozzy'`, `'osiris'`, or `'lcode'`).	required
`files` §	`str \| list`	The path(s) to the data file(s) to open. Can be a single path or a list of paths. Paths can be absolute or relative, but cannot contain wildcards or glob patterns.	required
`axes_lims` §	`dict`	A dictionary specifying the limits for each axis in the data (only used for `'lcode'` data type, optionally). Keys are axis names, and values are tuples of (min, max) values.	`None`
`nfiles` §	`int`	The maximum number of files to open. If not provided, all files will be opened.	`None`

Returns:

Type	Description
`DataArray \| Dataset`	The Ozzy data object containing the data from the opened file(s).

Examples:

Open time series of data

Let's say we are located in the following directory, which contains a time series of ozzy data in HDF5 format:

.
└── my_data/
    ├── Ez_0001.h5
    ├── Ez_0002.h5
    ├── Ez_0003.h5
    ├── ...
    └── Ez_0050.h5

We want to open only the first three files.

import ozzy as oz
ds = oz.open_series('ozzy', 'my_data/Ez_*.h5', nfiles=3)

The three files have been put together in a single dataset with a new time dimension.

ozzy§

DataArray §

`*args` §

`pic_data_type` §

`data_origin` §

`**kwargs` §

Dataset §

`*args` §

`pic_data_type` §

`data_origin` §

`**kwargs` §

available_backends §

open §

`file_type` §

`path` §

`axes_lims` §

open_compare §

`file_types` §

`path` §

`runs` §

`quants` §

`axes_lims` §

open_series §

`file_type` §

`files` §

`axes_lims` §

`nfiles` §

ozzy§

DataArray §

*args §

pic_data_type §

data_origin §

**kwargs §

Dataset §

*args §

pic_data_type §

data_origin §

**kwargs §

available_backends §

open §

file_type §

path §

axes_lims §

open_compare §

file_types §

path §

runs §

quants §

axes_lims §

open_series §

file_type §

files §

axes_lims §

nfiles §

`*args` §

`pic_data_type` §

`data_origin` §

`**kwargs` §

`*args` §

`pic_data_type` §

`data_origin` §

`**kwargs` §

`file_type` §

`path` §

`axes_lims` §

`file_types` §

`path` §

`runs` §

`quants` §

`axes_lims` §

`file_type` §

`files` §

`axes_lims` §

`nfiles` §