// N5Factory can make N5Readers and N5Writers
var factory = new N5Factory();
// trying to open a reader for a container that does not yet exist will throw an error
// var n5Reader = factory.openReader("my-container.n5");
// creating a writer creates a container at the given location
// if it does not already exist
var n5Writer = factory.openWriter("my-container.n5");
// now we can make a reader
var n5Reader = factory.openReader("my-container.n5");
// test if the container exists
n5Reader.exists(""); // true
// "" and "/" both refer to the root of the container
n5Reader.exists("/"); // trueThis tutorial for Java developers covers the most basic functionality of the N5 API for storing large, chunked n-dimensional image data and structured metadata. The N5 API and documentation refer to n-dimensional images as “datasets”, terminology inherited from HDF5. We will use this terminology in this tutorial. If you are used to work with Python and Numpy, an n-dimensional image or dataset is what you know as an ndarray. We will learn about:
- creating readers and writers
- modifying and inspecting the hierarchy (“folder structure”)
- saving and loading datasets
- saving and loading metadata
Readers and writers
N5Readers and N5Writers form the basis of the N5 API and allow you to read and write data, respectively. We generally recommend using an N5Factory to create readers and writers:
The N5 API gives you access to a number of different storage formats: HDF5, Zarr, and N5’s own format. N5Factory’s convenience methods try to infer the storage format from the extension of the path you provide:
In fact, it is possible to read with N5Writers since every N5Writer is also an N5Reader, so from now on we’ll just be using the n5Writer.
We use the the N5 storage format for the rest of the tutorial, but it will work just as well over either an HDF5 file or Zarr container.
Groups
N5 containers form hierarchies of groups - think “nested folders on your file system.” It’s easy to create groups and test if they exist:
The list method lists groups that are children of the given group:
and deepList recursively lists every descendent of the given group:
Notice that these methods only give information about what groups are present and do not provide information about metadata or datasets.
Some storage / access systems (AWS-S3) separate permissions for reading and listing, meaning it may be possible to access data but not list.
Datasets
N5 stores datasets (n-dimensional arrays) in particular groups in the hierarchy.
Datasets must be terminal (leaf) nodes in the container hierarchy - i.e. a dataset can not contain another group or dataset. (Is this strictly true? May be confusing with names like multiscale “datasets”)
We recommend using code from n5-ij or n5-imglib2 to write datasets. The examples in this post will use the latter.
The N5Utils class in n5-imglib2 has many useful methods, but in this post, we’ll cover simple methods for reading and writing. First, N5Utils.save writes a dataset and required metadata to the container at a group that you specify. The group will be created if it does not already exist. The parameters will be discussed in more detail below.
You can write in parallel by providing an ExecutorService to this variant of N5Utils.save
Reading the dataset from the container is also easy with N5Utils.open :
This save method DOES NOT perform any checks prior to writing data and will overwrite data that exists in the specified location. Be sure to check and take appropriate action if it is possible that data could already be at a particular location and container to avoid data loss or corruption.
This example shows that data can be over written:
Parameter details
groupPath
is the location inside the container that will store the dataset. You can store an dataset at the root of a container by specifying "" or "/" as the groupPath. In this case, the container will only be able to store one dataset (see the warning above).
blockSize
is a very important parameter. HDF5, N5, and Zarr all break up the datasets they store into equally sized blocks or “chunks”. The block size parameter specifies the size of these blocks.
For the example above, we stored an image of size 64 x 64 using blocks sized 32 x 32. As a result, N5 uses four blocks to store the entire image:
Quiz: How many blocks would there be if the block size was 64 x 8?
Click here to show the answer.
There would be eight blocks.
One block covers the first dimension, but it takes 8 blocks to cover the second dimension (\(8 \times 8 = 64\)). Also demonstrated by the code below:
N5 lets you store your image in a single file if you want - just provide a block size that is equal to or larger than the image size.
compression
Each block is compressed independently, using the specified compression. Use RawCompression to store blocks without compression.
Notice that blocks were previously ~1700-2000 bytes and are now ~4100 without compression.
The available compression options at the time of this writing are:
Metadata
N5 can also store rich structured metadata in addition to array data. This tutorial will discuss basic, low-level metadata operations. Advanced operations and metadata standards may be described in a future tutorial.
Basics
N5Writers have a setAttribute method for writing metadata to the storage backend. It takes three arguments:
<T> void setAttribute(String groupPath, String attributePath, T attribute)groupPath: the group in which to store this metadataattributePath: the name of this attributeattribute: the metadata attribute to be stored. Can be an arbitrary type (denotedT).
There are differences between an attribute “name” and an attribute “path”, but attribute “paths” are an advanced topic and will be covered elsewhere.
Similarly, N5Readers have a getAttribute method:
<T> T getAttribute(String groupPath, String attributePath, Class<T> clazz)The last argument (Class<T>) lets you specify the type that getAttribute should return. An N5Exception will be thrown if the requested type can not be created from the requested attribute. If an attribute does not exist, null will be returned (see the last example of this section). Consider these examples:
Sometimes it is possible to interpret an attribute as multiple different types:
Rich metadata
It possible to save attributes of arbitrary types, enabling you to struture your metadata into classes that are easy to save and load directly. For example, if we define a metadata class FunWithMetadata:
then make an instance and save it:
To retrieve all the metadata in a group as JSON:
Removing metadata
You can remove attributes by their name as well. To return the element that was removed, just provide the class for that element (this mirrors the remove method for Lists in Java.
Working with Dataset Metadata
Metadata used to describe datasets can be get and set the same as all other metadata. However there are special DatasetAttributes methods to safely work with dataset metadata. N5Reader.getDatasetAttributes and N5Writer.setDatasetAttributes ensure the metadata is always a valid representation of dataset metadata. Setting DatasetAttributes however should only be done when the dataset is initially saved. This ensure the required metadata is tightly coupled with the data. For example, setting dataset metadata should be done through the N5Writer.createDataset methods (or indirectly through the N5Utils.save methods mentioned above)
The attributes that N5 uses to read datasets can be set with setAttribute, and modifying them could corrupt your data. Do not manually set these attributes unless you absolutely know what you’re doing!
dimensionsblockSizedataTypecompression
The attributes that describe datasets are also accessible using getAttribute, try running:
n5Writer.getAttribute("data", "dimensions", long[].class);though using getDatasetAttributes().getDimensions() are generally recommended.