HDF5Folder

class Stoner.HDF5.HDF5Folder(*args, **kargs)[source]

Bases: HDF5FolderMixin, DataFolder

Just enforces the loader attriobute to be an HDF5File.

Attributes Summary

basenames

Return a list of just the filename parts of the objectFolder.

clone

Clone just does a deepcopy as a property for compatibility with Stoner.Core.DataFile.

debug

Just read the local debug value.

defaults

Build a single list of all of our defaults by iterating over the __mro__, caching the result.

depth

Give the maximum number of levels of group below the current objectFolder.

directory

Just alias directory to root now.

each

Return a Stoner.folders.each.item proxy object.

files

Return an iterator of potentially unloaded named objects.

groups

Subfolders are held in an ordered dictionary of groups.

instance

Return a default instance of the type of object in the folder.

is_empty

Return True if the folder is empty.

key

Override the parent class key to use the directory attribute.

layout

Return a tuple that describes the number of files and groups in the folder.

loaded

Iterate only over those members of the folder in memory.

loader

Return a callable that will load the files on demand.

ls

List just the names of the objects in the folder.

lsgrp

Return a list of the groups as a generator.

metadata

Return a Stoner.folders.metadata.MetadataProxy object.

mindepth

Give the minimum number of levels of group below the current objectFolder.

not_empty

Iterate over the objectFolder that checks whether the loaded metadataObject objects have any data.

not_loaded

Return an array of True/False for whether we've loaded a metadataObject yet.

objects

Return the objects in the folder are stored in a regexpDict.

pattern

Provide support for getting the pattern attribute.

root

Return the real folder root.

setas

Return the proxy for the setas attribute for each object in the folder.

shape

Return a data structure that is characteristic of the objectFolder's shape.

trunkdepth

Return the number of levels of group before a group with files is found.

type

Return the (sub)class of the Stoner.Core.metadataObject instances.

Methods Summary

add_group(key)

Add a new group to the current baseFolder with the given key.

all()

Iterate over all the files in the Folder and all it's sub Folders recursely.

append(value)

Append an item to the folder object.

clear()

Clear the subgroups.

close()

Close the cirrent hd5 file.

compress([base, key, keep_terminal])

Compresses all empty groups from the root up until the first non-empty group is located.

concatenate([sort, reverse])

Concatenates all the files in a objectFolder into a single metadataObject like object.

count(value)

Provide a count method like a sequence.

extend(values)

S.extend(iterable) -- extend sequence by appending elements from the iterable

extract(*metadata, **kargs)

Extract metadata from each of the files in the terminal group.

fetch()

Preload the contents of the DiskBasedFolderMixin.

file(name, value[, create, pathsplit])

recursely add groups in order to put the named value into a virtual tree of baseFolder.

filter([filter, invert, copy, recurse, prune])

Filter the current set of files by some criterion.

filterout(filter[, copy, recurse, prune])

Synonym for self.filter(filter,invert=True).

flatten([depth])

Compresses all the groups and sub-groups iunto a single flat file list.

gather([xcol, ycol])

Collect xy and y columns from the subfiles in the final group in the tree.

get(name[, default])

Return either a sub-group or named object from this folder.

getlist([recursive, directory, flatten])

Read the HDF5 File to construct a list of file HDF5File objects.

group(key)

Sort Files into a series of objectFolders according to the value of the key.

index(value[, start, end])

Provide an index method like a sequence.

insert(index, value)

Implement the insert method with the option to append as well.

items()

Return the key,value pairs for the subbroups of this folder.

keep_latest()

Filter out earlier revisions of files with the same name.

keys()

Return the keys used to access the sub-=groups of this folder.

make_name([value])

Construct a name from the value object if possible.

on_load_process(tmp)

Carry out processing on a newly loaded file to set means and extra metadata.

pop([name, default])

Return and remove either a subgroup or named object from this folder.

popitem()

Return the most recent subgroup from this folder.

prune([name])

Remove any empty groups from the objectFolder (and subgroups).

remove(value)

S.remove(value) -- remove first occurrence of value.

reverse()

S.reverse() -- reverse IN PLACE

save([root])

Save a load of files to a single HDF5 file, creating groups as it goes.

select(*args, **kargs)

Select a subset of the objects in the folder based on flexible search criteria on the metadata.

setdefault(k[, d])

Return or set a subgroup or named object.

slice_metadata(key[, output])

Return an array of the metadata values for each item/file in the top level group.

sort([key, reverse, recurse])

Sort the files by some key.

unflatten()

Take the file list an unflattens them according to the file paths.

unload([name])

Remove the instance from memory without losing the name in the Folder.

update(other)

Update this folder with a dictionary or another folder.

values()

Return the sub-groups of this folder.

walk_groups(walker, **kargs)

Walk through a hierarchy of groups and calls walker for each file.

zip_groups(groups)

Return a list of tuples of metadataObjects drawn from the specified groups.

Attributes Documentation

basenames

Return a list of just the filename parts of the objectFolder.

clone

Clone just does a deepcopy as a property for compatibility with Stoner.Core.DataFile.

debug

Just read the local debug value.

defaults

Build a single list of all of our defaults by iterating over the __mro__, caching the result.

depth

Give the maximum number of levels of group below the current objectFolder.

directory

Just alias directory to root now.

each

Return a Stoner.folders.each.item proxy object.

This is for calling attributes of the member type of the folder.

files

Return an iterator of potentially unloaded named objects.

groups

Subfolders are held in an ordered dictionary of groups.

instance

Return a default instance of the type of object in the folder.

is_empty

Return True if the folder is empty.

key

Override the parent class key to use the directory attribute.

layout

Return a tuple that describes the number of files and groups in the folder.

loaded

Iterate only over those members of the folder in memory.

loader

Return a callable that will load the files on demand.

ls

List just the names of the objects in the folder.

lsgrp

Return a list of the groups as a generator.

metadata

Return a Stoner.folders.metadata.MetadataProxy object.

This allows for operations on combined metadata.

mindepth

Give the minimum number of levels of group below the current objectFolder.

not_empty

Iterate over the objectFolder that checks whether the loaded metadataObject objects have any data.

Returns the next non-empty DatFile member of the objectFolder.

Note

not_empty will also silently skip over any cases where loading the metadataObject object will raise and exception.

not_loaded

Return an array of True/False for whether we’ve loaded a metadataObject yet.

objects

Return the objects in the folder are stored in a regexpDict.

pattern

Provide support for getting the pattern attribute.

root

Return the real folder root.

setas

Return the proxy for the setas attribute for each object in the folder.

shape

Return a data structure that is characteristic of the objectFolder’s shape.

trunkdepth

Return the number of levels of group before a group with files is found.

type

Return the (sub)class of the Stoner.Core.metadataObject instances.

Methods Documentation

add_group(key)

Add a new group to the current baseFolder with the given key.

Parameters:

key (string) – A hashable value to be used as the dictionary key in the groups dictionary

Returns:

A copy of the objectFolder

Note

If key already exists in the groups dictionary then no action is taken.

Todo

Propagate any extra attributes into the groups.

all()

Iterate over all the files in the Folder and all it’s sub Folders recursely.

Yields:

(path/filename,file)

append(value)

Append an item to the folder object.

clear()

Clear the subgroups.

close()

Close the cirrent hd5 file.

compress(base=None, key='.', keep_terminal=False)

Compresses all empty groups from the root up until the first non-empty group is located.

Returns:

A copy of the now flattened DatFolder

concatenate(sort=None, reverse=False)

Concatenates all the files in a objectFolder into a single metadataObject like object.

Keyword Arguments:
  • sort (column index, None or bool, or clallable function) – Sort the resultant metadataObject by this column (if a column index), or by the x column if None or True, or not at all if False. sort is passed directly to the eponymous method as the order parameter.

  • reverse (bool) – Reverse the order of the sort (defaults to False)

Returns:

The current objectFolder with only one metadataObject item containing all the data.

count(value)

Provide a count method like a sequence.

Parameters:

value (str, regexp, or Stoner.Core.metadataObject) – The thing to count matches for.

Returns:

(int) – The number of matching metadataObject instances.

Notes

If name is a string, then matching is based on either exact matches of the name, or if it includes a * or ? then the basis of a globbing match. name may also be a regular expressiuon, in which case matches are made on the basis of the match with the name of the metadataObject. Finally, if name is a metadataObject, then it matches for an equyality test.

extend(values)

S.extend(iterable) – extend sequence by appending elements from the iterable

extract(*metadata, **kargs)

Extract metadata from each of the files in the terminal group.

Walks through the terminal group and gets the listed metadata from each file and constructsa replacement metadataObject.

Parameters:

*metadata (str) – One or more metadata indices that should be used to construct the new data file.

Ketyword Arguments:
copy (bool):

Take a copy of the DataFolder before starting the extract (default is True)

Returns:

An instance of a metadataObject like object.

fetch()

Preload the contents of the DiskBasedFolderMixin.

With multiprocess enabled this will parallel load the contents of the folder into memory.

file(name, value, create=True, pathsplit=None)

recursely add groups in order to put the named value into a virtual tree of baseFolder.

Parameters:
  • name (str) – A name (which may be a nested path) of the object to file.

  • value (metadataObject) – The object to be filed - it should be an instance of baseFolder.type.

Keyword Aprameters:
create(bool):

Whether to create missing groups or to raise an error (default True to create groups).

pathsplit(str or None):

Character to use to split the name into path components. Defaults to using os.path.split()

Returns:

(baseFolder) – A reference to the group where the value was eventually filed

filter(filter=None, invert=False, copy=False, recurse=False, prune=True)

Filter the current set of files by some criterion.

Parameters:

filter (string or callable) – Either a string flename pattern or a callable function which takes a single parameter x which is an instance of a metadataObject and evaluates True or False

Keyword Arguments:
  • invert (bool) – Invert the sense of the filter (done by doing an XOR with the filter condition

  • copy (bool) – If set True then the DataFolder is copied before being filtered. Default is False - work in place.

  • recurse (bool) – If True, apply the filter recursely to all groups. Default False

  • prune (bool) – If True, execute a baseFolder.prune() to remove empty groups after filering

Returns:

The current objectFolder object

filterout(filter, copy=False, recurse=False, prune=True)

Synonym for self.filter(filter,invert=True).

Parameters:

filter (string or callable) – Either a string flename pattern or a callable function which takes a single parameter x which is an instance of a metadataObject and evaluates True or False

Keyword Arguments:
  • copy (bool) – If set True then the DataFolder is copied before being filtered. Default is False - work in place.

  • recurse (bool) – If True, apply the filter recursely to all groups. Default False

  • prune (bool) – If True, execute a baseFolder.prune() to remove empty groups after filering

Returns:

The current objectFolder object with the files in the file list filtered.

flatten(depth=None)

Compresses all the groups and sub-groups iunto a single flat file list.

Keyword Arguments:
  • ) (depth) –

  • level. (Only flatten ub-=groups that are within (depth of the deepest) –

Returns:

A copy of the now flattened DatFolder

gather(xcol=None, ycol=None)

Collect xy and y columns from the subfiles in the final group in the tree.

Builds the collected data into a Stoner.Core.metadataObject

Keyword Arguments:
  • xcol (index or None) – Column in each file that has x data. if None, then the setas settings are used

  • ycol (index or None) – Column(s) in each filwe that contain the y data. If none, then the setas settings are used.

Notes

This is a wrapper around walk_groups that assembles the data into a single file for further analysis or plotting.

get(name, default=None)

Return either a sub-group or named object from this folder.

getlist(recursive=None, directory=None, flatten=False)

Read the HDF5 File to construct a list of file HDF5File objects.

group(key)

Sort Files into a series of objectFolders according to the value of the key.

Parameters:

key (string or callable or list) – Either a simple string or callable function or a list. If a string then it is interpreted as an item of metadata in each file. If a callable function then takes a single argument x which should be an instance of a metadataObject and returns some vale. If key is a list then the grouping is done recursely for each element in key.

Returns:

A copy of the current objectFolder object in which the groups attribute is a dictionary of objectFolder objects with sub lists of files

Notes

If ne of the grouping metadata keys does not exist in one file then no exception is raised - rather the fiiles will be returned into the grou with key None. Metadata keys that are generated from the filename are supported.

index(value, start=None, end=None)

Provide an index method like a sequence.

Parameters:

value (str, regexp, or Stoner.Core.metadataObject) – The thing to search for.

Keyword Arguments:

start,end (int) – Limit the index search to a sub-range as per Python 3.5+ list.index

Returns:

(int) – The index of the first matching metadataObject instances.

Notes

If name is a string, then matching is based on either exact matches of the name, or if it includes a * or ? then the basis of a globbing match. name may also be a regular expressiuon, in which case matches are made on the basis of the match with the name of the metadataObject. Finally, if name is a metadataObject, then it matches for an equyality test.

insert(index, value)

Implement the insert method with the option to append as well.

items()

Return the key,value pairs for the subbroups of this folder.

keep_latest()

Filter out earlier revisions of files with the same name.

The CM group LabVIEW software will avoid overwriting files when measuring by inserting !#### where #### is an integer revision number just before the filename extension. This method will look for instances of several files which differ in name only by the presence of the revision number and will kepp only the highest revision number. This is useful if several measurements of the same experiment have been carried out, but only the last file is the correct one.

Returns:

A copy of the DataFolder.

keys()

Return the keys used to access the sub-=groups of this folder.

make_name(value=None)

Construct a name from the value object if possible.

on_load_process(tmp)

Carry out processing on a newly loaded file to set means and extra metadata.

pop(name=- 1, default=None)

Return and remove either a subgroup or named object from this folder.

popitem()

Return the most recent subgroup from this folder.

prune(name=None)

Remove any empty groups from the objectFolder (and subgroups).

Returns:

A copy of thte pruned objectFolder.

remove(value)

S.remove(value) – remove first occurrence of value. Raise ValueError if the value is not present.

reverse()

S.reverse() – reverse IN PLACE

save(root=None)

Save a load of files to a single HDF5 file, creating groups as it goes.

Keyword Arguments:

root (string) – The name of the HDF5 file to save to if set to None, will prompt for a filename.

Returns:

A list of group paths in the HDF5 file

select(*args, **kargs)

Select a subset of the objects in the folder based on flexible search criteria on the metadata.

Parameters:

args (various) – A single positional argument if present is interpreted as follows:

  • If a callable function is given, the entire metadataObject is presented to it. If it evaluates True then that metadataObject is selected. This allows arbitrary select operations

  • If a dict is given, then it and the kargs dictionary are merged and used to select the metadataObjects

Keyword Arguments:
  • recurse (bool) – Also recursively slect through the sub groups

  • kargs (varuous) –

    Arbitrary keyword arguments are interpreted as requestion matches against the corresponding metadata values. The keyword argument may have an additional __operator appended to it which is interpreted as follows:

    • eq metadata value equals argument value (this is the default test for scalar argument)

    • ne metadata value doe not equal argument value

    • gt metadata value doe greater than argument value

    • lt metadata value doe less than argument value

    • ge metadata value doe greater than or equal to argument value

    • le metadata value doe less than or equal to argument value

    • contains metadata value contains argument value

    • in metadata value is in the argument value (this is the default test for non-tuple iterable

      arguments)

    • startswith metadata value startswith argument value

    • endswith metadata value endwith argument value

    • icontains,*iin*, istartswith,*iendswith* as above but case insensitive

    • between metadata value lies between the minimum and maximum values of the argument (the default test for 2-length tuple arguments)

    • ibetween,*ilbetween*,*iubetween* as above but include both,lower or upper values

  • rich. (The syntax is inspired by the Django project for selecting, but is not quite as) –

Returns:

(baseFGolder) – A new baseFolder instance that contains just the matching metadataObjects.

Note

If any of the tests is True, then the metadataObject will be selected, so the effect is a logical OR. To achieve a logical AND, you can chain two selects together:

d.select(temp__le=4.2,vti_temp__lt=4.2).select(field_gt=3.0)

will select metadata objects that have either temp or vti_temp metadata values below 4.2 AND field metadata values greater than 3.

There are a few cases where special treatment is needed:

  • If you need to select on a aparameter called recurse, pass a dictionary of {“recurse”:value} as the sole positional argument.

  • If you need to select on a metadata value that ends in an operator word, then append __eq in the keyword name to force the equality test.

  • If the metadata keys to select on are not valid python identifiers, then pass them via the first positional dictionary value.

If the metadata item being checked exists in a regular expression file pattern for the folder, then the files are not loaded and the metadata is evaluated based on the filename. This can speed up operations where a file load is not required.

setdefault(k, d=None)

Return or set a subgroup or named object.

slice_metadata(key, output='smart')

Return an array of the metadata values for each item/file in the top level group.

Parameters:

key (str, regexp or list of str) – the meta data key(s) to return

Keyword Parameters:
output (str):

Output format - values are - dict: return an array of dictionaries - list: return a list of lists - array: return a numpy array - Data: return a Stoner.Data object - smart: (default) return either a list if only one key or a list of dictionaries

Returns:

(array of metadata) – If single key is given and is an exact match then returns an array of the matching values. If the key results in a regular expression match, then returns an array of dictionaries of all matching keys. If key is a list ir other iterable, then return a 2D array where each column corresponds to one of the keys.

Todo

Add options to recurse through all groups? Put back RCT’s values only functionality?

sort(key=None, reverse=False, recurse=True)

Sort the files by some key.

Keyword Arguments:
  • key (string, callable or None) – Either a string or a callable function. If a string then this is interpreted as a metadata key, if callable then it is assumed that this is a a function of one parameter x that is a Stoner.Core.metadataObject object and that returns a key value. If key is not specified (default), then a sort is performed on the filename

  • reverse (bool) – Optionally sort in reverse order

  • recurse (bool) – If True (default) sort the sub-groups as well.

Returns:

A copy of the current objectFolder object

unflatten()

Take the file list an unflattens them according to the file paths.

Returns:

A copy of the objectFolder

unload(name=None)

Remove the instance from memory without losing the name in the Folder.

Parameters:

name (string,int or None) – Specifies the entry to unload from memory. If set to None all loaded entries are unloaded.

Returns:

(DataFolder) – returns a copy of itself.

update(other)

Update this folder with a dictionary or another folder.

values()

Return the sub-groups of this folder.

walk_groups(walker, **kargs)

Walk through a hierarchy of groups and calls walker for each file.

Parameters:

walker (callable) – A callable object that takes either a metadataObject instance or a objectFolder instance.

Keyword Arguments:
  • group (bool) – (default False) determines whether the walker function will expect to be given the objectFolder representing the lowest level group or individual metadataObject objects from the lowest level group

  • replace_terminal (bool) – If group is True and the walker function returns an instance of metadataObject then the return value is appended to the files and the group is removed from the current objectFolder. This will unwind the group hierarchy by one level.

  • obly_terminal (bool) – Only execute the walker function on groups that have no sub-groups inside them (i.e. are terminal groups)

  • walker_args (dict) – A dictionary of static arguments for the walker function.

Notes

The walker function should have a prototype of the form:

walker(f,list_of_group_names,**walker_args)

where f is either a objectFolder or metadataObject.

zip_groups(groups)

Return a list of tuples of metadataObjects drawn from the specified groups.

Parameters:

groups (list of strings) – A list of keys of groups in the Lpy:class:objectFolder

Returns:

A list of tuples of groups of files – [(grp_1_file_1,grp_2_file_1….grp_n_files_1),(grp_1_file_2, grp_2_file_2….grp_n_file_2)….(grp_1_file_m,grp_2_file_m…grp_n_file_m)]