[Caffe] HDF5 Layer

I have struggled with HDF5 data layer when I wanted to have a vector label for each of my data. Below I will share some of my experience with this data layer that is very data-flexible but less straightforward to use.

Note that HDF5 data layer doesn’t support data transformation. This means that you have to either pre-process your data in the desired way before feeding them in, or add some additional processing layer such element-wise multiplication layer for data scaling.

Overall, HDF5 data layer requires a .h5 file and a .txt file. The .h5 file contains your data and label, while the .txt file specifies the path(s) to the .h5 file(s).

Following is an example of creating the .h5 file and its corresponding .txt file using python:

import h5py
import os
from __future__ import print_function

DIR = "/PATH TO xxx.h5/"
h5_fn = os.path.join(DIR, 'xxx.h5')

with h5py.File(h5_fn, 'w') as f:
   f['data'] = X

   f['label1'] = Y1

   f['label2'] = Y2

text_fn = os.path.join(DIR, 'xxx.txt')
with open(text_fn, 'w') as f:
   print(h5_fn, file = f)

Now you should have a .txt file and a .h5 file in your specified path.

The keys ‘data’, ‘label1’, ‘label2’ are keywords you defined for your data. You can have an arbitrary number of keywords, as long as you write the same keywords when you feed in your data to the hdf5 data layer. An example hdf5 data layer is like this:

layer {
   name: "example"
   type: "HDF5Data"
   top: "data"
   top: "label1"
   top: "label2"

   hdf5_data_param {
     source: "/PATH TO .txt file/"
     batch_size: 100
   }
}

Notice that the top blobs have the same keywords as when I created the .h5 file.

That’s it! Now you can use hdf5 data layer 🙂

Advertisements

3 thoughts on “[Caffe] HDF5 Layer

  1. hi. thanks for your great post, here I get some question: how to arrange the image( channel weight and width) into the hdf5 file, is there some specific order ?

    • Sorry for the late reply, I hope you have already figured this out.

      Just in case this may be relevant, there is no required channel order to make the hdf5 files, it all depends on your application and usage. Normally we save them in the order of [batch, R, G, B], but for particular cases the R G B channels might be swapped or reordered.

  2. Pingback: Tripletloss – badripatro

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s