Yuhang He's Blog

Some birds are not meant to be caged, their feathers are just too bright.

How to Add New Data Layer in Caffe

Caffe is initially designed for classification. That is, an image corresponds to an integer label. In many real-scenario applications, however, we want the deep neural network to accept multiple images as well as multiple float/integer labels. To this end, we have to write a new data layer, which is actually complex as you have to rewrite or change the source code in many places. Here I give a step-by-step hands-on guide to achieve it. (although many alternatives exist, such as python layer, image data layer and other temporary schemes, they are not perfect solutions, especially when we deal with large amounts of images and we have to convert them into LMDB/LEVELDB dataset to speed up training.)

How Caffe Data Layer Work

Before directly writing codes, we’d better figure out the details of how caffe data layer works. The following figure illustrates the data layer dependency with other relevant layers. caffelayer Note that the latest Caffe abandoned the DataReader Layer (still I don’t know why), so it might not work well with your caffe. Anyway, here I try to clear the way of the black box of how Caffe converts image input (in most cases) into caffe-acceptable data format. First, let’s look at the anatomy of these layers.

DataReader Layer

DataReader Layer is responsible for reading the new data format you defined in the caffe.proto and further converting them into caffe Batch dataset. Before directly entering into how DataReader handles your defined data format, let me first inset how Caffe uniformly stores and manages these data. Actually, Caffe purposely defines a class Batch to handle this (in base_data_layer.hpp):

1
2
3
4
5
template <typename Dtype>
class Batch{
  public:
    Blob<Dtype> data_, label_;
};

Yes, Batch stores whatever data format you defined into two Blobs: data_ and label_. Keeping this in mind helps you to jump out of the redundancy as well as the uncertainty of caffe code. All you have to do is to convert whatever your defined data to the data_ and label_ blobs respectively (if label_ is necessary). DataReader Layer reads one of your defined data via full().peep() and full().pop() each time or drops it via free(). DataReader inherits from InternalThread and BlockingQueue layers, both of which are somewhat difficult to fully understand as they guarantee to read dataset sequentially and correctly without thread blocking.

BaseData Layer

DataReader just reads your defined data and he doesn’t know how to transform it to data_ and label_ blobs appropriately. For example, how to determine the four dimensions $(n, h, w, c)$? how to split your defined data into data_ and label_ respectively? BaseData Layer is responsible for this task. Specifically, BaseData Layer involves a BasePrefetchingDataLayer, which holds the class variable transformed_data_ and class function load_batch() and data_transformer_. data_transformer_ is used to infer the data_ blob shape as well as to transform the original input data. transformed_data_ is an intermediate datum that stores the temporary transformed datum. load_batch() is responsible for loading your defined data one by one until reaching the batch size, forming the final data_ and label_ blobs.

Take a deep breath, it’s hard to fully understand the working mechanism without reading the code line by line. So, do not expect to skip the irritating code snippets chewing when you truly want to add new data layer! In sum, there are five steps to add a new data layer:

  • Modify the caffe.proto to define your new data format. For example:
1
2
3
4
5
6
7
8
9
10
11
message TripletDatum{
  optional int32 channels = 1;
  optional int32 height = 2;
  optional int32 width = 3;
  //actual image data for anchor, in bytes
  optional bytes data_anchor = 4;
  optional bytes data_pos = 5;
  optional bytes data_neg = 6;

  optional bool encoded = 7 [default = false];
}
  • Write a new data reader. For example triplet_data_reader.hpp/cpp.
  • Write a new blocking queue. For example triplet_blocking_queue.hpp/cpp
  • Write a new data reader. For example triplet_db_data_layer.hpp/cpp
  • Modify the db.hpp/cpp to allow the system to recognize your defined data format. For example, in db.hpp, you may have to add the the following code:
1
DB* GetDB(TripletDataParameter::DB backend);

In db.cpp you accordingly have to add the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
DB* GetDB(TripletDataParameter::DB backend) {
  switch (backend) {
#ifdef USE_LEVELDB
  case TripletDataParameter_DB_LEVELDB:
    return new LevelDB();
#endif  // USE_LEVELDB
#ifdef USE_LMDB
  case TripletDataParameter_DB_LMDB:
    return new LMDB();
#endif  // USE_LMDB
  default:
    LOG(FATAL) << "Unknown database backend";
    return NULL;
  }
}

Summary

Actually, I do not expect you can grasp the capability to add a new data layer by just reading this blog. What I hope for this blog is to provide you with an intuitive understanding or a clear guide that can help to quickly figure out what you should do step by step to achieve the final goal.

Hope you enjoy it and don’t feel that obscure.