Projects

chainer_prefetch_multiprocess_iterator (Python)

  • This is the reference implementation of my study “Accelerating Machine Learning I/O by Overlapping Data Staging and Mini-batch Generations”.
  • This is a Chainer Iterator class that executes prefetching training data from slow storages (such like parallel file systems) into fast storage (such as SSD) and generating mini-batches in the same time.
  • The aim of this study is to conceal the time for staging-in training dataset into node-local storages in computation nodes at HPC clusters (such as ABCI, TSUBAME, Cygnus, and so on).

chainer_minibatch_size_optimizer (Python)

inference_engine (C++)

  • This is an ONNX runtime implementation, such like onnxruntime or menoh.
  • For now, Gemm, Conv, MaxPool, Relu, Softmax, Dropout, Reshape are supported.
  • All backend implementation is my own implementation for CPU (This means that current backend implementation does not use optimized matrix libraries, such like Blas, Intel MKL-DDN, and so on.).

simple_map_reduce - Distributed MapReduce framework (Ruby)

optimization_experiments (C)

  • This is an experimental project to study optimization for GEMM and GEMV.
  • This projects includes dgemm and dgemv implementations which are optimized by loop exchange, loop unloop, blcoknize, padding repeatedly.

convolution_experiments (Python)

  • This is an experimental project to study implementation for Convolution with direct and im2col style repeatedly.

config2args (Rust)

  • This is a CLI tool to convert json config file into GNU CLI option style (such like --key1 value1 --key2 value2)

Nifty tech tag lists from Wouter Beeftink