New publication added!

Posted on Tue 14 June 2016 in research • Tagged with action recognition, rbmLeave a comment

Our work on Action Recognition Using Convolutional Restricted Boltzmann Machines has been published in the Proceedings of the 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction (MARMI) which was held in conjunction with ACM Conference on Multimedia Retrieval (ICMR) in New York, June 6-9 2016.

Abstract

Action Recognition Using Convolutional Restricted Boltzmann Machines

In this work we study deep learning architectures for the problem of action recognition in image sequences focusing on generative neural networks, namely the convolutional extension of restricted Boltzmann machines (RBMs). We first use a stack of convolutional restricted Boltzmann machines to learn and extract features from sequences of images in an unsupervised way, and then use them for the task of action classification. We modify the energy function of the convolutional RBM in such a way that the training updates reported in the literature follow directly from the differentiation of the objective function, which we define in terms of the free energy function. This is in contrast to other works on convolutional RBMs in the literature whose update equations do not directly follow from a well defined energy function or optimization framework without any ad hoc normalizations. We show that the representations that are derived from unsupervised training of the RBMs have very similar or better descriptive power than hand-designed image descriptors and give competitive performance in the problem of action recognition.

Link: http://doi.acm.org/10.1145/2927006.2927012


My dishes were misbehaving today!

Posted on Sat 27 February 2016 in fun • Tagged with animation, ffmpegLeave a comment

I knew something weird was happening in my kitchen lately, so I decided I should investigate this. I hid my camera in the cupboard and went to write some code on my laptop in the living room. Of course, I had music playing, but it was not loud enough that I don't notice the noises coming from the kitchen.

Needless to say that I was shocked by the state of the kitchen when I opened the door. Luckily, the camera recorded everything! Look what happened!

The video

Stop lying!

Oh, OK... It's all a lie :(

How to do it?

Just take a bunch of photos, copy them to your computer and run the following two pieces of code. The first one to rename the photos into a more convenient form:

rename.sh download
a=0
for i in *.JPG; do
  new=$(printf "frame_%04d.JPG" ${a}) #04 pad to length of 4
  mv ${i} new/${new}
  let a=a+1
done

And the second one to turn the frames into a video:

make_video.sh download
ffmpeg -i frame_%4d.JPG -vcodec mpeg4 -b 800k -vf "setpts=3*PTS" video.avi

That's it! Hooray!


Let's transfer some style!

Posted on Sat 13 February 2016 in fun • Tagged with deep, style, pythonLeave a comment

A couple of months ago I had some fun playing with style-transfer, an implementation of the "A Neural Algorithm of Artistic Style" paper by L. Gatys, A. Ecker, and M. Bethge that allows you to transfer a style from one image to another image, while keeping the contents of the input image.

So, here are some examples of the images I generated. Below each of them you'll see the original image and the image the style was transfered from.

Čakovec

Cakovec output

Used images

The input image is the centre of my hometown Čakovec.

The style image is The Swan by Leonid Afremov.

A selfie

Selfie

Used images

The input image is a selfie of Yoshi, Yuki and me taken in Donja Dubrava.

The style image is a Dragon Ball Z drawing that we found somewhere online.

Watch it being generated!

I even made a video of this one being generated, have a look at it here:

Tree

Tree output

Used images

The input image is a famous old tree in Čakovec.

The style image used is Wheatfield with Crows by Van Gogh.

Čakovec park

Park output

Used images

The input image is the park in Čakovec.

The style image used is Wheatfield with Crows by Van Gogh.

New Year's Eve in Manchester

Party output

Used images

The input image was taken by Joel Goodman.

The style image used is La muse by Picasso.

Conclusion

Transfering style from one image to another is fun!!! Yaaaaaaay!!!