Petar Palasek

New publication added!

Posted on Tue 14 June 2016 in research • Tagged with action recognition, rbm • Leave a comment

Our work on Action Recognition Using Convolutional Restricted Boltzmann Machines has been published in the Proceedings of the 1st International Workshop on Multimedia Analysis and Retrieval for Multimodal Interaction (MARMI) which was held in conjunction with ACM Conference on Multimedia Retrieval (ICMR) in New York, June 6-9 2016.

Abstract

Action Recognition Using Convolutional Restricted Boltzmann Machines

In this work we study deep learning architectures for the problem of action recognition in image sequences focusing on generative neural networks, namely the convolutional extension of restricted Boltzmann machines (RBMs). We first use a stack of convolutional restricted Boltzmann machines to learn and extract features from sequences of images in an unsupervised way, and then use them for the task of action classification. We modify the energy function of the convolutional RBM in such a way that the training updates reported in the literature follow directly from the differentiation of the objective function, which we define in terms of the free energy function. This is in contrast to other works on convolutional RBMs in the literature whose update equations do not directly follow from a well defined energy function or optimization framework without any ad hoc normalizations. We show that the representations that are derived from unsupervised training of the RBMs have very similar or better descriptive power than hand-designed image descriptors and give competitive performance in the problem of action recognition.

Link: http://doi.acm.org/10.1145/2927006.2927012

My dishes were misbehaving today!

Posted on Sat 27 February 2016 in fun • Tagged with animation, ffmpeg • Leave a comment

I knew something weird was happening in my kitchen lately, so I decided I should investigate this. I hid my camera in the cupboard and went to write some code on my laptop in the living room. Of course, I had music playing, but it was not loud enough that I don't notice the noises coming from the kitchen.

Needless to say that I was shocked by the state of the kitchen when I opened the door. Luckily, the camera recorded everything! Look what happened!

The video

Stop lying!

Oh, OK... It's all a lie :(

How to do it?

Just take a bunch of photos, copy them to your computer and run the following two pieces of code. The first one to rename the photos into a more convenient form:

rename.sh download

a=0
for i in *.JPG; do
  new=$(printf "frame_%04d.JPG" ${a}) #04 pad to length of 4
  mv ${i} new/${new}
  let a=a+1
done

And the second one to turn the frames into a video:

make_video.sh download

ffmpeg -i frame_%4d.JPG -vcodec mpeg4 -b 800k -vf "setpts=3*PTS" video.avi

That's it! Hooray!

Let's transfer some style!

Posted on Sat 13 February 2016 in fun • Tagged with deep, style, python • Leave a comment

A couple of months ago I had some fun playing with style-transfer, an implementation of the "A Neural Algorithm of Artistic Style" paper by L. Gatys, A. Ecker, and M. Bethge that allows you to transfer a style from one image to another image, while keeping the contents of the input image.

So, here are some examples of the images I generated. Below each of them you'll see the original image and the image the style was transfered from.