Petar Palasek

New publication added!

Posted on Tue 06 June 2017 in research • Tagged with background modelling, deep learning • Leave a comment

Our work on Background Modelling Based on Generative Unet was accepted at the Analysis of video and audio "in the Wild" Workshop , an international workshop organised in conjuction with IEEE AVSS 2017, that will be held on August 29 in Lecce, Italy.

Abstract

Background Modelling Based on Generative Unet

Ye Tao, Petar Palasek, Ioannis Patras

Background Modelling is a crucial step in background/foreground detection which could be used in video analysis, such as surveillance, people counting, face detection and pose estimation. Most methods need to choose the hyper parameters manually or use ground truth background masks (GT). In this work, we present an unsupervised deep background (BG) modelling method called BM-Unet which is based on a generative architecture that given a certain frame as input it generates as output the corresponding background image - to be more precise, a probabilistic heat map of the colour values. Our method learns parameters automatically and an augmented version of it that utilises colour, intensity differences and optical flow between a reference and a target frame is robust to rapid illumination changes and camera jitter. Besides, it can be used on a new video sequence without the need of ground truth background/foreground masks for training. Experiment evaluations on challenging sequences in SBMnet data set demonstrate promising results over state-of-the-art methods.

New publication added!

Posted on Tue 14 March 2017 in research • Tagged with human pose estimation, deep learning • Leave a comment

Our work on Deep Refinement Convolutional Networks for Human Pose Estimation was accepted at The 12th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2017) which will be held from May 30 to June 3, 2017 in Washington, DC.

Abstract

Deep Refinement Convolutional Networks for Human Pose Estimation

Ioannis Marras, Petar Palasek, Ioannis Patras

This work introduces a novel Convolutional Network architecture (ConvNet) for the task of human pose estimation, that is the localization of body joints in a single static image. The proposed coarse to fine architecture addresses shortcomings of the baseline architecture that stem from the fact that large inaccuracies of its coarse ConvNet cannot be corrected by the refinement ConvNet that refines the estimation within small windows of the coarse prediction. This is achieved by a) changes in architectural parameters that both increase the accuracy of the coarse model and make the refinement model more capable of correcting the errors of the coarse model, b) the introduction of a Markov Random Field (MRF)-based spatial model network between the coarse and the refinement model that introduces geometric constraints and c) a training scheme that adapts the data augmentation and the learning rate according to the difficulty of the data examples. The proposed architecture is trained in an end-to-end fashion. Experimental results show that the proposed method improves the baseline model and provides state of the art results on the FashionPose [8] and MPII benchmarks [1].

How we turned paintings into noise

Posted on Mon 27 June 2016 in fun • Tagged with painting, noise, sound, python • Leave a comment

A couple of months ago I was asked by Lidija Matulin, an academy-trained artist from Croatia, to help her on a project that she wanted to work on. She had this idea to make paintings that would extend to an additional dimension and include sound. That sounded (tee-hee) cool to me so I decided to help! Yay!

First, here's one of the painings Lidija made as part of this project, entitled "Vi niste pozvani pa izvolite sjesti" which translates roughly to "You're not invited so please have a seat":

The idea

We brainstormed how a painting could be tranformed into sound and concluded that a simple way would be to have a line that scans through the painting and turns whatever it sees into sound. But how to turn colour information into sound? We tried a couple of approaches.

What we tried

The first approach was just to take the average colour value of all the colours under the scanner line, turn the colour into greyscale, map this value into a frequency and produce a sine wave. Here I have to say thanks to Carl Bussey from Sheffield, who now lives in Berlin and writes software that makes music, because he showed me how to play sine waves in Python using pyaudio. Thanks Carl!

This first try didn't produce sounds that were too exciting so we had to make some changes. We split the scanner line into parts and had each part of the line make a sound on its own. This sounded a bit better.

The other thing that was annoying was that all of the sounds the program made always had the same length, which was dull. So, we made the length of the sound being played also depend on what was in the painting. Here we also decided that it would be better to have two different sources of sound from the painting; one from the background and one from the foreground. As the painting was now split into two different images, it was easy to make the sound of the foreground parts change their length depending on the area that was under each part of the scanner line as it scanned throught the painting.

One more thing that we did was to try different representations of colour other than the usual RGB representation before mapping the colour into the frequency of the sine wave being played. So we tried different combinations of hue, saturation, and value from the HSV representation.

In the end we had a script written in Python with a lot of parameters to play with that produced different sounds for different things that appeared in an image. Once you ran the script given a foreground and a background image, a video would be produced with a scanner line turning whatever it saw into sound.

So I gave this code to Lidija and she tried all kinds of different combinations of parameters and she figured out what to paint to produce what kind of sound and

The final results

The final paintings were shown at an exhibition under the title "Jednosmjernom kartom, umrljanom marmeladom, kroz zečju rupu" / "With a one way ticket, with stained marmelade, through the rabbit hole" in the Center for culture in Čakovec, Croatia.

Here's two photos from there, taken from here:

Some more articles (in Croatian) about the exhibition can be found here or here. You still have the chance to visit the exhibition until August 1st.

Have a look at the videos!

The exhibition also included projections of the videos with the produced sound and you can check them out here.

"Vidim što jedem" / "I can see what I'm eating"

"Tko krade tart?" / "Who steals tart?"

"Vi niste pozvani pa izvolite sjesti" / "You're not invited so please have a seat"

"Marmelada svaki drugi dan" / "Marmelade every other day"

"Crv jede i bude pojeden" / "A worm eating and being eaten"

"Keksi iz limenke ulja" / "Biscuits from a can of oil"

"Welsh rabbit" + "Don't look at me! I'm not a hare!"

That's it! Thanks for scrolling through and don't forget to have a look at Lidija's website!

Older Posts Newer Posts