POV-Ray: Newsgroups: povray.off-topic: Deconvolution

POV-Ray : Newsgroups : povray.off-topic : Deconvolution		Server Time 5 Jul 2025 23:04:56 EDT (-0400)

<<< Previous 10 Messages

Goto Initial 10 Messages

From: scott
Subject: Re: Deconvolution
Date: 10 May 2010 02:51:36
Message: <4be7acf8$1@news.povray.org>

>> The problem is when you apply a convolution to an image, you get a bigger 
>> image (the size of the original image + the size of the convolution 
>> window). The extra data around the edges is needed in order to do the 
>> deconvolution. Obviously in a photo you don't get this information.
>
> This merely means that you can't deconvolve the edges properly.

I thought about this some more and don't agree with you.  Take this 1D 
example:

Convolution kernel: {1,2,1}
Source image: {a,b,c,d,e,f,g,h}
Result: {12,12,12,12,12,12,12,12}

Can you figure out *any* of the pixels in the source image?

If you write out the convolution by hand, the first pixel gives you one 
equation with 3 unknowns - obviously you can't solve this.  Each additional 
pixel gives you one more equation and 1 more unknown.  You can never solve 
this!

In the mathematical world though, your source image is actually:

{0,0,...,0,0,a,b,c,d,e,f,g,h,0,0,...,0,0}

and when you apply your convolution you get edge data too.

With this extra information you can solve the deconvolution (because you 
know the source pixels are zero outside of the image).

IRL the source data is not zero outside of your photo.

Post a reply to this message

From: John VanSickle
Subject: Re: Deconvolution
Date: 12 May 2010 17:55:51
Message: <4beb23e7$1@news.povray.org>

Invisible wrote:
> Echo, focal blur and motion blur are all instances of convolution. That 
> means, hypothetically, that by applying a suitable *deconvolution*, you 
> should be able to get back the original signal.
> 
> Question: Is it actually feasible to do this in the real world?

Without artifacting?  No.

Now you could make a huge matrix (N x N, where N is the number of 
elements in the image to be sharpened) and solve it.  But for an image 
of, say, typical embedded YouTube size, you are looking an matrix that 
is on the order of 80K x 80k elements.  Granted, most of the elements 
are zero, but it's still a headache.

Regards,
John

Post a reply to this message

From: Kevin Wampler
Subject: Re: Deconvolution
Date: 12 May 2010 18:27:15
Message: <4beb2b43$1@news.povray.org>

John VanSickle wrote:
> 
> Now you could make a huge matrix (N x N, where N is the number of 
> elements in the image to be sharpened) and solve it.  But for an image 
> of, say, typical embedded YouTube size, you are looking an matrix that 
> is on the order of 80K x 80k elements.  Granted, most of the elements 
> are zero, but it's still a headache.

In the particular case of deconvolution there are much more efficient 
algorithms available.  There's various techniques, but one insight that 
is commonly exploited is that convolution in the spatial domain is 
equivalent to multiplication in the Fourier domain.  This in the absence 
of noise (which is a *big* assumption) and ignoring image boundaries you 
could compute (non-blind) deconvolution by using FFT on the image and 
the kernel and dividing, which is a pretty efficient operation.

IIRC the approach above isn't used much in practice, largely because of 
the noise assumptions, but some of the more sophisticated methods still 
perform operations in frequency space for efficiency.

Post a reply to this message

From: scott
Subject: Re: Deconvolution
Date: 13 May 2010 02:56:32
Message: <4beba2a0$1@news.povray.org>

> This in the absence of noise (which is a *big* assumption) and ignoring 
> image boundaries you could compute (non-blind) deconvolution by using FFT 
> on the image and the kernel and dividing, which is a pretty efficient 
> operation.

The problem is the convoluted version you have "the photo" is not made of 
complex numbers (which it would be if you did the convolution yourself). 
You need this in order to do the deconvolution, otherwise you don't have 
enough information.  See my 1D example in the other post.

Post a reply to this message

From: Kevin Wampler
Subject: Re: Deconvolution
Date: 13 May 2010 09:42:38
Message: <4bec01ce$1@news.povray.org>

scott wrote:
>> This in the absence of noise (which is a *big* assumption) and 
>> ignoring image boundaries you could compute (non-blind) deconvolution 
>> by using FFT on the image and the kernel and dividing, which is a 
>> pretty efficient operation.
> 
> The problem is the convoluted version you have "the photo" is not made 
> of complex numbers (which it would be if you did the convolution 
> yourself). You need this in order to do the deconvolution, otherwise you 
> don't have enough information.  See my 1D example in the other post.

I'm a bit confused by what you mean here.  If both the image and the 
kernel are real then complex numbers would only be used in the frequency 
space versions of the images, once you apply the IFFT everything should 
be real again.  I don't see how you'd need a (non-trivial) complex 
version of the image anywhere.

Your 1D example seemed to be about the image boundaries rather than the 
image being represented by real/complex numbers.  The point about the 
image boundaries is true, and will prevent an exact deconvolution as you 
pointed out.  You can still often get a good approximate solution by 
either assuming boundary conditions or by using an image prior.  I 
believe the latter is these is what's generally used in practice for 
image deblurring.

In the case of the most basic FFT method for convolution, the boundary 
conditions for the image would be implicitly cyclic.  I should point out 
that (IIRC) the naive deconvolution method of x=ifft(fft(y)/fft(k)) 
generally doesn't work well in practice, but the idea of using the 
Fourier transform to accelerate the computations still carries across to 
more sophisticated approaches.

Unless I'm missing something somewhere?

Post a reply to this message

From: scott
Subject: Re: Deconvolution
Date: 17 May 2010 02:55:23
Message: <4bf0e85b@news.povray.org>

> Your 1D example seemed to be about the image boundaries rather than the 
> image being represented by real/complex numbers.  The point about the 
> image boundaries is true, and will prevent an exact deconvolution as you 
> pointed out.  You can still often get a good approximate solution by 
> either assuming boundary conditions or by using an image prior.  I 
> believe the latter is these is what's generally used in practice for 
> image deblurring.

The example I was referring to was this one:

Convolution kernel: {1,2,1}
Source image: {a,b,c,d,e,f,g,h}
Result: {12,12,12,12,12,12,12,12}

What is the source image?

Post a reply to this message

From: Kevin Wampler
Subject: Re: Deconvolution
Date: 17 May 2010 15:51:48
Message: <4bf19e54@news.povray.org>

scott wrote:
>> Your 1D example seemed to be about the image boundaries rather than 
>> the image being represented by real/complex numbers.  The point about 
>> the image boundaries is true, and will prevent an exact deconvolution 
>> as you pointed out.  You can still often get a good approximate 
>> solution by either assuming boundary conditions or by using an image 
>> prior.  I believe the latter is these is what's generally used in 
>> practice for image deblurring.
> 
> The example I was referring to was this one:
> 
> Convolution kernel: {1,2,1}
> Source image: {a,b,c,d,e,f,g,h}
> Result: {12,12,12,12,12,12,12,12}
> 
> What is the source image?
> 

I think my last post was not clear enough.  I'm agreeing with you that 
you can't exactly recover the source image in your example.  What I'm 
disagreeing with are two things:

1) This has anything to do with complex numbers, even if you use the FFT 
method to recover the source image.

2) This prevents you from getting an approximate answer in practice.

Both of these assertions can be tested pretty easily, so I'll treat them 
in turn.  Also, sorry for the long post, but it seemed useful to go into 
a fair bit of detail.

---------- (1) complex numbers aren't the issue

First of all, let's try the FFT method for solving for the source image. 
  This code should run in both matlab and octave:

n = 8; % size of images
k = [1 2 1]; % kernel
k2 = [zeros(1,ceil((n-3)/2)) k zeros(1,floor((n-3)/2))]; % padded kernel
y = 12 + zeros(1,n); % result image

disp('source image (FFT method)');
ifft(fft(y)./(fft(k2)+1e-8)) % +1e-8 is a regularization hack

 >>>>>> result:

{3 3 3 3 3 3 3 3}

So despite not having any complex numbers in "the photo" (i.e. the 
result image) the method worked just fine and gave us a pretty 
reasonable looking result.

"But there's not enough information to derive a result!" I imagine 
you're thinking.  This is correct, so clearly something was happening 
behind the scenes that allows the method to actually compute a source 
image.  The answer is that the FFT method makes an implicit assumption 
about what happens beyond the boundaries of the result image by treating 
the image as cyclic.  Thus the source image isn't just 
{a,b,c,d,e,f,g,h}, it's actually treated as 
{...,f,g,h,a,b,c,d,e,f,g,h,a,b,c,...}, which is what allows this method 
to work.

This is the point I was making earlier that one way to solve the problem 
is to make an assumption about what happens beyond the boundaries of the 
image.  A natural next question to ask is what sort of errors this 
outright assumption might give in the reconstructed source image.

---------- (2) approximate solutions are possible

Let's look at the source images reconstructed under three different 
"reasonable" assumptions about what happens beyond the boundaries of the 
image:

cyclic: {...,f,g,h,a,b,c,d,e,f,g,h,a,b,c,...}

zero:  {...,0,0,0,a,b,c,d,e,f,g,h,0,0,0,...}

full-intensity: {...,12,12,12,a,b,c,d,e,f,g,h,12,12,12,...}

For each case the resulting source images are:

cyclic-source: {3 3 3 3 3 3 3 3}

zero-source: {5.3333, 1.3333, 4.0000, 2.6667, 2.6667, 4.0000, 1.3333, 
5.3333}

full-intensity-source: {4.5556, 1.8889, 3.6667, 2.7778, 2.7778, 3.6667, 
1.8889, 4.5556}

Clearly the different assumptions have given rise to different 
reconstructed images.  Nevertheless, notice that the center pixels in 
each of the three cases take the values of 3.0, 2.67, and 2.78.  Despite 
the fact that the assumptions for what happened beyond the boundary 
range from 0 to 12, the value of the center pixels only ranges from 2.67 
to 3.0 -- a factor of 36 less.

Basically, an incorrect assumption about what happens beyond the edges 
of the image resulted in errors concentrated at the edges of the 
reconstructed image, so in the center of the source image the FFT method 
actually can actually give a pretty accurate answer.  For a larger image 
(and the same kernel) the center values will be even more accurate.

---------- side notes

I want to emphasize that the deblurring method I gave is only to 
illustrate the basic theoretical principles behind using FFT to 
accelerate the deconvolution process.  In practice inaccurate kernels 
and image noise will result in a pretty poor result if you just use the 
method I gave.  However, people can and have developed pretty good 
deblurring methods which are more sophisticated and do work well in 
practice.  In many (but not all) of these approaches, using the FFT and 
performing some operations in frequency space is very useful for 
speeding up the algorithm.

Post a reply to this message

From: scott
Subject: Re: Deconvolution
Date: 18 May 2010 02:41:07
Message: <4bf23683@news.povray.org>

> 1) This has anything to do with complex numbers, even if you use the FFT 
> method to recover the source image.

Agreed, I incorrectly thought that if you had some complex version of the 
resultant you would have more information to regenerate the source, but with 
real source image and real kernel that is obviously not possible.

> 2) This prevents you from getting an approximate answer in practice.

> >>>>>> result:
>
> {3 3 3 3 3 3 3 3}
>
> cyclic-source: {3 3 3 3 3 3 3 3}
>
> zero-source: {5.3333, 1.3333, 4.0000, 2.6667, 2.6667, 4.0000, 1.3333, 
> 5.3333}
>
> full-intensity-source: {4.5556, 1.8889, 3.6667, 2.7778, 2.7778, 3.6667, 
> 1.8889, 4.5556}

My point was that the following are all valid solutions to the source image:

{3,3,3,3,3,3,3,3}
{2,4,2,4,2,4,2,4}
{1,5,1,5,1,5,1,5}
{6,0,6,0,6,0,6,0}

In this case even the centre pixels could be anywhere from 0 to 6, and the 
"appearance" of the image is radically different in each case.  Surely this 
is exactly the sort of detail a "deblurring" filter needs to recover, but it 
seems impossible.

Post a reply to this message

From: Kevin Wampler
Subject: Re: Deconvolution
Date: 18 May 2010 14:43:03
Message: <4bf2dfb7@news.povray.org>

scott wrote:
> 
> My point was that the following are all valid solutions to the source 
> image:
> 
> {3,3,3,3,3,3,3,3}
> {2,4,2,4,2,4,2,4}
> {1,5,1,5,1,5,1,5}
> {6,0,6,0,6,0,6,0}
> 
> In this case even the centre pixels could be anywhere from 0 to 6, and 
> the "appearance" of the image is radically different in each case.  
> Surely this is exactly the sort of detail a "deblurring" filter needs to 
> recover, but it seems impossible.

Ok, I think I better understand what you were asking now.  To answer it 
I think I should get into some stuff I had alluded to earlier (i.e. "use 
an image prior").  This will probably be another semi-lengthy post, and 
I'll break it into two parts:

1) Looking at your example in the frequency domain to get a good 
theoretical grasp of what's happening.

2) Some discussion of the practical (as opposed to theoretical) issues 
and solutions.

---------- 1) theoretical discussion

Throughout this section I'll be assuming that the image is "cyclic" so 
that it's easy to analyze it in the frequency domain.  Fortunately your 
examples above still work in a cyclic image.

As you've noted, it's possible to reconstruct multiple source images in 
your example.  The reason for this is that you happened to choose an 
"unlucky" kernel.  The reason why it's unlucky can be seen pretty easily 
by looking at the kernel in frequency space.

In using the FFT on the kernel, I has to pad the kernel to make it the 
same size as the image, yielding the padded kernel:

k = {0, 0, 0, 1, 2, 1, 0, 0}

If we now calculate abs(fft(k)), we get:

kf = {4.0, 3.4, 2.0, 0.6, 0.0, 0.6, 2.0, 3.4}

The key thing to notice here is that this contains a element which is 
zero.  This means that the convolution kernel will "kill" certain 
frequencies in the source image.  Basically, we'll get the same result 
image for any source-image + c*ifft([0 0 0 0 1 0 0 0]). Unsurprisingly, 
if we calculate 8*ifft([0 0 0 0 1 0 0 0]) we get:

{1, -1, 1, -1, 1, -1, 1, -1}

Which perfectly explains the nature of the examples you gave which all 
convolve to the same result image.  IIRC most kernels in practice won't 
contain exact zeros like this, and it turns out that even a slightly 
different version of your kernel wouldn't have had this problem:

abs(fft([0 0 0 0.01 1.98 1.01 0 0]))
   = {4.0, 3.4, 1.98, 0.55, 0.04, 0.55, 1.98, 3.4}

It also wouldn't work on images which are an odd number of pixels across 
(still assuming a cyclic image of course), which is possible why you 
chose an image that was an even number of pixels across.

In the code snipped I gave to deconvolve your example, I dealt with this 
issue in a very simple way:

ifft(fft(y)./(fft(k2)+1e-8))

It's the +1e-8 that was the key here, and in your example it's necessary 
to prevent a divide by zero.  The effect of this is that the 
deconvolution didn't "hallucinate" any frequencies that weren't in the 
result image, even if they would technically also be correct 
deconvolutions.  This is why it gave the {3, 3, 3, 3, 3, 3, 3, 3} answer 
rather than one of the others you gave.

---------- 2) practical discussion

As far as I'm aware in practice having a kernel like yours which happens 
to kill certain "unlucky" frequencies isn't really an issue.  What can 
be an issue is that your kernel will suppress high frequencies to the 
point that they fall below the accuracy of your camera sensor (or even 
numerical precision).  This is why undoing a depth of field blur won't 
always be possible (even if everything is at the same depth), since the 
blur kernel in that case heavily suppresses high frequencies.  This 
means that only a limited amount of deblurring is possible (as an aside, 
there's also been work on designing camera apertures so as to minimize 
this).

Fortunately in the case of blur from camera shake the kernel is normally 
a bit more kind and doesn't destroy high frequencies quite so much. 
What is a much bigger issue is that you don't know what the blur kernel 
was, and getting it wrong can induce artifacts into the image.

Fortunately, we can do quite a bit by assuming a good image prior.  The 
artifacts produced by an incorrect deconvolution tend to pretty 
obviously not look like any "real" image.  There's some examples which a 
good explanation here:

http://www.mathworks.com/products/demos/image/ipexblind/ipexblind.html

The best image deblurring algorithms I know use this insight pretty 
heavily, and optimize a deconvolution by comparing how the statistics of 
the deconvolved image compare the expected statistics of "real" images. 
  It turns out that even pretty weak models of what real images look 
like can help quite a bit and often give surprisingly good results.

In some sense the simple deconvolution method I gave was also implicitly 
using an image prior which favors smoother images (for a very particular 
definition of smoothness).

Did this answer your questions?

Post a reply to this message

From: scott
Subject: Re: Deconvolution
Date: 19 May 2010 03:02:55
Message: <4bf38d1f@news.povray.org>

> If we now calculate abs(fft(k)), we get:
>
> kf = {4.0, 3.4, 2.0, 0.6, 0.0, 0.6, 2.0, 3.4}
>
> The key thing to notice here is that this contains a element which is 
> zero.  This means that the convolution kernel will "kill" certain 
> frequencies in the source image.

OK that makes sense.

> As far as I'm aware in practice having a kernel like yours which happens 
> to kill certain "unlucky" frequencies isn't really an issue.  What can be 
> an issue is that your kernel will suppress high frequencies to the point 
> that they fall below the accuracy of your camera sensor (or even numerical 
> precision).

Yes, this was my expectation of what would happen in a practical situation. 
I would imagine a focal-blur kernel would suppress the high frequencies, 
such that the information was either impossible to recover (due to noise 
levels) or what was recoverable would be heavily aliased due to the limited 
precision.

> This is why undoing a depth of field blur won't always be possible (even 
> if everything is at the same depth),

This was my initial thought in response to Andrew's original statement that 
you could get back the original signal because it was a convolution.

> Fortunately in the case of blur from camera shake the kernel is normally a 
> bit more kind and doesn't destroy high frequencies quite so much. What is 
> a much bigger issue is that you don't know what the blur kernel was, and 
> getting it wrong can induce artifacts into the image.

Yes, often you can even see a bright light source in the resultant image and 
how it has moved (maybe a zig-zag pattern due to camera shake) - you could 
probably use this as a basis for the kernel, maybe optimising the kernel 
until the resultant image had the light source as small as possible.

> Did this answer your questions?

Yes thanks!

Post a reply to this message

<<< Previous 10 Messages

Goto Initial 10 Messages