|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
>> No. Only if everything in the frame is at approximately the same focal
>> distance does it approximate a 2D convolution.
>
> I doubt that happens very often, usually the camera gets a point in
> focus you didn't intend (and the bit you did intend is then out of focus).
No, as demonstrated, the problem I usually have is that my camera's
optics physically can't focus near enough. That blurry image I posted is
of a flat piece of ground. The focal depth is probably near-identical
everywhere. (Except that I think the ground plane might be tilted
relative to the camera...)
> But is it accurate enough to visibly improve the sharpness of an image?
> Judging by all the software I've seen that claims to do this, usually not.
OK. That's kind of what I was asking. DSP theory says it can be done,
but are the results worth it?
(DSP theory also says that a perfect deconvolution would require
infinity gain at certain frequencies. You just *know* that's not going
to work well in the Real World.)
>> This merely means that you can't deconvolve the edges properly.
>
> True, more precisely a region half the size of the convolution kernel
> along each edge.
For the image I'm thinking of, that wouldn't matter. The bit I want is
the person's head in the center.
>> A much bigger problem is figuring out what the hell the convolution
>> kernel might have been, given only the blurred image...
>
> Indeed, and it might not be constant for every pixel.
I suppose if it changes sufficiently slowly, it might still be possible
to estimate... maybe...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> (Except that I think the ground plane might be tilted relative to the
> camera...)
So each point will have a different convolution kernel...
Maybe if you thought about the optics of your camera you could derive an
approximate kernel as a function of depth and focal distance, you could then
apply this for each pixel using some trial&error method to get the sharpest
image (or some other method).
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> The problem is when you apply a convolution to an image, you get a bigger
>> image (the size of the original image + the size of the convolution
>> window). The extra data around the edges is needed in order to do the
>> deconvolution. Obviously in a photo you don't get this information.
>
> This merely means that you can't deconvolve the edges properly.
I thought about this some more and don't agree with you. Take this 1D
example:
Convolution kernel: {1,2,1}
Source image: {a,b,c,d,e,f,g,h}
Result: {12,12,12,12,12,12,12,12}
Can you figure out *any* of the pixels in the source image?
If you write out the convolution by hand, the first pixel gives you one
equation with 3 unknowns - obviously you can't solve this. Each additional
pixel gives you one more equation and 1 more unknown. You can never solve
this!
In the mathematical world though, your source image is actually:
{0,0,...,0,0,a,b,c,d,e,f,g,h,0,0,...,0,0}
and when you apply your convolution you get edge data too.
With this extra information you can solve the deconvolution (because you
know the source pixels are zero outside of the image).
IRL the source data is not zero outside of your photo.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> Echo, focal blur and motion blur are all instances of convolution. That
> means, hypothetically, that by applying a suitable *deconvolution*, you
> should be able to get back the original signal.
>
> Question: Is it actually feasible to do this in the real world?
Without artifacting? No.
Now you could make a huge matrix (N x N, where N is the number of
elements in the image to be sharpened) and solve it. But for an image
of, say, typical embedded YouTube size, you are looking an matrix that
is on the order of 80K x 80k elements. Granted, most of the elements
are zero, but it's still a headache.
Regards,
John
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
John VanSickle wrote:
>
> Now you could make a huge matrix (N x N, where N is the number of
> elements in the image to be sharpened) and solve it. But for an image
> of, say, typical embedded YouTube size, you are looking an matrix that
> is on the order of 80K x 80k elements. Granted, most of the elements
> are zero, but it's still a headache.
In the particular case of deconvolution there are much more efficient
algorithms available. There's various techniques, but one insight that
is commonly exploited is that convolution in the spatial domain is
equivalent to multiplication in the Fourier domain. This in the absence
of noise (which is a *big* assumption) and ignoring image boundaries you
could compute (non-blind) deconvolution by using FFT on the image and
the kernel and dividing, which is a pretty efficient operation.
IIRC the approach above isn't used much in practice, largely because of
the noise assumptions, but some of the more sophisticated methods still
perform operations in frequency space for efficiency.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> This in the absence of noise (which is a *big* assumption) and ignoring
> image boundaries you could compute (non-blind) deconvolution by using FFT
> on the image and the kernel and dividing, which is a pretty efficient
> operation.
The problem is the convoluted version you have "the photo" is not made of
complex numbers (which it would be if you did the convolution yourself).
You need this in order to do the deconvolution, otherwise you don't have
enough information. See my 1D example in the other post.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
>> This in the absence of noise (which is a *big* assumption) and
>> ignoring image boundaries you could compute (non-blind) deconvolution
>> by using FFT on the image and the kernel and dividing, which is a
>> pretty efficient operation.
>
> The problem is the convoluted version you have "the photo" is not made
> of complex numbers (which it would be if you did the convolution
> yourself). You need this in order to do the deconvolution, otherwise you
> don't have enough information. See my 1D example in the other post.
I'm a bit confused by what you mean here. If both the image and the
kernel are real then complex numbers would only be used in the frequency
space versions of the images, once you apply the IFFT everything should
be real again. I don't see how you'd need a (non-trivial) complex
version of the image anywhere.
Your 1D example seemed to be about the image boundaries rather than the
image being represented by real/complex numbers. The point about the
image boundaries is true, and will prevent an exact deconvolution as you
pointed out. You can still often get a good approximate solution by
either assuming boundary conditions or by using an image prior. I
believe the latter is these is what's generally used in practice for
image deblurring.
In the case of the most basic FFT method for convolution, the boundary
conditions for the image would be implicitly cyclic. I should point out
that (IIRC) the naive deconvolution method of x=ifft(fft(y)/fft(k))
generally doesn't work well in practice, but the idea of using the
Fourier transform to accelerate the computations still carries across to
more sophisticated approaches.
Unless I'm missing something somewhere?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> Your 1D example seemed to be about the image boundaries rather than the
> image being represented by real/complex numbers. The point about the
> image boundaries is true, and will prevent an exact deconvolution as you
> pointed out. You can still often get a good approximate solution by
> either assuming boundary conditions or by using an image prior. I
> believe the latter is these is what's generally used in practice for
> image deblurring.
The example I was referring to was this one:
Convolution kernel: {1,2,1}
Source image: {a,b,c,d,e,f,g,h}
Result: {12,12,12,12,12,12,12,12}
What is the source image?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
>> Your 1D example seemed to be about the image boundaries rather than
>> the image being represented by real/complex numbers. The point about
>> the image boundaries is true, and will prevent an exact deconvolution
>> as you pointed out. You can still often get a good approximate
>> solution by either assuming boundary conditions or by using an image
>> prior. I believe the latter is these is what's generally used in
>> practice for image deblurring.
>
> The example I was referring to was this one:
>
> Convolution kernel: {1,2,1}
> Source image: {a,b,c,d,e,f,g,h}
> Result: {12,12,12,12,12,12,12,12}
>
> What is the source image?
>
I think my last post was not clear enough. I'm agreeing with you that
you can't exactly recover the source image in your example. What I'm
disagreeing with are two things:
1) This has anything to do with complex numbers, even if you use the FFT
method to recover the source image.
2) This prevents you from getting an approximate answer in practice.
Both of these assertions can be tested pretty easily, so I'll treat them
in turn. Also, sorry for the long post, but it seemed useful to go into
a fair bit of detail.
---------- (1) complex numbers aren't the issue
First of all, let's try the FFT method for solving for the source image.
This code should run in both matlab and octave:
n = 8; % size of images
k = [1 2 1]; % kernel
k2 = [zeros(1,ceil((n-3)/2)) k zeros(1,floor((n-3)/2))]; % padded kernel
y = 12 + zeros(1,n); % result image
disp('source image (FFT method)');
ifft(fft(y)./(fft(k2)+1e-8)) % +1e-8 is a regularization hack
>>>>>> result:
{3 3 3 3 3 3 3 3}
So despite not having any complex numbers in "the photo" (i.e. the
result image) the method worked just fine and gave us a pretty
reasonable looking result.
"But there's not enough information to derive a result!" I imagine
you're thinking. This is correct, so clearly something was happening
behind the scenes that allows the method to actually compute a source
image. The answer is that the FFT method makes an implicit assumption
about what happens beyond the boundaries of the result image by treating
the image as cyclic. Thus the source image isn't just
{a,b,c,d,e,f,g,h}, it's actually treated as
{...,f,g,h,a,b,c,d,e,f,g,h,a,b,c,...}, which is what allows this method
to work.
This is the point I was making earlier that one way to solve the problem
is to make an assumption about what happens beyond the boundaries of the
image. A natural next question to ask is what sort of errors this
outright assumption might give in the reconstructed source image.
---------- (2) approximate solutions are possible
Let's look at the source images reconstructed under three different
"reasonable" assumptions about what happens beyond the boundaries of the
image:
cyclic: {...,f,g,h,a,b,c,d,e,f,g,h,a,b,c,...}
zero: {...,0,0,0,a,b,c,d,e,f,g,h,0,0,0,...}
full-intensity: {...,12,12,12,a,b,c,d,e,f,g,h,12,12,12,...}
For each case the resulting source images are:
cyclic-source: {3 3 3 3 3 3 3 3}
zero-source: {5.3333, 1.3333, 4.0000, 2.6667, 2.6667, 4.0000, 1.3333,
5.3333}
full-intensity-source: {4.5556, 1.8889, 3.6667, 2.7778, 2.7778, 3.6667,
1.8889, 4.5556}
Clearly the different assumptions have given rise to different
reconstructed images. Nevertheless, notice that the center pixels in
each of the three cases take the values of 3.0, 2.67, and 2.78. Despite
the fact that the assumptions for what happened beyond the boundary
range from 0 to 12, the value of the center pixels only ranges from 2.67
to 3.0 -- a factor of 36 less.
Basically, an incorrect assumption about what happens beyond the edges
of the image resulted in errors concentrated at the edges of the
reconstructed image, so in the center of the source image the FFT method
actually can actually give a pretty accurate answer. For a larger image
(and the same kernel) the center values will be even more accurate.
---------- side notes
I want to emphasize that the deblurring method I gave is only to
illustrate the basic theoretical principles behind using FFT to
accelerate the deconvolution process. In practice inaccurate kernels
and image noise will result in a pretty poor result if you just use the
method I gave. However, people can and have developed pretty good
deblurring methods which are more sophisticated and do work well in
practice. In many (but not all) of these approaches, using the FFT and
performing some operations in frequency space is very useful for
speeding up the algorithm.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> 1) This has anything to do with complex numbers, even if you use the FFT
> method to recover the source image.
Agreed, I incorrectly thought that if you had some complex version of the
resultant you would have more information to regenerate the source, but with
real source image and real kernel that is obviously not possible.
> 2) This prevents you from getting an approximate answer in practice.
> >>>>>> result:
>
> {3 3 3 3 3 3 3 3}
>
> cyclic-source: {3 3 3 3 3 3 3 3}
>
> zero-source: {5.3333, 1.3333, 4.0000, 2.6667, 2.6667, 4.0000, 1.3333,
> 5.3333}
>
> full-intensity-source: {4.5556, 1.8889, 3.6667, 2.7778, 2.7778, 3.6667,
> 1.8889, 4.5556}
My point was that the following are all valid solutions to the source image:
{3,3,3,3,3,3,3,3}
{2,4,2,4,2,4,2,4}
{1,5,1,5,1,5,1,5}
{6,0,6,0,6,0,6,0}
In this case even the centre pixels could be anywhere from 0 to 6, and the
"appearance" of the image is radically different in each case. Surely this
is exactly the sort of detail a "deblurring" filter needs to recover, but it
seems impossible.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|