 |
 |
|
 |
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Francois LE COAT <lec### [at] atari org> wrote:
> I have the first reference image, and the second one, from which I
> determine the optical-flow. That means every pixel integer displacement
> between an image, to the other. That gives a vector field that can
> be approximated by a global projective transformation. That means eight
> parameters in rotation and translation, that match the two images the
> best. The quality of the images' matching is measured with correlation,
> between the first image and the second (projectively) transformed.
So, you have this vector field that - sort of maps the pixels in the second
frame to the pixels in the first frame.
And then somehow the projection matrix is calculated from that.
> Disparity field which is an integer displacement between images, is
> linked to depth in the image (3rd dimension) with an inverse relation
> (depth=base/disparity). That means we can evaluate image's depth from
> a continuous video stream.
Presumably when an object gets closer, the image gets "scaled up" in the frame,
and you use that to calculate the distance of the object from the camera.
> > I'm also wondering if you could create a 3D rendering with the data you
> 're
> > extracting, and maybe a 2D orthographic overhead map of the scene that
> the
> > drones are flying through, mapping the position of the drones in the fo
> rest.
> All is not so perfect, that we can imagine what
> you're wishing, from the state of the art...
Well, I'm just thinking that you must have an approximate idea of where each
tree is, given that you calculate a projection matrix and know something about
the depth. So I was just wondering if, given that information, you could simply
place a cylinder of the approximate diameter and at the right depth.
> It's a long work since I can show some promising results. Now I'm
> happy to share it with a larger audience. Unfortunately all the
> good people that worked with me, are not here to appreciate. That's
> why I'm surprised with your interest =)
Well, I have been interested in the fundamentals of photogrammetry for quite
some time. And I have been following your work for the last several years,
hoping to learn how to create such projection matrices and apply them.
https://news.povray.org/povray.advanced-users/thread/%3C5be592ea%241%40news.povray.org%3E/
I don't work in academia or the graphics industry, so I only have what free time
that I can devote to independently learning this stuff on my own.
Even if I were to simply use a POV-Ray scene, where I rendered two images with
different camera locations - then I'm assuming that I could calculate a vector
field and a projection matrix. (something simple like cubes, spheres, and
cylinders)
Given The projection matrix and one of the two renders, would I then have the
necessary and sufficient information to write a .pov scene to recreate the
render from scratch?
- BW
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Bald Eagle writes:
> Francois LE COAT wrote:
>> I have the first reference image, and the second one, from which I
>> determine the optical-flow. That means every pixel integer displacemen
t
>> between an image, to the other. That gives a vector field that can
>> be approximated by a global projective transformation. That means eigh
t
>> parameters in rotation and translation, that match the two images the
>> best. The quality of the images' matching is measured with correlation
,
>> between the first image and the second (projectively) transformed.
>
> So, you have this vector field that - sort of maps the pixels in the se
cond
> frame to the pixels in the first frame.
> And then somehow the projection matrix is calculated from that.
>
>> Disparity field which is an integer displacement between images, is
>> linked to depth in the image (3rd dimension) with an inverse relation
>> (depth=base/disparity). That means we can evaluate image's depth fro
m
>> a continuous video stream.
>
> Presumably when an object gets closer, the image gets "scaled up" in th
e frame,
> and you use that to calculate the distance of the object from the camer
a.
>
>>> I'm also wondering if you could create a 3D rendering with the data y
ou're
>>> extracting, and maybe a 2D orthographic overhead map of the scene tha
t the
>>> drones are flying through, mapping the position of the drones in the
forest.
>
>> All is not so perfect, that we can imagine what
>> you're wishing, from the state of the art...
>
> Well, I'm just thinking that you must have an approximate idea of where
each
> tree is, given that you calculate a projection matrix and know somethin
g about
> the depth. So I was just wondering if, given that information, you cou
ld simply
> place a cylinder of the approximate diameter and at the right depth.
>
>> It's a long work since I can show some promising results. Now I'm
>> happy to share it with a larger audience. Unfortunately all the
>> good people that worked with me, are not here to appreciate. That's
>> why I'm surprised with your interest =)
>
> Well, I have been interested in the fundamentals of photogrammetry for
quite
> some time. And I have been following your work for the last several ye
ars,
> hoping to learn how to create such projection matrices and apply them.
>
> https://news.povray.org/povray.advanced-users/thread/%3C5be592ea%241%40
news.povray.org%3E/
>
> I don't work in academia or the graphics industry, so I only have what
free time
> that I can devote to independently learning this stuff on my own.
>
> Even if I were to simply use a POV-Ray scene, where I rendered two imag
es with
> different camera locations - then I'm assuming that I could calculate a
vector
> field and a projection matrix. (something simple like cubes, spheres, a
nd
> cylinders)
>
> Given The projection matrix and one of the two renders, would I then ha
ve the
> necessary and sufficient information to write a .pov scene to recreate
the
> render from scratch?
>
> - BW
I understand your question. The problem is I'm far from reconstituting
a 3D scene from the monocular information I have at the moment. I know
a company which is doing this sort of application, called Stereolabs...
<https://www.stereolabs.com/>
I'm far from their perfect 3D acquisition process. And it is obtained
with two cameras. I only have one camera, and a video stream that I
didn't acquired myself. Is there an interest, and applications? I'm
not at this step in my work.
I know that similar monocular image processing have been used on planet
Mars, because the helicopter only had one piloting camera, for weight
and embedding reasons.
The main goal at this point of the work, is to show that we could
eventually do the same job, using many cameras, or with only one.
But I'm far from obtaining a similar result, as elaborated systems,
like with Stereolabs camera ZED, for instance.
That is already being done perfectly with a stereoscopic system...
Do you understand? Thanks for your attention.
Best regards,
--
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Francois LE COAT <lec### [at] atari org> wrote:
> I have the first reference image, and the second one, from which I
> determine the optical-flow. That means every pixel integer displacement
> between an image, to the other. That gives a vector field that can
> be approximated by a global projective transformation. That means eight
> parameters in rotation and translation, that match the two images the
> best. The quality of the images' matching is measured with correlation,
> between the first image and the second (projectively) transformed.
So, you have this vector field that - sort of maps the pixels in the second
frame to the pixels in the first frame.
And then somehow the projection matrix is calculated from that.
> Disparity field which is an integer displacement between images, is
> linked to depth in the image (3rd dimension) with an inverse relation
> (depth=base/disparity). That means we can evaluate image's depth from
> a continuous video stream.
Presumably when an object gets closer, the image gets "scaled up" in the frame,
and you use that to calculate the distance of the object from the camera.
> > I'm also wondering if you could create a 3D rendering with the data you
> 're
> > extracting, and maybe a 2D orthographic overhead map of the scene that
> the
> > drones are flying through, mapping the position of the drones in the fo
> rest.
> All is not so perfect, that we can imagine what
> you're wishing, from the state of the art...
Well, I'm just thinking that you must have an approximate idea of where each
tree is, given that you calculate a projection matrix and know something about
the depth. So I was just wondering if, given that information, you could simply
place a cylinder of the approximate diameter and at the right depth.
> It's a long work since I can show some promising results. Now I'm
> happy to share it with a larger audience. Unfortunately all the
> good people that worked with me, are not here to appreciate. That's
> why I'm surprised with your interest =)
Well, I have been interested in the fundamentals of photogrammetry for quite
some time. And I have been following your work for the last several years,
hoping to learn how to create such projection matrices and apply them.
https://news.povray.org/povray.advanced-users/thread/%3C5be592ea%241%40news.povray.org%3E/
I don't work in academia or the graphics industry, so I only have what free time
that I can devote to independently learning this stuff on my own.
Even if I were to simply use a POV-Ray scene, where I rendered two images with
different camera locations - then I'm assuming that I could calculate a vector
field and a projection matrix. (something simple like cubes, spheres, and
cylinders)
Given The projection matrix and one of the two renders, would I then have the
necessary and sufficient information to write a .pov scene to recreate the
render from scratch?
- BW
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Bald Eagle writes:
> Even if I were to simply use a POV-Ray scene, where I rendered two imag
es with
> different camera locations - then I'm assuming that I could calculate a
vector
> field and a projection matrix. (something simple like cubes, spheres, a
nd
> cylinders)
>
> Given The projection matrix and one of the two renders, would I then ha
ve the
> necessary and sufficient information to write a .pov scene to recreate
the
> render from scratch?
>
> - BW
For the moment, the work with depth from monocular vision is not enough
advanced that we can recreate the visible scene. Vision with two cameras
or more, gives a much advanced result for 3D reconstruction of scenes.
Let's remind us the starting point from this thread... We've redone the
experiment from Hernan Badino, who is walking with a camera on his head:
<https://www.youtube.com/watch?v=GeVJMamDFXE>
Hernan determines his 2D ego-motion in the x-y plane, from corresponding
interest points that persist in the video stream. That means he is
calculating the projection matrix of the movement to deduce translations
in the ground plane. With time integration, it gives him the trajectory.
We're doing almost the same, but I work with OpenCV's optical-flow, and
not interest points. And my motion model is 3D, to obtain 8 parameters
in rotation and translation, that I can use in Persistence Of Vision.
I hope you're understanding... I'm reconstituting the 3D movement, and I
discover it's giving "temporal disparity", that is depth from motion.
Best regards,
--
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Here is another result...
> Bald Eagle writes:
>> Even if I were to simply use a POV-Ray scene, where I rendered two ima
ges with
>> different camera locations - then I'm assuming that I could calculate
a vector
>> field and a projection matrix. (something simple like cubes, spheres,
and cylinders)
>>
>> Given The projection matrix and one of the two renders, would I then h
ave the
>> necessary and sufficient information to write a .pov scene to recreate
the
>> render from scratch?
>>
>> - BW
>
> For the moment, the work with depth from monocular vision is not enough
> advanced that we can recreate the visible scene. Vision with two camera
s
> or more, gives a much advanced result for 3D reconstruction of scenes.
>
> Let remind us the starting point from this thread... We've redone the
> experiment from Hernan Badino, who is walking with a camera on his head
:
>
>
> Hernan determines his 2D ego-motion in the x-y plane, from correspondin
g
> interest points that persist in the video stream. That means he is
> calculating the projection matrix of the movement to deduce translation
s
> in the ground plane. With time integration, it gives him the trajectory
.
>
> We're doing almost the same, but I work with OpenCV's optical-flow, and
> not interest points. And my motion model is 3D, to obtain 8 parameters
> in rotation and translation, that I can use in Persistence Of Vision.
>
> I hope you're understanding... I'm reconstituting the 3D movement, and
I
> discover it's giving "temporal disparity", that is depth from motion.
An instrumented motorcycle rolls on the track of a speed circuit. Thanks
to the approximation of optical flow (DIS - OpenCV) by the dominant
projective movement, we determine translations on the ground plane,
roll and yaw. That is to say the trajectory by projective parameters
(Tx,Tz,Ry,Rz).
<https://www.youtube.com/watch?v=-QLJ2ke9mN8>
Image data comes from the publication:
Bastien Vincke, Pauline Michel, Abdelhafid El Ouardi, Bruno Larnaudie,
o
Rodriguez, Abderrahmane Boubezoul. (Dec. 2024). Real Track Experiment
Dataset for Motorcycle Rider Behavior and Trajectory Reconstruction.
Data in Brief, Vol. 57, 111026.
The instrumented motorcycle makes a complete lap of the track. The
correlation threshold is set at 90% between successive images, to
reset the calculation of the projective dynamic model.
Best regards,
--
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Monocular Depth :
<https://www.youtube.com/watch?v=34zUDqEzHos>
A drone flies between the trees of a forest. Thanks to the optical-flow
measured on successive images, the temporal disparity reveals the
forest of trees... We take a reference image, the optical-flow is
measured on two rectified images. Then we change the reference when
the inter-correlation drops below 60%. We can perceive the relief in
depth with a single camera, over time.
In fact, when we watch images captured by a drone, although there is
only one camera, we often see the relief. This is particularly marked
for trees in a forest. The goal here is to evaluate this relief, with a
measurement of "optical-flow", which allows one image to be matched with
another, when they seem to be close (we say they are "correlated").
We have two eyes, and the methods for measuring visible relief by
stereoscopy are very developed. Since the beginning of photography,
there were devices like the “stereoscope” which allows yo
u to see the
relief with two pictures, naturally. It is possible to measure relief,
thanks to epipolar geometry, and well-known mathematics. There are many
measurement methods, very effective and based on human vision.
When it comes to measuring relief with a single camera, knowledge is
less established. There are 3D cameras, called "RGBD" with a "D" for
"depth". But how do they work? Is it possible to improve those? What
I am showing here does not require the use of any “artificial neu
ral
network”. It is a physical measurement, with a classic algorithm,
which does not come from A.I. nor a big computer :-)
Best regards,
--
François LE COAT
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Francois LE COAT writes:
> Monocular Depth :
>
> <https://www.youtube.com/watch?v=34zUDqEzHos>
>
> A drone flies between the trees of a forest. Thanks to the optical-flow
> measured on successive images, the temporal disparity reveals the
> forest of trees... We take a reference image, the optical-flow is
> measured on two rectified images. Then we change the reference when
> the inter-correlation drops below 60%. We can perceive the relief in
> depth with a single camera, over time.
>
> In fact, when we watch images captured by a drone, although there is
> only one camera, we often see the relief. This is particularly marked
> for trees in a forest. The goal here is to evaluate this relief, with a
> measurement of "optical-flow", which allows one image to be matched wit
h
> another, when they seem to be close (we say they are "correlated").
>
> We have two eyes, and the methods for measuring visible relief by
> stereoscopy are very developed. Since the beginning of photography,
> there were devices like the “stereoscope” which allows
you to see the
> relief with two pictures, naturally. It is possible to measure relief,
> thanks to epipolar geometry, and well-known mathematics. There are many
> measurement methods, very effective and based on human vision.
>
> When it comes to measuring relief with a single camera, knowledge is
> less established. There are 3D cameras, called "RGBD" with a "D" for
> "depth". But how do they work? Is it possible to improve those? What
> I am showing here does not require the use of any “artificial n
eural
> network”. It is a physical measurement, with a classic algorith
m,
> which does not come from A.I. nor a big computer :-)
A WEB page was made to illustrate Monocular Depth...
<https://hebergement.universite-paris-saclay.fr/lecoat/demoweb/monocular_
depth.html>
This is about measuring monocular depth, just as stereoscopic disparity
is measured. It means quantifying the depth, with images from a single
camera. We can see this relief naturally, but it is a matter of
measuring it with the optical-flow :-)
Best regards,
--
François LE COAT
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Francois LE COAT writes:
>> Monocular Depth:
>>
>> <https://www.youtube.com/watch?v=34zUDqEzHos>
>>
>> A drone flies between the trees of a forest. Thanks to the optical-flo
w
>> measured on successive images, the temporal disparity reveals the
>> forest of trees... We take a reference image, the optical-flow is
>> measured on two rectified images. Then we change the reference when
>> the inter-correlation drops below 60%. We can perceive the relief in
>> depth with a single camera, over time.
>>
>> In fact, when we watch images captured by a drone, although there is
>> only one camera, we often see the relief. This is particularly marked
>> for trees in a forest. The goal here is to evaluate this relief, with
a
>> measurement of "optical-flow", which allows one image to be matched wi
th
>> another, when they seem to be close (we say they are "correlated").
>>
>> We have two eyes, and the methods for measuring visible relief by
>> stereoscopy are very developed. Since the beginning of photography,
>> there were devices like the “stereoscope” which allows
you to see the
>> relief with two pictures, naturally. It is possible to measure relief,
>> thanks to epipolar geometry, and well-known mathematics. There are man
y
>> measurement methods, very effective and based on human vision.
>>
>> When it comes to measuring relief with a single camera, knowledge is
>> less established. There are 3D cameras, called "RGBD" with a "D" for
>> "depth". But how do they work? Is it possible to improve those? What
>> I am showing here does not require the use of any “artificial
neural
>> network”. It is a physical measurement, with a classic algorit
hm,
>> which does not come from A.I. nor a big computer :-)
>
> A WEB page was made to illustrate Monocular Depth...
>
> <https://hebergement.universite-paris-saclay.fr/lecoat/demoweb/monocula
r_depth.html>
>
>
> This is about measuring monocular depth, just as stereoscopic disparity
> is measured. It means quantifying the depth, with images from a single
> camera. We can see this relief naturally, but it is a matter of
> measuring it with the optical-flow :-)
Until now, drone images came from forests in France. The first images
were obtained in the French Vosges.
<https://www.youtube.com/watch?v=245yJJrwMQ0> Drone in the forest
We are now seeing more and more drones in forests outside of France.
The available image sources are diversifying...
Best regards,
--
François LE COAT
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Francois LE COAT writes:
>>> Monocular Depth:
>>>
>>> <https://www.youtube.com/watch?v=34zUDqEzHos>
>>>
>>> A drone flies between the trees of a forest. Thanks to the optical-fl
ow
>>> measured on successive images, the temporal disparity reveals the
>>> forest of trees... We take a reference image, the optical-flow is
>>> measured on two rectified images. Then we change the reference when
>>> the inter-correlation drops below 60%. We can perceive the relief in
>>> depth with a single camera, over time.
>>>
>>> In fact, when we watch images captured by a drone, although there is
>>> only one camera, we often see the relief. This is particularly marked
>>> for trees in a forest. The goal here is to evaluate this relief, with
a
>>> measurement of "optical-flow", which allows one image to be matched w
ith
>>> another, when they seem to be close (we say they are "correlated").
>>>
>>> We have two eyes, and the methods for measuring visible relief by
>>> stereoscopy are very developed. Since the beginning of photography,
>>> there were devices like the “stereoscope” which allow
s you to see the
>>> relief with two pictures, naturally. It is possible to measure relief
,
>>> thanks to epipolar geometry, and well-known mathematics. There are ma
ny
>>> measurement methods, very effective and based on human vision.
>>>
>>> When it comes to measuring relief with a single camera, knowledge is
>>> less established. There are 3D cameras, called "RGBD" with a "D" for
>>> "depth". But how do they work? Is it possible to improve those? What
>>> I am showing here does not require the use of any “artificial
neural
>>> network”. It is a physical measurement, with a classic algori
thm,
>>> which does not come from A.I. nor a big computer :-)
>>
>> A WEB page was made to illustrate Monocular Depth...
>>
>> <https://hebergement.universite-paris-saclay.fr/lecoat/demoweb/monocul
ar_depth.html>
>>
>>
>> This is about measuring monocular depth, just as stereoscopic disparit
y
>> is measured. It means quantifying the depth, with images from a single
>> camera. We can see this relief naturally, but it is a matter of
>> measuring it with the optical-flow :-)
>
> Until now, drone images came from forests in France. The first images
> were obtained in the French Vosges.
>
> <https://www.youtube.com/watch?v=245yJJrwMQ0> Drone in the forest
>
> We are now seeing more and more drones in forests outside of France.
> The available image sources are diversifying...
Here is another example of drone in the forest, outside from France:
<https://www.youtube.com/watch?v=BO_OUFHFzTY> Forest
We also obtained the trajectory in the X-Z plane, of the flight:
<https://sketchfab.com/3d-models/forest-e253ebd61e2a4bd6abcd21ac56f25bae>
These are filtered measurements and it's not ideal computer graphics.
Best regards,
--
François LE COAT
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Hi,
Francois LE COAT writes:
> Here is another example of drone in the forest, outside from France:
>
> <https://www.youtube.com/watch?v=BO_OUFHFzTY> Forest
>
> We also obtained the trajectory in the X-Z plane, of the flight:
>
> <https://sketchfab.com/3d-models/forest-e253ebd61e2a4bd6abcd21ac56f25ba
e>
>
> These are filtered measurements and it's not ideal computer graphics.
Here is a sequence of images of a ballad in the forest. This scene
is observed by a tracking drone...
<https://www.youtube.com/watch?v=46VWJ6-YqtY>
The camera's movement is estimated in the images using a projective
dominant motion measurement. The presence of a man in the image sequence
does not interfere with trajectory estimation, because the character
occupies a position in the field of vision that is not dominant. The
dominant motion corresponds to the scrolling of the scenery, that is
the movement of the forest relative to an observing camera.
Best regards,
--
<https://eureka.atari.org/>
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|
 |