[Beowulf] cluster for doing real time video panoramas?
landman at scalableinformatics.com
Fri Dec 23 01:15:24 EST 2005
I seem to remember SGI doing things like this in Reality Engine texture
memory many moons ago. We would map multiple video streams onto the
exterior of a mirrored sphere for one of the demos. The graphics
hardware then (and more so now) is very good for calculations like these
Jim Lux wrote:
> At 08:40 PM 12/22/2005, Gerry Creager wrote:
>> I'll take a stab at some of this... the parts that appear intuitively
>> obvious to me.
>> On 12/21/05, Bogdan Costescu <Bogdan.Costescu at iwr.uni-heidelberg.de>
>> > On Wed, 21 Dec 2005, Jim Lux wrote:
>> > > I've got a robot on which I want to put a half dozen or so video
>> > > cameras (video in that they capture a stream of images, but not
>> > > necessarily that put out analog video..)
>> Streaming video from 'n' cameras, or streams of interleaved images
>> (full frames).
> 6 "channels" of video from unsynchronized cameras (unless there's a
> convenient way to genlock the inexpensive 1394 cameras), so, presumably,
> at some level, you've got 6 streams of full video frames, all at the
> same rate, but not necessarily synchronized in arrival.
>> > > I've also got some telemetry that tells me what the orientation of
>> > > the robot is.
>> > Does it also give info about the orientation of the assembly of
>> > cameras ? I'm thinking of the usual problem: is the horizon line going
>> > down or is my camera tilted ? Although if you really mean spherical
>> > (read below), this might not be a problem anymore as you might not
>> > care about what horizon is anyway.
>> The only sane, rational way to do this I can see is if the camera
>> reference frame is appropriately mapped and each camera's orientation
>> is well-known.
> This would be the case. The camera/body is a rigid body, so knowing
> orientation of the body tells you orienatation of the cameras. More to
> the point, the relative orientation of all cameras is known, in body
> coordinates. Also, the camera/lens optical parameters are known
> (aberrations, exposure variation, etc.)
>> > Given that I have a personal interest in video, I thought about
>> > something similar: not a circular or spherical coverage, but at least
>> > a large field of view from which I can choose when editing the
>> > "window" that is then given to the viewer - this comes from a very
>> > mundane need: I'm not such a good camera operator, so I always miss
>> > things or make a wrong framing. So this would be similar to how
>> > Electronic (as opposed to optical) Image Stabilization (EIS) works,
>> > although EIS does not need any transformation or calibration as the
>> > whole big image comes from one source.
>> With the combination of on-plane and off-axis image origination, one
>> has the potential for a stereoscopic and thus distance effect. A
>> circular coverage wouldn't provide this. Remapping a spherical
>> coverage into an immersive planar or cave coverage could accomplish
> There's some literature out there on this. Notably, a system with 6
> cameras, each with >120 deg FOV, pairs at each vertex of a triangle, so
> you essentially have a stereo pair pointing every 120 degrees, with
> overlap. There's some amount of transformation that can synthesize (at
> least for things reasonably far away) a stereo pair from any arbitrary
>> First step is to rigidly characterize each camera's optical path: the
>> lens. Once it's charactistics are known and correlated with its
>> cohort, the math for the reconstruction and projection becomes
>> somewhat simpler (took a lot to not say, "Trivial" having been down
>> this road before!). THEN one characterizes the physical frame and
>> identifies the optical image overlap of adjacent cameras. Evenly
>> splitting the overlap might not necessarily help here, if I understand
>> the process.
> Fortunately, this is exactly what things like panotools and PTGui is for.
>> Interlacing would have to be almost camera-frame sequential video and
>> at high frame rates. I agree that deinterlaced streams would offer
>> better result. One stream per CPU might be taxing: This might
>> require more'n one CPU per stream at 1394 image speeds!
> And why I posted on the beowulf list!
> > once it's done you don't need to do it again...
>> I'm not so sure this is beneficial if you can appropriately model the
>> lens systems of each camera and the optical qualities of the imager
>> (CMOS or CCD...) I think you can apply the numerical mixing and
>> projection in near real time if you have resolved the models early.
> Seems like this should be true. Once all the transformations are
> calculated, it's just lots of arithmetic operations.
>> > To come back to cluster usage, I think that you can treat the whole
>> > thing by doing something like a spatial decomposition, such that
>> > either:
>> > - each computer gets the same amount of video data (to avoid
>> > overloading). This way it's clear which computer takes video from
>> > which camera, but the amount of "real world" space that each of them
>> > gives is not equal, so putting them together might be difficult.
>> > - each computer takes care the same amount of the "real world" space,
>> > so each computer provides the same amount of output data. However the
>> > video streams splitting between the CPUs might be a problem as each
>> > frame might need to be distributed to several CPUs.
>> I would vote for decomposition by hardware device (camera).
> In that, you'd have each camera feed a CPU which transforms it's images
> into a common space (applying that camera's exposure and lens corrections)
>> And, I'd
>> have some degree of preprocessing with consideration that the cluster
>> might not necessarily be our convenient little flat NUMA cluster we're
>> all so used to. If I had a cluster of 4-way nodes I'd be considering
>> reordering the code to have preprocessing of the image-stream on one
>> core, making it in effect a 'head' core, and letting it decompose the
>> process to the remaining cores. I'm not convinced the typical CPU can
>> appropriately handle a feature-rich environment imaged using a decent
>> DV video steam.
> I don't know that the content of the images makes any difference,
> they're all just arrays of pixels that need to be mapped from one space,
> into another.
>> > > But, then, how do you do the real work... should the camera
>> > > recalibration be done all on one processor?
>> > It's not clear to me what you call "recalibration". You mean color
>> > correction, perspective change and so on ? These could be done in
>> > parallel, but if the cameras don't move relative to each other, the
>> > transformations are always the same so it's probably easy to partition
>> > them on CPUs even as much as to get a balanced load.
> Lots of good ideas,all..
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf