Blog Article

New Frontiers in Computer Visualization

From 3D models to multimodal “conversation systems” to recreating the “visual complexities” of physical appearance, these researchers are taking computing to the next level.

Published July 1, 2003

By Dan Van Atta

Image courtesy of Framestock via stock.adobe.com.

Three decades ago engineers in California demonstrated a prototype personal computer, called the Alto, that would usher in the PC era and forever alter the course of human communications. In today’s online “e” era, the quest to conquer new challenges in computer science continues at an accelerating pace.

Imagining the potential impacts of today’s research on human activity three decades hence is certainly the creative stimulus of which innovative discovery is made. Three examples of such creative work underway in today’s computer science laboratories were discussed on April 1 at The New York Academy of Sciences’ (the Academy’s) semi-annual Computer Science Mini-Symposium.

Titled Frontiers in Visualization, research scientists from Columbia and Princeton Universities and IBM’s T. J. Watson Research Center each described efforts to create computer-based graphical imaging capabilities that overcome current limitations and open the door to a world of new possibilities.

A Search Engine for 3D Models

At Princeton University, Thomas Funkhouser, PhD, is working to advance the day when true three-dimensional (3D) models can be readily created electronically and transmitted via the Internet. “Scanners are getting cheaper and fast graphics cards are readily available on PCs,” Funkhouser told the gathering. “Someday 3D models will be as common as images are on the Web today.”

While 3D models already exist on Web sites, Funkhouser said they are often deeply embedded in data and not easy to locate. To remedy this, he and his Princeton colleagues have built a search engine specifically for locating 3D models on the Web.

To locate a 3D model using the search engine, the query can begin with a simple text word, such as “chair.” Or the query can be based on a 2D sketch – a simple drawing of a chair, for example. But the search engine also allows users to scan-in an actual model or sketch and instruct the computer to “find similar shape,” thereby producing a plethora of similarly shaped chairs. The new “query interfaces” they are building also will allow searches based on inputting 3D sketches and models.

“My goal is to create a metric for similarity,” the computer scientist said, “so that we can quickly search the data base and find a similar shape. This requires that we create an index of the data base.”

As an example, Funkhouser described the challenge of asking the search engine to provide the best matches of shapes similar to a 3D image of a Volkswagen Beetle. To accomplish this, he said the team needed to create a “shape descriptor” that would be concise enough to be stored in the data base, compute rapidly and be both efficient and discriminating in its selections.

The Challenges

One challenge is to match 3D models effectively even when they appear in arbitrary alignments. To address this, Funkhouser’s team is building a “harmonic shape descriptor” that is invariant to rotations and yet as discriminating as possible. For this the team decomposes the 3D shapes into “an irreducible set of rotation-independent components,” then stores “how much” of the model resides in each component.

In tests conducted by students at Princeton, Funkhouser said the recently developed search engine proved most effective – 90 percent – when the user query was based on matching to an existing 3D shape. While the engine is still a work in progress, he noted that more than 35,000 3D models have been indexed thus far and more than 260,000 queries were processed this past year.

“This field is so young that there are no real benchmark tests,” Funkhouser added. “We want to develop such a test so that people can test different methods and measure their effectiveness.” Additional work is planned to improve 2D matching methods, develop new query interfaces and new matching and indexing algorithms for better methods of shape matching and shape analysis.

Automating Info Graphics

At IBM’s T. J. Watson Research Center, Michelle Zhou leads a group that is developing next-generation multimodal “conversation systems” to aid users in searching for information. Their system can automatically produce information graphics – such as graphs, charts and diagrams – during the course of “computer-human conversations.”

Zhou, whose PhD dissertation at Columbia was on building automated visualization systems that in turn create a coherent series of animated displays for visualizing a wide variety of information, aims to make these “conversations” both “multimodal,” meaning users can employ both speech and gesture inputs to express their information requests, and “multimedia,” meaning that computers may employ speech, graphics and video to present desired information to users.

When computer users search for information today, such as real estate market trends for a particular area, for example, the desired information needs to be carefully handcrafted using graphics tools such as Microsoft PowerPoint or Adobe Illustrator. Without previous training in graphic design, however, the process of handcrafting such information graphics is difficult and time-consuming. Especially, within a dynamic human-computer conversation, it is extremely difficult to handcraft every possible information graphic in advance.

To simplify matters, researchers have built systems that can help people design information graphics automatically. After receiving a user request – such as “display sales data for the first quarter” – these systems can produce information graphics – such as a bar chart – automatically.

A New Graphics Generation System

Now Zhou and her team are building a new graphics generation system, called IMPROVISE+, that will allow users to provide more specific preferences, then adjust the graphic using a “feedback” input. The result: a new, customized information graphic.

“By allowing users to critique a sketch first,” Zhou said, “IMPROVISE+ can save the cost of fine-tuning the undesirable design.” After the computer processes the initial input to customize the image, however, the human user is once again asked for input. “The system is not foolproof,” Zhou acknowledged, “so at the last stage we take the users input to validate the generation process.”

Selecting from a database of existing graphic examples, or cases, she said the team’s approach uses a “similarity metric” to retrieve the case that is most similar to the request. The retrieved case is then either directly reused or adapted.

“This approach allows us to extend our work to cover a wider variety of applications,” Zhou said, “since existing graphic examples are abundant and the learning process itself doesn’t have to be changed.”

Modeling Visual Appearances

From top: Peter N. Belhumeur, Thomas Funkhouser, and Michelle Zhou.

Computer scientist Peter N. Belhumeur has had a remarkable career since receiving his PhD from Harvard University a decade ago. Recipient of both the Presidential Early Career Award for Scientists and Engineers and the National Science Foundation Career Award, Belhumeur was appointed a full professor of Electrical Engineering at Yale University in 2001.

Belhumeur recently moved to Columbia University, where he is creating computer models that attempt to recreate the “visual complexities” of physical appearance. To do this requires understanding and attempting to replicate the complex variations related to shape, reflection, viewpoint and illumination.

In looking at even very common and seemingly simple images, Belhumeur noted that the differences between the images of the surfaces of the various material elements are quite stunning. “Because of the variation in the composition of the materials,” he said, “there’s really a great disparity in the appearance.”

To accurately model the visual appearance of an object researchers must account for its shape, reflection, viewpoint and illumination. Of the four, Belhumeur said reflectance – a four-dimensional process involving both the incoming and outgoing light – is the most complex and least understood, despite its critical importance. “As a result, you have to make assumptions about the nature of reflectance,” he said, “and this has been sort of the Achilles’ heel of nearly all image-based shape reconstruction.”

Many Applications

Referring to side-by-side photos of a peach and a nectarine, Belhumeur said: “Here you have two objects that are Essentially the same shape and coloration. Yet, because of the differences in reflection, they appear different enough.” That difference, he pointed out, is due to the way the surface of each object reflects light.

Despite the challenges, he said researchers at Columbia have developed a new method for reconstructing models of objects from images of the objects themselves, as well as a new algorithm for determining reflectance. The models will allow scientists to view the shape of the object from a single image, he said, then produce reasonably accurate “synthetic images” showing how the object would look under viewpoints or different lighting conditions.

In addition, the researchers have invented a device called the Lighting Sensitive Display that uses photo-detectors, cameras and optical fibers to sense the illumination in the environment and modify the content of the image. Potential applications of this work, Belhumeur said, include face and object recognition, image-based rendering, computer graphics, content-based image and video compression and human-computer interfaces.

Also read: IBM Provides Valuable Data and Modeling Tools for Crisis Response

About the Researchers

Peter N. Belhumeur graduated in 1985 from Brown University with Highest Honors, receiving a Sc.B. degree in Computer and Information Engineering. He received a S.M. in 1991 and a Ph.D. in 1993 from Harvard University, where he studied under David Mumford and was supported by a Harvard Fellowship. In 1993 he was a Postdoctoral Fellow at the University of Cambridge’s Sir Isaac Newton Institute for Mathematical Sciences. He was appointed professor of Electrical Engineering at Yale University in 2001. He recently joined the faculty at Columbia University.

Thomas Funkhouser is an assistant professor in the Department of Computer Science at Princeton University. Previously, he was a member of the technical staff at Bell Laboratories. His current research interests include interactive computer graphics, computational geometry, distributed systems, and shape analysis. He received a B.S. in Biological Sciences from Stanford University in 1983, a M.S. in computer science from UCLA in 1989, and a Ph.D. in computer science from UC Berkeley in 1993.

Michelle Zhou is a research staff member at IBM T.J. Watson Research Center, where she manages the group of intelligent multimedia interaction. Before joining IBM, Michelle was working on her thesis at Columbia University on creating automated visualization systems that can create a coherent series of animated displays for visualizing a wide variety of information. She also received a Ph.D. in computer science from Columbia.

Author

Contributing Author