navigation
  home
articles
  shared objects
  porter duff part 1
programs
  descent II
  physfs
  jagged alliance 2
To support our porting efforts, please consider a donation
 

Porter Duff Image Compositing Part 1


Contents

Introduction

I want to do a series of short articles on Porter/Duff image compositing and how it works and is used under AmigaOS 4.1, and possibly also cover some Cairo. It seems that a lot of people still have difficulties with the concept and the use of Porter/Duff compositing and why it would be used.

This is part one. We'll just cover the basics here, i.e. no programming knowledge required. We look at what it is, and how it is used.

Overview

Porter-Duff image compositing (PDIC in the following) is a technique used originally in motion pictures in order to layer different render paths or to combine computer-generated images and real-life action. Originally developed at LucasFilm for the motion Picture Star Trek II, it is a technique that curiously enough became quite popular in recent years in the desktop computer market.

In essence, PDIC "composes" a source image onto a destination image. This is similar in principle to a normal blitter - the operation essentially replaces pixels in the destination depending on the source (and destination) pixel. Some blitters even support something that is called "transparent blits", i.e. replacing only those pixels in the destination that are not zero (i.e. not transparent) in the source bitmap.

The classic Amiga's blitter was better than most modern-day counterparts in that it could use three different inputs (mostly used as source, destination, and mask) and hence use a stencil to "cookie-cut" parts of the source image and replace it by the destination (among other things; the Amiga blitter could actually calculate a binary function based on the function's canonical form called the minterms).

In a sense, PDIC takes the original Amiga blitter one step further. Instead of allowing a single-bit stencil, PDIC allows to assign the programmer/artist a "coverage" value for each pixel. What this means is the following: Suppose that you have a bitmap that has a red triangle painted into it. At the edge of the triangle, if the triangle was mathematically correct, each pixel would only be partially covered by the triangle's shape. The rest of the pixel would not be covered.

Obviously, if we just paint this triangle onto a normal bitmap, we get something that is frequently called "jaggies" or "aliasing" - mainly, a pixel cannot be partially colored but is either colored (red in this case) or it isn't. This aliasing effect becomes more obvious the lower the resolution is, but even in high resolutions, it is still very obvious. There are ways to prevent this (aptly called "anti-aliasing"), but these techniques usually only work well for basic shapes like lines, arcs, and similar things (AmigaOS uses these techniques for "font smoothing" for example).

If we assume that each pixel of our shape also contains a value that specifies its coverage (i.e. what percentage of the pixel is actually colored by the image, and how much is really transparent), we can use this information to blend the red triangle with the color that is already in our destination bitmap. For example, if we want to compose this triangle onto a white canvas, we can weight the color of the incoming pixel by its coverage, and the color of the destination by the reciprocate. This way, pixels at the edge of the triangle start to "light up", i.e. the coverage information allows us to weight the contribution of source and destination pixels to compute a more optimal color for a pixel. This process is usually called alpha blending, since the coverage is usually called alpha for historical reasons (incidentally, this is also the meaning of the 'A' in ARGB color modes). Alpha blending is only a special case of PDIC, though. PDIC usually goes beyond that.

For starters, there is nothing that prevents us from making the same assumptions on coverage on the destination bitmap as well. We do not need to limit ourselves to merely working with completely opaque destinations. For what it's worth, the result of the compose operation could be the input of another operation as well. Suppose for example we want to label the red triangle by anti-aliased text. Since, as mentioned above, the coverage information for arcs and lines is easily calculated and the fonts are created from arcs and lines, coverage for fonts is easy to obtain. A font can therefore be composed onto our red triangle without effort.

Let's consider another example. Everybody will probably be familiar with the effect that is often used in movies to give the impression of looking through a telescope. Usually, the movie picture is cut to a circular shape to generate this impression. Likewise, a graphic effect that is sometimes encountered is viewing another image through a large text (for example, the title sequence of a movie). While it is easy to generate an "inverted" circle for the telescope effect, generating a reverse of a font might be more difficult; furthermore, we might not actually want to generate extra bitmaps but use an existing bitmap which contains a rendered text with coverage information. In this case, using plain alpha blending doesn't work - we would get the reverse of what we actually want, black letters on top of the destination.

PDIC works with operators. An operator is a rule on how to compute the resulting pixel and coverage information based on the source and destination. These rules determine how the source and destination bitmaps affect each other. A commonly used operator is the Source_Over_Destination operator (the one we described above). A detailed description of these operators is outside the scope of this introduction; the original paper introducing Porter/Duff image compositing is available here.

The AmigaOS 4.1 implementation of PDIC can do more than "just" compositing. There are two basic modes of operation: Blit mode, and triangle mode. The Blit mode works like a blitter, i.e. you give two bitmaps (source and destination) and the compositing engine applies the source to the destination with a specified operator at a given coordinate set, optionally scaling the source bitmap by arbitrary factors, and clipping the result to a specified rectangle on the destination.

Triangle mode is much more flexible than that. Instead of target coordinates, the programmer may pass the compositing engine a list of triangles. Each triangle is made up of three vertices, each of which can have individual texture coordinates (including a depth coordinate for perspective correction). The individual triangles are still applied using the Porter/Duff compositing operator, but the programmer has very fine-grained control over the placement and mapping of these. It is easy to e.g. rotate, shear, or otherwise "distort" the bitmap. Effects like the famous "desktop cube", "jiggly" windows, and other desktop effects are easily implemented with this functionality.

Cairo

The Cairo Graphics API is a vector graphics API that allows the programmer to take full advantage of raster-free graphics and internally uses compositing for some of its rendering (for example for anti-aliased text and shapes). It supports multiple output devices; that means that the same code that draws e.g. a text document in a word processor can generate printer output via a PostScript or PDF file.

Cairo works using paths. A path is an arbitrary shaped concatenation of line segments and arcs. A path may or may not be closed. Once a path is defined, it can be either "stroked" or "filled". Stroking a path means to have an imaginary brush follow along the path, applying color to the "canvas" beneath. Filling, obviously, means to fill the area that is enclosed by the path with a color or pattern. Individual pieces of rendering can then be composed onto each other.

Cairo and Compositing - Applications

The obvious question now is, what do we need this for? Surely, getting transparent windows and rounded corners is nice but hardly rocket science. Therefore, we'll have a look at a few sample applications for these new technologies.

The obvious one: Mozilla Firefox

Mozilla Firefox has, since version 3.0, switched all of its rendering to Cairo. It goes without saying that a fast and efficient and hardware-accelerated implementation of Cairo is a cornerstone to bringing this application to AmigaOS 4.1. There are a couple of applications beyond Firefox that depend on or support Cairo, like OpenOffice, ClassPath, and others. So, Cairo is a stepping-stone to getting these applications ported over.

A better GUI design

Mainly, though, the usage of Cairo and Porter/Duff compositing allows us better user interfaces. For one thing, because of its very nature, drawing into off screen bitmaps and then compositing this into the visible window is flicker-free - you do not see anything being drawn, you just see the end result. This prevents the usually annoying flicker when drawing animated elements.

By extension, it allows us to make better use of animation in user interfaces. While many people might dismiss this as eye candy, virtually all modern user interfaces make use of it. Suppose you have an "Iconify" button in your window border that you just clicked. If you have several icons on your desktop, chances are that you start looking for the icon of your iconified window. If, however, the window iconification is animated (for example, by zooming the window "into" the icon) you immediately know where the icon went.

Animation also allows for better highlighting of focus. If you have a list of items in a window, using scalable GUI graphics (Cairo) and compositing allows you to zoom the item that the cursor is on, possibly revealing additional information. Defining user interface elements as vector graphics allows us to arbitrarily zoom items, be it for drawing attention to them, or to fit more items onto the screen when things get more crowded.

Compositing also helps to allow users to have more control over the looks of their user interface. Typically, GUI "themes" are built from a set of bitmaps that are usually fine-tuned to fit together. A checkbox, for example, is usually made up in such a way that it fits e.g. the background pattern. Compositing, on the other hand, allows an artist or GUI designer to define transparency channels for their GUI element images, and the program can simply compose these elements together. This not only makes the design more flexible, it also allows the user to exchange certain elements (for example the background). TrollTech's Qt4 toolkit makes extensive use of these features to allow modifying the look of an application via style sheets.

Impact for AmigaOS 4.1

Clearly, we're not there yet. But AmigaOS 4.1 and the new technologies introduced with it are the foundation for a more feature-rich user interface experience in future versions. For application developers, Cairo and PDIC offer great possibilities for enhancing the look and feel of their applications. Will there be eye candy? Very likely, yes. But if you look at it, the name "user interface" already implies how important this part of the system is - it is, after all, the direct interface between the system and you, the user. A good user interface experience is a very important part of working with a computer.


(C) 2008 H & T Frieden. All rights reservedAboutImprintDisclaimer