[Award Winner] Bitmapped Images - The technical side of things explained.
This tutorial is going talk about some technical aspects of bitmapped images which keep coming up as things to point out. Bitmapped images are made up of a rectangle of pixels or dots and contrast with the vector images which are made up of mathematical shapes like lines, polygons, ellipses and so on. Eventually even the vector images are viewed by conversion to an array of dots as all but the smallest fraction are displayed on regular PC monitors which are in themselves rectangular arrays of dots.
This tutorial is not going to dwell on file formats, compression of images and so on, this is the bit about images that is going on inside the RAM of the computer as part of the drawing application whilst your working on them.
I should also mention that many of the effects are quite small but significant so you may need to view the attached images at full size or even zoomed in. To view full size, click on them to get the black bordered 'light box' of vbulletin and then click again. If your cursor shows that you can magnify it with a + sign in it then use it and use scroll bars to pan.
Last edited by Redrobes; 07-27-2008 at 11:59 AM.
Pixels and Color
So lets start with a dot or pixel. It is the smallest part of an image that is stored in the computer and is assigned a color. The amount and style of storage of the color depends on how many colors that pixel could have and there are always limits so lets begin with the most restrictive. If the pixel can take only black or white then it needs just 1 bit to store it. That neans that you can have 8 pixels to one byte. So a 1000x1000 pixel image would takes approximately 125 KBytes of storage in RAM (i.e. not too much). At the other extreme is "Full Color" or "True Color" and this can have up to 16 million colors by virtue of the fact that the Red, Green, and Blue channels which make up the color can take any one of 256 shades. That happens to be the number of states in one byte so it takes exactly 3 bytes for an RGB encoded pixel at full color. So a 1000x1000 pixel image in full color takes approximately 3 MBytes (i.e. a lot more).
The R,G,B in an RGB pixel are known as the color channels or components and the number of bits that are required to encode the shade is known as the bit depth. You can have a 24 bit depth image for the 3 channels or sometimes its called 8 bits per component - with the assumed 3 components per pixel.
A grayscale image is universaly known to have one channel and 8 bit depth. The shades within the single channel are luminosity such that a value of 0 is black and a value fo 255 is white ( 0 to 255 inclusive makes the 256 shades ).
Due to the precious nature of memory in computers (especially older ones), there was a requirement to have something in between these two extremes. It would be nice to have images with a few colors on them for graphs, pie charts and other non photographic type diagrams. Where all of the previous type of images used the color bit vale to directly represent the color of the pixel there is another completely different way of doing it and these types of image are known as color index or paletted. They also have a fixed number of bits assigned to each pixel with usually is 4 or 8 but instead of the value being the color, the value is an index into a table of colors where the precise color in the color table can be precisely defined using lots of bits because you only have it defined once per image instead of one per pixel. So now the image has two parts. The color index table or pallete and the pixel information.
By using 4 bits per pixel that gives a range of 16 indexes. So a set of 16 colors are defined in the color table and described each with 24 bit RGB. Windows has a standard set but it is possible to have any colors in that table for a particular custom image.
Much more common tho is to have 8 bits per pixel and a color table or pallette of 256 entries. Often some of them will be made to match the standard windows set which leaves either 240 (or sometimes 236) spare which you can use to define the most common colors in the image to make it with. Usually, having 200 or so colors is enough to turn a full color photograph into one using a third of the memory without losing too much color quality. We will see later why its not usually a good idea to do it tho.
I have implied that the black and white image type is like a direct pixel color type of image but its equally valid to treat it as a 2 shade color index type where it looks up into an implied 2 entry color table. In most painting applications it is treated more like the latter type than former. Very few people use it thesedays in any case because you can do nicer lines using grayscale with antialiasing which will be discussed later.
I think we can finally talk about these images as bit mapped. By this we mean that it has sets of bits mapped to the pixels for the image.
Here are some examples of an image in the following formats. a) Full Color, b) Grayscale, c) Black and White, d) 8 bit color index, e) 4 bit color index (custom palette), f) 4 bit color index (windows palette)
Last edited by Redrobes; 07-27-2008 at 11:48 AM.
In all that talk about color we completely ignored transparency which images often have. It comes in two falvors depending on whether the image is a color index type or not. If it is then its quite simple. One of the index colors is assigned to be the transparent color type so any pixel which references that index is transparent. For the other type of image you need another channel or component. This channel is called the Alpha Channel and it usually (but not always) has as many bits as the main RGB color channels. The net effect is to have RGBA images. This means that a true color image with alpha transparency has 4 bytes instead of 3. What it also implies is that the transparency has a range of 256 shades so that it can range from completely transparent to slightly transparent through to nearly opaque and then fully opaque. Note however that color index type images with transparency have the one single color index so that these types can only do either fully opaque or fully transparent and nothing else between the two. The upside is that you only lose one color index and the number of bits stays the same so the image size does not increase.
Most virtual table top (VTT) applications use the alpha channel in full color images to provide transparency and thus allow images to take on shapes other than rectanges - e.g. characters holding weapons and shields.
Last edited by Redrobes; 07-27-2008 at 12:03 PM.
A bitmapped image is a rectangle of pixels and the resolution of the image is the number of pixels in each direction. For example a video monitor screen might have a resolution of 1280x1024 pixels. Higher resolution images have more pixels and can therefore describe a similar image in more detail than a low resolution image. This term is pretty much universally recognized.
An image size has different meaning depending on who you ask. Often it is synonymous with resolutuion however - just the number of pixels. When the image is mapped onto something physical or something representing a physical device then it might have real world size. For example a monitor screen has a width and height of viewing area. The image could be printed to a sheet of paper having known dimensions. Whenever (and only when) an image has a known resolution mapped to a known physical size, then it can be said to have a pixels per inch value. As soon as anything has a pixels per inch, dots per inch, or lines per inch or similar kind of metric then it also has a maximum spatial frequency. All of these names are used to describe whether an image will look good but usually miss out on an important but implied 3rd value which is how far away from the image it will be viewed. For a monitor or printed sheet of paper it is assumed that it will be of a short, perhaps half meter, distance. When viewing a poster or bill board the distance will be much larger and therefore the image might look just as good at a much lower dots per inch. So here is some technical proof and guidelines about what constitutes a good value.
The most important surface on which an image must fall is the retina of an eye and the lens / pupil of which has a diameter of about 4mm in medium to low light conditions. As the light dims the pupil opens up and when bright it closes down. The maximum resolution that anything could possibly detect given superhuman retina still depends on the pupil size so in theory during low light conditions the eye is potentially sharper even if less sensitive. Its all a bit moot but allows for a calculation of absolute maximum angular spatial frequency and thus pixels per inch at different distances.
Angular resolution can be determined from the Rayleigh Criterion for mid green (500nm) light in a 4mm pupil as:
sin(theta) = 1.22 * 500x10^-9 / 4x10^-3
theta = 0.153mrads
So that means that at half a meter away, Mr Supereyes can see lines spaced at about 0.077 mm apart which is 330 pixels per inch.
A billboard by the side of a road 10m away could not be resolved by Mr Supereyes beyond 1.5mm which is 17 pixels per inch.
So if you print a character sheet or page of book to be read up close then 600dpi is about the maximum that you will need. A picture to hold up and share 300dpi. A battle mat viewed at about a meter away 150dpi and a poster on the wall maybe 100dpi. For comparison, a top spec 17" laptop screen is 1920x1200 which is approximately 130dpi. Any digital camera purchased for printing full images onto A4 will be a bit pointless after 10 megapixels.
Its also worth mentioning that if your going to print any of the maps onto A4 which is 8.3 x 11.7 then at 300dpi that means images much larger than 2500x3500 are a bit wasted - but you can cater for large format printers if you fancy.
Last edited by Redrobes; 07-27-2008 at 11:50 AM.
More about color
I mentioned recently that in the world of computers that color is a bit of a nightmare generally. You can specify absolute physical quantities to color spectra and you can buy Pantone swatches which have color calibrated printed areas for unified color but theres something which you can never calibrate or make any adjustment to and that is the human eye. Members of my family have a very common red-green deficiancy and friends are almost grey tone only. Arcana mentioned recently he is color blind to some extent also. Everyone sees color differently to a greater or lesser degree.
If you can clearly see the two digit number in the image below your not color blind or at least red green deficient but there are many charts to check all sorts of color deficiencies - one is not enough to cover them.
Light as I am sure you all know is a continuous spectrum from deep red to deep violet and yet computer screens display them with just 3 - Red, Green and Blue. It just so happens that you can get the effect of most colors that the eye responds to from using a mix of the three. The word 'most' in there is very important and it is a fact that there are colors that simple red, green and blue will never be able to produce. Also, the specific red, green and blue used by monitors and cameras is even more limited. Basically what a monitor can produce is a subset of all colors. Also, what an individuals eye can see is a subset of all colors from all people. Notably that for people who have had their damaged lens removed and substituted with a prosthetic (clear plastic I suppose) one can allegedly see further into the 'ultra' violet than normal people.
All eyes and all devices (monitors, printers) etc have a color gamut which is the range of colours that they can 'deal' with. If you try to send one color from one device into another device, it might end up being 'out of gamut' for that device. A good example of this is an infra red camera sending that color to the eye. Its in gamut for the specialist camera but not for the eye - it needs a color shift up into the visible.
EDIT -- Link to page of tests:
Last edited by Redrobes; 10-29-2010 at 03:11 PM.
I have mentioned that you can encode colors using 3 values of shades of Red, Green, and Blue to provide a wide range of colors but not a complete set compared to an average eye. The trio of R,G & B make up a 3 dimensional space where on one axis you have red, another green and another blue. Three axis, all perpendicular - well thats a cube right ? So you you can also find references to color cubes too but most people talk about color spaces since its possible to have more than 3 components to encode a color.
Another common color space is the Hue, Saturation, and Luminance (HSL or HSV) channels. The hue is the color tint, the saturation determines how strong it is and the luminance is how bright. Within this space it is easy to desaturate an image by merely reducing the single saturation value.
There are more color spaces but they start to appear more complicated as you go on. One to just mention is the Y, Cr, Cb one which is used in color TV and another important one is CMYK which I will deal with in a mo.
There is always a method for converting between different color spaces though some of them are not exact equations - especially when going from one to another of different number of channels - notably RGB to CMYK. I.e. there may be more than one set of values in one space to mean the same as one value in another. Take HSL, any value with L = 0 is black no matter what H & S but only a zero in all of R,G, and B will mean black. Since all different values in RGB mean different colors and more than one value in HSL mean one color in RGB then it implies that there are colors in 3 byte RGB which cannot be represented in 3 byte HSL. When using a limited number of bits used to represent the component then a true conversion value might have to be rounded. Therefore converting between color spaces reduces the color quality of an image.
I need to cover color spaces so that we know what happens when we blend two different colors in a particular color space. When mixing two colors and averaging them it is the equivalent to plotting a line across the color space between the two color points and picking the point in the middle of the line and that is the color that will be produced. In RGB this is easily done in numbers. So if you have black RGB( 0,0,0 ) and white RGB( 255,255,255 ) and average then thats approximately RGB( 128,128,128 ) which is a mid gray color. Easy. Had we have gone from pure red to pure blue then we have RGB( 255,0,0 ) to RGB( 0,0,255) which is RGB( 128,0,128 ) - a kind of deep mauve. Had I have done this in HSL format tho it would have been HSL( 0,255,128 ) to HSL( 128,255,128 ) giving HSL( 64,255,128 ) which is actually green as the hue has gone through the spectrum halfway. So noting that result we can see that averaging different colors in different color spaces give very different results. There is no 'right' answer or logically 'right' color space to use for this purpose.
These letters stand for Cyan, Magenta, Yellow, and Black (not to confuse B with blue I suppose). The first three of these being the primary colors for paint or color absorbing materials and are opposite to RGB which are the primary colors for light or color emitting materials. If you take a full light spectrum and remove red from it then it looks cyan in color. If you remove green then it looks magenta and if you remove blue then it looks yellow. Similarly if you paint some magenta and yellow then it looks red, you can paint cyan and yellow then it looks green and cyan and magenta looks blue. If you paint perfect cyan, magenta and yellow all together then its goes black. However the inks used in printers are not perfect and it comes out a dark muddy brown color and people like their black very black so the addition of the last K channel accounts for that.
As mentioned before there is not a single equation to convert from RGB to CMYK and some color corves are usually used to look up what levels of CMYK should be used for any RGB. The usual rough rule is work out how much gray is involved and get that on the K channel and work out the remainder with the C,M and Y. Because of the funny color curves used and four axis color space then averaging colors in CMYK can also produce odd shades when blending colors.
Right, we have covered color and resolution now so we can finally start doing something with them. We can exchange color for resolution or resolution for color. If we up the resolution and reduce the number of colors then this is called dithering and is a very common process. In fact when you print out an image then it will be done for you as part of the printer driver even if you dont ask for it. Dithering can be applied to all images but generally paint apps do not perform it on color index type. For the other type, the dithering is done per channel so we might as well stick to one channel and use grayscale images to show the effects.
There are many types of dithering but the most common is halftoning which is what is also known as an ordered dither. Each pixel of a large number of gray shades are converted to less shades (usually balck and white only) by substituting them for a small grid of new pixels where the average of the new grid is the same shade as the original pixel. What is vital to understand is that if you dither an image then you must up the resolution of it by an amount that the dither grid size if you want to preserve the image quality. The dither grid size should depend on the amount of shades being converted. So if we were to go from full grayscale to black and white that is a 256 to 2 color drop. So using a 16x16 grid should be the minimum to allow for the full averaging. I doubt any paint app would use that and its more likely to use a 3x3 or 4x4 grid instead. Therefore, expect some loss and at least multiply the resolution by 4 in each direction.
A second but slightly less common form is the error difusion. When converting the PC keeps track of what color the dither would average to and compares with next pixel and applies the dither pixel a color which gets the dither color average as close to the sample image pixel. If that was in black and white only then if the dither average is too bright then it puts a black dot down, if too dark then a white one. Over space, the error wibbles up and down but averages out to track the sample image. If the sample image is made up of thin lines then this technique does not work so well as it confuses it but if the resolutuion is multiplied by 4 in each direction, generally, error diffusion is better than halftone.
An image shown next is the color one dithered into the 16 color windows palette. Not that bad considering the bizarre set of colors contained in that palette.
Now that you can appreciate what is happening behind the scenes you will also appreciate printer DPI settings. A standard color desktop printer might have a DPI rating of 1440 but it only has 4 or maybe 6 ink colors. A color photo has to be dithered in CMYK down to 4 inks before being sprayed. The 1440 rating is the number of dots per inch per component. So it is not how much DPI you can print a full color photo at. A modern color ink jet printer uses error diffusion (sometimes called giclee for some bizarre and cost increasing marketing hype reason) so that you should expect that to print a full color pixel you would need to divide this number by something between 4 and 16 to get the real 'true color' DPI rating of the final print. So a 1440 DPI printer can print photos at about 200 pixels per inch which is still quite good as can be seen from out previous discussion about resolution. Also, as a tip, dont try to pre-dither images before sending to the printer - let the printer do it.
We stated earlier that we can go back the other way too. We can trade resolution for more colors. First, up the image number of colors to full color or grayscale and then average sections of the image out. For example assume that we can see that a halftone image was processed with a 4x4 grid. Then average those grids out. We can do the same with error diffused dither by using a blur. By looking at dithered images from farther than the ability to resolve the dot pattern, your eye is just doing the averaging for you.
Last edited by Redrobes; 07-27-2008 at 11:53 AM.
Your sat in a tour bus in the Serengetti looking out over the vista to a herd of Zebra, get your digital camera out, click and capture that image, driving away you capture some more of the same zebra. The original is effectively infinite in detail but your camera will capture a number of pixels depending on the camera resolution, encode them (probably compress them too) and store them on the camera flash card. From that piont on you cannot get any detail from that image with a finer resolution than that of the cameras imaging sensor. The camera sensor has 'sampled' the infinite detailed image at regular intervals and collected only those samples in the camera.
Later on you look at the zebra and note that in the middle picture the stripes on his front leg are reversed to the first picture and then later, wham !, what the hell happened with that third image ?
What your looking at are the effects of undersampling and a process called 'aliasing'. The same effect is also called Moire when referred to 'fringes' that appear in closely spaced lines at a changing angle. An alias is a second instance of something usually a second name of a gunslinger like "Alias Smith and Jones" (http://www.imdb.com/title/tt0066625/). In graphics they refer to secondary instances of stuff which is not supposed to be there - like the new zebra stripes. Also, once you have them it is extremely difficult to recover the image so that it shows what the stripes should have been like.
The precise nature of what is happening to your zebra leg is quite complicated and is wrapped up in a lot of math which we must lightly dip our toes into in order to fix the problem. Suffice it to say that in this instance what happened was that the next sample skipped over a stripe and went into the next one after. Its the same as the wheels going backwards on old cowboy movies of the wagons rolling where the film camera sample rate of 24 frames per second is just below that of the wheel spokes moving around. The movie camera frame rate is undersampling the action.
This is the main reason why holding original photos and art at high resolution is essential. If your going to work on a final piece of art at 1000x1000 pixels then you should work in much more than that - say 4000x4000 and only convert to 1000x1000 at the final stages. If for any reason there are details in the original that are too fine to produce at 1000x1000 then they will not only be lost but could cause weird effects to be produced at the final re-sample.
Last edited by Redrobes; 07-27-2008 at 12:22 PM.
Ok so we have an aliasing issue, what we need is an aliasing busting technique - cue antialiasing.
Antialiasing is all about doing stuff at a higher sample rate than the final, filtering and then resampling to the final rate. The idea is that the effects that would have been present from the undersample will be beyond the filter and therefore not in the final image. So were back to our zebra. If the camera had filtered off any higher frequencies than the pixel sample rate then as the stripes on the leg of the zebra got thinner then eventually they would blur out and become a solid mid gray color. And then as it got farther away still the whole animal would become solid gray but would at least look better than the new stripes from the earlier example.
So, always work at a higher resolution than the final. Make a copy of the high resolution image and blur it just a little so that a few pixels blur together and then resample it smaller to final size. If going 2 to 1 then blur just enough for two pixels to look like a single blob and then make half the size, if going 10 to 1 then blur so that 10 pixels become one blob and then make one tenth the size.
This form of antialising is known as "super sampling" it is the type also known as "Full Screen Antialiasing" (FSAA) when setting up video cards for games.
Last edited by Redrobes; 07-27-2008 at 12:43 PM.
Tags for this Thread