View Full Version : [Award Winner] Bitmapped Images - The technical side of things explained.
07-27-2008, 11:40 AM
This tutorial is going talk about some technical aspects of bitmapped images which keep coming up as things to point out. Bitmapped images are made up of a rectangle of pixels or dots and contrast with the vector images which are made up of mathematical shapes like lines, polygons, ellipses and so on. Eventually even the vector images are viewed by conversion to an array of dots as all but the smallest fraction are displayed on regular PC monitors which are in themselves rectangular arrays of dots.
This tutorial is not going to dwell on file formats, compression of images and so on, this is the bit about images that is going on inside the RAM of the computer as part of the drawing application whilst your working on them.
I should also mention that many of the effects are quite small but significant so you may need to view the attached images at full size or even zoomed in. To view full size, click on them to get the black bordered 'light box' of vbulletin and then click again. If your cursor shows that you can magnify it with a + sign in it then use it and use scroll bars to pan.
07-27-2008, 11:41 AM
So lets start with a dot or pixel. It is the smallest part of an image that is stored in the computer and is assigned a color. The amount and style of storage of the color depends on how many colors that pixel could have and there are always limits so lets begin with the most restrictive. If the pixel can take only black or white then it needs just 1 bit to store it. That neans that you can have 8 pixels to one byte. So a 1000x1000 pixel image would takes approximately 125 KBytes of storage in RAM (i.e. not too much). At the other extreme is "Full Color" or "True Color" and this can have up to 16 million colors by virtue of the fact that the Red, Green, and Blue channels which make up the color can take any one of 256 shades. That happens to be the number of states in one byte so it takes exactly 3 bytes for an RGB encoded pixel at full color. So a 1000x1000 pixel image in full color takes approximately 3 MBytes (i.e. a lot more).
The R,G,B in an RGB pixel are known as the color channels or components and the number of bits that are required to encode the shade is known as the bit depth. You can have a 24 bit depth image for the 3 channels or sometimes its called 8 bits per component - with the assumed 3 components per pixel.
A grayscale image is universaly known to have one channel and 8 bit depth. The shades within the single channel are luminosity such that a value of 0 is black and a value fo 255 is white ( 0 to 255 inclusive makes the 256 shades ).
Due to the precious nature of memory in computers (especially older ones), there was a requirement to have something in between these two extremes. It would be nice to have images with a few colors on them for graphs, pie charts and other non photographic type diagrams. Where all of the previous type of images used the color bit vale to directly represent the color of the pixel there is another completely different way of doing it and these types of image are known as color index or paletted. They also have a fixed number of bits assigned to each pixel with usually is 4 or 8 but instead of the value being the color, the value is an index into a table of colors where the precise color in the color table can be precisely defined using lots of bits because you only have it defined once per image instead of one per pixel. So now the image has two parts. The color index table or pallete and the pixel information.
By using 4 bits per pixel that gives a range of 16 indexes. So a set of 16 colors are defined in the color table and described each with 24 bit RGB. Windows has a standard set but it is possible to have any colors in that table for a particular custom image.
Much more common tho is to have 8 bits per pixel and a color table or pallette of 256 entries. Often some of them will be made to match the standard windows set which leaves either 240 (or sometimes 236) spare which you can use to define the most common colors in the image to make it with. Usually, having 200 or so colors is enough to turn a full color photograph into one using a third of the memory without losing too much color quality. We will see later why its not usually a good idea to do it tho.
I have implied that the black and white image type is like a direct pixel color type of image but its equally valid to treat it as a 2 shade color index type where it looks up into an implied 2 entry color table. In most painting applications it is treated more like the latter type than former. Very few people use it thesedays in any case because you can do nicer lines using grayscale with antialiasing which will be discussed later.
I think we can finally talk about these images as bit mapped. By this we mean that it has sets of bits mapped to the pixels for the image.
Here are some examples of an image in the following formats. a) Full Color, b) Grayscale, c) Black and White, d) 8 bit color index, e) 4 bit color index (custom palette), f) 4 bit color index (windows palette)
07-27-2008, 11:41 AM
In all that talk about color we completely ignored transparency which images often have. It comes in two falvors depending on whether the image is a color index type or not. If it is then its quite simple. One of the index colors is assigned to be the transparent color type so any pixel which references that index is transparent. For the other type of image you need another channel or component. This channel is called the Alpha Channel and it usually (but not always) has as many bits as the main RGB color channels. The net effect is to have RGBA images. This means that a true color image with alpha transparency has 4 bytes instead of 3. What it also implies is that the transparency has a range of 256 shades so that it can range from completely transparent to slightly transparent through to nearly opaque and then fully opaque. Note however that color index type images with transparency have the one single color index so that these types can only do either fully opaque or fully transparent and nothing else between the two. The upside is that you only lose one color index and the number of bits stays the same so the image size does not increase.
Most virtual table top (VTT) applications use the alpha channel in full color images to provide transparency and thus allow images to take on shapes other than rectanges - e.g. characters holding weapons and shields.
07-27-2008, 11:42 AM
A bitmapped image is a rectangle of pixels and the resolution of the image is the number of pixels in each direction. For example a video monitor screen might have a resolution of 1280x1024 pixels. Higher resolution images have more pixels and can therefore describe a similar image in more detail than a low resolution image. This term is pretty much universally recognized.
An image size has different meaning depending on who you ask. Often it is synonymous with resolutuion however - just the number of pixels. When the image is mapped onto something physical or something representing a physical device then it might have real world size. For example a monitor screen has a width and height of viewing area. The image could be printed to a sheet of paper having known dimensions. Whenever (and only when) an image has a known resolution mapped to a known physical size, then it can be said to have a pixels per inch value. As soon as anything has a pixels per inch, dots per inch, or lines per inch or similar kind of metric then it also has a maximum spatial frequency. All of these names are used to describe whether an image will look good but usually miss out on an important but implied 3rd value which is how far away from the image it will be viewed. For a monitor or printed sheet of paper it is assumed that it will be of a short, perhaps half meter, distance. When viewing a poster or bill board the distance will be much larger and therefore the image might look just as good at a much lower dots per inch. So here is some technical proof and guidelines about what constitutes a good value.
The most important surface on which an image must fall is the retina of an eye and the lens / pupil of which has a diameter of about 4mm in medium to low light conditions. As the light dims the pupil opens up and when bright it closes down. The maximum resolution that anything could possibly detect given superhuman retina still depends on the pupil size so in theory during low light conditions the eye is potentially sharper even if less sensitive. Its all a bit moot but allows for a calculation of absolute maximum angular spatial frequency and thus pixels per inch at different distances.
Angular resolution (http://en.wikipedia.org/wiki/Angular_resolution) can be determined from the Rayleigh Criterion for mid green (500nm) light in a 4mm pupil as:
sin(theta) = 1.22 * 500x10^-9 / 4x10^-3
theta = 0.153mrads
So that means that at half a meter away, Mr Supereyes can see lines spaced at about 0.077 mm apart which is 330 pixels per inch.
A billboard by the side of a road 10m away could not be resolved by Mr Supereyes beyond 1.5mm which is 17 pixels per inch.
So if you print a character sheet or page of book to be read up close then 600dpi is about the maximum that you will need. A picture to hold up and share 300dpi. A battle mat viewed at about a meter away 150dpi and a poster on the wall maybe 100dpi. For comparison, a top spec 17" laptop screen is 1920x1200 which is approximately 130dpi. Any digital camera purchased for printing full images onto A4 will be a bit pointless after 10 megapixels.
Its also worth mentioning that if your going to print any of the maps onto A4 which is 8.3 x 11.7 then at 300dpi that means images much larger than 2500x3500 are a bit wasted - but you can cater for large format printers if you fancy.
07-27-2008, 11:42 AM
I mentioned recently that in the world of computers that color is a bit of a nightmare generally. You can specify absolute physical quantities to color spectra and you can buy Pantone swatches which have color calibrated printed areas for unified color but theres something which you can never calibrate or make any adjustment to and that is the human eye. Members of my family have a very common red-green deficiancy and friends are almost grey tone only. Arcana mentioned recently he is color blind to some extent also. Everyone sees color differently to a greater or lesser degree.
If you can clearly see the two digit number in the image below your not color blind or at least red green deficient but there are many charts to check all sorts of color deficiencies - one is not enough to cover them.
Light as I am sure you all know is a continuous spectrum from deep red to deep violet and yet computer screens display them with just 3 - Red, Green and Blue. It just so happens that you can get the effect of most colors that the eye responds to from using a mix of the three. The word 'most' in there is very important and it is a fact that there are colors that simple red, green and blue will never be able to produce. Also, the specific red, green and blue used by monitors and cameras is even more limited. Basically what a monitor can produce is a subset of all colors. Also, what an individuals eye can see is a subset of all colors from all people. Notably that for people who have had their damaged lens removed and substituted with a prosthetic (clear plastic I suppose) one can allegedly see further into the 'ultra' violet than normal people.
All eyes and all devices (monitors, printers) etc have a color gamut which is the range of colours that they can 'deal' with. If you try to send one color from one device into another device, it might end up being 'out of gamut' for that device. A good example of this is an infra red camera sending that color to the eye. Its in gamut for the specialist camera but not for the eye - it needs a color shift up into the visible.
EDIT -- Link to page of tests:
07-27-2008, 11:43 AM
I have mentioned that you can encode colors using 3 values of shades of Red, Green, and Blue to provide a wide range of colors but not a complete set compared to an average eye. The trio of R,G & B make up a 3 dimensional space where on one axis you have red, another green and another blue. Three axis, all perpendicular - well thats a cube right ? So you you can also find references to color cubes too but most people talk about color spaces since its possible to have more than 3 components to encode a color.
Another common color space is the Hue, Saturation, and Luminance (HSL or HSV) channels. The hue is the color tint, the saturation determines how strong it is and the luminance is how bright. Within this space it is easy to desaturate an image by merely reducing the single saturation value.
There are more color spaces but they start to appear more complicated as you go on. One to just mention is the Y, Cr, Cb one which is used in color TV and another important one is CMYK which I will deal with in a mo.
There is always a method for converting between different color spaces though some of them are not exact equations - especially when going from one to another of different number of channels - notably RGB to CMYK. I.e. there may be more than one set of values in one space to mean the same as one value in another. Take HSL, any value with L = 0 is black no matter what H & S but only a zero in all of R,G, and B will mean black. Since all different values in RGB mean different colors and more than one value in HSL mean one color in RGB then it implies that there are colors in 3 byte RGB which cannot be represented in 3 byte HSL. When using a limited number of bits used to represent the component then a true conversion value might have to be rounded. Therefore converting between color spaces reduces the color quality of an image.
I need to cover color spaces so that we know what happens when we blend two different colors in a particular color space. When mixing two colors and averaging them it is the equivalent to plotting a line across the color space between the two color points and picking the point in the middle of the line and that is the color that will be produced. In RGB this is easily done in numbers. So if you have black RGB( 0,0,0 ) and white RGB( 255,255,255 ) and average then thats approximately RGB( 128,128,128 ) which is a mid gray color. Easy. Had we have gone from pure red to pure blue then we have RGB( 255,0,0 ) to RGB( 0,0,255) which is RGB( 128,0,128 ) - a kind of deep mauve. Had I have done this in HSL format tho it would have been HSL( 0,255,128 ) to HSL( 128,255,128 ) giving HSL( 64,255,128 ) which is actually green as the hue has gone through the spectrum halfway. So noting that result we can see that averaging different colors in different color spaces give very different results. There is no 'right' answer or logically 'right' color space to use for this purpose.
07-27-2008, 11:43 AM
These letters stand for Cyan, Magenta, Yellow, and Black (not to confuse B with blue I suppose). The first three of these being the primary colors for paint or color absorbing materials and are opposite to RGB which are the primary colors for light or color emitting materials. If you take a full light spectrum and remove red from it then it looks cyan in color. If you remove green then it looks magenta and if you remove blue then it looks yellow. Similarly if you paint some magenta and yellow then it looks red, you can paint cyan and yellow then it looks green and cyan and magenta looks blue. If you paint perfect cyan, magenta and yellow all together then its goes black. However the inks used in printers are not perfect and it comes out a dark muddy brown color and people like their black very black so the addition of the last K channel accounts for that.
As mentioned before there is not a single equation to convert from RGB to CMYK and some color corves are usually used to look up what levels of CMYK should be used for any RGB. The usual rough rule is work out how much gray is involved and get that on the K channel and work out the remainder with the C,M and Y. Because of the funny color curves used and four axis color space then averaging colors in CMYK can also produce odd shades when blending colors.
07-27-2008, 11:44 AM
Right, we have covered color and resolution now so we can finally start doing something with them. We can exchange color for resolution or resolution for color. If we up the resolution and reduce the number of colors then this is called dithering and is a very common process. In fact when you print out an image then it will be done for you as part of the printer driver even if you dont ask for it. Dithering can be applied to all images but generally paint apps do not perform it on color index type. For the other type, the dithering is done per channel so we might as well stick to one channel and use grayscale images to show the effects.
There are many types of dithering but the most common is halftoning which is what is also known as an ordered dither. Each pixel of a large number of gray shades are converted to less shades (usually balck and white only) by substituting them for a small grid of new pixels where the average of the new grid is the same shade as the original pixel. What is vital to understand is that if you dither an image then you must up the resolution of it by an amount that the dither grid size if you want to preserve the image quality. The dither grid size should depend on the amount of shades being converted. So if we were to go from full grayscale to black and white that is a 256 to 2 color drop. So using a 16x16 grid should be the minimum to allow for the full averaging. I doubt any paint app would use that and its more likely to use a 3x3 or 4x4 grid instead. Therefore, expect some loss and at least multiply the resolution by 4 in each direction.
A second but slightly less common form is the error difusion. When converting the PC keeps track of what color the dither would average to and compares with next pixel and applies the dither pixel a color which gets the dither color average as close to the sample image pixel. If that was in black and white only then if the dither average is too bright then it puts a black dot down, if too dark then a white one. Over space, the error wibbles up and down but averages out to track the sample image. If the sample image is made up of thin lines then this technique does not work so well as it confuses it but if the resolutuion is multiplied by 4 in each direction, generally, error diffusion is better than halftone.
An image shown next is the color one dithered into the 16 color windows palette. Not that bad considering the bizarre set of colors contained in that palette.
Now that you can appreciate what is happening behind the scenes you will also appreciate printer DPI settings. A standard color desktop printer might have a DPI rating of 1440 but it only has 4 or maybe 6 ink colors. A color photo has to be dithered in CMYK down to 4 inks before being sprayed. The 1440 rating is the number of dots per inch per component. So it is not how much DPI you can print a full color photo at. A modern color ink jet printer uses error diffusion (sometimes called giclee for some bizarre and cost increasing marketing hype reason) so that you should expect that to print a full color pixel you would need to divide this number by something between 4 and 16 to get the real 'true color' DPI rating of the final print. So a 1440 DPI printer can print photos at about 200 pixels per inch which is still quite good as can be seen from out previous discussion about resolution. Also, as a tip, dont try to pre-dither images before sending to the printer - let the printer do it.
We stated earlier that we can go back the other way too. We can trade resolution for more colors. First, up the image number of colors to full color or grayscale and then average sections of the image out. For example assume that we can see that a halftone image was processed with a 4x4 grid. Then average those grids out. We can do the same with error diffused dither by using a blur. By looking at dithered images from farther than the ability to resolve the dot pattern, your eye is just doing the averaging for you.
07-27-2008, 11:45 AM
Your sat in a tour bus in the Serengetti looking out over the vista to a herd of Zebra, get your digital camera out, click and capture that image, driving away you capture some more of the same zebra. The original is effectively infinite in detail but your camera will capture a number of pixels depending on the camera resolution, encode them (probably compress them too) and store them on the camera flash card. From that piont on you cannot get any detail from that image with a finer resolution than that of the cameras imaging sensor. The camera sensor has 'sampled' the infinite detailed image at regular intervals and collected only those samples in the camera.
Later on you look at the zebra and note that in the middle picture the stripes on his front leg are reversed to the first picture and then later, wham !, what the hell happened with that third image ?
What your looking at are the effects of undersampling and a process called 'aliasing'. The same effect is also called Moire when referred to 'fringes' that appear in closely spaced lines at a changing angle. An alias is a second instance of something usually a second name of a gunslinger like "Alias Smith and Jones" (http://www.imdb.com/title/tt0066625/). In graphics they refer to secondary instances of stuff which is not supposed to be there - like the new zebra stripes. Also, once you have them it is extremely difficult to recover the image so that it shows what the stripes should have been like.
The precise nature of what is happening to your zebra leg is quite complicated and is wrapped up in a lot of math which we must lightly dip our toes into in order to fix the problem. Suffice it to say that in this instance what happened was that the next sample skipped over a stripe and went into the next one after. Its the same as the wheels going backwards on old cowboy movies of the wagons rolling where the film camera sample rate of 24 frames per second is just below that of the wheel spokes moving around. The movie camera frame rate is undersampling the action.
This is the main reason why holding original photos and art at high resolution is essential. If your going to work on a final piece of art at 1000x1000 pixels then you should work in much more than that - say 4000x4000 and only convert to 1000x1000 at the final stages. If for any reason there are details in the original that are too fine to produce at 1000x1000 then they will not only be lost but could cause weird effects to be produced at the final re-sample.
07-27-2008, 11:45 AM
Ok so we have an aliasing issue, what we need is an aliasing busting technique - cue antialiasing.
Antialiasing is all about doing stuff at a higher sample rate than the final, filtering and then resampling to the final rate. The idea is that the effects that would have been present from the undersample will be beyond the filter and therefore not in the final image. So were back to our zebra. If the camera had filtered off any higher frequencies than the pixel sample rate then as the stripes on the leg of the zebra got thinner then eventually they would blur out and become a solid mid gray color. And then as it got farther away still the whole animal would become solid gray but would at least look better than the new stripes from the earlier example.
So, always work at a higher resolution than the final. Make a copy of the high resolution image and blur it just a little so that a few pixels blur together and then resample it smaller to final size. If going 2 to 1 then blur just enough for two pixels to look like a single blob and then make half the size, if going 10 to 1 then blur so that 10 pixels become one blob and then make one tenth the size.
This form of antialising is known as "super sampling" it is the type also known as "Full Screen Antialiasing" (FSAA) when setting up video cards for games.
07-27-2008, 11:46 AM
This means changing the image resolution or the number of pixels. The idea is to preserve the original image in the best possible way. The techniques involved are split depending on whether going up or down.
Down means taking a big image and making it smaller. Your going to lose information and what we would like to do is lose the least amount. Although all paint packages have the ability to do it a few key presses and one pass they all seem to be universally useless at it. If your going from 1000 to 800 then the best way is to upsample from 1000 to a large multiple of 800 like 3200, or 4000 and then blur and downsample to a 1/4 or 1/5. If changing in much larger ratios like 1000 to 173 or something like that then find a nice multiple of 173 larger than 1000 - i.e. 1730 and upsample to that first then blur 10 pixels into one and then down sample by 1/10th. Many people say - oh always use Bicubic or always use Lanczos but I disagree and I will show the results here with bicubic from 1000 down to 173. Maybe you disagree. Mathematically a sinc resample should be the best possible but I dont think the paint apps implement the full sinc and the windowing makes the resample less effective as more scaling is applied.
Up means taking a small image and making it bigger and here the paint apps seem to do it as well as can be expected so just picking the right algorithm is all thats needed. Again here, people often say that Bicubic or Lanczos is the best and for most images I would agree but there are exceptions. For general work including maps and photos I think its true and below is a sample sheet.
Where the situation changes is with noise and ringing and with small scales. If you are resampling from 997 to 1000 or something very very close to what you want then I would use a pixel/point/nearest neighbor based resample because for about 99% of that image the pixels will not change. Below is a set of lines resampled up very slightly.
You will have to save and zoom up this last image to see whats happening. My monitor makes all three pretty much gray.
07-27-2008, 11:46 AM
We can clearly see in many cases of resampling that some pixels are made from the average of several others. This is particularly true in upsample using cubic weighting. We have also said earlier that averaging two pixels in any color space can cause some odd results so this section is some gotchas to look out for and what to do about them. This is an area that I am least familiar with since I only use RGB and dont bother trying to fix these but its worth knowing anyway.
We noted that a blend from red to blue in RGB color space would give deep mauve - ok but look at this. This is an up sample of some colors which have some issues. Some of the corners here are pure green but the other is red & blue (magenta). When we average these two colors together the red & blue from one image drop halfway and the green from the other drops halfway. End result - all halfway to give mid gray ! Now its worth noting that for resample types that dont allow blends like the nearest neighbor then you dont get the issue. Also, it turns out that if you apply a gamma adjustment to the image, resample with a cubic and the apply and reverse gamma adjustment then the problem gets fixed. I havent fully convinced myself why this is true but I am assured it is.
07-27-2008, 12:30 PM
Reserved space - but this one came up recently too.
EDIT -- actually this comes up a lot and again today so here we go again... how to resample up just B&W stuff.
First, upsample anyway you like in factors of double (200%) in stages until one more is less than double (say 1.7).
For each stage double the size of the image which makes it pixellated - a nearest neighbor / pixel resize is fine.
Next, blur it - preferably using gaussian blur. The amount of blur radius can be experimented with but about a factor of 3 pixels or so.
Then use contrast to clamp it back to being B&W again. Actually I use about 95% or so not 100% but its up to you. Dont brighten it or darken it when you do it, just up the contrast.
Keep doing this in stages until the last stage is <200% in which case you might want slightly less blur than usual but not by much.
Here is the results. Everything is cool except for where there is an acute internal angle where it tends to start filling in depending on the amount of blur used. So less blur helps, but more blur is better to get rid of pixellation. You have to experiment with it.
07-27-2008, 02:15 PM
Thanks for spending ALL the time to explain all this. Unfortunatly, my head just can't comprehend 90% of it, so I stopped reading at half of the first article. Sooooo I gave you a rating and rep, cause even if it won't help me, I am 100% sure someone will read it and get some help from the work.
07-27-2008, 02:35 PM
At least I now comprehend why the anomalies happen even though I need much much more practice in fixing them, as explained here, to be really comfortable with actually doing them and fully understanding them. Thanks Red, very informative.
07-27-2008, 07:51 PM
Awesome! Thank you so much, Redrobes.
Do you happen to be able to explain in general terms how the different resampling algorithms work? I would love to be able to choose one based on an understanding of what it's going to do rather than trying to rely on an imprecise rule of thumb.
It's so nice to have a programmer around who knows how to talk down to my level!
07-28-2008, 07:47 AM
This turorial is Great, and answers many questions I have had wrt imaging terms. Thank you for posting it.
While I am still something of a dummy (at least a newbie) when it comes to imaging software in general, I am less of a dummy when it comes to electronics.
it turns out that if you apply a gamma adjustment to the image, resample with a cubic and the apply and reverse gamma adjustment then the problem gets fixed. I havent fully convinced myself why this is true but I am assured it is.
Since I was also curious about gamma correction, I looked up gamma correction in search engine and found following document:
Based on this, I think that gamma correction does nothing to the image data itself, it is simply a setting for when image is displayed on monitor. So if you first halve the original value, and later double it for a separate image, the new image ends up with the original value, and at no point is the image data altered by changing the gamma setting.
I'm not sure if this helps to understand the issue or not. I would need to be a bit more familiar with sampling methods.
Edit: Warning! The attempt at insight contained in this post, is most likely either dead wrong or completely irrelevant to this topic. Every line after I thank Red Robes for posting this tutorial, should probably be disregarded.
07-28-2008, 10:58 AM
Do you happen to be able to explain in general terms how the different resampling algorithms work?
Ok, first let me say that all of these resamples work separately in X and Y so think of it doing a stretch in width followed by a stretch in height and the same process is applied in each case. So we can talk about just a stretch in one direction. Often the PC will do both at once for performance reasons tho.
Well the first is the simplest and quickest and you have seen that it often produces the worst results but can be the best in certain circumstances and that is the point sampled. Often called nearest neighbor or sometimes a pixel resize.
Say your going up from 100 pixels to 140 pixel image - thats a multiply by 1.4. What the PC does is loop over the 140 pixels and divide the pixel number by 1.4 and then look up the nearest pixel in the original image and use that. What that means also is that occasionally it will hit the same original pixel a few times duplicating the result and you can see that in the 997 to 1000 stretch example. Occasionally theres a double black or double white line. When you use this stretch the output is always blocky in appearance.
Linear resampling takes the same approach and runs over the 140 pixels and divides the position by 1.4 so that it will give a value of say 86.3. What the PC does is look at pixels 86 and 87 and take 70% of 86 and 30% of 87 and adds them up. Its a straight up linear interpolation.
For cubic its harder to describe but it does the same loop and get the same 86.3 value. Now it looks at 4 pixels. 85,86,87, and 88. It applies a cubic weighting function over the top of the four and adds them up. The cubic curve in the middle is like an S shape so that its a smoother blend at the known source pixels.
Lanczos is a windowed sinc function. That means its a sinc which is just sin(x) / x but restricted to a window of either 5 or 7 pixels (either side) depending on whether your using Lanczos5 or Lanczos7. It just so happens (lots of math) that a sinc function is the perfect function to use for resampling but that a sinc function goes on and on and never ends so they suppress it inside a smaller window. Anyway I think Lanczos5 will look at 9 pixels, multiply with a windowed sinc function and then add them all up.
Its called bilinear, and bicubic because the function is applied in two directions at once so whats actually happening is that a patch of pixels are being processed at once.
There are more types but they are all much the same style. Look at a patch of source pixels and interpolate a new destination pixel based on that patch. You can get better image results the more you know about what was going on in the location around the source.
Edit -- just fact checking, it seems the 5 or 7 is not either side but the total amount of points so either 2 or 3 either side.
07-28-2008, 11:04 AM
I think that gamma correction does nothing to the image data itself, it is simply a setting for when image is displayed on monitor.You can apply color correction and gamma adjustment globally at the video card stage and this would affect everything and not affect the pixel values but to do that color fix up with resampling, you have to apply a gamma function to the pixels first and then apply the stretch and then apply the inverse gamma function. Although in the example provided the first gamma function does nothing because all of the colors start saturated if you tried to do it to a photo then you need that first gamma adjustment if your going to apply an inverse one later. Your right in that gamma adjustments are used to compensate for the effects of monitors - esp CRT type. Why this process works to fix this problem I am not sure. I am not sure if a different compensation curve would also work.
07-28-2008, 07:30 PM
This is a link to program that does an incredible job at up scaling images
I can not tell you how much I wish I had this program at my current job
I am sick of getting 15 kb .jpg logo files
07-29-2008, 08:50 AM
Would you like to take the middle image from post #11, take the top left small bit and scale it up very high (800%) and cut out the bit of petal like the others so we can compare ?
Here is another freeware application that provides a few other enlarging modes:
Here are some results:
Fractal 5 - XinLi:
07-29-2008, 09:58 PM
Thats brilliant Rob. Are you able to edit the post and label them as to what they mean. I have to say I do not think that they are an improvement over ordinary methods tho. The one Mathuwm suggested is critically acclaimed and supposed to be very good indeed so I am curious. It appears I have repped you enough already :)
07-29-2008, 10:34 PM
resampled with Genuine fractals
07-29-2008, 11:06 PM
I am expecting great things from this super resampler but I cant accept this. This is too good. Is this the small image in the top left of the middle image of post 11 - I have attached it again below. How can it reproduce perfect stamen from a single pixel... tell me this was from a different image !
Can you do the image attached at 800% or x8 mag ?
Edit -- I can see that this must have come from top left of post #2 and not post #11. Its still very good but it would be great to have a side by side comparison from the same source.
07-29-2008, 11:33 PM
the 120x96 pixal image
No i do not think that would work to well.
the larger the image the better it upscales
this program is used a lot for upscaling images for printing on a large format printer
what i did was take this:
and scale it to this
I actually cropped it after I scaled it so the actual differences might be slightly off
07-29-2008, 11:42 PM
Well the little one is whats being used for all of these samples just so we can compare what a small image resampled would become under different algorithms. Its looks from the dialog images as your using Bicubic in any case - unless it says that when using the Genuine Fractals anyway.
Just for reference, here is my GTS doing it. I will have to look at the source to see what kind its using. I think its Lanczos tho.
Thanks for putting the labels on yours Rob. Interesting ringing on the sinc. That looks a bit over the top for a proper sinc tho. I think there might be a coding issue going on in that one. Sinc in a box window will ring a little but that seems a bit too much to be correct.
07-29-2008, 11:56 PM
ok so here is your tiny image scaled up
07-30-2008, 12:15 AM
A few more with that freebee Image Analyzer I referred to (quite the swiss army knife, really!), using the sample image, enlarged 800%
Two of the fractals (no idea what the parameters do just playing). I got the best results by going x2, x2, x2 rather than straight (the image names include the parameters):
And two more from Gimp - progressive resize, 800% in 20 steps:
and a gimp plugin "smart enlarge" using resyntherizer(?) I enlarged 2x 2x 2x. The first was OK, but after that it certainly added detail, but bolloxed up the image:
And for your viewing pleasure, the gimp resynth plugin going the full 800% in one shot Not a good resize, but an interesting 70's acid trip none the less:
And here is the best I could do with some playing.
This was done with Image Analyzer fractal resize x 2 then a wiener resize x2, followed by a fractal resize x2. Pulled into gimp and added film grain.
07-30-2008, 01:11 AM
Someone is having too much fun:)
07-30-2008, 08:55 AM
Thanks Mathuwm and Rob for those. I like that Genuine Fractals one. I would agree that it is the best of them all. Its artificially sharpened up those edges just where it was required. That really is some clever software there.
09-29-2008, 06:53 PM
I have edited in the space I left in post #13 of this thread about B&W resampling.
11-23-2008, 11:06 AM
In this thread I was talking about antialiasing and how occasionally you see the spokes of the wheels of a western movie wagon go backwards or sometimes even look stationary because the camera frame rate is similar to the action. Well I cam across this today - you will have to forgive the title of this movie but I have never seen a better example of temporal antialiasing - ever. Its excellent !
Time to revisit this one...
A few more tools. Here is a demo of a technique called "smartedge" (http://audio.rightmark.org/lukin/graphics/resampling.htm). It only resizes by x2. I ran it three times to get the reference image up to the 800% we are comparing:
Also, I tried a demo of EnlargerPro by Bearded Frog (http://www.beardedfrog.com/download.htm). Here is the (watermaked in the demo) result:
As a final note, Apply sharpening after an image to the final size, not the other way around. Otherwise unperceivable sharpening halos may become clearly visible!
12-11-2008, 04:07 PM
Thanks for a great tutorial. There is a tremendous amount of excellent information here.
FOR ALL INTERESTED
I've attached a PDF document I created from this tutorial. The only thing I've done is to reformat the tutorial and correct any spelling and grammical errors. The content has not been altered.
12-11-2008, 04:55 PM
Well done, that's very nice. Spread it about a bit to anyone who might benefit from it.
The only issue is that some of the images need to be viewed without any stretching applied because they are describing what goes on when you stretch images. The PDF viewer is scaling them itself so just be mindful that its happening and will affect the images. The one in particular is the 997 to 1000 pixel stretch using all three types (3rd pic of post #11) where I argue (not very firmly tho...) that pixel resample is possibly the best in that particular case and in the reader they all munge into something which looks the same since there's another stretch being applied on top. Same goes for some of the dithering images etc.
I think in general its easier to read in your PDF than the original tho as the images are inline and bigger.
03-01-2009, 09:34 PM
(1) Genuine Fractal sampling
(2) Gamma correction in saturated images
(3) Upsampling rasterized vector images
(1) Genuine Fractal Sampling
I was interested in the "Fractal" upsampling technique since the literature on image resampling doesn't contain this term --- the closest I could find were statistical techniques for preserving edges and Haar/other wavelet transformations. I found the website for the "Genuine Fractal" (GF) method, and it is interesting to see how it works. I believe, overall, the GF upsampling method is roughly the same as repeated bicubic upsamplings at factors of 2, followed by smoothing, and edge-sharpening to keep the image from blurring. This, effectively, automates the "by-hand" upsampling technique you describe earlier in this tutorial.
I have no clue why they call this "Fractal" sampling, since it does not look like any of the mathematics of fractals. Also, the method is patented, although, the patent is from the early/mid-90s so it is probably close to being expired.
(2) Gamma Correction
I'm not sure the exact colorspace, nor gamma correction method being referred to. However, usually gamma correction occurs in an unclamped colorspace, and samples to a 32-bit floating-point channel. By adding a gamma of .5 for a saturated image, then upsampling, then inverting the gamma to 2, you've applied a isomorphism to all the pixels that maintain their saturation levels (up to round-off error). However, by increasing the gamma, the newly generated pixels (the "dark" muddy pixels between the mauve and green) are gamma'ed out of existence --- they become very light gray. I would propose that a light-gray pixel will be interpreted as a transition zone in the light field your eye picks up, rather than an edge. The dark-gray transition in the light field will be interpreted as an edge. This makes the light-gray transition "look" better without having any discernibly different characteristics, other than being lighter-in-color.
(3) Upsampling Rasterized Vector Images
There are a number of (free) programs for converting from raster images (especially 2-bit raster images) to stroked vector images. I have had success with both POTrace and AutoTrace/Delineate. They can be found here:
Another update on this topic.
There is great review of different up-sizing techniques over here: http://www.cambridgeincolour.com/tutorials/digital-photo-enlargement.htm
(as well as a wealth of digital photo tips, techniques and tutorial at that site!)
10-11-2009, 04:42 PM
So I wonder if anyone would care to explain some of the operations that can be performed on an image? What does Multiply mean in terms of the color space? What do the different "Other" filters in Photoshop do (high pass, maximum, minimum, and custom. I think most of us have figured out offset)?
And once again, RR, thank you very much for this thread. I just refreshed myself on it and learned almost as much the second time through as I did the first time.
10-12-2009, 12:51 AM
I found this yesterday when I started trying to translate Ascension's atlas tutorial into gimp. Since PS has a set of blend modes gimp doesn't have, this was very useful. It has a pretty detailed explanation of what each mode means (multiply, etc.), but I don't think it includes the "others."
10-12-2009, 09:09 AM
Good link there Gidde and I think that covers most of the modes. Theres a minimum and a max which I didnt see. Thats where you take two images and pick the pixel from either one or the other depending on whether its red, green or blue value is brightest between the two or darkest - done on a per component basis so if one had high green and low blue then it takes the green value from one and blue from the other and result is the RGB value with that green and blue in it. Same in reverse for minimum.
By treating an image with pixel values from 0 to 255 as brightness, divide those integer values by 255 to get a real value between 0.0 and 1.0 and most of the math is done like that and per component basis and then the final image is those resultant math ops then multiplied by 255 to get back to one byte per component RGB again.
So in theory you can do lots of math with images as long as you don't mind using a limited precision of 1/255 as the smallest increment. This also shows that its better to all of your math ops on images which have brightnesses across the full range. No point in doing them on limited range shades of gray like dark colors etc. Thats where some of the height banding comes from on the height mapped greyscale images you see.
10-12-2009, 12:25 PM
That article does just fine up through Screen, then the details (i.e. the math) vanish. I had the same problem with the CGTextures tutorials (http://www.cgtextures.com/content.php?action=tutorial&name=blendmodes) (which I find far more enlightening than the Wikipedia article, by the way). Some of the blend modes are explained in great depth, and others just say something like
"Soft Light: Very much like Overlay, but the result is much more subtle." So what's going on in the channels that makes the effect more subtle?
10-12-2009, 03:50 PM
Ahh, well the exact names that people give these things I cant help out with much. If its brightest, darkest, add, subtract, multiply, difference, invert (NOT), AND, OR, XOR. The other stuff like blur etc are not math functions but I can explain how to do those along with sharpen, emboss, edge enhancements or edge finding. But the more esoteric stuff like the buttonize, soft plastic, watercolor etc are all voodoo depending on what the programmer did. I don't think that these have exact specifications.
A lot of the non strict math stuff have a set of parameters - like the blur amount for example. In this regard soft light and hard light are algorithms that might consist of several so its a bit of a recipe or a vintage wine of taste to suit.
10-12-2009, 05:14 PM
Here's a site with the math on a lot of these blend modes, and a few of his own devising.
Blend modes (http://www.pegtop.net/delphi/articles/blendmodes/)
05-31-2011, 07:49 AM
This is a paper and some results of a new collaboration between two researchers, one from a university and one from microsoft. The results are very impressive:
some interactive samples of it.
Dead horse beating time...
I think I just found the best free enlarger out there:
Here are some samples of the little image enlarged to 800% with the 4 enlarging presets (all png to avoid jpg artifacts):
Sharp and Noisy:
And as a special treat, a jpeg of the samesource image blown up with the default setings to 3000%:
07-14-2012, 09:19 PM
Another paper outlining a new way of doing similar. Dont know much about it but you can compare by clicking on the buttons. It looks good to me with a few instances where its breaking down compared to the bicubic but not many. Also, I have to say that some of these images seem to lend itself to the algorithm. Interesting zebra pic but without a high res version of the original its not possible to note whether the leg of the zebra with stripes has been corrupted and then fixed up into something visually appealing but still incorrect. Hard to say...
Powered by vBulletin® Version 4.2.3 Copyright © 2015 vBulletin Solutions, Inc. All rights reserved.