image processing basics
first post in this forum, i'm new here and to 2d coding, though i have a fair background in 1d dsp, which of course lends itself to 2d.
audio has a nice repository of algorithms at musicdsp.org, many of which work straight out of the box. some things like interpolation can be applied directly to images. i was somewhat surprised to not find a similar site for graphics.. though of course a lot of 2d includes 3d and then things become extremely involved... so i asked for this forum to be created, and expect i'll post in it as my knowledge progresses.
so itfp, thanks to the admin here and to waldronate, who is the first coder i've interacted with here, for steering me in the right direction.
one of the topics i've researched in audio is algorithmic/aleatoric/stochastic/okay, "random" generation, so my interests carry over with a focus on what is possible procedurally instead of by direct specification.
i'll post by topic:
i expect that most often the first step for map generation is height field generation, commonly using perlin noise.
i prefer to roll my own rand() function using hal chamberlin's algorithm from 'musical applications of microprocessors' - i use unsigned INTs and perform math using INTs for speed (remember i know next to nothing about image coding, using GDIs et c., so this may be unwise) -
nrnd = 196314165 * nrnd + 907633515;
if you scale that sensibly and use it to generate white noise at audio rate, there is no discernible repetition. you can use 16 bit INTs and hear repetition every 2^16 samples, so SHORTs are useful for finite applications.
here's chamberlin's chart:
word len A B
8 77 55
12 1485 865
16 13709 13849
24 732573 3545443
32 196314165 907633515
perlin noise isn't the only method of generating height fields, eg. obviously each pixel could be randomly generated then the image could be blurred/smoothed. it is convenient and efficient.. interpolation is used to smooth random points at different scales (typically at 'octaves' or ^2) and then summed at amplitude 1/(n^2). the idea is also that the same data set (256*256 for unsigned CHARs?) can be used for each octave because the scale at any perspective obscures the similarity.
ken perlin's lecture and webpage on the topic:
there are several great illustrations of the topic online, easy to find, eg.
so i'd be a ponce to repeat them all here. having generated noise by several methods in audio i can think of no method that lends itself better to the solution.
i haven't implemented this yet, only verified that it is effective:
take the four adjacent pixels, to each side and vertically, and take the differences (or 'delta' as i believe mathematicians like to call it), eg.
dX = n[x+1][y] - n[x-1][y];
dY = n[x][y+1] - n[x][y-1];
arctan2 (whatever that is..) the two arguments:
..and you'll get something with a sign and up to +/- 2*pi which can conveniently be applied to a sin() function eg. for directionally lighting/shadowing each pixel, and adding an appropriately scaled constant allows modification of the direction.
so far, the largest single repository for image processing algorithms i've found is the FAQ for a mailing list:
whether this is form there or elsewhere (so far i've noted a few but used none) -
corners = ( Noise(x-1, y-1)+Noise(x+1, y-1)+Noise(x-1, y+1)+Noise(x+1, y+1) ) / 16
sides = ( Noise(x-1, y) +Noise(x+1, y) +Noise(x, y-1) +Noise(x, y+1) ) / 8
center = Noise(x, y) / 4
in audio these would be called FIRs or finite impulse response filters, as opposed to IIRs, which i haven't seen much reference to for images yet. the few other filters i've seen for blurring, smoothing, sharpening and edge detection are all based on the adjacent filters, and i suppose performed repeatedly. i'll stop posting now and give myself a chance to code more :)
The reason that you don't see an IIR in image procesing is because there is no time component. If you have an array of images over time, it's called video and now you need to do fancy 3D image processing and/or pixel areas over time (that latter bit is exactly analogous to a 1-D process like audio processing).
I have a tut and talk about some of the common / basic image algorithms in the post:
also, arctan2 is a function in programming languages which implements arctan except that arctan would be required to be given values of infinity to get certain results from it so arctan2 function is coded to give explicit results for values where it would need to have given an infinity to get it. Arctan is usually used to get an angle from a gradient so when the value of dy -> 0 for dx/dy then the function returns PI/2 or -PI/2 as appropriate. It also handles the angles > +/- 90 deg properly. So arctan2 is a doddle to use else you need to write a few extra lines of code to trap the infinities if you use plain arctan.
oh and arctan2 should give +/- pi not 2pi. From -pi to +pi is one full rotation.
And if your after lots of formulas like the one you quoted for the random number gen then the wiki page of them is a good source:
IIRs can be applied to any set of periodically sampled data. it may not be conventional to do so, and there may be a differing vernacular, an IIR could certainly be applied to rows and/or columns of pixels :)
Originally Posted by waldronate
as they are not computationally intensive, eg. a highpass filter with a steep coefficient could be run once in both directions over every row and column with dramatic results and achieve a radius/"wavelength" effect that would require hundreds of passes with grid filters :) as said, i'm new at this so i expect i have plenty to learn.
one of the common foundations for audio processing is dspguide.com - downloadable in pdf chapters. it's general theory and has accessible explanations of FFT and other common dsp methods suitable for 2d procedure as much as 3d procedure. might be a good read for some aspirants :)
off course, now we are seeing that my secret agenda in soliciting for this forum was to mine goodies for myself :D thanks redrobes and waldronate :)
i'll add a thingy on IIRs for images..
this filtering was performed using the well-known formulas from robert bristow-johnson's biquad cookbook.. code's right in there (IIRC the signs for the bandpass y coefficients need to be flipped..)
actually using them will probably be aggravating unless you have a background in audio processing or IIR filtering or otherwise are oriented with the fourier theorem :) generally these filters (the state variable is another trivial algorithm) have a phase effect as well as a frequency dependent attenuation (dspguide.com mentioned above provides a thorough background here) but not always.
what they would allow you to do to images is apply a wide effect at low computation. the actual process (after the coefficients are calculated) uses a few buffer variables to record the last two states of the input (x[n], x[n-1], x[n-2]) and output (y[n], y[n-1], y[n-2]) and a scalar for each.. so the computation would look like this:
y[n] = a0*x[n] + a1*x[n-1] + a2*x[n-2] + b1*y[n-1] + b2*y[n-2];
y[n-2] = y[n-1]; y[n-1] = y[n];
this can often be reduced by a few multiplies as often a0 and a2 are the same coefficient or similar.
in this image, the first frame compares the original signal (a bandlimited triangle wave.. similar to terrain lol) to the 2nd order highpass filter from the cookbook.. the phase distortion is endemic to the filter, it is relative to the wavelength in one direction or the other.. and of course audio/1d filtering is an *extensive* field.. zero-phase filters exist, SINC filters (dspguide) would probably do fascinating things with images..
in the 2nd frame, i recorded the 'audio' and reversed it and filtered it again.. this would be synonymous to running the biquad in one direction along a row of pixels, then running it in the other direction.
in the third frame, i've scaled the output to resemble the input. i'm not particularly anxious to prove any point about the utility of IIR filters, only demonstrating them for those who wish to explore them :)
How about this: An IIR on a fixed-size data set such as an image can be implemented as an FIR with a filter width the size of the image, if I recollect correctly. The classic use of IIRs to provide feedback into a signal aren't nearly as useful for the most part in image processing, where there is a distinctly limited data size. There are some examples of low-pass filtering out there (often under the term recursive filters rather than IIR), but it's not a particularly common usage, in my experience.
now that i've had time to read it :) familiar territory for me, with the exception of the upsampling work. which is astounding.
Originally Posted by Redrobes
you may me interested to know that the audio field has a different take on resampling. itfp, i would expect that audio upsampling congruent to the work in this thread is extremely pricey. audio people are an extremely precious bunch as i expect you're aware. i am nescient in regards to anything expensive, so all the upsamplers i've encountered for audio are very primitive.
itsp, downsampling and bitcrushers are par for the course in audio - most modern genres would be half of what they are without it, it's used very extensively. quite honestly, some of the "creative" upsampling in that thread would probably be lucrative. if you've got something that no one else has, you can easily set your own price and have a significant customer base, if interested in such things. i expect audio parallels if not dramatically exceeds auto and harley davidson sales in terms of translating testosterone into capital. it's quite horrifying really.