One thing that you might try is to use an ellipse and rectangle rather than circle or square. These shapes give you an additional degree of freedom (aspect ratio), which might well get you more interesting shapes with fewer primitives.

As far as the shapes themselves, they don't seem statistically different than the output of a slice through a fractional Brownian motion (fBm) surface or a simple wavelet noise surface. I suppose that's not surprising because the basic process is the same (add progressively smaller details to a basic shape).

If I understand correctly, you're generating an abstract shape tree and then rasterizing it, which would mean that performance would break down the generation phase and the rendering phase. Are you limited on the placement end or on the rendering end?