Thanks @enver-arco, my prompts were pretty descriptive, anywhere from 20 to 50 words each. Objects that the AI has seen a lot (e.g. people, some animals, photos of real world) are easier to describe with less words, and objects that were likely scarce in the training dataset (e.g. a rustic pit trap in the woods with sharp spikes at the bottom) take a lot more words and attempts to define for the AI (and I was still not satisfied with some of them).
Probably close to half of the text within my prompts was description of content (e.g. feral vicious wolf sharp teeth in the misty woods at night with heavy rain) and half was description of style (e.g. realistic, concept art, volumetric light, etc.). A few words that I found were important: beautiful, realistic, fantasy, concept art, digital illustration, portrait, videogame asset, digital icon, volumetric light, 3d render, photo, highly detailed, masterpiece, seamless texture.
For all the icons, using "videogame asset" helps a lot, but if I used that in the characters it made them worse. For the backgrounds and spike trap, using "photo" and even terms like "200 mm" helped a lot to bring them closer to what I was looking for. Typing "portrait" instantly makes faces and characters more beautiful, but does not help if you are trying to get a full body. For characters, depending on what I was getting I had to add or remove things like "fantasy, "realistic", "beautiful".
Img2img helped a lot, especially with the card frames. I drew the rough shapes for them in MS Paint and then SD respected my dimensions and direction almost perfectly, I was very impressed with the power of that tool.
Most images took a lot of attempts, I'd say 20-ish generations, some tweaking in GIMP, taking a total of probably a half-hour or so to finalize each. Sorry I rambled a bit :sweat_smile: