Rawheds Tutorial#3:

32bpp Graphics Coding

[Introduction]
[32BPP Basics]
[Alpha Channel?]
[More 32BPP RGB]
[MMX Helps out]
[Conversion]
[Converting to 24BPP]
[Converting to 16BPP]
[Converting to 8BPP(mode13h)]
[Converting to TextMode]
[Closing Words]

---==[Introduction]==---------------------------------------------------------

This tutorial is based on how my current vesa/gfx engine works. I'd previously been doing just 16bpp graphics, and I had to code all of my routines for 16bpp. When I first started with 16bpp it was quite a novelty, but after a while(tut2) I was getting irritated with it and wanted a more flexable model. I saw demos which could run in TONS of modes like 8bpp, 15bpp, 16bpp, 24bpp, 32bpp and even textmode! A lot of demos could have their mode changed from the commandline, and I realised that this pure 16bpp model of mine was not so cool and very unflexable.

What this tutorial covers is a different way of coding gfx engines so that they can handle multiple color depths. Basically what happens is that you create all your memory buffers as if they were holding 32bpp graphics, and all of your internal graphics code works at the 32bpp level, and then finally when you want to flip the frame to the screen you just convert to the appropriate bpp level. So you could have conversion functions to convert between 32bpp-->16bpp and 32bpp-->8bpp, and then you would flip that into video memory.

I also had a lot of trouble finding out the video mode for 32BPP modes. All the vesa docs I read only had up to 24BPP. Eventually I found(from UNIVBE) that 320x200x32bpp is 146h mode.

I can't rememeber where I heard of this idea from, but I do know that its not original. Infact A LOT of demo groups use it. But since I couldn't find any tuts on it, and I thought it works very well I wrote this tut. So lets go then.

---==[32BPP Basics]==---------------------------------------------------------

Although 32bpp alows way more colours than the other modes(16bpp etc) it is actually the easiest to code for! 15&16bpp modes are cool, but they only offer 32768 & 65536 colours, and they are difficult to work with because they have the RGB values packed into them(see tut1).

The 32BPP format is easy, and of course each pixel takes up 32bits(4 bytes) of memory. You have to be careful because a 320x200 surface can take up a lot more memory than lesser modes.

So only 4 32bpp layers and you are using a MEG of memory!
Here is how the 4 bytes are structured:

8bits for Alpha channel, 8 for Red, 8 for green and 8 for blue. As you can see, you have the same range of RGB colours as you do in 24bpp. So why use 32bpp if its just gonna take up more memory? Simple. First of all its faster. Why would it be faster to read/write 4 bytes as opposed to 3? Basically the computer handles R/W faster when it has to read an even number of bytes. Also, you don't get 24bit registers. For example:

Because there is no easy was of writing 3 bytes at a time, its much easier to write 4 bytes. Hence 32bpp modes :)

---==[Alpha Channel?]==-------------------------------------------------------

Well I must confess, as the time that I'm typing this I've never used the alpha channel, or really thought about what it could be used for...So I'm sort of gonna be making this up as I go along. But I'm sure you can think of groovy things to use it for. Having an extra 8 bits on your layers/surfaces is very handy indeed.

1]You could use it to define MANY characteristics of the surface pixels.

This is just an example of one way you could to things. Although I think a simplifies version of the above would be better for the realtime demos of today.

2]You could keep things simple and just use the 8 alpha bits for doing your own internal transparency etc. This is probably what most people use it for. Very handy, but not something I've done myself.

---==[More 32BPP RGB]==-------------------------------------------------------

Ok, so now you know the format etc. Now to show you some nice things. Want to add 2 RGB pixels together? Sure, easy - not like 16bpp.

A very nice trick that I found was with MMX instructions. They have something which I found perfect for 32BPP functions. I'm not about to write an MMX tutorial :) so go and read another doc for that, but I want to introduct one MMX feature in particular. Saturated registers.

Lets take a simple additive surface loop. Here you have 2 320x200x32bpp surfaces, both with pictures on them and you want to add them together.
Eg:

Doing that for every pixel would be VERY slow yes? Even doing that in normal ASM would be slowish. But MMX can make it easier. I use NASM, you should too :)

---==[MMX Helps out]==--------------------------------------------------------

Saturated registers are registers which don't overflow. Normally if you a dded 250+20 in a byte value(say AL), at the end AL would = 4. So what MMX's saturated registers does is clips it. So when you do an MMX add, 250+20 = 255. Funky eh? MMX works with 8 mmx registers(MM0-MM7), each are 64bit egisters. So you can store 2 32BPP pixels in each register! This is VERY cool because it means that using 1 instruction you can additively add 2 pairs of pixels.

Two MMX instructions which I have found handy are: PADDUSB & PSUBUSB

Here are 2 MMX registers(64bits each) filled with 2 pixels each:

The MMX instruction: PADDUSB MM0,MM1 basically adds each 8bit segment, and clips the addition to 255. Same with PSUBUSB MM0,MM1 except that is clips it to 0. Here is how we could use this in a complete function. This function does the same as the above pseudo code, but MUCH quicker.

You won't believe how fast this is until you try it.

---==[Conversion]==-----------------------------------------------------------

Ok, so you've written a groovy internal 32bpp gfx library. Complete with texture-mapped four dimensional splines and beautiful particles algorithms. Now what? Well you have to copy you buffer into videomemory so that it can be seen :) The nice thing is that the viewer doesn't have to have a videocard that can handle 32BPP. You can convert the image in the buffer to the appropriate format and then flip. Eg:

A nice feature that I've added to my demo (which I'm busy writing) is that you can change videomodes while running the demo by pressing F1-F4. I thought this was quite a groovy idea :)

Before I actually sat to code my 32BPP engine, I thought it would be very slow to convert all the time. I mean one fullscreen color conversion MUST be slow. But its not that bad :) Why not? Ok, lets take the videomodes from the above code:

As you can see, even though you have to convert, the ammount of data you have to push to the video card becomes less, so it sort of compensates :) And besides, the conversion routine ISN'T that costly. I actually love figuring out new ways(and faster ways) to convert between different pixel formats. Its FUN :) Below are the algorithms that I use. If you use them please credit me and send me a little email ;). I don't claim that they are the best or anything, and if you can see kewl ways to improve them pleaser give me a shout.

---==[Converting to 24BPP]==--------------------------------------------------

Well, this should be very easy :) Just chop off the ALPHA channel. So I'll leave this one up to you :)

---==[Converting to 16BPP]==--------------------------------------------------

Have fun trying to come up with your own methods :) I think PTC has some nice conversion routines, although I have yet to check them out.

---==[Converting to 8BPP(mode13h)]==------------------------------------------>

Have fun trying to come up with your own methods :) I think PTC has some nice conversion routines, although I have yet to check them out. This function doesn't take into account the palette. Infact, all it does is assume you've set your palette to go from 0(black) to 255(white), and then finds the approximate brightness of the RGB values and uses them. I know its lame :), but I've seen other demos doing the same thing. Oh well, I'm sure I'll write a color palette version very soon, as I've only adopted this 32BPP internal mode about 2 weeks ago.

---==[Converting to TextMode]==-----------------------------------------------

Hmmm, this was hard :) hehe, its amazing that with these graphics modes, it seems to get easier with the more colors you can have. I mean 32BPP is dead easy to code, 16BPP is harder, and textmode is quite a mission :) This is a VERY simple hack, and if you can make a better one, please let me know all about it. This one just writes character #176, #177, #178, #219 to the screen depending on the brightness of the RGB value. And it also selects the color(0-15) base on the "brightness" of the RGB value. So it assumes that your palette goes from dark-->bright. Unfortunately I haven't made it do funky things like realtime change the palette or search for the best color. I'll probably do this soon. This is basically just a test:

---==[Closing Words]==--------------------------------------------------------

Phew :) I really hope this helps some people out there, in some way or another. Please send me any thoughts/ideas/improvements on this topic, I'd really like to hear/see them. The scene is wonderful, long live the scene. When I die I want to go to a scene heaven ;)

-Rawhed/Sensory Overload
-Mailto:sfeist@netactive.co.za
-Htpp://www.surf.to/demos/
-Andrew Griffiths
-South Africa
-05-07-1999