Oldskooler Ramblings

the unlikely child born of the home computer wars

to the last, I grapple with thee

Posted by Trixter on August 28, 2011

MindCandy 3 is 99.9% finished.  From January until now, it has crept along from about 98% done to 99.9% done.  Why the slow progress?

It is almost entirely Adobe Encore’s fault.  Encore is the only halfway decent solution to creating a Blu-ray, allowing Photoshop files for menu creation (very flexible and handy), multi-page menus, subtitles including a subtitle editor, and other fun stuff.  It also has a lot of help for the newbie if you need it, including encoding of assets, a library of themes and buttons, and most importantly a clear interface.  Coupled with the excellent Blustreak Tracer CMF, you can produce BDCMF output suitable for professional replication.  This is a lot of money (roughly $1800), but the nearest solution for BDCMF output upwards from this is Netblender’s DoStudio, which is nearly double the cost at $3000.  We are of a limited budget, and already familiar with Adobe’s tools, so we chose Encore+Tracer.

Encore’s blu-ray support, we have discovered, is extremely buggy and almost unusable.  Here’s a few fun showstoppers we’ve had to work around:

  1. You can have only 15 buttons per page of a multi-page menu.  Any more and they’re not guaranteed to show up on hardware players.
  2. You can have only 9 pages per multi-page menu.  Additional pages aren’t guaranteed to show up on hardware players.
  3. You can have around 90 buttons spread across your multi-page menu.  Any more and they’re not guaranteed to show up on hardware players.
  4. Subtitles closer than 5 frames together will choke the project.  (CS 5.5 has an option to fix this, but I refuse to pay $600 for a bugfix, so for CS 5.0 I had to write a Subtitle Workshop script to adjust the subtitles so this wouldn’t happen.)
  5. Any video asset used as a background to a multi-page menu will be transcoded whether it is in a compliant format or not.  This is especially idiotic when you consider that multi-page menus are just graphical overlays on whatever is playing in the background, so no transcoding is even necessary.  Even more hilarious, it forces a transcode to 30fps, even if your asset is 24fps ot 60fps.
  6. Using H.264 video with open GOPs (higher quality in a smaller space, perfectly valid for blu-ray) causes Encore to freak out and decode the entire asset once, causing near lock-up of your computer at 100% CPU across all cores while this is happening.  Working with any timeline greater than a few minutes is impractical because of this.
  7. Encore, for lack of a more eloquent term, fucks with blu-ray player registers it has no business fucking with.  As a result, subtitles turn on when they’re not supposed to.  BluStreak Tracer (mac) or BDEdit (PC) is required to fix this.
  8. There is no way to enable  “title” or “return” remote control button functionality.
  9. Trying to encode a 480i pop-up menu results in a garbled menu (720p and 1080p/i works fine).

Keep in mind that Adobe doesn’t disclose these issues.  (They disclose three of them in the CS5.5 release notes and claim to fix one of them, but like I said, I shouldn’t have to pay an extra $600 for a bugfix that restores advertised functionality.)  So my build process for the last two months has been something like this:

  1. Make changes to the blu-ray project in Encore to work around bugs (1 hour)
  2. Build the project (2 hours, thanks to bug #5 above)
  3. Burn and verify to a rewritable BD-RW50 (3 hours to burn, 2 hours to verify)
  4. Test in a PS3, as these bugs only affect hardware players.  Note bugs and issues.
  5. GOTO 1

This means it takes a minimum of 8 wallclock hours to test a change, and that’s if it happens on a weekend when I’m there to babysit the process and start one step as soon as the prior one finishes.  But, of course, that usually flushes out yet another bug that you need to fix.  I’ve been through at least 20 iterations of this when Encore should have just simply worked as advertised.

Looking back, we should have spent the $3000 for DoStudio.  It was significantly more expensive, and we would have had to probably borrow money to pay for it hoping that the sales of MC3 would repay the cost, but the time it would have saved would have been worth it.  We might have been done months ago.

I wouldn’t be so frustrated if Adobe publicly acknowledged bugs.  Hell, I’d be happy if they acknowledged bug reports, of which I’ve submitted 5 (they don’t even acknowledge if they’ve received a bug report!)  I could have designed MC3 around those bugs a year ago, saving all this time.

Posted in MindCandy | 7 Comments »

Happy Birthday to the IBM PC (and MTV)

Posted by Trixter on August 12, 2011

Today, the IBM PC celebrates it’s 30th birthday.  11 days earlier, MTV did the same.  Both of those events changed the world and shaped my life, so I had a little fun with my own IBM PC to commemorate the event, which I call MTV Corruption:

Posted in Demoscene, Technology, Vintage Computing | 3 Comments »

Walking the road to dead

Posted by Trixter on August 1, 2011

As of this very minute, I am 40 years old.  Barring any unforeseen disease or accident, my life is essentially half over.

So, how’s my driving?

Directly after graduating high school, my senior class went to a party thrown by the school in a rented skating rink masquerading as a giant dance hall.  Despite being less than 5 miles away from the graduation ceremony, some teens were showing up drunk, something I hadn’t ever seen before.  Some arrived with JBF hair, something else I hadn’t seen before.  And when the party was over at 2am, everybody went to a Lake Michigan beach about 2 miles away where the party continued (under the watchful eye of police who had been given “incentive” by wealthy township parents to watch over the party without arresting anyone) with much alcohol and the occasional disappear into the bushes.  I imbibed of neither, being a completely sheltered and, at that moment, shocked virgin.  Midway through the second party, I asked a similarly-sheltered friend, “I thought only 10% of our class had sex and did drugs; what the hell is going on?”  “Where have YOU been?” he replied.  “Your percentages are inverted.” He then explained to me how some of our friends were able to convince www.vaporizervendor.com to provide some free drug paraphernalia.

I vowed a few things that morning:

  • I would stop contemplating suicide
  • If I was still a virgin by New Year’s Eve 1999, I would commit suicide
  • I would give alcohol a chance

I’m happy to report that I was no longer a virgin less than a year later, having met my soulmate in college.  21 years, 16 marriage anniversaries, and two children later, things simply couldn’t be better.  For anyone who thinks that there is nobody out there for them, I say this:  Get out more.  Someone, somewhere, really wants to meet you, and you really want to meet them.

What about that alcohol vow?  I’ve had so few drinks in my life that I can remember every single one of them, and to prove it, here goes:  A Miller Light at a party when I was 16, a small glass of everclear punch at a frat party when I was 19, rum and coke at my bachelor party, Malibu rum and coke at a company party, a Corona at a company outing, a Bud Light after a successful day of running the MobyGames booth at Classic Gaming Expo 2004, a glass of salmiakki at Pilgrimage 2004, another one at Block Party 2009, three different types of spirits at Whiskeyfest Chicago 2010, about 10 beers over an 18 month period at a recent company, two “rum barrels” at same said company’s outing, two shots of something unidentifiable yet quite strong while leaving said company, and a Malibu rum and coke at a recent wedding.  That’s everything.  I think that’s enough to say I’ve given alcohol a chance, and I still really fucking hate it.  Every one of them has burned on the way down.  Every single one.  I don’t understand the appeal of a substance that directly attacks you as you imbibe.  “Well, you didn’t drink enough!” I hear someone shout in the back of the room.  Maybe not, but if I wanted to get relaxed and/or euphoric, I would rather just go to a demoparty or get sleep-deprived (or, as is usual for demoparties, both simultaneously).  You know what really lifts me?  Watching something so goddamn funny that tears stream down my face from all the laughing.  I can’t believe being drunk is better than that.

As a physical specimen, I could have gone better.  I was born with one foot turned 80 degrees towards the other.  I inherited terrible eyes from both my parents; one was cross-eyed with astigmatism, and the other quite nearsighted, so naturally I got all three of those and am legally blind without my glasses.  My eyes are so bad, in fact, that I don’t qualify for LASIK (the best it could do for me is reduce my prescription, two eye doctors have told me; no point in doing it if I still have to wear glasses!).  I’ve never had any natural athletic ability.  Every September is hell thanks to hayfever allergies.  But it’s not all bad; innovative eye training at a young age almost completely cured my crossed eyes without surgery (and earned me a Speak’n’Spell as a reward), and a leg brace worn until I was three corrected the foot.  I shot up to 6 feet 2 inches by age 16, where I remain.  My weight is a problem, but I’ve started running again and it’s something I have control over and hope to be in good shape in four months.  Heck, I still have all my hair.  I could have turned out a lot worse.

I’ve experienced a lot of heartache my first 40 years.  I’ve been beaten up on a regular basis, nearly got kicked out of high school for ditching class, was kicked out of college for the same thing, washed out of a physical labor job after only two days, and blew a shot at a potentially high-earning new career by screwing up a managerial position.  I’ve also CAUSED a lot of heartache, by being pretentious and rude to people who didn’t deserve it, treating every member of my immediate family badly or disrespectfully at least once, dumping my first girlfriend in a truly horrific way, acting unprofessionally in front of customers, and even stealing (in both the plagiarism and retail sense).  I’ve nearly doubled my high-school graduation weight.  Early in my career, I was known (and treated) as “the smartest kid in the room”, something I’ve lost due to age and time and has resulted in some depression.  I’ve even lost a few friendships along the way.  Deservedly, I am cursed with extremely detailed memories of every single one of these events.

Thankfully, I’ve had a lot of good things happen to me as well, some by chance, and others by my own doing.  I met my wonderful wife, who I somehow convinced to put up with me and gave me two wonderful children.  I made some considered and crafty career choices that kept me fulfilled with how I earn a living, something I’m especially proud of given that I never completed college.  I’ve personally witnessed the birth (and death, in some cases) of home computers, music videos, the space shuttle, digital media, the internet, the web, the fall of the Berlin wall, cell phones, the bicentennial, and of course video games.  The day I was born, astronauts from Apollo 15 first took the lunar rover out for a spin.  I’ve started a few projects that I am well-known for in certain small circles, including one that wildly outgrew what I could give it and continues to survive without me.  I even gained approval and acceptance from a small group of underground creative hackers, which tickles me.

If I had to go back and live my life again, I’d do it all exactly the same.  Cliché or not, I really would, since deviating from the course would put me somewhere else entirely today, and I’m not sure I want that.  If I hadn’t gotten picked on and beat up so much as a youth, I probably wouldn’t have turned to computers and music for solace and comfort.  (And believe me, computers pretty much saved my life.)  If I hadn’t done so poorly in high school, I wouldn’t have picked Monmouth College to attend (the only nice college that would take me based on my ACT scores and not my GPA) and I wouldn’t have met my wife, and consequently had our children.  If I hadn’t flunked out of college, I wouldn’t have had the career path that led to where I am today; I probably would have graduated with a liberal arts degree with a specialization in computer science, and gotten work in a local rural town doing mediocre application programming.  And so on.

No, really – I really would do it all over again.  Want one last example?  High school.  Most people never want to revisit high school.  Me, I wish I could do some of this stuff ten times over:

Today on the train ride into work, I sat across the aisle from a large mid-40’s guy with unkempt shaggy balding hair 2 inches too long, black sneakers worn with blue jeans, an 80’s hair-metal black t-shirt one size too small, and a dirty no-name mp3 player that he was using to listen to uncomfortably loud metal on his cheap earbuds.  Think Brian Posehn but without the personality and success.  His music was so loud that I could make out the lyrics, and my initial impulse was to ask him to turn it down.  But as I kept glancing over, I saw he was really rocking out to what he was listening to, in his confined sitting-in-a-train-seat way.  This loser had nothing but his cheap metal, which was enough.  I opted not to bother him; let him have his moment, something nice to sustain him for the rest of his inevitably crappy day at a crappy job.  I mention this to illustrate two things:  The first is this attitude I have, something I’ve gained with age and did not have 20 years ago — patience, forgiveness, empathy, consideration.  The second is how tiny changes early in life could have turned me into this guy.  It’s in these moments that I’m actually glad I’m older.

Every six months, one aspect of your life gets much easier, while something else gets much, much harder.  I can live with those odds for the second half.

Posted in Lifehacks, Sociology | 4 Comments »

My thoughts exactly

Posted by Trixter on July 23, 2011

I normally don’t post short articles that just link to other places, but I ran across two posts recently that say exactly what I was going to try to say in coming weeks.  Rather than stab at the topics badly, I thought it would be better to just refer you to them.  So here they are.

Bryan Jones wrote a wistful account of the end of the space shuttle program, along with his personal photo of Atlantis’ final approach.  I saw the first and the last shuttle launches live on TV (as a 10-year-old, my mother woke me at 5:30 in the morning to watch the first one), and I feel, as he does, that our lack of commitment to a space program is a shame.  For those who wonder what we gained from spending money on the shuttle program, he lists some of the advances the shuttle program has given us, such as cell phone cameras and LED lights.

Optimus wrote a little on why he has pulled back from the demoscene a bit, and I urge all my scener friends to read this because he sums up very closely the state of mind I’ve had in the last couple of years.  In fact, his history mirrors mine a little, including how I felt when I first discovered the scene, how I treated the scene the first few years, why I attempted some scene “outreach” at times, and why I mostly hold back.

So there you go.

Posted in Demoscene, Technology | 1 Comment »

At a disadvantage

Posted by Trixter on June 4, 2011

Quick, without doing any research: What early 1980s computer was faster, the IBM PC or the Commodore 64? The IBM PC ran an 8088 at nearly 5MHz, whereas the C64 ran a 6502 variant at 1MHz. The PC cost thousands of dollars, the C64 hundreds. The PC had a 1 megabyte address space; the C64 only 64K. Is this a trick question?

It is!  The C64 was faster.  The original IBM PC, despite appearances and bias on the part of both consumers and marketing, was actually the slowest popular personal computer on the market at the time of its release, even compared to the Apple II and Atari 400.  Here’s why.

The 8088 holds an uncomfortable position between the realm of 8-bit and 16-bit personal computing; while the internal word size was indeed 16-bit, the 8 in 8088 means that its external data bus was only 8 bits wide.  This means that the 8088 could only access one byte of data in a single bus operation, giving it speeds much more like an 8-bit personal computer than a 16-bit one. Normally this is no big deal; the 6502 used in the C64 had the same limitation.  But unlike the 6502, which could access a byte in a single cycle, the 8088 took 4 cycles to access that same byte.  Another way of looking at this: every time memory is touched, the 8088 wastes 75% of its cycles, effectively turning the IBM PC from a 4.77MHz computer into a 1.1925MHz computer.  This gave it a “lead” of only 0.1695 MHz over the C64.

If it still had a slight lead, then why was it slower?  While the 8088 could indeed operate on 16 bits at a time, the machine instructions were between 2-4 bytes large, and only the simplest instructions took 2 cycles to execute.  Contrast that with the 6502, where most instructions are 1 byte large and most execute in 1 cycle.

Let’s illustrate this with a fun example:  Rotating a byte of memory once using ROR (rotate right). We’ll keep it fair by treating the PC like it only has a single 64K segment of memory. First, the 6502 version using ROR:

Cycle Operation
1 fetch opcode, increment program counter
2 fetch low byte of address, increment program counter
3 fetch high byte of address, increment program counter
4 read from effective address
5 write value back and do operation
6 write the new value to the effective address

6 cycles. Now the 8088 version:

Cycle Operation
1 ROR BYTE PTR [1234],1 expands to “D0 0E 34 12” so let’s get to fetching the opcode:
2 (still fetching…)
3 (still fetching…)
4 (still fetching…)
5 (still fetching…)
6 (still fetching…)
7 (still fetching…)
8 (still fetching…)
9 Fetch lowbyte of address
10 (still fetching…)
11 (still fetching…)
12 (still fetching…)
13 Fetch hibyte of address)
14 (still fetching…)
15 (still fetching…)
16 (still fetching…)
17 Perform operation, which takes 15 cycles + EA calculation (6)
37 Final cycle of calculation, we’re done, yay :-/

What took 6 cycles on the C64 takes 37 cycles on the IBM PC, no thanks to the slow memory access of 4 cycles per byte. Taking both machine’s clock speeds into account, this means the operation takes about 6 microseconds on the C64 and about 8 microseconds on the IBM PC.  It can get much worse than that, especially if you’re foolish enough to access more than a single 64K memory segment.  IBM PC is teh suck! (*)

The gap between the IBM PC and the Atari 400 is even wider, if you can believe that, because the Atari 400 ran the 6502 faster (1.78MHz) than the C64 (1.026 MHz).  The BBC Micro?  2MHz!  It’s painful to think about!

Ever wonder why there hasn’t been a true demoscene demo on the original IBM PC aside from three scrollers (all Sorcerers releases, btw)? Well, now you know one major reason. (Lack of decent graphics is another; in fact, I’d be willing to argue that only the Apple II had slower graphics.)

(*)Yes, I know the 8088 has 4-byte prefetch queue that sometimes speeds things up.  That comes in handy, oh, almost never.

Posted in Demoscene, Programming, Uncategorized, Vintage Computing | 43 Comments »

MindCandy: What’s taking so damn long?

Posted by Trixter on May 24, 2011

Work on MindCandy 3 continues, and I wouldn’t be posting something if the end wasn’t firmly in sight.  After three years, it is 99% finished and the end really is in sight.  Here’s the status:

  • All the demos, intros, NVScene footage, production notes, and easter eggs are completely finished and through production.  (And the blu-ray footage looks absolutely stunning.)
  • All the group commentary is in, except for the very last one which I hope will come through because it’s pretty important in my opinion, but I’m not going to wait until MekkaSymposiumBreakpointRevision 2018 to get it.  (edit: we got it!)
  • Our cover is done, another masterpiece from fthr.  Our booklet is 95% done.
  • I dusted the cobwebs off my Cinema 4D knowledge and put together an intro animation for the disc.  (Just a 15-second abstract thing, mind you, but it’s better than being dumped unceremoniously into the main menu without so much as a how-do-you-do.)  I also did some background drone and foley for it — shocking, I know!  Don’t be too impressed; I used loops.
  • The blu-ray is finished authoring, which was an arduous process because Adobe Encore is so damn buggy.  Phoenix did some great menus given the limitations we had to work with.  I had to start over from scratch a few times, and even then there are some bugs which will just have to stay in.

If things are looking so rosy, why are there still about 8 weeks left before you can hold this masterpiece in your hands?  One word:

Subtitles.

At my most maximum speed, typing between 90-100 wpm with a clear understanding of what is being said, it takes at best 4x realtime to subtitle what people are saying on the commentary.  Because there is a mixture of accents and varying degrees of being able to speak English, this can take as much as 10x realtime.  And you can only do about an hour of it before your hands start to cramp up.  So let’s do some math:  If it takes, say, 7x realtime on average to subtitle, and we have 4 hours to subtitle (main feature+intro featurette+production notes), it would take one person about 28 solid hours to complete the subtitling.  I have about 90 minutes a day to do subtitling, from my train ride back home from work where I can get a good seat and fall into a groove, to free time during evenings.  Still, that means the soonest I can get done is about 18 days (2.5 weeks!) from now.

Luckily, I have some weekend time too, and other members of the group are taking chunks, so hopefully we’ll be done in less than 2 weeks.

I hate subtitling.  I really, really hate it, especially since you are creating subtitles for something that should never be watched without audio in the first place (these are demos for goodness sakes!).  But because we have an international audience, and that audience may not understand English all that well, we are going through this ordeal for you, the customer.  All praise attention to detail!  All hail the customer!

I haven’t thought about who is going to do the translation of the subtitles, which is unfortunately going to extend time even further.  Maybe we’ll only offer English subtitles.  I really don’t want to delay MindCandy 3 beyond Assembly — I want it to be ready by Assembly.  Which is also the reason I’m not going to subtitle the additional TEN HOURS of NVScene 2008 footage, even though it is hard to understand sometimes.  I’m sorry, but really, do you want MindCandy 3 to be delayed until the end of the year for subtitling?

Enjoy a frame from the opening anim:

Posted in Demoscene, Digital Video, Entertainment, MindCandy | 3 Comments »

PixelJam @ NOTACON

Posted by Trixter on April 14, 2011

This weekend I will be at one of three (!) scheduled North American demoparties, PixelJam hosted at NOTACON.  Feel free to stop by the demoroom and say hi.

No entries for me this year, as my sole agenda for the party is to finish MindCandy 3, as Phoenix will be there as well.

Posted in Demoscene | 3 Comments »

When you reach the top, keep climbing

Posted by Trixter on March 15, 2011

(Rather than break up the discussion, I’ve edited this entry with the promised timing information at the end of the post.)

First off, you owe it to yourself to check out Paku Paku, the astonishingly great pac-man clone written by Jason Knight.  Why astonishingly great?  Because, as a hobbyist retrogaming project, it does everything right:

  • Uses a 160×100 16-color tweakmode on CGA, PCjr/Tandy, EGA, VGA, and MCGA, despite only VGA being capable of a truly native 160×100 resolution
  • Plays multi-voice sound and music through the PC speaker, Tandy/PCjr 3-voice chip, Gameblaster CMS, and Adlib (yes, CMS support!)
  • Runs on any machine, even a slow stock 128K PCjr
  • Has convincing game mechanics (ghosts have personalities, etc.)
  • Comes will full Pascal+ASM source code

This is just as good a job, if not better, than I like to do with my retroprogramming stunts.  Very impressive work!

One of the things I love about coding for the 8088/8086 is that all timings and behavior are known. Like other old platforms like the C64, Apple II, ZX Spectrum, etc. (or embedded platforms), it truly is possible to write the “best” code for a particular situation — no unpredictable caches or unknown architectures screwing up your optimization. Whenever I see a bit of 808x assembly that I like, I try to see if it can be reworked to be “best”.  I downloaded Paku Paku just as much for the opportunity to read the source code as for the opportunity to play the game (which I did play, on my trusty IBM 5160).

On Mike Brutman’s PCjr programming forum, a discussion of optimizing for the 8088 broke out, with Jason giving his masked sprite routine inner loop as an example of how to do things fast:

lodsw
mov  bx,ax
mov  ax,es:[di]
and  al,bh
or   al,bl
stosw

It takes advantage of his sprite/mask format by loading a byte of sprite data and a byte of the sprite mask with a single instruction, then it loads the existing screen byte, AND’s the sprite mask out of the background, OR’s the sprite data into the background, then writes the background data.  It takes advantage of many 808x architecture quirks, such as the magic 1-byte LODS and STOS instructions (which read a word into/write a word out of AX and then auto-increment the SI or DI registers, setting up for the next load/store) , and the 808x’s affinity for the accumulator (AX, for which many operations are faster than for other registers).  In the larger function, it’s unrolled, specialized for the size of the sprite.  It’s pretty tight code.

However, one line (“MOV BX,AX”) bugged me, as it also bugged the author:

The sprite data format is stored as byteMask:byteData words which I point to with DS:SI for LODSW… which I then move to BX (which sucks, but is still faster than MOV reg16,mem; add SI,2) so I can use bh as the mask and bl as the data.

So, was that code “best”?  Is there no faster way to write a masked sprite in 160×100 tweaked text mode on the 8088?

First, let’s look at his original code, with timings and size:

lodsw            16c 1b
mov  bx,ax       2c  2b
mov  ax,es:[di]  10c 3b
and  al,bh       3c  2b
or   al,bl       3c  2b
stosw            15c 1b
--------------------------
subtotal:        49c 11b
total cycles (4c per byte): 93 cycles

On 8088, reading a byte of memory takes 4 cycles, whether it’s “MOV AX,mem” or the MOV instruction opcode itself. That’s why smaller slower code can sometimes win over larger faster code on 808x. So it’s important to take the size of the code into account when optimizing for speed.

Some background knowledge of how Paku Paku works can help us:  The game does all drawing to an off-screen buffer that mirrors the video buffer, and when the screen needs to be updated, only the changed memory is copied to the video buffer.  Because Jason does all drawing to an off-screen buffer in system RAM, and the video buffer is smaller than the size of a segment, you have room left over in that segment to store other stuff. So if you store your sprite data in that same segment after where the video buffer ends, you can get DS to point to both screen buffer AND sprite data. Doing that lets us point BX to the offset where the sprite is (it was originally meant to be an index register after all), and use the unused DX register to hold the sprite/mask. We can then rewrite the unrolled inner loop to this:

mov  dx,[bx]     8+5=13c 2b ;load sprite data/mask
lodsw            16c     1b ;load existing screen pixels
and  al,dh       3c      2b ;mask out sprite
or   al,dl       3c      2b ;or sprite data
stosw            15c     1b ;store modified screen pixels
inc  bx          3c      2b ;move to next sprite data grouping
--------------------------
subtotal:        53c     10b
total cycles (4c per byte): 93 cycles

Although we saved a byte, it’s a wash — exactly the same number of cycles in practice.  However, since he is already unrolling the sprite loop for extra speed, we can change INC BX to just some fixed offset in the loop every time we need to read more sprite data, like this:

mov dx,[bx+1]
(next iteration)
mov dx,[bx+2]
(next iteration)
mov dx,[bx+3]

By adding a fixed offset, we can get rid of the INC BX:

mov  dx,[bx+NUM] 12+9=21c 3b ; "NUM" being the iteration in the loop at this point
lodsw            16c      1b
and  al,dh       3c       2b
or   al,dl       3c       2b
stosw            15c      1b
----------------------------
subtotal:        58c      9b
total cycles (4c per byte): 94 cycles

We shaved two bytes off of the original, but we’re one cycle longer than the original.  While the smaller code is most likely faster because of the 8088’s 4-byte prefetch queue, it’s frustrating from a purely theoretical standpoint.

Reverse-engineer extraordinaire Andrew Jenner thinks two steps ahead of me and provides the final optimization that not only gets the cycle count down, but frees up two registers (DX and SI) in the process.  He writes only what is necessary, and since we need to skip over every other byte when writing in 160×100 mode, manually updates the DI index register to do so.  The end result is obtuse to look at, but undeniably the fastest:

mov ax,[bx+NUM]  12+9=21c 3b ; “NUM” being the iteration in the loop at this point
and al,[di]      9+5=14c  2b
or al,ah         3c       2b
stosb            11c      1b
inc di           3c       1b
----------------------------
subtotal:        52c      9b
total cycles (4c per byte): 88 cycles

…successfully squeezing blood from a stone.

Is this truly “best”?  I think so.  But to prove it, we have to time the code running on the real hardware.  Thanks to Abrash’s Zen Timer, we have the following results:

  • Jason’s original code as listed above, repeated three times to plot a 5×5 sprite:  48 microseconds
  • My code block, three times with [bx], [bx+1], [bx+2]: 41 microseconds
  • Andrew’s optimization, also written with [bx], [bx+1], [bx+2]: 37 microseconds

And just to make your head spin, check the comments for this entry — the resulting discussion shows that if you’re willing to rearrange both your sprite data and your thinking, you can get things even faster!

Posted in Gaming, Programming, Vintage Computing | 24 Comments »

Dopplegangers!

Posted by Trixter on February 13, 2011

I have free time to work on a single project at a time, and that project this weekend has been MindCandy.  (We’re very close to a test disc (yay!) — minus subtitles.  Subtitling 4 hours of multi-speaker dialog is a massive chore, multiplied by the number of languages you want to have, so we’re strongly considering not doing subtitles.)  But if I had time to work on multiple projects simultaneously?  I’ve always wanted to produce videos about classic hardware and games, 99% centered on the PC/DOS platforms of the 1980s.  Imagine how happy I am to have discovered the following people:

Lazy Game Reviews – Produces 10-minute reviews on both hardware and games, with a touch of humor and lots of footage captured from the real hardware whenever possible.  The Carmageddon review in particular is perfection, having been captured from a real 3Dfx card and with meaningful illustrations of gameplay, including some accurate history of the development of the game.  His Youtube channel is easier to navigate past shows, but the blip.tv channel earns him a modicum of cash and has better quality video, so… choose.

Ancient DOS Games – While LGR covers the gamut of classic personal computers and gaming, Ancient DOS Games covers only DOS games, and the thoroughness and attention to detail is astounding.  Features like tips and tricks on how to play the game, recommending the best graphics mode or DOSBOX settings per game, noticing what the framerate of the game is and how it affects gameplay, and even a comparison of dithering methods in Thexder and whether or not they were effective — these are all OCD traits that I would have put into my own coverage of the material.  His fly-outs are pixel-art amusing.

Those guys are doing such an amazing job that I really don’t see the need for me to do so.  The both of them combined equals a quality of work that I can’t see myself improving upon, which not only makes me very happy, but frees me up to work on other projects.  Check them out, dammit!

PS: I found I have a true doppleganger over on tumblr.  We have very much in common — moreso were I lesbian.

Posted in Digital Video, Entertainment, Gaming, MindCandy, Vintage Computing | 3 Comments »

An attempt at podcasting

Posted by Trixter on February 11, 2011

If you’d like to hear what I sound like imitating a podcaster, head on over to Hacker Public Radio to listen to an argument against emulators.  This was lightly scripted and lightly edited, and while I don’t think it turned out very well, I’ve received some nice comments on it, so maybe you’ll like it too.

I have an idea for a regular podcast I would indeed like to do, but not until MindCandy is finished and maybe one or two other projects as well.  The idea would center around vintage IBM PC and DOS-era computing as a hobby, of course.

Posted in Vintage Computing | 2 Comments »