Reverse-engineering an old wound

November 2012
S	M	T	W	T	F	S
	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Posted by Trixter on November 8, 2012

Nearly two decades ago on the usenet newsgroups comp.sys.ibm.pc.demos and comp.sys.ibm.pc.soundcard, there were some accusations flung around that Josh Jensen (Cyberstrike of Renaissance, for those who still remember the PC demoscene) had copied entire chunks of Mark J. Cox‘s MODPLAY to use in his own mod player SuperProPlay (and later MASI sound system). Just as time has a way of healing old wounds, advances in technology has a way of ripping them open again, and a chance encounter with some familiar assembly code in October got me thinking about the accusations against Jensen all those years ago. I didn’t give it much attention back then, but I’m a different person now, with much more skill than I had 20 years ago. With decades of x86 assembler, reverse-engineering, and programming skills under my belt, I decided to take another look at this issue to see if it could be answered definitively. I armed myself with much better RE tools (IDA) as well as Josh’s released Protracker Playing Source (PPS) v1.10 source code (PPS110.ZIP) and spent about an hour looking at them both.

My verdict: Josh quite absolutely copied entire chunks of MODPLAY for use in his own code.

When accused, Josh’s paraphrased explanation at the time was “it’s a modplayer, of course some things are going to be the same from player to player”, but that only makes sense at a high level. Yes, the basics of playing a mod are the same across all players, such as interpreting the data structures and effects, mixing four channels into a single output channel, etc. But the devil is in the details, and it is the details that point to copying. The code is not 100% identical at an assembler level, but there are some very unique choices Mark made in the original MODPLAY that mysteriously show up in Josh’s source, such as internal housekeeping, and the inner mixing loop.

The mixing loop, I have found, is a good “fingerprint” for a modplayer — almost every author implements it in a different way. There are bare-metal fastest-possible implementation loops (such as the self- modifying fixed-length code of Carlo‘s Galaxy Player), loops optimized for low memory usage (such as MODPLAY’s loop which uses a MUL in the inner loop), loops that trade memory for speed (such as TANTRAKR which uses a 128K lookup table to eliminate MUL), and all targets inbetween. 4 channels or N channels? 32-bit mixing or 16-bit mixing? Logarithmic or linear volume tables? Cubic interpolation or linear interpolation? Just about every x86 mixing modplayer is different. And the choice Mark Cox made — utilizing a MUL in the inner loop and making heavy use of memory variables — was because he knew his target was a 286 or later and could handle it. You can also tell from the MODPLAY disassembly that Mark was working in a vacuum, because his performance-sensitive code is nowhere near as optimized as it could be (sorry Mark!). Looking at Jensen’s source, you can see exactly the same methods at play, including the inner loop (although Jensen made a few tiny 1- and 2-opcode optimization changes here and there).

As much as I love the inner loop as a fingerprint, the most convincing evidence that copying occurred is actually in the most boring sections of both programs: General housekeeping (things like program startup/initialization, maintaining player state, etc.) Mark does something in MODPLAY that struck me as odd; he calls two tiny procedures to set some variables based on whether or not a mod has 15 or 31 instruments (labels are mine; I don’t have access to Mark’s source code):

sub_1E23 proc near
  mov sequence_offset, 1D8h
  mov word_124, 258h
  mov header_size, 258h
  mov num_inst, 0Fh
  retn
sub_1E23 endp

sub_1E3C proc near
  mov sequence_offset, 3B8h
  mov word_124, 438h
  mov header_size, 43Ch
  mov num_inst, 1Fh
  retn
sub_1E3C endp

That’s a weird way to set some vars. You don’t normally call a tiny procedure simply to set a handful of memory variables to fixed values; usually, you just set the values directly. I don’t know Mark’s motivation for doing it this way. This is a very unusual section of code that I wouldn’t expect to see again…

…and yet, Jensen follows the very same odd practice in PPM.ASM:

proc sd_Set15Ins
 uses ds
 mov ax,@data
 mov ds,ax
 mov [Word NumberInstruments],15
 mov [Word SequenceOffset],01D8h
 mov [Word HeaderSize],0258h
 ret
endp sd_Set15Ins

proc sd_Set31Ins
 uses ds
 mov ax,@data
 mov ds,ax
 mov [Word NumberInstruments],31
 mov [Word SequenceOffset],03B8h
 mov [Word HeaderSize],043Ch
 ret
endp sd_Set31Ins

Again, it’s not the exact instructions that are copied or their order, it’s that entire concepts were copied, and because Mark implemented them in a unique way, they stand out in Jensen’s code.

As someone who has done a lot of cracking and reverse-engineering of vintage software — including, I’m ashamed to say, outright theft of other people’s code — I sense other subtle touches in Jensen’s released source that indicate large sections of it are not his original work. The most obvious are switching between hex and decimal values as a basic notation from procedure to procedure; the copied chunks favor hexidecimal notation, while the original code favors decimal. Also, all throughout the code some lines are indented using 8-character tab stops while other lines use spaces for padding, which is indicative of generating a file using one padding style and then editing it using another style, which would not typically happen if you wrote all the code from scratch.

In the last 20 years, Jensen has remained a professional programmer, and just as my skills and integrity have increased over that time, I have no doubt that his have increased as well. It is not my intention to libel Jensen as a whole; I simply wish to set the record straight regarding only one of his claims.

This entry was posted on November 8, 2012 at 10:20 pm and is filed under Demoscene, Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to “Reverse-engineering an old wound”

Aaron J. Grier said

November 18, 2012 at 6:10 pm
(replying here since my ISP dropped usenet access years ago, and inews needs some updating to support authentication for posting via eternal-september.)

aside from the look / feel copying between the two players, the thing that really made me suspicious was the handling of the effects. modplay had specific behavior with a few of u4ia’s mods, and I remember going back and forth with him and playing songs between the reference protracker on my amiga 500 and various players on the PC to figure out what effects were broken where. for a period of time, superproplay had the same effect bugs that modplay did, although they did eventually diverge as bugs were fixed.

Reply
- Trixter said
  
  November 19, 2012 at 12:49 pm
  I emailed Mark hoping to get a copy of the modplay source, not because I wanted additional confirmation but because I was really curious why he made some of the choices he made. So far, no response, but that isn’t indicative of anything (maybe my email got lost). I hope someday to see the real source, as there’s only so much you can infer from the disassembly.
  
  Reply

	benjamc72842db224 on At a disadvantage
	Z80 vs. 8088 Speed… on At a disadvantage
	Z80 vs. 8088 Tempo… on At a disadvantage
	Z80 vs. 8088 Traipse… on At a disadvantage
	tapewyrm on MartyPC: Finally, a cycle-accu…

Oldskooler Ramblings

the unlikely child born of the home computer wars

Recent Posts

Recent Comments

TrixterTwitter

Pages

Meta

Top Posts

Archives

Blog Stats