The truth about netopsystems

August 2007
S	M	T	W	T	F	S
	1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Posted by Trixter on August 19, 2007

I was first made aware of Net Op Systems (currently going by the name of NOS Microsystems Ltd.) when downloading Adobe Acrobat Reader about 3 years ago. I was struck by how small the compressed deliverable was, so, being a compression hobbyist, I did some preliminary analysis and found that they used a considerable about of context arrangement and prediction (ie. “solid” mode in rar/7-zip, or the content-specific predictors in PAQ) to get the size down. I recently ran across their product again when downloading the most recent version of Solaris x86; it comes in a 1.1G NOSSO executable package. The sole payload was a 3.1G .ISO image, which meant the compressed deliverable was 37% of the uncompressed size. This is very impressive, given that the .ISO image is filled with a lot of .JAR and .BZ2 compressed images themselves. The successful extraction of a workable .ISO file from this compressed deliverable means that NOSSO has to perform the following to work its magic:

Identify the various compressed files in the .ISO wrapper
Extract those compressed files
Decompress the content inside those compressed files
Arrange everything by context (ie. all ASCII text in one group, all binary executables in another group, etc)
Compress the entire thing to a proprietary stream, using content-specific prediction for various content groups
Store the original arrangement of the compressed and uncompressed content

Upon extraction, the NOSSO distributable has to perform the following:

Decompress everything and keep track of it
RECOMPRESS the data originally found in compressed files, so that their effective format is kept the same. There may be small differences due to compression options and implementations, but as long as the end result is usable by the end program (ie. a reassembled .ZIP file is still able to be decompressed into the same contents) then there’s no harm done.
Rearrange the end result back into the original container (in my case an .ISO file)

This is why they call their process “reconstitution” instead of “decompression”, because the end result, while functionally identical, is usually not bit-for-bit identical. By taking advantage of context and recompressing files from less-efficient formats into the more efficient format NOSSO uses internally, we can get these excellent compression ratios. (In fact, I’ll wager that, to speed up reconstitution times, they use a very fast and less efficient version of recompression of the files inside the target wrapper, which would inflate them slightly and result in even more “impressive” compression ratios :-)

What’s the downside? The downside is that this entire process defeats its own purpose. I’ll explain:

NOSSO is marketed as a delivery format that saves everybody bandwidth and, presumably, time. It’s that presumption that allows them to shoot themselves in the foot. While the compressed distributable only took 39 minutes to download on my 6mbit/s cable modem connection, it took a whopping 124 minutes to “reconstitute” on a 2.6GHz P4 with 700MB RAM free (out of 1G RAM total). My total time to get the end result was 163 minutes. (A 2.6GHz machine is not the bleeding edge in 2007, but it’s no slouch either, and is representative of the average system most people will use for everyday use.) At its original size, 3.1G, it would have taken me only 104 minutes to download it.

It would have been faster to get the end result had it not been compressed at all.

Now, 6mbit/s is a pretty fast broadband connection, so I understand that skews the results a bit. With a more common broadband connection speed of 3mbit/s, let’s check the numbers again: Compressed download + extraction: 202 minutes. Uncompressed download: 208 minutes. Okay, so it’s break-even at a 3mbit/s connection. But break-even still involves 100% CPU utilization as the thing is decompressed, resulting in an unusable system for two hours, so it’s still not “free”.

Is there strength in using any compression at all? Let’s check both WinRAR and 7-Zip on the original 3.1G unmodified .ISO file:

7-Zip compressed size: 2.68G. Time to download at 3mbit/s: 187 minutes. Decompression time: 14 minutes. Total time to get the end result: 201 minutes.
WinRAR compressed size: 2.69G. Time to download at 3mbit/s: 189 minutes. Decompression time: 3 minutes. Total time to get the end result: 192 minutes.

So, at 3mb/s, the end result was just about the same, except our system was only tied up for 3 or 15 minutes instead of two hours. We’d get even more compression at the same decompression speed if we burst the .ISO like NOSSO does, compressed using WinRAR’s or 7-Zip’s “solid” mode, and then reconstitute it back into an .ISO when done with a small utility program.

My conclusion from all this is that there’s really no point in using NetOpSystem’s product, unless the end-user’s broadband speed is 1mbit/s or slower. But if it’s that slow, the user is already used to ordering DVD-ROMs for delivery instead of trying to download them, right? Or, if the user downloads them anyway, they’re used to firing them off before they go to bed, to download overnight. So, again, no need for the product…

…unless you’re the content producer and want to transfer cost (bandwidth) to the end user (time). Which is probably why NetOpSystems is still in business.

This entry was posted on August 19, 2007 at 5:54 pm and is filed under Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

16 Responses to “The truth about netopsystems”

phoenix said

August 19, 2007 at 6:41 pm
Just looking at that first webpage, it’s clear that this product is about marketing and business, and not the end-user experience. So your reaction is probably typical but of little concern to the content producers. Hopefully we’ll continue to enjoy a variety of options.

Reply
James said

August 19, 2007 at 7:07 pm
A wonderful review of something which has always annoyed me!

In the UK, Virgin Media (recently took over NTL) have announced upgrading all 10Mbit customers to 20Mbit. Everytime I go to download something from adobe I come to the same conclusion as you; it would be faster for me to just download the data since the ‘reconstruction’ takes so much longer, even on a quad core 4GB ram!

Reply
mpz said

August 21, 2007 at 2:32 pm
Oh, they don’t care about *your* bandwidth or *your* CPU time that goes to waste. All they care about is *their* bandwidth that they have to pay per gigabyte for. NOSSO helps quite handily there..

(Insert obligatory rant about proper content delivery systems like Akamai or even BitTorrent..)

BTW, I’m pretty sure the end result has to be very nearly if not exactly bit identical to the original. The ISO file has a self-contained filesystem inside it; if the “re-constituted” zip/jar and bz2 archives were sized any differently from the originals, the filesystem references to file sizes and starting sectors would have to be fixed too. This is not an impossible feat with ISO files, but it would certainly be impossible with proprietary file formats that just embed ZIPs etc..

Therefore I would guess that they simply figure out the exact parameters and if those are not found, store the files as they are. This is probably the easier way because most ZIP files on the internet and on *nix distribution CDs/DVDs are compressed with zlib (infozip or gzip) – searching through the few options shouldn’t doesn’t take a prohibitively long amount of time during the compression phase. Chances are most ZIPs are compressed either at the default level or -9.

Reply
Trixter said

August 21, 2007 at 9:19 pm
James: Amazing that you have 20mbit to your house!! The best we can get in the USA is 6mbit (you can get more but you have to pay business costs, ie. $1200 a month or more).

mpz: They’re definitely not leaving compressed things alone because the .ISO in question is 80% compressed files (*.JAR and *.BZ2) so if they just left it alone, it would be 2.8G instead of 1.1G. Which is why the RAR/7-Zip tests showed 2.8G. So they most definitely recompress the content, and re-recompress it into the original file format upon “reconstitution”.

Reply
mpz said

August 22, 2007 at 10:47 am
Oh, of course they do, I was just saying that they pretty much *must* reproduce the original data exactly (at the “reconstitution” phase, in other words re-recompressing into the “original” zips and bz2s) – which isn’t that big of a feat (there are also others that do it like Precomp and so on).

Reply
Brolin Empey said

August 23, 2007 at 11:14 pm
Trixter: I am curious how your cable modem can transfer less than 1 bit per second, considering that a bit is a fundamental unit of information. ;) ‘m’ = milli, ‘M’ = mega. Thus, mbit = millibit.

At least “mbit” is clearly a mistake — “Mbit” was intended — because of the bit’s status as a fundamental unit of information.

However, using SI multiplicative prefixes without a unit is poor form. Granted, it is usually assumed that such quantities are using units of bytes in the context of data storage, and bits in the context of data transmission. Regardless, it is better to be explicit and unambiguous than implicit and potentially ambiguous — especially when being explicit requires the use of only a /single character/ more. :P

You are lucky that you did not mention HDD capacities. If you did, you would have been mixing decimal usage and binary misuse of SI multiplicative prefixes. :)

The CPU utilisation might become less of an issue as multi-core PCs become more common. All Intel Macs, for example, have /at least/ 2 logical processors.

mpz: .bz2 files are not archives! :) bzip2 is used to compress uncompressed archive files, such as .tar files. This is because both (GNU) tar and bzip2, unlike zip archivers such as PKZIP and InfoZIP, follow the Unix philosophy: each program should do only one task (archiving xor compression), and do it well.

Trixter: Your wildcard globbing patterns will not work on a standard, case-sensitive *nix file system, since *nix archive files (well, *nix files in general, unless they were e.g. created on DOS, which uses SHOUTING names :)) use lowercase file extensions. ;)

JAR archives, like ZIP, can be created without compression (store only), so it is possible that some of the .jar files in your disc image are not compressed. Granted, the default seems to be to use compression with at least the jar program from sun-jdk-1.5.0.06, which has a -0 option to store only.

Reply
Ruairi (rc55) Fullam said

February 2, 2008 at 11:53 am
Brolin: Good lord, your attention to detail is wasted on picking at Jim’s blog posts!

Jim: Fascinating post! I wondered how they did it also. It did make me think about how much you could optimise things further – I always wondered why no one worked on a flexible compression scheme (a lossy system perhaps), and further methods to repack compression schemes, being more aggressive with huffman tables in mp3 (I think there was a tool called Rehuff that did that and shaved off 1-3% iirc).

The only context I’ve seen this done is manual ripped warez releases where audio is recompressed as well as having video stripped out, although I think it stopped with Dreamcast releases (as you well know the GD-ROM format is more spacious than that of CD-R).

Reply
Alex said

March 23, 2008 at 10:13 am
I’ve just looked for myself and come to a different conclusion to you lot – Nosso talk about GetPlus which as far as I know hasn’t been used for Adobe or the Sun Solaris ISO (where I got my interest in digging around a bit more) – so yes, in those particular cases the companies were just interested in reducing their bandwidth bills rather than improving the result for end users. However, both those cases are ‘free’ tools – one is large distribution and one is simply a large download.

I say if there is a business to be made from saving companies bandwidth bills then so be it, and if it means they don’t have to change their current distribution models or dabble in bittorrent then I can understand why they’d go for it.

Reply
Jorge said

October 5, 2008 at 12:28 pm
GetPlus _is_ used with Acrobat 9.

I came across this page while searching for information on NOSSO. Man, I have hated that stuff for ages. It’s _so_ slow and annoying. I would think they should worry more about customer perception than saving a tiny bit of bandwidth. Currently it looks like their software isn’t very well designed because it’s so slow to install.

Reply
H said

November 30, 2008 at 8:27 am
I worked for NOS in 2004 and was one of those who were responsible for the Mac OS X release and worked also on the MS Windows version of NOSSO (whose name was FEAD back then).

I’m still bound by NDAs so I’m not allowed to say much, however your observations are mostly correct. As it were the worst months in my carreer as a software engineer (it’s one of those classical graduate-grinders), I wouldn’t be able to say many nice things anyway. ;)

Their products become obsolete as bandwidths grow so I’m leaning back and enjoying their fall. ;)

Reply
- Trixter said
  
  November 23, 2018 at 10:14 pm
  H, your NDAs are up — any change you could elaborate? ;-)
  
  Reply
Trixter said

November 30, 2008 at 7:58 pm
Thanks for the confirmation that my observations were correct :-)

Reply
Max said

June 24, 2009 at 2:20 am
The latest versions of Adobe Reader use a new version of Nosso – they seem to start the reconstitution while still downloading, which substantially reduces the processing after the end of the download.
As you say, still mainly of benefit to the publisher rather than the user.
I think they must have to reconstruct a bit-for bit copy, as usually all the files in an installation package have CRCs.

Reply
Yuhong Bao said

September 22, 2009 at 10:33 am
“The latest versions of Adobe Reader use a new version of Nosso – they seem to start the reconstitution while still downloading, which substantially reduces the processing after the end of the download.
As you say, still mainly of benefit to the publisher rather than the user.”
But a good step forward, but I have a better idea. How about benchmarking both the system and the internet connection and use that to determine how much compression to use? If for example the system is slow but the internet connection is fast, less compression can be used, but if it is the opposite, more compression can be used.

Reply
Trixter said

September 28, 2009 at 10:28 pm
A good idea, but as I suspected, this would only benefit the end-user and not the publisher. The publisher will always want to save the most bandwidth.

Reply
yuhong said

February 13, 2014 at 10:11 pm
To be honest, I think the first version of Acrobat Reader to do this was 6.0, which dates back to 2003. I think dial-up modems was still common back then.

Reply

	Matthew Garrett: Wha… on 8088 MPH: We Break All Your…
	The Incredible Demo… on 8088 MPH: We Break All Your…
	Trixter on 8088 MPH: We Break All Your…
	wh0phd on 8088 MPH: We Break All Your…
	John Olson on Cyberpunx

Oldskooler Ramblings

the unlikely child born of the home computer wars

Recent Posts

Recent Comments

Pages

Meta

Top Posts

Archives

Blog Stats