Open Video Codecs and Flash

8 05 2008

General
Theora
Xvid
OMS Video
Dirac / Schrödinger
Flash / flv / f4v

General

When a standard is open it allows for a huge adoption of it by anyone, anyone can use it and be sure that their data isn’t locked away and they have to deal with a specific company if they want to access their own content. Open Standards are what runs the Internet. The problem is that being an ‘Open Standard’ isn’t all that’s required. H.264 for instance is an Open Standard but its not royalty-free as there are patents on it, and it requires a licensing fee for implementation. While these licenses are cheap and easy to obtain for companies making them attractive, they block the formats for the non-commercial open source community. You are still allowing a 3rd party to dictate the requirements for access to your data.

This is where the much hated software patents come into it, you cannot distribute patented software in binary, precompiled form as a patent has to be applied to a physical object (thanks to a court case in America binary code somehow now counts as such, while other countries have various laws America is where Silicon valley is so we all loose out, some countries seen to be specifically making exceptions to allow patents to be applied on computers). You can distribute patent software in source code since its not an actual implementation of it. As for if you can legally compile that code for personal use various from country to country, there is some discussion on that here.

Firstly its important to have an royalty-free unencumbered codec for use in streaming video for things such as Firefox and Linux/Unix distributions to be able to legally play back these formats, patents are the reason that in order to support MP3 playback you have to install codecs (which in newer distributions is a lot easier and automatically setup). Commercial distros can afford to pay the patent license fees but this isn’t much help for for the open source community, or hobbyists, Ubuntu/Debian/Fedora/Gentoo/Arch/BSDs etc… aren’t commercial distros, they don’t charge you so they can’t pay for the codecs and if they could pay for them then the media is still in a format that is locked away accessible on the whims of the patent holder.

Since the HTML 5 draft (due to be finalized 4 years from now in 2012) included video streaming, having a decent open codec is more important now than ever before, originally the draft had mentioned the use of Ogg however Nokia and Apple raised objections concerned about hidden ‘submarine patents’, low compression ratio and lack of hardware decoders, Nokia wanting support for H.264 (which also happens to be the codec Apple is already using for iTunes/iPod video along with AAC for audio) or alternatively leaving out streaming video and letting corporations fight it out. H.264 being impossible to include in the standard.

As for the royalty-free video codecs that around around we have, Theora, DIRAC and OMS.

Ogg Theora

Firstly there is the oldest and most widely known Theora codec, often referred to as “Ogg Theora” as its contained in the Ogg container format, not to be confused with Ogg Vorbis which is an audio codec designed to be a royalty-free alternative to MP3, also lives in the Ogg container format and is often used to provide the audio for Theora videos in Ogg format.

Theora is a project of the Xiph.Org foundation (also responsible for the royalty-free codecs, FLAC for lossless audio and Speex a voice audio codec with an extremely good compression ratio), its based on VP3 which was donated to the public by its creator On2 who dropped all claims on it.

Unfortunately is seems that Theora is now out of date and has fairly bad compression when compared to other codecs. Xpih.Org are apparently working on an improved version of Theora for HTML 5 but with the binary format locked for compatibility its unclear to me if it can be improved enough to reduce file sizes and improve quality or if its just work on improving the tools around Theora.

Xvid

Xvid is apparently a royalty-free codec, originally from OpenDivX code it was forked when the DivX 4 closed source. The problem is that Xvid is based on the MPEG-4 standard which has 2 dozen companies claiming patents on it and licenses are apparently no longer being offered.

OMS Video

Sun’s Open Media Commons recently announced OMS Video, and open coded, the audio component is using the video component is based on H.261 which is out of the 17 year patent restriction, then adding newer unpatented technologies. Currently there isn’t anything from them yet code wise. Another worry is another Open Media Commons project, DReaM, its a DRM specification, as far as DRM goes it seems less evil since its designed to be open and royalty-free itself but its still DRM, in the end as long as the DRM isn’t built into OMS it shouldn’t be a problem but I have a small concern that they will use OMS as an infection vector for DReaM. The announcement and specification overview don’t mention DReaM at all other than saying its also part of Open Media Commons so its probably not an issue but worth watching. Fortunately DRM is its own worst enemy, DReaM is supposed to bring an open royalty free DRM system to allow music to interoperate but DRM seems less about protecting music and more about online music retailers locking clients to their system/devices, one someone has a whole database of DRM’d songs they will have to buy hardware that supports it for ever and keep shopping at the same place, they can never leave (at least not without breaking the DRM or loosing all their music), you can read more about why DRM sucks at the Defective By Design website.

Dirac / Schrödinger

The BBC who have been experimenting with streaming video created Dirac (wikipedia) which is designed to be completely unencumbered by using patent free technologies. Wikipedia says it is in the same range of compression as h264. There is an implementation of DIRAC called Schrödinger which has libraries, gstreamer plug-ins and is intended to get it in the Ogg container.

Flash / flv / f4v

Recently Adobe with their Open Screen Project, opened flash and the flv/fl4 format for use without license restrictions, the swf specification and the flv specifications are already published. This is great news for projects like Gnash however my main concern however is that flv has technologies using patents in it. For instance flv in Flash 9 supports AAC for audio and the Wikipedia article on ACC says:
“…a patent license is required for all manufacturers or developers of AAC codecs, that require encoding or decoding. It is for this reason FOSS implementations such as FAAC and FAAD are distributed in source form only, in order to avoid patent infringement.”. This makes it seem like even though the license restriction is removed, the open source community will benefit from having the API’s available but not be able to actual make a binary version of the flash client. You won’t be able to expect flash to be built into Firefox or shipped with Ubuntu. The real clients of Adobe will still likely need a license from Adobe unless they want to go to patent holders such as AAC and independently obtain licenses (likely to end up costing more in the end). Another format used is MP3 which has a whole load of parent issues, the MP3 decoding patents run out around 2012 and the encoding later around 2016 (Ive seen various different times but their fairly close, there is a big list of mp3 patents but it doesn’t say what is needed for decoding/encoding and whats optional, the latest is 2017), flv also uses yet another commercial proprietary codec Nellymoser.

These are just the audio codecs for the video there is H.263 since Flash 6 and as of Flash 8
VP6, I haven’t found much information on the license issues around them but they do seem to be patented. Wikipedia says “As of September 2006, an open-source implementation of the decoder is part of the libavcodec project, though producing or dealing with VP6 video streams inside libavcodec/libavformat seems to be discouraged and/or refused due to clashes between the ffmpeg’s developers and On2 technologies by a claim of Intellectual Property and Trade Secrets Infringement made by the corporation itself.”

As for Flash itself I have no idea about what other patents on the technology exist when we live in a world where anti-aliasing fonts is patented. In order for flash to really be open source friendly we would need to see it adopt patent free codecs for flv (such as DIRAC, Vorbis, Theora or OMS).

Comments : 6 Comments »
Tags: codecs, dirac, f4v, flash, flv, free, freesoftware, h264, html5, linux, mp3, ogg, oms, opensource, patents, Schrödinger, sun, swf, theora, video, vorbis, xvid
Categories : ☢, dirac, f4v, flash, flv, free, freesoftware, h264, html5, linux, mp3, ogg, oms, opensource, patents, Schrödinger, sun, video, vorbis, xvid

code, code.back, code.back2… – A better way with Revison Control (svn/git/bzr/hg tutorials & comparisons)

18 06 2007

This article is very long, it covers some basics of what revision control system (RCS)/ source code management systems (SCMS) are, basic tutorial of using subversion for a personal repository, what distributed ones are, basics of using git, bzr and hg for a personal repository and my comparisons on them. Its only a basic introduction, I’ve never had to manage any large complex projects so advanced stuff isn’t covered (plus its long enough).

If you program and don’t use some kind of rcs you are making your life much harder than it should be, rcs are a great, distributed ones are greatest. All you need is to learn a few steps to setup a repo, and somewhere to put it, anything with ssh can be used or just on the local disk.

Even for non-programmers, if you find yourself making changes to config files much then having a repository containing them is definitely a good idea, if you botch it up, you can always revert to the previous edit and compare the 2 with diff.

Introduction to RCS

Originally when I would code, I would intermittently ‘cp -rf directory directory.backup’, that way if I screwed up my code I could always go back. This was working fine for my smaller projects, at least until one particularly painful Uni assignment (SunRPC will segfault on anything), eventually I had reached backup.22 and often I had to go back a few revisions, Not an easy task because I wouldn’t remember the exact number, and I had often done more than one change to the code like add comments to everything, which resulted in me creating more backups with things like a single comment added, because the code I had done had started to randomly segfault. I’m sure there was a simple memory leak but with the deadline a few hours away I didn’t have time to hunt it down (basic gdb wasn’t working because it was the SunRPC libs that where crashing). In the end I got my assignment in (although it was probably the worst mark on an assignment yet, once again I hate RPC).

After that I decided to try using a revision control system, previously I had never actually though about using them for my simple coding and just assumed they where only needed for larger project, the only time I had encountered them was to ocassionaly grab some code from when I needed something newer than was shipping with my Linux distribution, however while googling for stuff about uni I managed to find this website from another student about setting up SVN for projects. I had only previously used svn for grabbing code from public projects, I had also used cvs although it was fairly clear than cvs was a fairly outdated system.

SVN – Subversion tutoiral

SVN works rather well for me as it is on the systems at uni and can be tunneled over ssh so I can push/pull to/from my server at home. The basic functionality of svn allows for going back to any previous revision with ‘-r #’, coding from any system that can connect to the repository all I need to do is checkout/update it, seeing a ‘diff’ between revisions to see what I changed.

Unfortunately subversion isn’t distributed (explained later) so I wouldn’t recommend it, but understanding the basics of revision control is important, so I have instructions on using it here, the same basic outline of commands is used for most of the revision control systems around with a few minor differences. I might use svn for basic repository for editing config files but any of the distributed ones would work just as well.

You can use any system you either have direct access to or ssh (and http etc…) to store your code, I’m using ssh in this.

SourceForge (from the owners of Slashdot) provide free public svn (and cvs) hosting for open source projects, include bug tracking and basic forums however I haven’t found the site very nice to navigate, although you can just host a normal website on it.

Setting up a personal local repository is easy.

Firstly we need to make a svn folder where all the other svn projects will live:
mkdir ~/svn

Then we need to make a repository for the project:
svnadmin create ~/svn/PROJECTNAME

Next is importing the current code, you do this from the directory where your code lives not the svn created one, make sure you clean up any unneeded files like binaries and generated output first:
svn import . \ file:///home/USERNAME/svn/projectname/trunk -m "Initial commit."

Notice that it is going into the sub folder trunk, this is important because later on you might need to tag code so you might end up with /trunk/, /1.0rc1/ and /1.0/, you can just put the code in the main directory if you don’t want this kind of functionality. Make sure there are 3 /’s in the uri, normally the server name goes after the first / but since this is local there aren’t any. You must also specify the full path to your folder. -m is the commit message that describes the changes for revisions.

You can also use svn+ssh://USERNAME@SERVER/home/USERNAME/svn/PROJECTNAME/trunk if you want to do it over ssh.

The next set is to checkout your repository, even though you have a local copy you still need the subversion metadata (Annoying url prefix, I wish it was just ssh://):
svn checkout \ svn+ssh://USERNAME@SERVER/home/USERNAME/svn/PROJECTNAME/trunk \ PROJECTNAME

This time I’m doing it over ssh, once again remember that its coming from the trunk folder. The trailing PROJECTNAME is to make svn rename. co is a shorter alias of checkout if your excessively lazy.

Thats the hard bits done, from now on its very simple as all the information about where to upload is stored in the .svn folder in your project.
Now you just edit your code, and once your happy with the changes you type:
svn commit -m "Description of changes."

When you create a new file that you want to add to the repository you must first tell svn that you want to add it manually, this avoids accidentally uploading compiled binary files or files outputted by your program:
svn add filename

To update to the version of the code in the repository (or a particular version with -r#):
svn update

To see the difference between revisions, you can also specify a particular revision with -r:
svn diff

To see the logs:
svn log

To make a tag:
svn copy \ svn+ssh://USERNAME@SERVER/home/USERNAME/svn/PROJECTNAME/trunk \ svn+ssh://USERNAME@SERVER/home/USERNAME/svn/PROJECTNAME/1.0

Distributed Revision Control Systems

SVN was a massive improvement to managing even simple personal code, I used it for several months without issues, however there is a new bread of RCS that are appearing, distributed ones. There are currently 3 main contenders:

Git – Made by Linus for maintaining the Linux kernel, also used by KDE.

Mercurial – Run with the command ‘hg’, A popular Python based one, used by Sun (for hosting of Java).

Bazaar-NG – Or ‘bzr’ Python again, as used on launchpad.net and the Ubuntu community (there all Canonical made).

There is also darcs (written in Haskell), GNU Arch, and monotone. There is a Wikipedia article listing various revision control software (commerical/free and central/distributed).

Being distributed means that when you check out a central repository, you actually have your own local repository rather than just a copy of the code from it, so you can commit changes without having access to the central repository. Allows for much easier experimentation as you can quickly branch off from your local repository and its Useful for people with laptops who might not have an internet connection. With subversion, you can checkout a repository but then your stuck with the one version, you can only commit back to the main repository, the most you could do is try to copy the directory and other painful workarounds. Also there isn’t technically a ‘central’ repository, although there will generally an official one everyone downloads from. Still handy features to have even when its just for personal use, for instance a simple ‘svn log’ needs to talk to the central server, which can take some time if its a large repo and/or is over a slow connection.

Speed wise Git is currently the fastest for most operations as it was designed for maintaining the massive Linux kernel. Next fastest is Mercurial and then Bazaar (which is planning to match git speeds in its 1.0 release). However for most simple projects speed isn’t that much of a requirement, as long as its not tediously slow for simple changes any of them should work fine.

The functionality of all of these are fairly similar, you tell it who you are, you init the original source directory, commit the initial repository, then you can checkout from anywhere with access, branch off code, modify code, commit it, merge it back into the master branch, push it to the server. Review Logs, see changes with diffs etc…

Most of these support the ability of checking out code from repositories of a different type, you might need a plugin though. You can also convert between systems with tailor, although you might loose some information.

In the end its probably just personal choice which one you prefer as they all offer the same basic functionality.

DRCS Tutorials

Git

Firstly there is a great talk from Linus about Git on Google video, its 1hr 10min long. It might be somewhat dated however, some of the functionality talked about might have been implemented or speedup since then (for instance pushing in git now exists).

Git is written in c and is currently the fastest. It is probably best suited for larger projects. However some of Git is more advanced features are a bit harder to use and understand although not by too much for basic usage, so it might not be suited for the less experienced user. The speed improvements on Git are apparently lost on Windows systems as they rely on some specific methods of disk access (unless this has been fixed in newer versions). So Windows or multisystem developers might want to avoid it.

If you want, you can get free public Git hosting here, although its only a very basic service currently.
UPDATE: There is also github which has a free opensource developer plan (100mb, no private repos).

A nice thing about Git is that it keeps all your branches in the same folder, with bzr/hg when you branch of it creates a separate folder for that branch, you could keep them all in one main project folder (For bazaar you can create a repository that stores all your branches saving space by sharing common files) but with Git everything is in the one folder by default making for a much tidier feel, branches you aren’t working on are tucked away and you switch between them fairly painlessly with the checkout command. Might require a bit more effort to work between 2 branches however.

Git also has nice sha-1 ids for everything so you can tell if things become corrupt, and it generally views all your code as one thing rather than each file so it can track changes to a function even if its moved from one file to another.

You can ‘apt-get install git-core’ on Ubuntu/Debian, however its out of date so the instructions will vary. You can get the code from the site compile from source for a newer version.

Firstly tell Git who you are (and enable prwdy colours), the following for newer version of Git:
git config --global user.name "YOURNAME" git config --global user.email EMAIL@DOMAIN.com git config --global color.diff auto git config --global color.status auto git config --global color.branch auto
Note that those are all –config, not -confg, wordpress screws it up

To initialize the current code directory (older versions use ‘git-init-db’):
git init

When committing to Git, you need to maintain an index of files that are to be committed, you can use the ‘add’ command to do this, in svn you only need to add new files to but in git you need to also add changed files, however rather than adding changed files manually you can use ‘commit -a’ which will automatically add the changed files to the index (but not newly created ones). Since all your files are new in this initial import you need to add them:
git add .

Then commit them:
git commit

When you want to grab your code from a remote repository and put in in the current directory, use:
git clone ssh://SERVER/home/USERNAME/git/PROJECTNAME

Enter your directory, you can then make a branch for hacking on:
git branch BRANCHNAME

View your list of branches:
git branch

Then you switch to that branch:
git checkout BRANCHNAME

Modify some code and check it into your local BRANCHNAME branch:
git commit -a

Switch back to your original local branch:
git checkout master

Merge the changes into the master branch:
git merge BRANCHNAME

Delete the extra branch (-D will force it to delete if you didn’t merge it):
git branch -d BRANCHNAME

Push the branch to your server:
git push ssh://USERNAME@SERVER/home/USERNAME/git/PROJECTNAME

Theres some more tutorial information on Git here.

Bazaar – (bzr)

Bazaar written in python is probably the slowest of the 3, however the current project roadmap for 1.0 is to match the speed of git, so there might be some improvements appearing. There are benchmarks here showing much better speed improvements, up to 0.15, no 0.16/0.17 which also list more performance improvements in their changelogs. I haven’t found any videos on Bazaar but there have been three, shuttleworth, posts recently on bazaar as a lossless RCS.

For public Bazaar hosting there is launchpad, which has bug tracking and such for project, and storing personal user branches.

Bazaar seems fairly simple to use, I haven’t needed any of the more advanced features but it seems like advanced stuff would be simpler under Bazaar than Git, but for the simple stuff there isn’t any major difference.

Firstly set your name:
bzr whoami "Your Name <EMAIL@DOMAIN.com>"

Enter your source code directory and initialize it:
bzr init

Add the files to the index:
bzr add .

Commit the branch. this same command is also used to commit code after its modified, by default it will add all changed files to the index, like -a in git:
bzr commit

You can create a repository to store branches, this allows you to save space by sharing the common files between them.
bzr init-repo REPONAME cd REPONAME

Now you can branch off from your remote branch into the local repository, notice its sftp for ssh now, a different standard for the same thing again, you can use ~ for the home folder now though, there is also bzr+ssh:// which doesn’t seem to need the paramiko library but i’m not sure of the difference between them other than that:
bzr clone sftp://USERNAME@SERVER/~/bzr/PROJECTNAME

In addition to ‘clone’, you can also use ‘checkout’, this means that any changes you commit, as well as being committed to the local branch will also be committed to the branch you checkout from, if possible. This is somewhat similar to svn, except changes are still committed to the local branch regardless of the remote branch being accessible (unless you use –lightweight, in which case it works just like svn and all everything depends on the remote branch working). You can also use checkout inside a branch to obtain the latest committed version of that branch into the working directory which is sometimes needed if you push branches as it will transfer the .bzr directory with the revisions but not the working branch.

You can fork of from your local branch for experimental coding, which will make a separate folder in the repository:
bzr clone PROJECTNAME PROJECTNAME-testcode

Then after coding, change to the main local branch directory and merge:
bzr merge ../PROJECTNAME-testcode

Then you can push the local branch back to your servers branch:
bzr push sftp://USERNAME@SERVER/~/bzr/PROJECTNAME

Also see the official Bazaar tutorial.

Mercurial – (hg)

Mercurial works basically the same as bazaar. Theres a google video tech talk on it here (50min).

Thus you must firstly identify thyself:
echo -e "[ui]\nusername = YOUR NAME <EMAIL@DOMAIN.com>" > ~/.hgrc

Changeth to thy source code directory and initilizeth with:
hg init

Addeth ye new files to thy index:
hg add

Commiteth thy files to thy repo:
hg commit

Snag your remote repo to a local location:
hg clone ssh://USERNAME@SERVER/~/hg/PROJECTNAME

Branch off your local main to a secondary branch:
hg clone PROJECTNAME PROJECTNAME-testcode

Modify some code, and commit to the secondary branch with:
hg commit

Change back to your primary local branch and merge (this needs 2 commands):
hg pull ../PROJECTNAME-testcode hg update

Push it to your remote repo:
hg push ssh://USERNAME@SERVER/~/hg/PROJECTNAME

Official Mercurial Tutorial.

Finally

After trying out all 3, I found them to be vary similar to each other and any would be suitable for most purposes, you could probally pick one at random and be happy or choose one based on the public services that are avilable such as launchpad, I will probably end up using bzr, hg seemed to make merging a bit more of a pain, requiring an extra step, and the ‘merge’ command some how changed from the docs to the ‘update’ command, also the aesthetics of the output wasn’t as good but thats a bit nitpicky. Bazaars rapidly improving speed should see it ahead of hg if they meet their goals. I also liked git quite alot and might use that for some stuff but it isn’t available on the Solaris systems at uni, and requires 22mb just for the basic binaries so to much for me to install locally (50mb directory limit), but I do favor the approach of having all the branches in the one local location rather than making a whole new one each time, cuts down on the appearance of clutter.

If you are looking for public hosting for your code with a repository of your choice, you can check this wikipedia article which shows a handy list of hosts and what systems they support.

Comments : Comments Off

Categories : ☢, bazaar, bzr, cli, code, cvs, git, hg, linus, linux, programming, rcs, rpc, scm, solaris, sun, svn, unix

SunRPC beginner tips

13 06 2007

RPC (Remote Procedure Calls) are used for making client/server programs where the client can call a function on the server without having to implement their own network code, they link in with rpcbind which is run on just about every system now days. It works by using a simple msg.x template, running it through a code generator rpcgen and linking the results with the client and server code.

I had to do 2 assignments this year for my distributed computing class using Sun’s RPC on Solaris.

RPC is extremely painful and there isn’t to much in the way of beginners resources. The server kept core dumping on me and the bit causing the problem was often in the RPC libs rather than my code. Normally this is caused by breaking memory allocation but its very hard to track down, gdb doesn’t really help.

A few tips i picked up while working on Sun RPC:

You can stop the server for backgrounding, allowing you to use standard printf’s for debugging. Just add -DRPC_SVC_FG to your servers compile line.

Correct memory management is crucial, any leaks no matter how minor will cause crashes with RPC.

Don’t leave any pointers undefined even if you haven’t put anything into them and are using a size variable of 0 (such as emptry arrays/strings), NULL them, SunRPC will try to free() any pointers after receiving the struct back which can cause your client to crash when it recives a successful response because its trying to free the memory before putting the actual response struct data in there, since they are undefined it will try to free random memory and segfault.

Make sure this applies to the both the sent and result struct, the server should automatically initialize the result struct as you can’t guarantee the client has passed a valid empty struct

Make sure you test more than one remote function in a row, quite a few memory errors will not manifest until the server attempts to process the next request AFTER the bad function has worked. If your server is seg faulting on a function, make sure its actually not the previous function that is the problem.

Its much easier to use one universal message struct for all the remote functions to be passed back and forward rather than making a new struct type for each function. Possibly slightly less efficient but not by too much for simple projects.

Its much easier to let the server do all the work, if the information is just getting printed to the console passback a string rather than a struct with the information in it. Probably not a good practice for real life usage though, but much eaiser for learning basic RPC.

The Linux rpcgen seems to be fairly horrible, most of my code wouldn’t work with it, doesn’t seem to support generating stubs, maybe there is an entirely different approach for programming rpc in Linux but I couldn’t find it. Current version might be broken. Things like enums just wouldn’t work for me (which awas a problem because my 1st assignment specified the them). Some sample code I downloaded worked fine, others just wouldn’t.

You can make your code thread safe and handling concurrent connections with ‘rpcgen -MA’, you will still need semaphores or some other form of concurrency control.

Sun’s rpcgen has the ability to generate template stubs for server and client code with -a, very useful, they are called msg_server.c and msg_client.c, done actually modify them as they will be overridden next time, just copy em. Can be used in combination with the -aAM for threadsafe.

Code generated by rpcgen is outdated and will give warning when compiled but still works on, you can fix the stubs by adding int befoure main and such but the templates you will probably need to live with.

Variable sized arrays in the template file are declared with not [], you can specify a max length ie , [255] will do fixed size arrays like normal.

RPC has a ‘string’ variable type in the template file, this is the equivalent of c’s char* (notice its string, not c++’s String) for example: ‘string name;’, your c code will see this as a normal char*

A array in the RPC template file will make a struct with name.name_val and name.name_len

char* for strings in the template file was causing me pain, I don’t remember why but there is probally a reason for string.

Remember to NULL terminate your strings when they are passed around, otherwise RPC won’t know when to stop freeing

Functions can only accept one struct, so make sure it contains everything needed.

Sometimes poking into the files generated by the template can help understanding some problems, such as typos in msg.x not being caught by rpcgen but causing your source to fail compiling.

If possible use something other than RPC (CORBA, XMLRPC, SOAP), I haven’t used them but they can’t be worse. They might have some overhead though.

Linkage:
http://www.cdk3.net/rmi/Ed2/SunRPC.pdf – SunRPC definition.
http://www.cs.cf.ac.uk/Dave/C/node34.html – Some basic rpc examples, the way it works is a bit different to the stubs generated by Sun’s rpcgen but its fairly easy to work out the changes needed.
http://www2.cs.uregina.ca/~hamilton/courses/430/notes/rpc.html – Same again, includes a linked list example.

Comments : 4 Comments »

Categories : ☢, code, distributed, rpc, semaphore, sun, unix

ZFS on Linux – Freedom can be so restrictive

9 06 2007

UPDATE2: Back in May, there was a post on Jeff Bonwik‘s (Lead ZFS developer) blog with pictures of him and Linus having lunch, they where linked to from Jim Grisanzio’s (Another Sun employee) blog with the title of “ZFS Pics“.

There is also some work on developing a new Linux filesystem, btrfs with many of the ZFS features. “the filesystem format has support for some advanced features that are designed to leapfrog ZFS”.

UPDATE: There has recently been some talk on the kernel development mailing list about GPLv2, GPLv3 and Solaris, including ZFS. Linus’s post, skeptical about Sun cooperating and Sun’s CEO reply saying “if it was, we wouldn’t be so interested in seeing ZFS everywhere, including Linux, with full patent indemnity.”.

ZFS is a great file system from Sun, ~~currently its going to be the default for the file system in OSX Leopard when its released~~ (apparently its read-only) and its already in the FreeBSD kernel. And of course Suns operating system Solaris.

Grub boot loader already allows for booting from it.

Sun claim it to be the last word in filesystems. Apparently speed wise its close to hard drive platter speed like XFS, handles software raid like LVM and is able to handle more storage capacity than anyone should ever needed like ext4, supports Compression, snapshots and encryption its being worked on.

There is ZFS on FUSE that allows you to use it on Linux, but FUSE is slower than a real file systems (Benchmarks here) and it is much harder have the main root partition on it as it must load the programs that access the hard drive from somewhere. dpkg also requires a patch for systems using Debian apt.

Unfortunately there are 2 problems with to getting it into the core Linux kernel.

Licensing and Patents.

Currently OpenSolaris is under Sun’s CDDL which is incompatible with the GPL license that the Linux kernel uses. Sun have been talking about GPLing Solaris with the GPLv3. Would this mean we could see ZFS in Linux? Unfortunately no, the Linux kernel is under the GPLv2, with Linus previous saying that he would probably stick to GPLv2 for the Linux kernel, although he did recently say he was ‘pretty pleased’ about the new draft but still skeptical. The GPLv2 says that there must be no restrictions on how the software is used, the GPLv3 says you must not use it with DRM or on hardware the deliberately prevents the modification of the software (ie Tivo). Some parts of ZFS are under the GPLv2 via grub, but only the very basic bits needed for booting so probably not enough to use on a system.

The other problem is Sun apparently have 56 patents on the technology that goes into ZFS. If its under a compatible license with the Linux kernel, then this could still prevent wide spread adoption of it in the Linux community. Its theoretically possible that Sun is secretly being payed by MS to get their code into the kernel and sue em although it seems a bit to tinfoil hat to me. Sun apparently won’t sue anyone using their codebase, but I’m not sure how legally binding that is. It also prevents reverse engineering ZFS from scratch.

Sun have recently been making an attempt at getting the Linux community involved with Solaris, recently recruiting an ex-Debian developer Ian Murdock who’s job it is to make Solaris more appealing to the Linux user with project Indiana (mailinglist), a binary based Solaris distribution designed to be what people expect from Linux, it is possible that we could see Sun releasing ZFS in such a way that the Linux community can make use of it as a show of good faith. But its also possible that they will keep it as bait in an attempt to sway Linux developers to their side. Sun have been fairly good with the free software community of late, releasing Java under the GPLv2 (At least the bits they could), but it might be viewed as an attempt at keeping Java in play since C# and Flash have taken a large chunk out of the area.

We could also see few GNU/Linux distributions switch to GNU/Solaris ones if/when/how Solaris is GPL’d, we could see Ubuntu Solaris one day, it being under the newer GPLv3 license could make it the free software OS of choice (well maybe it would still be HURD because of its microkernel, but that doesn’t seem to be usable yet after almost 20 years of development), there already is Nexenta which is a GNU/Solaris distribution similar to Ubuntu. I tried the version that shipped with the OpenSolaris demonstration pack (they ship’em to you free here, like Ubuntu does here), it includes a bunch of Solaris versions on 2 dvds, the case smelled funny). It looked fairly nice for an Alpha, although it didn’t detect my networking or sound, the newer Developer Solaris on the same cd had better hardware support so Nexenta might just need a newer kernel version (They have already release an alpha7 and CP with ZFS boot support, but I haven’t tried them just yet).

I hope to see ZFS in the Linux kernel, every time its brought up in discussions it generally goes: ZFS is cool, I want it, Theres a FUSE version, FUSE is slow i want it for real, Linux carn’t have it because of CDDL, its really the patents the are the problem.

Hopefully someone will eventually code it, just ignoring the patent issues for peoples personal use and distros will could start to include it when Solaris gets GPL’d or Sun will make some statement about it since it seems to be the most commented issue on ZFS.

http://en.wikipedia.org/wiki/ZFS
http://en.wikipedia.org/wiki/GNU_General_Public_License#Version_3
http://kerneltrap.org/node/8066
http://www.zdnet.com.au/news/software/soa/Sun-looks-to-GPL-v3-for-Java-Solaris/0,130061733,339273561,00.htm

Comments : 2 Comments »

Categories : ☢, filesystem, FUSE, gpl, gplv3, linux, solaris, sun, zfs

	Jerry Kaidor on Quick simple encrypted loopbac…
	Rupert Phelps on Open Video Codecs and Fla…
	DW on Run a UNIX command on each fil…
	thiruvadi rajaraman on Quick simple encrypted loopbac…
	ege's press. on FreeBSD review and howtos from…

☠ I could not think of a blog title ☠

Open Video Codecs and Flash

General

Ogg Theora

Xvid

OMS Video

Dirac / Schrödinger

Flash / flv / f4v

code, code.back, code.back2… – A better way with Revison Control (svn/git/bzr/hg tutorials & comparisons)

SunRPC beginner tips

ZFS on Linux – Freedom can be so restrictive

Blog Stats

Recent Comments

Twitter

Top Posts

Top Clicks

Archives

Meta