Wed Aug 13 16:23:52 PDT 2008

Several groups of Linux kernel papers have been published recently. Here's my pick of them:

First we have the Proceedings of the 2008 Linux Symposium (these are in some order of order, favourite first):

Next there's the ACM SIGOPS Operating Systems Review. These papers are about much more experimental developments in the kernel and are thus more fun, even if they are less likely to see the light of day:

Wed Aug 13 11:13:23 PDT 2008

I've just released two new curve25519 implementations: one in C and one in x86-64 assembly. The latter is 10% faster than djb's implementation.

curve25519 is an elliptic curve, developed by Dan Bernstein, for fast Diffie-Hellman key agreement. DJB's original implementation was written in a language of his own devising called qhasm. The original qhasm source isn't available, only the x86 32-bit assembly output.

Since many x86 systems are now 64-bit, and portability is important, this project provides alternative implementations for other platforms.

Implementation Platform Author 32-bit speed 64-bit speed
curve25519 x86 32-bit djb 265µs N/A
curve25519-donna-x86-64 x86 64-bit agl N/A 240µs
curve25591-donna Portable C agl 2179µs 628µs

(All tests run on a 2.33GHz Intel Core2)

Mon Jul 7 21:50:46 PDT 2008

Google has, at last, open sourced Protocol buffers. My, very minor contribution to this is that I wrote the basis for the encoding documentation.

Protocol buffers pretty much hit the sweet spot of complexity and capability. (See XML and ASN.1 for examples of attempts which missed.) I have the beginnings of a protocol buffer compiler for Haskell that I wrote for internal apps. Now that the C/Java/Python versions are out, I should probably clean that up and put it on Hackage. But every coder should consider protocol buffers for their serialisation needs from now on.

Tue Jul 1 20:22:57 PDT 2008

Firstly, if you're wondering what happened to all the ObsTCP stuff, it didn't disappear, it just moved to a different blog. Things are still moving as fast as I can push them.

The Black Swan

(ISBN: 1400063515)

This book has some good, if unoriginal, points about the stupidity of much of the modeling done in today's world, esp the world of finance. Sadly, these are hidden in many pages of self-centered rambling and discourse on adventitious topics. If you're thinking of buying this book, get The (Mis)behaviour of Markets by Mandelbrot instead; you'll thank me.

Wed Jun 11 12:58:49 PDT 2008

I've added a bunch of Obsfucated TCP stuff to the obstcp project page code.google.com include kernel patches, userland tools, specs and friendly introductions.

Also, I posted it to Reddit. If it doesn't get downvoted into /dev/null in 60 seconds, the comments will probably end up there.

Tue May 27 09:02:24 PDT 2008
OpenID - not actually spawn of Satan

A blog post aggregating complaints about OpenID has been popping up in different places this morning. If you've read it, you might want a little perspective. I'm not going to deal with each point in turn because there's so many, mostly repeating each other.

Phishing

At login time, the site that you're logging into can end up redirecting you to your OpenID provider. Your provider then tells you to go to their site and enter your login information, then click a button to try again. They don't provide a "link" to their site and they don't ask for your password.

Some early providers might not have followed these basic steps, but all the reasonable ones do.

Yes, it's still possible for users to be confused but, by habit they'll be used to doing to right thing.

XSS and CSRF

XSS problems on the providers site are a big deal. This criticism is reasonable.

CSRF may be a bigger deal because you are more likely to be 'logged in' to the target. However, most users already keep persistent cookies to save logging into these sites. The additional attack surface here is dubious; CSRF issues are a problem with or without OpenID.

DNS poisoning

If your OpenID starts with https://, you should be protected from DNS poisoning attacks and the like by the usual TLS PKI. This isn't perfect, but it's pretty good.

However, the OpenID spec says that plain domain names are normalised by prepending http://. This is a technical problem with the spec and should be fixed. Until then, this is a reasonable criticism but not a fundamental issue.

Privacy

The OpenID provider has a lot of information about your activities. This is little different than, say, your email account and many people are happy with Gmail. Likewise, password recovery on most of the sites which could use OpenID is based on email access, so most people already have a single password that suffices for entry to many sites.

If you don't like the idea of Gmail you can run your own email server. Likewise, you can run your own OpenID provider.

Using the same OpenID on many sites does allow them to link your activities. So does giving these sites your email address for password recovery. So does using the same IP (although to a lesser extent).

Some providers will let you have many OpenIDs linked to the same account for this reason. Joe user probably won't use that feature and probably gives the same email address to all those sites already and so looses nothing.

Trust problems

OpenID is not a trust system. Trust systems may be built on top of identity systems. Likewise, apples are not oranges and complaints about their lack of tangyness are moot.

Usability / Adoption

Somewhat valid points here. It's a big job to get widespread adoption and, at the moment, it's a pretty small crowd that uses OpenID. However, OpenID doesn't need a flag day; it can have incremental deployment.

Availability

Valid points. If your provider goes down you're going to have a bad day.

Conclusion

I don't believe that OpenID should be used to login to your bank account. However, for the myriad of sites that I login to (Google Reader, reddit, ...) it would be nice to just be able to type my OpenID in. It's decently suited to that because I'm fed up with all these accounts.

Tue May 20 20:46:22 PDT 2008

I'm now running a Ubuntu based laptop with a somewhat functions Obsfucated TCP patch in its kernel. (If you have a Neo like view of the Internets you'll be able to see it by the funny options in the SYN packets.)

Hopefully soon I'll be able to post a first draft patch for other people to try. In the mean time, I wrote the start of the mounds of documentation I suspect it'll need: a very non-technical introduction.

Wed Apr 30 20:50:01 PDT 2008

I've updated the patches linked to in the last post with today's work. Both sides now end up with the same shared key (and not just because they got the same private key from lack of entropy like before). That took some fun tracking down of bugs.

Also, packets are now HMAC-MD5'ed with the shared key, and invalid packets are dropped. That also took far longer than expected. I ended up using the MD5 implementation from the CIFS filesystem because the kernel's crypto library is just plain terrible. It's also totally undocumented but, from what I can see, you can't lookup an algorithm without taking a semaphore, and that requires that you be able to sleep. I almost think I must be missing something because that's dumber than the bastard offspring of Randy Hickey and Jade Goodie.

But there we go. Encryption (with Salsa20) to come next Wednesday.

Wed Apr 23 20:09:16 PDT 2008
First Obsfucated TCP patches

After a day of kernel hacking, I have a few patches which, together, make a start towards implementing ObsTCP.

At the moment, it will advertise ObsTCP on all connections and, if you have two kernels which support it, you'll get a shared key setup. At the moment, the private key is generated at boot time and since the host doesn't have any entropy then, it's always the same. So I'll have to do something special there. Also, I've a problem where the ACK with the connecting host's public key can get lost. Since ACKs aren't ACKed, this can be a real pain. I think I need to include it in every transmitted packet until (yet another) option signifies that it's been received.

Wed Apr 16 15:04:39 PDT 2008

After the last post explained why small curves aren't good enough for obsfucated TCP, I decided that, since I'm going to have to do some damage to the TCP header to get a bigger public key in there anyway, I might as well go the whole way and use curve25519, by djb. Now, djb has forgotten more about elliptic curves than I'll ever know and I feel much happier using a curve that's been designed by him. As you can probably guess from the name, it's a curve over 2255-19 - a prime. So the public keys are 32 bytes long.

In order to get that much public key material into a TCP header, here's my proposed hack: Jumbo TCP options.

djb's sample implementation of curve25519 is written in a special assembly language called qhasm. Sadly, it's so alpha that he's not actually released it. So the sample implementation is for ia32 only, uses the floating point registers and has 5100 lines of uncommented assembly. It is, however, freaking quick.

However, since I have kernel-space in mind for this I've written a C implementation. It's about 1/3 the speed (and I've not really tried to optimise it yet), doesn't use any floating point (since kernel-space doesn't have easy access to the fp registers in Linux) and fuzz testing seems to indicate that it's correct. (At least, it's giving the same answers as djb's code.)

Next step: hacking up the kernel. (And I thought the elliptic curve maths was hard enough.)

Tue Apr 8 21:42:36 PDT 2008
Elliptic curves don't work either

(For context, see my previous post on OTCP)

In any Diffie-Hellman exchange based on elliptic curves, we have Q=aP where P and Q are points on an elliptic curve. The operation of multiplying a point and a scalar is well defined, but unimportant here. The problem facing the attacker is, given Q and P, find a. If they can do that, we're sunk.

If you could find a pair of numbers such that: cP + dQ = eP + fQ then you're done because: (c-e)P = (f-d)Q = (f-d)aP, then a = (c-e)/(f-d) mod n, where n is the size of the field underlying the curve.

Finding such a point by picking random examples is never going to work because of the storage requirements. However, if you define a step function which takes a pair (c, d) and produces a new pair (c', d') you have defined a cycle through the search space. (It must be a cycle because the search space is finite. At some point you must hit a previous state and loop forever.) Now you can use Floyd's cycle finding algorithm to find a collision with constant space. This is an √n algorithm for breaking this problem and is well known as Pollard's rho method.

Now, if you have many of these problems you get a big speed up by using some storage. Assume that you do the legwork to solve an instance of the problem and that you record some fraction of the points that you evaluated. (How you choose the points isn't important so long as it's a function of the point; say pick all points where the first m bits are zero.)

Now, future attempts to break the problem can collide with one of the previous points. If you find cP + dQ = eP + fR (note that P is a constant of the elliptic curve system) and also that R = bP (because we solved this instance previously) then cP + dQ = cP + adP = (e+fb)P and so (c-(e+fb)) / d = a (and we know all the values on the left-hand side).

Now, 2112 (14 bytes) is about as big an elliptic curve point as we can fit in a TCP header. The maximum options payload is 40 bytes, of which 20 are already taken up in modern TCP stacks. We need 2 bytes of fluff per option and, unless we want this to be the last TCP header ever, we need to leave at least 4 bytes. That's where the 14 byte limit comes from.

We give the attacker 250 bytes of space. I believe that each point will take 3*14 bytes of space for the (c,d,Y) triple, where Y = cP+dQ. Thus they can store 244 distinguished points. Thus one in 256-44=12 points are distinguished. Additionally, generating those 244 points isn't that hard, computationally. This suggests that an attacker can find a collision in only 212 iterations., or about 213 field multiplications.

So, again, a reasonable attacker can break our crypto in real time.

This scheme becomes much harder to sell if we have to do evil things to the TCP header in order to make it work.

Thu Mar 20 10:23:36 PDT 2008

If you've been wondering what I'm up to at work, we now have a public blog for the RechargeIt project.

Wed Mar 19 15:02:10 PDT 2008

How sad: from reading the sleepcat documentation on network partitions, it's clear that BDB uses a broken replication system (i.e. not Paxos). That's a shame because I was hoping to use it.

Tue Mar 18 20:35:49 PDT 2008

Yahoo now has OpenID for all its accounts, which is great. Wonderful in fact. OpenID is a good thing for many authentication needs on the Internet and will make the world a better place.

However,...