Saturday, 18 June 2011

IPv6 Day in Cambridge - Success and Non-event!

IPv6 day (8th June) in the University was largely a non-event, and so can be declared a success! This seems to match experiences reported elsewhere.

We are not aware of any significant problems experienced by University users in accessing any services on the day, including external ones known to be participating in IPv6 day such as Google and Facebook. The core of the University network is connected to the global v6 internet, but most distribution networks in departments and colleges only connect using v4 at the moment. Those networks that are connected to the global v6 internet (including the one connecting my desktop workstation) worked fine on the day as expected.

In the run-up to the day we enabled v6 on a number of central services, including the main University web server, the Streaming Media Service (both the web interface and the HTTP download service), the 'new' interface to our search engine, the University Training Booking System, and the central mail service (SMTP, POP, IMAP). On the day we published AAAA records for these services alongside the normal A records from about 08:30 to 19:00 BST.

With the exception of the web server, all these services were enabled more or less as they would be for a v6 production service, though a few features (such as automatic v6 address transition between cluster members, and adapting log analysis to recognise v6 addresses) were not completed in time. The web server used a seperate Apache reverse proxy to provide v6 connectivity to avoid having to disturb its configuration. While doing this, and subsequently, we identified various issues and surprises that I've already mentioned (here, here, and here).

The University web server received 8,981 requests from 280 distinct clients over v6. By comparison it received a total of 1,257,012 requests over both protocols for the entire 24 hour period, meaning that v6 requests probably represented about 1.5% of the total. The breakdown of 8,351 native v6 requests from 230 clients by approximate country of origin appears in the table below.

What was interesting was the relatively high number of clients (50) making requests over transitional 6to4 connections (630). Most of these (36 clients making 476 requests) were from inside the University. Most or all of these clients will have had perfectly good native v4 connectivity to www.cam and this confirms (if confirmation were needed) that rather a lot of systems prefer IPv6, even if provided by a transition technology such as 6to4, over IPv4. Interestingly we didn't see any Teredo traffc.

6to4 caused the only significant incident of the day, when a department mail server switched to using IPv6 over a 6to4 route being advertised by a user workstation elsewhere on the department subnet. This mailserver sends all its outgoing mail via the University's central internal mail switch, but that won't accept messages from machines with 6to4 addresses because it doesn't see them as 'inside'. The problem was quickly fixed, but it seems clear that, ironically, problems caused by 6to4 and Toredo 'transitional' connectivity may represent a significant barrier to further IPv6 roll-out.



Native IPv6 requests to http://www.cam.ac.uk/ on IPv6 Day, by approximate country of origin

[Here 'UCS STAFF' represents clients on the Computing Service staff network, 'UNIVERSIY' represents those elsewhere in the University, 'JANET' those elsewhere on JANET, and 'United Kingdom' those elsewhere in the UK].

  2619 UCS STAFF
  1373 China
  1290 Brazil
   835 JANET
   630 UNIVERSITY
   420 United Kingdom
   293 United States
   171 Greece
   123 France
   110 Czech Republic
    97 Russian Federation
    81 Germany
    66 Japan
    48 Portugal
    47 Netherlands
    36 Finland
    33 Canada
    33 Serbia
    17 Spain
     7 Switzerland
     6 Ireland
     5 Saudi Arabia
     3 Hong Kong
     2 Italy
     2 Korea, Republic of
     2 Norway
     1 Australia
     1 New Zealand
Geolocation provided by Maxmind's free GeoLite IPv6 Country database. "This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/."

Monday, 13 June 2011

More IPv6 gotchas

Our participation in IPv6 day (which I might get around to writing up one day) has lead me to identify three more 'gotchas' relating to IPv6 deployment:

IPv6 tunnels come up outside the wire

As predicted in advance, and born out by our experience on the day, it's clear that lots of clients will use transitional IPv6 connectivity (6to4 or Teredo) even when contacting services also available over native IPv4. Worse, some machines with 6to4 connectivity will advertise themselves as IPv6 routers and other machines on the same subnet will use their connectivity in preference to native IPv4.

In addition to the obvious problem that this transitional connectivity may be broken, or blocked, or massively sub-optimal, there the additional unexpected (to me) problem that machines doing this will be using 6to4 or Teredo IP addresses (2002::/16 or 2001:0000::/32 respectively) and so will appear to be outside you local network even if they are actually inside. This has serious implications for continued attempts to do access control by IP address.

Both addressing schemes actually embed local IPv4 addresses in the v6 addresses they use so you could - perhaps - choose to recognise these. But if you do you'll be in the interesting position of having 'internal' traffic coming into your network from the outside!

Fragmentation

IPv6 doesn't support packet fragmentation by routers, but does require that a sender reduces its packet size and retransmits in response to an ICMP6 type 2 'Packet too big' message.  If this mechanism fails, perhaps because ICMP packets are being blocked but also for any other reason, you may find for example that users can connect to a web site but not get any content back.

This is because the initial connection establishment and HTTP GET request all use small packets but everything goes wrong the moment the web server starts sending full packets containing the data requested. Unhelpfully, web server access logs may look fine when this happens, with the only hint of problems being that too few bytes may have been transmitted (though given a big enough TCP window and a small enough document even this may not be obvious).

Old software

Even though IPv6 has been around for a while, support for it is still missing or broken in a lot of software (especially if you use 'stable' or 'Long Term Support'  Linux distributions whose versions will inevitably be somewhat less that 'bleeding edge').

For example even though the SLAPD LDAP daemon supports IPv6, my colleagues failed to find a way to get the version included in SLES 10 to support both v4 and v6 at the same time, though it was happy to do one or the other. In addition, this version didn't seem to support IPv6 addresses in its access control list syntax.

I also had a problem geolocating the IPv6 clients that accessed our web server. The geolocation database I normal use (the free GeoLite Country and friends from Maxmind) does support IPv6, and the version of their C API supplied with the current Ubuntu LTS (10.04  Lucid Lynx) is just new enough (1.4.6) to cope. But the versions of the Perl and Python bindings needed to process IPv6 both need 1.4.7 of the C API, and since the library is used by quite a lot of Ubuntu utilities upgrading it isn't trivial. In the end I had to build a private version of the C API and the Perl and Python bindings but that was one more bit of work I wasn't expecting.

Saturday, 4 June 2011

IPv6 day - more problems than expected?

A couple of posts on the JANET Development Eye blog:
together with links from them to some useful pages on the ARIN wiki suggest that rather more people may experience problems on IPv6 day than I had perhaps previously expected. The main problem, ironically, seems to be the widespread deployment by default in many OSs and networks of workarounds intended to provide access to IPv6-only resources from machines with only v4 connectivity. The problem is that these workarounds are often broken, or blocked, or massively sub-optimal, but that applications may still try to use these in preference to v4 even when accessing dual-stack services.

Really worrying is that measurements by Google suggest that many University networks, with their 'light-touch' approach to regulating network-connected devices, may be badly affected by all this. I suppose we will see on Wednesday!