Sunday 13 December 2009

Apache configuration file layouts

A traditional Apache configuration consists of one file (httpd.conf) that contains all the required configuration directives. However a single file is a problem for  packaging systems where different packages are responsible for different aspects of Apache's operation. For them it's much easier if they can contribute one or more files containing configuration fragments and if these are then incorporated into the Apache configuration using the 'Include' directive. While convenient for the packaging system, this is less convenient for the system administrator who now finds his Apache configuration spread across multiple files in several directories. Here are a two diagrams showing the configuration file layout in two common Linux distributions - Debian (and so Ubuntu), and SLES:


Note that the mods-enabled and hosts-enabled directories contain symlinks to files actually stored in the parallel mods-available and hosts-available directories,  and that commands, a2enmod, a2dismod, a2ensite and a2dissite, are provided to manipulate these symlinks. Within the mods-{available,enabled} directories, the *.load files contain the Apache configuration directive to load the module in question; the coresponding *.conf files contain configuration directives necessary for the module. httpd.conf is included for backwards compatability and to support installing 3rd party modules directly via apxs2. See the file /etc/apache2/README for more details.


The files shown in yellow boxes, all of which appear in the /etc/apache2/sysconfig/ directory, are regenerated automatically from information in /etc/sysconfig/apache2/ on Apache startup and so shouldn't be hand edited. See comments at the top of /etc/apache2/httpd.conf for more information.

This diagram is for SLES10 and Apache 2; similar arrangements were used with SLES 9 and Apache 1.3, with 'apache2' replaced by 'httpd' in filenames. In SLES 9 it was necessary to run SuSEconfig to regenerate the files based on sysconfig information.

2010-02-23: Debian diagram amended - the 'master' file was incorrectly labelled httpd.conf and should have been apache2.conf. Apart from anything else, you can't have httpd.conf including itself!

Wednesday 9 December 2009

Paul Walk's 'Infrastructure service anti-pattern

Following on from Service-to-service communication, I've just seen a excelent blog posting in paul walk's weblog entitled 'An infrastructure service anti-pattern' which makes an excellent case for how machine APIs should be used. Well worth reading.

Monday 7 December 2009

Service-to-service communication

The University of Cambridge's Raven service works well enough for interactive logins using a web browser, but doesn't (and was never intended to) support non-interactive authentication, or authentication between one service and another, rather than between people and services. Here's a set of suggestions for filling this gap and for supporting general service-to-service communication - I happen to like these today but I'm making no promises that I won't have changed my mind by tomorrow.

For 'proxied' or non-interactive authentication on behalf on individuals I'd recommend OAuth. This is essentially a standardised protocol for establishing a token that grants one service limited, delegated access in a user's name to another service. There's a good example of how it could work in the Beginner’s Guide to OAuth – Part II : Protocol Workflow. OAuth is gaining significant traction in social networking applications.

For service to service communication I'd recommend SSL/TLS using mutual authentication by certificate. Since we are assuming that authentication is required we should also assume that confidentiality is necessary so the protection offered by SSL/TLS seems appropriate.

Certificate trust could just be established bilaterally between pairs of services, but the complexity of this grows with the square of the number of services involved. Better would be to establish an in-house Public Key Infrastructure with a central in-house Certification Authority (CA) that could issue certificates for this purpose. Some difficult policy decisions will be needed about who is allowed to apply for certificates in the name of which services, but once made it should be possible to largely automate the CA by providing a Raven-authenticated web interface for certificate management. Note that these certificates would need to identify 'services', rather than just computers, so the parties to a conversation could for example be the 'CS IP Register Database' and the 'Department of Important Studies Network Management system'. We'd need to sort out a naming convention. An important service provided by the CA would need to be the maintenance of a Certificate Revocation List.

Authorisation I'd leave to the services involved. Both OAuth and certificate authentication establish the 'identity' of a party in a conversation and it should be easy enough to use this identity within whatever authorisation system a service already uses. For example, Lookup could be adapted to allow certificate DNs to appear alongside user identifiers as members of groups used for access control. 

Finally we need to identify protocols by which services can communicate. I suggest something lightweight and vaguely 'REST'ish. Authorities differ on what exactly REST requires, but here I just mean a basic CRUD interface carried over HTTP and mapped onto HTTP primitives PUT, GET, POST, DELETE, etc. Data should probably be serialised using simple XML, though other formats such as JSON are a possibility. Existing XML schema can be used where appropriate, for example the  Atom Syndication Format can be used to represent lists (particularly search results), and the Atom Publishing Protocol is probably worth considering to support the creation and modification of resources (see Wikipedia for an introduction to Atom).

The advantage of this approach is that it provides a lightweight and technology neutral interface using tools (HTTP servers and clients) that are widely available and reasonably well understood. It even allows an amount of  experimentation using nothing but a web browser. It also opens the possibility of in-browser manipulation of data, especially if results are available in JSON. Against this there's the need for an API design for each new service and the requirement for programming work at both the client and server ends. One way  of supporting this is to distribute at least one example client library with each new API. An important selling point for this approach is the fact that it underpins almost all of the current 'cloud' offerings - see the Google Data Protocol, Amazon Web Services, Yahoo Social API, etc.

There are other posibilities for filling the various slots mentioned above - obvious ones being SSH to provide confidentiality and strong mutual authentication, and SOAP to provide interservice communication. I happen to think (today, as mentioned above) that the set I've listed here would currently provide the best solution. Why might be the subject of subsequent posts.

Saturday 14 November 2009

Re-using Raven's password database

[This posting was originally published on an internal wiki in early 2009 and is republished here to increase its exposure. What is says remains relevant today]

The University's Raven authentication system currently provides a usable authentication system for interactive, web-based applications but doesn't support non-interactive web-based applications, nor non-web ones. This excludes many potential uses: Windows/MacOS/Unix logon, IMAP, POP, SMTP, LDAP, WebDAV (and so CalDAV), etc., etc. Since Raven of necessity has a database containing username/password pairs for most people in the University, and most people know their Raven password, it is tempting to assume that extending it to support these other uses would be easy.

This post explores ways in which authentication based on 'Raven passwords' could be extended and tries to point out some of the advantages and drawbacks of each proposal. In reading this, you need to consider the security properties you might actually want from such a system (hint: it's not important that legitimate users have to provide their password before gaining access to something, what is important is that people can't realistically gain access using an identity that is not their own - these are not the same thing!). It is also important to consider who might be attempting to attack what, and how much we care about it: are we talking about a) a bored University student, b) a tabloid journalist, c) an organised criminal, or d) the American NSA, and are they trying to gain access to a) their mate's photo archive, b) the next heir to the throne's email account, c) the University financial system?

This paper only considers ways in which the existing, single Raven password might be usefully reused and ignores possibilities such as as using multiple passwords, one-time password lists, cryptographic smartcards and tokens, fingerprint readers, multi-tier authentication, etc. It also ignores two very real problems with 'static' passwords:
  • It is effectively impossible to prevent people giving their password away, and they do!
  • Users have to type their password into something - typically the workstation that they are sitting in front of. Depending on the workstation, it is entirely possible that it has been compromised, for example with a virus that installs a key logger or by a malicious or inept system administrator.
Option the first: forget passwords
Stepping back a bit, we could forget passwords and just ask people what their user-id is. This would even simplify Raven itself:

This would save people from having a password that they needed to remember, and would save the Computing Service from having to issue them.  Both of these would be significant advantages. Anyone could easily set up almost any system 'protected' in this way - the hard part might actually be ''stopping'' it from asking for a password.

The obvious downside is that we'd have to trust everyone who can get to a login prompt not to lie (and for many networked services this means everyone in the entire world), and so this is rather unrealistic.

Option the second: use a single fixed password
Rather than forgetting passwords completely, we could could use a single, fixed, 'well known' password to protect all accounts. This would be easy to remember (and easy to recover if you forgot it). It would also be fairly easy to set up a system protected in this way.

Now we only have to trust all the people who ever knew the password not to lie. It  is fair to assume that a 'secret' legitimately known by 55,000 (and rising) people (those who have ever had a Raven account) would not stay a secret for long so we'd still be trusting quite a lot of people not to lie, if not the whole world. It would also be next to impossible ever to change this password. Again, this is probably not a realistic option (though this doesn't actually stop people using it, though usually on a small scale - think of the codes used to arm and disarm intruder alarm systems).

Option the third: distribute a list of user name/plaintext password pairs
We could (in theory, though not currently in practice) extract a list of user names and corresponding plain text passwords from Raven, one for each user, and then distribute this list to every system that needs to authenticate users.

Users could then quote their user name and password, the system could look up the the user name and compare the offered password with the one on the list - if it matches they would get in, if not they wouldn't. Implementing authentication in this way would be fairly easy, and system administrators could post-process the list into whatever form their system actually needed.

This would avoid having to trust users not to lie - the vast majority of users would only be able to successfully authenticate as themselves.

The list itself, however, would now be a major problem. Anyone with access to it could authenticate as any user on any system relying on this authentication service. Worse, since many people use the same password in multiple places, anyone with access to the list could probably also forge authentication on entirely unrelated systems.

All of the managers of all of the systems using the authentication service would inherently have access to the list, so we would have to trust them. Even if we assume that none of these administrators would ever be actively malicious, there would still the danger that they might accidentally or recklessly leak the list (from a hacked server, on a memory stick, on a laptop left on a train, etc.). So the security of each system under this model would still depend on a whole group of people that each system's administrator has no particular reason to trust. In practice the only option would be to severely restrict the systems that could participate, to the point where such a service could never provide 'universal' authentication for the University.

Option the fourth: distribute a list of user name/'crypted' password pairs
Rather than distributing plaintext passwords, we could distribute them 'non-reversibly encrypted' (or 'hashed') - for example using the 'crypt' or 'md5' password format used in Unix password files. In principle this would make it possible to check that a proffered password is correct (by hashing it and checking that the hashed versions match), without exposing all the passwords on the list.

Unfortunately, hashed passwords can be recovered by a 'dictionary attack', in which an attacker generates a dictionary of words and their hashed equivalents and then searches the hashed passwords for matches. Since users have a bad habit of choosing common words (or trivial variations of them) as passwords, such an attack can be expected to recover a reasonable proportion of passwords. So even with hashed passwords the list would be vulnerable to compromise and would still have to be treated as confidential.

In addition, everyone logging-in would still have to offer their plaintext password for verification, at which point it would still be vulnerable given a malicious or negligent system administrator. Consider two systems 'A' and 'B', both using such a system and with some users in common. What grounds can system 'A's administrators have for believing that system 'B's administrators will not (accidentally or deliberately) capture, and disclose or use, user name/password pairs that would allow forged logins on system 'A'. What if system 'A' were on the Student Run Computer System web site and system 'B' was the University Financial System, or vice-versa? 

There is unfortunately a more insidious problem. If asked for their 'central authentication system password', how could a user know that it is safe to quote it? Passwords are only safe if you don't disclose them, but to use them that is exactly what you have to do. If you disclose them to a malicious system then you've lost. This is the basis of 'phishing' attacks aimed (with some success) at getting people to disclose the electronic banking or mail system passwords. Given all of the following, which are the bogus sites that is just trying to capture your password?

There is no obvious solution to any of this.

It's worth noting in addition that many authentication schemes (particularly those using some sort of challenge-response) need access to plaintext passwords, or at least particular hashes of the passwords, and so can't be supported by any scheme that distributes pre-hashed passwords.

Option the fifth: central password verification
Even if a list could be made to work,  distributing it would be difficult especially if it needs to be done in a timely manner. One solution would be to have a 'central password verification service'. In this model, a system using the authentication service solicits a user name and password for a user and forwards them to the central service which returns a match/no match response. Almost any network protocol could be used for this, but it is common to use LDAP and overload LDAP's 'user login' process to provide authentication. Alternatives include POP, IMAP, RADIUS, and home-grown protocols.

This approach successfully avoids exposing the list of everyone's plaintext or hashed passwords to system administrators (and potentially others) which significantly reduces the exposure. It unfortunately still requires that users's plaintext passwords be disclosed as they authenticate, and does nothing to help users decide when they should and should not quote their password. 

Protocols used for remote password verification need to be configured so that they don't expose plaintext passwords on the network, and so the authentication server can't be impersonated and used to collect passwords. Doing this correctly makes things considerably more complicated and is something which is often omitted.

Option the sixth: Kerberos (or similar)
Kerberos was designed precisely to overcome many of these problems. It allows a central verification service to assert that a user knows a password, and so has authenticated themselves, without the user having to disclose their password to anything other than their local workstation. It uses assorted cryptographic sleights of hand to do this and has numerous other important properties, such as preventing the recipient of an assertion from using that to impersonate the user on any other service.

In principle Raven could be extended to become a Kerberos central verification service (a 'KDC'). But using Kerberos comes at a cost. Firstly the user's local workstation needs Kerberos software installed and configured. Secondly the Kerberos protocol is significantly more complicated than a user name/password exchange, so unless the systems to be protected already supports it then there are likely to be difficulties.

One interesting development is that Microsoft have adopted Kerberos (or at least, a version of Kerberos) for authentication under Windows. For example every Windows Active Directory Domain is also a KDC. By establishing appropriate trust relationships between Windows KDCs throughout the University it might be possible to use a Raven password to authenticate to an Active Directory Domain and then to rely on Kerberos for further onward authentication as and when required. Similar Kerberos-based login arrangements are available for MacOS and Linux. It is possible that in this environment a malicious or inept AD administrator could compromise authentication - further research is required to establish if it is practical.

Kerberos reduces the number of times that passwords need to be entered and so reduces, but does not eliminate, the problem of educating users about when the should and should not provide their passwords when asked. Note that some software, for example this 'Kerberos' module for Apache, will misuse the initial Kerberos user authentication process by soliciting the user's user name and plaintext password and then seeing if it can authenticate as the user. In this they are just using the Kerberos system as a central password verification service with all the problems that this entails described earlier. Again, it's not obvious how to enable users to safely detect this.

Where does Raven currently fit into this?
The current Ucam WebAuth system used by Raven, and a whole range of similar web-redirect-based systems, depends on the client software in use being a web browsers and on browsers including as standard a fortuitous combination of features:
  • Support for HTTP redirects, allowing the client to be instructed to contact the authentication server direct, allowing communication that bypasses the server initiating authentication;
  • Support for the https: protocol, providing both security for the user's password on the wire and a way for users to positively confirm that they are communicating with the real authentication server and not an imposter (and so that it ''is'' safe to disclose their password); and
  • Provision of a user interface which can solicit a user and and password.
Ucam WebAuth has some features in common with Kerberos, but without the need to distribute or configure client software, though its use does require much more investment at the server end than a simple user name/password system. It does, crucially, only require the user to disclose their password to the central authentication server and provides a way for users to easily identify it. As a result it, just, manages to provide a reasonably reliable authentication system using passwords, but of course only in a web environment.

Friday 13 November 2009

Secure use of passwords

[This posting was originally published on an internal wiki in 2006 and is republished here to increase its exposure. I believe that what is says remains entirely relevant]

Passwords are about the best tool we have for identifying people on-line. There are alternatives, but they have financial and organisational costs that don't scale to the 30,000 people at the University of Cambridge. Unfortunately passwords are actually a very poor tool for this purpose and there are a number of prerequisites that must be met if they are to work at all. Three related ones interest me in particular - one that many people understand, two that they don't.
Prerequisite 1
Passwords must not travel in clear over insecure networks. Many networks are relatively easy to snoop. Most wireless networks are trivial to snoop and doing so doesn't even need physical access. Anyone who manages to capture a userid and password by snooping can impersonate the user for as long as the userid/password pair remains valid, and may be able to leverage this to gain further privilege. Most people understand this.

Prerequisite 2
Passwords must not be divulged in clear to untrusted systems. It's no good having a secure central password validation service if third-party systems collect userids and passwords and pass them on to the central service. Any one of these systems could maliciously capture userids and passwords, or accidentally or recklessly expose them. Even if access to the central validation system is itself secure, a common failing of such third-parties is for them to cause or encourage passwords to be sent in clear on insecure networks.

Of course it's tempting to say "My system's secure so it's safe for me to do this". The problem is that, given n such systems it's necessary for each to trust the other n - 1. Clearly this doesn't scale much beyond n = 1.
These two can actually be combined: Passwords must not travel in clear over insecure networks or through any system that anyone doesn't trust.
Prerequisite 3
It must be possible for password holders to decide when it is safe to divulge their password. They need to divulge their password to authenticate, but divulging it also makes it possible for others to impersonate them.

A system that only ever requires a password to be entered on one particular secure form on one particular web site goes some way towards meeting this requirement. Even if many people don't understand the issues it is likely that at least some will, and will identify and report other occasions on which they are asked for this password. Such other requests are likely to be dangerous or malicious. As the number of occasions on which a particular password is legitimatly requested rises, so does the difficulty of explaining and understanding what these occasions are. Beyond a small number, most people will reflexively enter their password whenever they are asked for it. This will result in an increase in likelihood of malicious or accident password interception and so a decrease in the reliability of the authentication system.
Remember, when considering password-based authentication systems, that the important thing isn't that legitimate users get challenged for a password before gaining access, even though this is the behaviour commonly checked by management. The important thing is that people can not realistically gain access using an identity that is not their own, and this isn't the same thing!

Monday 9 November 2009 terminally broken?

The email forwarding service seems to be terminally broken this morning, with all its registered mail hosts returning 'Not route to host' when accessed on port 25. Brief Googling suggests this has been going on for a while, and I've just had a message bounced that was originally sent last Wednesday which seems to confirm this.

While I have always used the basic service, and so haven't actually paid anything for it, this level of 'service' is unusable and suggests that Bigfoot don't care about mail forwarding any more. Time I think for my own domain name and the end of a long, and largely happy, relationship with Bigfoot.

Wednesday 4 November 2009

How come Google got it wrong?

We've noticed that various 'Points of Interest' on Google maps in Cambridge city centre are just plain wrong. For example Christ's College appears at the junction of St Andrew's Street and Downing Street when it's actually several hundred meters north on the east side of Hobson's Street. Emmanuel College and the Museum of Archaeology and Anthropology are also in the wrong place, though not so drastically. There are others.

I think Google have Christ's College in the wrong place because they think its post code is CB2 3AR when it should be CB2 3BU (even that's still not ideal for something as big as a college since it identifies the delivery entrance rather than anything suitable for visitors, but it's better than nothing).

For Christ's this seems to be a common error (try Googling for "Christ's College CB2 3AR" to see the extent of the propagation of this error) but I suspect the fact that have got it wrong may be at the root of the problem. It might be possible to get this fixed, though it's going to take a very long time for the fix to propagate.

Emmanuel has a similar problem - Google has their post code as just 'CB2' so they have been put where 'St. Andrews Street CB2' resolves to. It's harder to track down where this error originates, and I haven't tried.

The Museum of Arch and Anth is more interesting. Google have their correct address of "Downing St, Cambridge, Cambridgeshire, CB2 3DZ", and "CB2 3DZ" resolves to their correct location. However "Downing Street, CB2 3DZ" resolves to the incorrect location shown on the map so there is something odd about the way the Google geolocation service handles this particular address (perhaps it doesn't always find "CB2 3DZ" in its database and so falls back to "Downing Street, CB2"?).

These errors are not surprising, because Google isn't a traditional map company. They don't go out and survey anything but just purchase, collect and aggregate data from various sources (TeleAtlas for the streets, their own search index for address information, etc, etc.). Some of this information is wrong, which isn't surprising - what's really surprising is that the results are as good as they are. But it does cause us a very real problem, because some people feel that we can't use anything based on Google maps for as long as it shows well known University locations in the wrong place.

Addition 2009-11-05: A colleague has pointed out that part of the airport at Cambridge is labelled "London Luton Airport".

Tuesday 6 October 2009

Apache SSL/TLS security configuration

Recent installation of a new vulnerability scanner at the University has hilighted that a default-configured Apache supports SSL v2 which reportedly suffers from several cryptographic flaws and has been deprecated for several years, and supports the use of SSL ciphers that are very week by today's standards.

The danger of dropping support for particular protocols and ciphers is that doing so denies access to any clients that don't support anything else. Ideally you should review your Apache configuration in the light of your security needs and the capabilities of your clients, which obviously only you will know. Failing that, I've reviewed a sample of the logs the the University's Raven server to see what its clients are actually using. This should represent general University client capabilities. Only 17 out of over 3,500,000 connections used SSLv2, all of which looked to be from robots or similar; only 68 of these 3,500,000 connections used ciphers with symmetric key lengths of 56 bits or below.

My conclusion is that, for general use, adding the following to your Apache configuration will provide a reasonable level of security while excluding few if any legitimate vistors:
SSLProtocol All -SSLv2
When compared to the Apache default this a) drops SSLv2 while leaving everything else (including future developments); and b) drops the export-crippled ciphers, those using 64 or 56 bit encryption algorithms, and the SSLv2-only ones (since we've dropped SSLv2). Exactly what this will leave you depends on the version of OpenSSL you are using, but you can find out from the openssl command-line utility:
openssl ciphers -v 'ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:!LOW:!SSLv2:!EXP'
On my Ubuntu 8.04 box this leaves
DHE-RSA-AES256-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(256)  Mac=SHA1
AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1
AES128-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA1
RC4-SHA SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=SHA1
RC4-MD5 SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=MD5
According to NIST Special Publication 800-57, symmetric keys of at least 112 bits should be generally OK until 2030. Note however that this only applies when used in conjunction with certificates containing asymmetric keys of at least 2040 bits, so it would also be advisable to upgrade any certificates using smaller keys.

Wednesday 23 September 2009

Shibbolising Plone 3 - some experiences

In a previous posting I listed two (and a half) options for Shibbolising Plone 3. What follows is a comparison based on what I found when I actually tried installing them into default Plone instances.

[The 'and a half' option was Liberty Alliance / SAML 2 Authentication Plugin for PAS. I'm discounting it for the time being because a) it doesn't look as if it will work with Shib 3, b) there hasn't been a release since December 2008, and c) the most recent release gives '404 Not found' on the Plone web site]

The two remaining options were WebServerAuth and a set of products from Ithaka (AutoUserMakerPASPlugin, ShibbolethLogin, ShibbolethPermissions). They have quite a lot in common (which is not surprising since they share a common ancestor).

Common features
  • Both depend on the Apache Shibboleth module to perform authentication, and so require connections to Plone to be proxied through Apache.

  • For this to work, server variables provided by the Shib module have to be converted to headers in the proxied request - doing so is a matter of Apache configuration. It is important to ensure that unauthenticated users can't spoof any such headers used for anything related to security.

  • Both will allow Plone access by anyone who can successfully authenticate - if this isn't appropriate then Apache access controls could be used to limit this further (and WebServerAuth has an additional option).

  • Both require a unique user identifier. Easiest to use the value in REMOTE_USER which is set by default by the Shib module to the first non-blank value in eduPersonPrincipleName or the SAML 2 or SAML1 versions of eduPersonTargetedID. This leads to some unwieldy userIDs - both products can strip domain names from IDs, but this removes any guarantee of uniqueness.

  • Both force authentication for an https: version of the site while allowing unauthenticated access to the http: version. While Shib's 'Lazy Session' features might make this unnecessary, both products will in principle work with other webserver-based authentication schemes that may not support anything similar.

  • For both, it's advisable to suppress the default 'login' portlet that is displayed to unauthenticated users.
WebServerAuth features
  • WebServerAuth doesn't create users in the database - by default they just get assigned the 'Authenticated' role (not 'Member') when they log in. It uses userID (so eduPersonPrincipleName or eduPersonTargetedID, optionally with domains stripped) as the user's Full Name - it doesn't use any other Shib attributes even if they are available.

  • Users requiring additional rights can be created in the Plone database and appropriate rights will be assigned when the corresponding users authenticate. Creating Plone users with IDs containing '@' and '.' requires a long-winded hack.

  • WebServerAuth can be configured only to authenticate users who have corresponding Plone accounts, but the user interface is sub-optimal: people will log in and apparently succeed, only to be greeted with a Plone page that still has a "Log In" link.

  • By default, WebServerAuth redirects to the https: version of the current page when authentication is required (and modifies the standard 'log in' link to achieve this). This will cause the Apache module's default Sessioninitiator to be used for authentication. Optionally WebServerAuth can redirect to a customised URL which could perhaps be used to implement a local WAYF service (e.g. simplified login for local users; redirect to federation WAYF for others). Essentially WebServerAuth takes over all login access to the site (which can be problematic if it fails...).

  • WebServerAuth is under current development - its author contacted me within hours of my posting my earlier article to point out an error with it.

  • Need to set the standard logout link (ZMI --> Plone --> postal_actions --> user --> logout --> URL) to something apropriate (or perhaps suppress the link altogether?)
Ithica product features
  • AutoUserMakerPASPlugin creates real Plone users as new people authenticate. It can use configurable Shib attributes to initialise full name, email, location, roles and group membership. This only applies to user creation - once created, changes to attributes are not propagated and the users need to be managed on Plone. It seems necessary to configure full name to fall back to userID if nothing else is available to avoid all such users ending up with a full name of '(null)'.

  • AutoUserMakerPASPlugin can be configured to strip either all or selected domains from userIDs - one approach might be to strip just a local domain, giving local users short userIDs while ensuring against name clashes with anyone else.

  • AutoUserMakerPASPlugin supports a configurable 'logout' URL that can invoke the local Shib module logout function. This may be confusing in a wider single sign-on context (but is handy for testing).

  • ShibbolethPermisisons adds the ability to add local (per page or per container) access rights based on configurable Shib attributes. Again, this only applies at account creation time, after which rights have to be managed manually within Plone as usual.

  • ShibbolethLogin installs a replacement for the standard login page. This includes configurable links either to local SessionInitiator URLs or direct to Shib 1 IdPs or WAYFs. So for example it's possible to have a 'Log in with a Raven (University of Cambridge) user id' link for local users and a more general 'Log in with a different UK Federation user id' link for everyone else. Local Plone user authentication (with username/password) remains available. ShibbolethLogin appears to be targeted at Shib 1 functionality, with no obvious support for new Shib 2 functionality, such as the Discovery Service.

  • The maintenance status to the Ithica products is unclear. Documentation reports the most recent tests to have been with "Zope 2.10.5 and Plone 3.0.6"; files in the distribution appear to have been modified most recently in May 2008.

Tuesday 15 September 2009

Federated AUTH is less than half the battle

Federated web authentication, as provided by Shibboleth, OpenID, Pubcookie, Cambridge's Ucam WebAuth (a.k.a. Raven), etc., is less than half the battle of adapting an application to work in a federated environment. Much the harder part involves the related issues of authorisation and account provision.

Most existing applications assume that they have some sort of account for every users who can authenticate, not least because traditionally they need somewhere to store user names and passwords. In the federated authentication world this won't always be true, since lots of people will be able to authenticate that the application has never heard of.

Approaches to dealing with this include:

Require an account to exist locally before a user can authenticate

If handled manually, this can impose an administrative overhead which will only be manageable for a small user base. Alternativly accounts could be provisioned automatically, but only if the expected set of users can be identified in advance. It is often hard to adapt existing software to not fail badly when presented with an authenticated user who doesn't have an account.

Users may have to tell administrators how their identity provider will identify them and this could be a problem - in Shib's case it is currently unlikely that users will know their eduPersonPrincipleName (even if it is available), and they can't predict their eduPersonTargetedID. An authenticated 'Application' page that captures available attributes might be a way around this.

Automatically provision an account when a user first authenticates

Doing this entirely automatically assumes that information is available about the user, and so assumes either Shib-like attribute provision or access to additional data in something like an LDAP directory. It also assumes that enough information is available to create accounts - some existing applications positively require things like forename and surname which is problematic if these are not available. Alternativly or additionally, users can be prompted for additional information, though obviously this is less reliable and implementing such prompting can be difficult.

However it is also necessary to decide who will get such accounts, since it's unlikely that you will actually want to provision an account for everyone who can successfully authenticate, especially if such accounts come with some basic privileges. So some sort of rule engine will be needed to define who does and doesn't get an account, and perhaps what default privileges they get. This will have to be driven by Shib-like attributes or information from directories. For obvious (I hope) reasons, such decisions probably can't be taken based on information supplied by the users themselves.

Dynamically log users in without creating accounts for them

In some applications it's possible for a user to appear 'logged-in' without having a corresponding user account. One example I happen to know about s Plone using the WebServerAuth product. This simplifies things somewhat, though you still need to address the question of who gets what access by default. You probably need some way to manually grant additional access to a limited number of people where the people and the access that they need can't be established based on attribute or directory data - for them, some sort of local account bound to their federated identity will probably still be needed.

Monday 14 September 2009

Shibbolising Plone 3 - a review

There are lots of Google references to shibolising Plone, but it's not clear how many of them apply to Plone 3 (rather than Plone 2 or earlier). All seem to rely on getting Apache to implement the Shibboleth protection, and then passing identity information over to Plone for it to use.

It looks as if the options include
which seem to have been replaced by
all from (or at least related to) the WebLion project at Penn State. There's a useful page on WebServerAuth on the WebLion site.

Alternativly there are three extensions from Ithaka
which are described in this this slide set and this article.

It looks as if the Ithica solutions actually provision Plone accounts for Shib-authenticated visitors, while the WebLion products give such users 'authenticated' state without creating accounts for them. I can see pros and cons for both approaches. WebServerAuth is being actively developed (last release August 2009); none of the others look as if they are very actively maintained. The Ithica products apparently work with at least Plone 3.0.6; apachepass only claims to work with Plone 2.5; Auto Member Maker and WebServerAuth apparently work with Plone 3.

Note: updated 2009-09-15 to correct the development status of WebServerAuth in the light of comments by Erik Rose.

Note also that the Liberty Alliance / SAML 2 Authentication Plugin for PAS might be relevant, if it's sufficiently flexible to talk Shib.

See also Shibbolising Plone 3 - some experiences