Storing Information in the Cloud Unconference

One of the first things to strike me about this meeting was that it was in the old UMIST building, which is a fantastic piece of Victorian Gothic complete with a huge hall with stained glass windows.  It’s certainly a very fitting venue for a meeting that brought together records managers, suppliers, archives managers, local councils, cloud experts and publicly listed companies to discuss what storing information in the cloud means for them but more on that later.  The workshop-based unconference was part of a project currently run by the Department of Information Studies at Aberystwyth University and funded by the Society of Archivists, looking into the security, operational and governance issues of storing information in the cloud. Their aim at this one-day workshop-based unconference was to generate debate and to highlight some of the security and governance issues surrounding the storage of information in a virtual environment.  Very helpfully, they used a hash tag of #soacloud for the day and I seem to remember that they were going to archive on Twapper Keeper so I’ll record my impressions here but please do see the feed for the ‘official’ notes.

So, one of the first jobs was for everyone to introduce themselves.  What was particularly noticeable was:

– The diversity of job roles represented at the meeting.  As you can see from above there was a wide range of people represented.  It was fair to say they were mostly from a records management background but that was probably to be expected given the remit of the day and the speakers;

– New titles – many people had new titles or had moved departments and this was particularly the case in universitites.  Conversations over coffee revealed that many people had changed their title to something like, say, digital archives manager, but that practice at their institution and support for the new demands of the role were not changing as quickly, leading to a challenging environment;

– There was a range of experience with the cloud; most people were fairly new to it and had come to the meeting to find out more and explore the challenges and opportunities represented by using it.  Very few had got extensive experience and that was mostly from either selling cloud solutions or working in a consultancy or research role;

The unconference went on to explore three different areas so I’ll structure my post on these areas although it is fair to say that the format and the active discussion amongst participants meant there were often not neat boundaries around these areas.

Security and Legal Issues

I’d like to state for this section that I am not a legal expert and that what follows are discussion points and comment.  The reader should not act in any way on the basis of the information below without seeking, where necessary, appropriate professional advice concerning their own individual circumstances.

One of the first key discussion points was raised here, which was the perpetual issue of how you define the cloud.  What everyone appeared to agree on was that it was important be clear about the definition you use, that there are a lot of definitions around and that the definition of what it is very much affects everything else you say about it.  In terms of this session, it was very much defined by the issues so the focus was on the cloud as it existed on the open internet rather than private clouds such as the GCloud.  My personal opinion is that it would be better to stick with the NIST definition so we’re all talking about the same thing and then specify what aspects you want to cover rather than inventing a new definition;  I find that useful and it prevents starting every discussion with a long debate about definition;

Most of this session focused on a legal assessment of the risks of using the cloud, on the basis that there were already lots of people willing to promote the positives.  Many people, myself included, stressed that there was a need for balance and that presenting all the negatives was just as bad as presenting all the positives.  So, the points in summary were:

– There is a risk v value equation to purchasing a cloud service and consumers are very reliant on trust in the case of the service not being available.  Under a standard contract with most cloud providers then there is not much acceptance of risk and quite often no acceptance at all.  It was quite evident that this caused a great deal of concern to the legal profession compared to a normal outsourcing contract or providing the service internally.  Interestingly, though, many contributors in the room reported that universities negotiated their own contract with cloud service providers, giving them the acceptance of risk they felt was appropriate and there was a feeling that sharing that experience with others would lead to cloud providers amending their terms appropriately.  A good example raised was Leeds Metropolitan University negotiating their own terms for the use of Google Apps (further information on Leeds Met use of Google Apps is here) .  This could be an interesting trend to follow as I feel it goes right to the heart of adoption of cloud by institutions.  However, there were many in the room that felt there were a number of solutions on offer.  As mentioned, some institutions are likely to see value in doing their own legal work given the amount they could save.  There is also potential for collective negotiation where a number of organisations or institutions are happy to agree common terms with cloud providers.  An interesting scenario that was presented was whether there would be some trickledown and what was in place now was based on what cloud providers were used to – if they had enough customers who were high value enough would a revised SLA be possible at a slightly greater cost of the provision of the cloud service.  On a final note, many argued that it was important to understand what it meant to get services from a cloud provider as often the service they offered was better than that provided internally.  Internal services also, often, didn’t come with an SLA or acceptance of risk;

– The cloud is very easy to get into, whether that be storing data on it as a service or storing data as part of the service that is offered (e-mail is a very good example of this).  This is not necessarily a good thing, particularly in light of the standard (and it has to be said understandable in some ways) answer from the legal and IS departments of ‘no’ when they are asked by end users whether they can use the cloud, which tends to lead to users doing it anyway.  What was also highlighted is that those tasked with information management of whatever sort are very often not even consulted, which can be potentially even more catastrophic as information both leaves the organisation and is generated outside the organisation, possibly never to return.  This may seem relatively trivial but data and information is increasingly valuable,  whether that be for re-use, exploitation of IP, transfer into other areas such as industry or to ensure it is appropriately archived for future use in all these areas.  Many reported existing benign neglect around information management that they felt could get worse as the cloud freed up people to find their own solutions that worked for them but not necessarily the greater good.  There were several who felt the answer was to say ‘maybe’ or ‘yes and this provider is great’ and for information management professionals to talk more about what they did and raise awareness of the benefit they brought;

– One risk that was highlighted that I hadn’t even thought of on the legal front was that of data being temporarily in the cloud for processing and what happens to it after that.  We didn’t really explore this but I think it’s an area that bears more analysis;

–  The Data Protection Act (DPA) raised its head, as always in any discussion where government (local and national) and universities were involved due to the high risk of not complying and the issues around being able to meet its provisions if data is stored in the cloud (but also bear in mind this is with the definition of the cloud being ‘on the internet’ so it excludes a private cloud or the proposed government cloud).  There were questions as to whether cloud providers could be certified and whether that would make a difference in compliance; there are certainly examples given over in the US where providers are certified under HIPAA for storage of health data.  The main conclusion here was to make a decision as to what data went into the cloud and was processed there.

Records Management in the Cloud

The ever jovial Steve Bailey took a wry look at the implications of the cloud for records management.  His driving premise was that every principle that records managers had been able to hold to, even through storage of records on electronic media, was broken apart by the cloud because there was no longer one physical location where records could go to be managed (even if records reside on a server in electronic form then they are in one physical location).

The main points of this session were:

– Records managers know very little about managing records in the cloud;

– There is a difference between those who are interested in developing and maintaining a product in the cloud and those whose primary interest is preserving what is in or on that product.  With fragmentation between what is on each format, it is becoming increasingly difficult to search for records in one place.  One subject may have a blog, photos online, mail conversations under many different providers in the cloud and discussion documents online.  Contrast this to Bailey’s example of Samuel Pepys papers that can be found in one place and are professionally archived for future generations.  An interesting quote was ‘maybe Google doesn’t want to be the world’s archivist’;

– Is there a place for an information administrator in this world, similar to a financial administrator for if a company goes under?  The information administrator could make sure that the relevant information be taken and appropriately archived for future use.  Another thought that struck me whilst discussing this was whether we had thought about what happened to all the material we now generate in a Web 2.0 world and how long it was valid for; certainly something to raise with colleagues such as Neil Grindley, who is heading up JISC’s digital preservation and archiving work;

– On the back of this, Bailey suggested that it may be beneficial to have a public-funded web repository for all this stuff so that it could be archived safely.  A contemporary example is who is going to remember what the BIS website looked like before the change of government?  Does anyone care?  Should they?  There was some interest in the Content Management Interoperability Services (CMIS) specification, which could help in this area by ensuring content is interoperable and therefore easier to save.  Certainly the vendors were starting to see some join-up around it and mentioned that tenders were starting to request it ;

– Linked data, as always, got a mention but a very brief one, which I think was driven in part by the people in the room.  It’s maybe a topic that could be covered further (or maybe already has) in terms of its use for finding and helping preserve records;

– Another good question from Bailey was what would happen if some of the big providers start charging; would we start to realise the value in records and the records manager as a decision maker on what to keep and what to get rid if we had to face that choice;

– The final question is what is the difference between records and information.  Users often do not want records management and yet they often need it, unfortunately, often after the fact.  Maybe more needs to be done to get the user to appreciate that and to go to them rather than expecting them to arrive at the records manager’s desk.  Once that starts to happen then perhaps records management will start to get recognised at a senior level and there will start to be an appreciation that information management is just as vital a consideration as other business drivers when procuring systems.  Bailey concluded by putting forward the proposition that maybe there aren’t records managers any more and that the role has become that of an information manager, for which there seemed broad agreement in the room.

Cloud and Security

Paul Miller’s presentation went down a well worn road for me so I’ll let you pick up on it from the twitter feed. Key points that stood out for me that he made were:

– Software as a Service (SaaS) started off by providing most of what people wanted for most of the time for most of what they need and targeted those who were never going to use that service outside the cloud.  So, Google Docs is OK as a basic word processor but I am not going to leave MS Word for it.  Interestingly, though, SaaS is now starting to offer features that the incumbents have always had and is catering for more of what users want.  So where next?  There weren’t any answers but it was an interesting thought experiment as to what could happen;

– Cloud SaaS offers rapid iteration, which is good for the user but bad for those supporting the user.  I’m not too sure that it is even good for the user.  Sure they get lots of new features but do they know what they are?  Do they suffer from feature deluge?  I’d certainly agree that support can be problematic if new features arrive without the time to develop training for users and appropriate knowledge in a support team;

– Understanding what you put on the cloud is vital.  There is the perception that the cloud is insecure based on the security model you would want for personal data but if what you are putting on there is ideas for papers or general ‘fluff’ do you really need to have high levels of security? I think this proved problematic for the audience but I can see Paul’s point and mostly agreed with it.  An interesting counter argument was that the reason many staff did not have e-mail in the cloud whilst their students did was that they had a legal link to their institution so whilst staff interests would be served by going to the cloud, maybe those of the institution weren’t and selecting what data mattered to the institution was just too difficult;

– An interesting point from earlier was a discussion over how auditing could be used to verify the security of a cloud facility.  Many argued that a cloud facility could be a lot more secure than an internal facility and that auditing, to some extent, reduced that security because it meant allowing people into the data centre who could then potentially compromise it, even unintentionally.  Do we need certification to overcome this need to have an independent audit of security?  Which ones do we trust?  Is audit very much depdendent on what the use is so do we need to audit differently for different uses?  Does is matter that we know someone who has ‘touched the hardware’ or are we happy to trust that the provider has carried out the appropriate checks?  There is also the question of distinguishing between access, storage and use – audit often needs to answer who is doing these things and in a virtual environment is this even possible?

– The penultimate point to run through is around the fear of having your information in the cloud and out there available for all to get at.  There were a number of good arguments made that the sheer volume of stuff in the cloud tended to mitigate against anyone being able to find what was yours, a sort of security by obscurity.  This even applied to agencies that had legal powers to look at what you had (there is often concern, for example, that the US spies on data held in facilities located there).  Another mitigating factor that was mentioned was around the interest these agencies had in data located on the cloud; going back to an earlier argument, what is held is often a large volume of stuff that no-one other than the user has that much interest in.  If that can be held on the cloud then it makes a lot of sense.  The more sensitive data can then be held elsewhere;

– A good point to come back to finish off this section was around what people expected out of security and their perception of security as opposed to the reality of it.  There was a great meme in here about trust and how even if an institution had many users on its network, making it less secure, it was something they owned and felt they trusted more than a network they did not own and yet had far fewer users on it; the harsh fact is that most networks and servers are compromised by people and the more of them you have with access the more likely it is to happen.  In a lot of ways, I think the cloud is a very secure place but it is vital to ensure you ask the right questions to be sure that you can answer the right questions about those who have been trusted with what you have put there.

So, I promised to cover off why the building was so appropriate at the start of the post and I’ll do that here in conclusion.  When the UMIST building was erected on Sackville Street in Manchester it represented a triumph of knowledge and certainty when it was opened in 1902.  Everything generated in the building was appropriately filed and managed and you can no doubt still find it today as a a result of those people who sifted records from information and diligently archived what was important.  Are we going to be able to come back to the same building in 100 years time and have that same certainty about finding information that we’ve stored in the cloud?

*mashup Event: Location: It’s moving on….

Perhaps somewhat ironically, finding the mashup event on location was difficult, even with a smart phone , which brought out some of the pros and cons of using location.It was raining hard so rather than juggle my umbrella and my phone I had a quick look at the map on my phone before I got out of the tube station at Aldgate then didn’t refer to it until I was lost.

Onto the meeting itself and it was the usual vibrant mix of entrepreneurs and those interested in the subject material.  That meant a lively meeting with plenty of audience questions and some thought provoking views from the panel, who included Vodafone, Yahoo, Broadsight and Rummmble (hope I have that name right).

Broad points that came out of it that we need to be considered in academia were, I think:-

  • It’s going to be the tools in the background such as OpenStreetMap and GeoNames that do the heavy lifting to make location successful (or not).  This could equally apply to the geo services we run in education such as the Geo suite from EDINA;
  • Audience is nothing without data.  It’s notable that the services that have started up and failed are the ones that provide a service and don’t get data from users.  I think this could apply to academia too.  Whilst services need to provide value to the user they also need to have some route to being sustainable and having data about what users are doing is going to be valuable in providing this but in a different way to the commercial sector.  Academic geo services are going to have to work out who wants this data and what they are going to do with it in a way that is acceptable to users (for more on this see below);
  • Where you have been is more important than where you are.  Linked to the above, location data is useful over time but it a lot less useful when you have a very slender slice of time over which it is measured (ie now!).  For academic use, I think this is particularly important to bear in mind as there is potential to use geo data for one application and then use the derived data set from usage for an entirely different application – that derived set is only going to get more useful as more time is spent on the original application.  As an example, many campuses are now using interactive campus maps that use geo services and pass back a user’s location.  Analysing this data could well be very useful for a range of applications such as understanding human searching behaviour in a physical environment or even analysing the physics of crowds;
  • It is vital to understand what the use case is for location and then bring the tools to the use case rather than bringing the use cases to the tools.  It sounds simple but how many times have we done the latter in higher education?
  • It is easier to move users to new technology such as geo-location in small steps than to move them in one big step.  Again, this seems a relatively simple assertion but it is immensely useful in terms of how geo-location is introduced.  Taking our campus map example above, it has a use case, the applications often sit on top of what students already have in terms of hardware and it is building on what they use already.  More complex geo apps can come over time;
  • Users are likely to want applications where they can see other users and yet they themselves can’t be seen.  There’s an obvious flaw in this but it’s something that I would imagine is going to be a particular problem in academia with concerns over privacy.  There needs to be some resolution to it, which could be aggregation via anonymisation or what seems to be becoming a standard approach of giving granular controls over who can see where you are and how accurately they can see where you are, ranging from city level right down to where the device they are carrying can locate the person;
  • The Janus face of location – there are good and bad sides to geolocation.  It’s likely the bad side is going to cause more press and raise more concerns.  Robmyhouse.com has been the most recent, where those who share their location on Foursquare have had that data used to show when they are away from their house.  There are obviously patterns that can be discerned here too.  However, the flip side is that any would-be criminal could far more effectively do this by simply sitting outside someone’s house so it’s important to appreciate the risks but not overstate them;
  • Privacy of data v usefulness of releasing it– as with all business events, there had to be an equation.  This one is going to prove important in academia.  There will be a sweet spot where the value of the information that users give away is balanced by the utility of what they get out of the application.  Again, a bit of a no-brainer but in the time of fEC and reduced spending then any application/service but particularly new ones are going to have to prove their worth to stay sustainable.  It will be interesting to see whether this is at a different place for academia and what the tradeoff is.  In a commercial world it is relatively easy to see that a large number of people are happy to give up, say, their shopping habits to well-known supermarkets for vouchers.  What will be the equivalent in HE and FE?
  • Smartphones aren’t it – we keep on talking about what devices apps are going to run on and what we have seen at JISC so far is that smartphones have been a popular platform.  However, only a small percentage of the population are going to have smartphones so what happens for the others?  That’s going to need some thought as to have equal access then it’s a question that needs resolving.  A thought provoking insight from the panel was to ask how many people carried an Oyster card then to reveal that it is a location device that benefits both you and the provider.  Could access cards in institutions prove to be the first location device that is broadly adopted on campus and what would they be used for?
  • Aggregators.  The panel argued that the iPhone is a good example of this.  Apple didn’t provide anything novel in the individual features; they’d all been seen before.  What was novel was how it was brought together.  There was a strong feeling that this moment was yet to happen for location but once it had happened it could have a large impact;
  • Beware what location data can be used for – the panel raised the point that Google carried out the StreetView project then shortly afterwards they ended their relationship with Teleatlas for map data.  Whilst location data can be obtained for one use it could prove to be useful elsewhere.  Perhaps something to bear in mind for academia working with commercial providers.

mashup Augmented Reality Event, 2009

I went along to the Augmented Reality(AR) mashup event  yesterday evening to see what was happening outside the education sector for augmented reality.   JISC has been involved in a number of projects using smartphones and the Walking through Time project is looking at using AR to be able to show users what streets used to look like (see the project video of what the project is doing).  I can see that JISC are likely to do more with AR.  The uses in research, particularly, could be very exciting.  Imagine an archaeology dig, for example, where you could have layer and finds information overlaid on what you were seeing through the camera on your smartphone.

The first thing that struck me about the event was the popularity and range of people there.  When I arrived at 6pm, there was a queue at the reception desk of those who had turned up on the night and failed to get a place and the attendees included small private companies, large multinationals, government agencies and representatives of national interest groups.

The event had a multitude of themes running through it both in the room and on a lively Twitter backchannel (#mashupevent).  I’ve tried to pick out a few of those below that I think are particularly relevant:

–          What do we mean by AR?  This was an interesting question because some apps are fairly close to traditional multimedia apps.  It was also useful to be reminded that AR has been around for 15-20 years.  The reason for the current excitement is the potential of putting it into people’s hands through their phones and laptops; devices that they have easy access to rather than expensive specialist devices that are only used for AR.  The eventual definition that the meeting seemed to settle on was any application that added information to what the user was currently seeing and so augmented their reality.  I think it’s going to be important to settle on one meaning and work with that but not spend too much effort getting there;

–          AR has the potential to excite users because of its very visual nature.  This has both an upside and a downside.  What we’re seeing at the moment looks spectacular but how useful is it, which is a particular concern when we get to education?  I loved what companies such as Total Immersion were doing, using 2D images on paper to trigger generation of 3D images on screen; it grabs your eye immediately.  But there was a question of whether the dull but useful apps such as Connected’s education app that reads bar-codes and then triggers media launching on PlayStation Pockets (PSPs) would ultimately have more impact.  My feeling is that researchers would probably want functionality first.  Another interesting reflection was on not creating a PR soufflé – one company had Stephen Fry endorsing their product but still hasn’t had it approved by Apple, meaning their consumers are excited but don’t have anything to buy.  I think that is equally a useful lesson to take away for JISC; if we develop some cool AR apps then we need to make sure they can be available if they are taken beyond prototype;

–          AR is a medium but not an end in itself.  There was a feeling that what we are seeing at the moment are more gimmicks than solid apps because the apps focus on AR.  Indeed, one brand advisor is more concerned with telling companies NOT to develop an AR app for their brand!  It was agreed that there needs to be a focus on solving the problem by blending AR into the solution.  I think this has a lot of resonance in research applications and there is an argument to explore AR on a small scale at the moment and wait for it to mature before committing to larger projects;

–          For AR to be successful, it needs to have the hardware to be able to run it.  What was noted was that this did not necessarily mean having smartphones and one person even suggested that there may be AR phones.  My personal feeling is that for the hardware to be rolled out on a mass scale then devices need to be cheap, they need to be carried by a large retailer and they need an indispensable AR app.  A good example of how this is being done in another area is  INQ’s Facebook phone, which is now being carried by a major supermarket retailer in the US;

–          Data was a very interesting topic and unfortunately not one that had much discussion.  AR can give a lot to the user but I think Dan Rickman,  the Chair of the BCS Geospatial Group had a good point when he said that metadata would prove crucial.  I think having attention data geo-located could prove immensely important, bringing space into the equation and allowing us to personalise information and bring it to the researcher based on where they are; it would also make the time dimension more valuable.  There were some apps that showed how useful this could be without attention data including one that showed the nearest tube stations and directions over your cameraphone view (NearestTube by AcrossAir).  This also brings up privacy and how a user controls what data they share about where they are.  Again, a fairly short discussion on this at the meeting but a major issue to be addressed considering many institutions still face considerable challenges on how basic personal data is managed;

–          Standards was another area that was kind of touched on.  Somewhat worryingly, some of the panel felt that it was important to develop proprietary apps and then sort of get to the standards from there.   I think there needs to be a bit more work here as otherwise we’re going to end up with a mess of uninteroperable apps.  Sure, build the standards alongside practical experience but don’t wait until there is a wealth of practical experience and try to build from there;

An interesting comment from one of the panellists, to sum up, was ‘augmented reality is about looking through the window and not just looking at the window’.  I think when we can get to that state then AR will have truly taken off.  Until then, there is quite a bit more exploring to do to make it practically useful for researchers.

As You Like It Identity at JISC Conference 09

So, we had a great session at the JISC Conference 2009 on identity and both Lawrie and I were exteremely pleased with the audience feedback and involvement we had.  I won’t repeat the notes of what was discussed as the very diligent and accurate roving JISC blogger assigned to our session got those and they are here (many thanks for that).

What I will do is to go through what I got out of the session, a few more resources on identity and some of the material we didn’t get to cover.

So, starting on what I got out of the session:

– People are thinking a lot more about identity issues in academia; when we first started talking about identity in groups  (a good example being in the first blog post I ever did for JISC) , we had a large number of listeners and fewer particpants.  Great to see that more people have gone to find out more and are looking at how they need to deal with identity in their area;

– Universities such as Cardiff (thanks to David Harrison for mentioning this at the session) are starting to educate their students on how they need to deal with identity.  It would be useful to have more examples as one of the points raised in the session is that students are a key group to engage with;

– There are many people out in institutions who have a good grasp on the issues that need to be dealt with and an appreciation of what it means for their areas;

– Working with an audience rather than presenting to them can get the most out of a session for both the people leading the session and those attending.  I must admit I was a bit fearful of how this would work given we had a small room that was packed full (Lawrie even gave up his chair!) and it was quite a large group (70+);

– Identity is a complex subject yet it is one that can be approached simply by working with our peers to understand how it will affect how we work in academia;

– We can control our identity and reputation online, which benefits not only us but also our teams our departments and our institutions.  We need to think at a variety of scales;

In terms of resources, we had a jargon segment to try to explain key terms that are used within identity management.  I’ll stick my hand up to a few admissions:

– These weren’t intended to be ‘absolute’ definitions;

– They aren’t intended to be a comprehensive glossary – there are no doubt omissions but I feel it is best to start small and these are the most common (and often for a new user most confusing) terms that are heard;

– When I wrote them I was writing them for the average learner, teacher or student in an institution – yes they are simplified and lose some of the technical detail;

So, after all the excuses, here are the terms:

Identity credentials – generally a username and password but anything that identifies you as a user.  Also known more commonly as an ‘identity’.

Registration – ensuring that the person who you are issuing a set of identity credentials to is who they say they are.  When we talked about the birth of an identity in the session, registration is when this happens.

Authn – Authentication – re-verifying that the user is who they say they are before they are allowed to carry out an action.

Auths or authz – Authorisation – the process of verifying that someone who is trying to access a resource (be that a paper, journal article, some data or something else) is entitled to do so.

PII – Personally Identifiable Information – anything that can personally identify you or another person. This could be your name or a piece of information about you that could only apply to you.  This is what we most need to protect and ensure is up to date as it can affect our academic reputation.

User-centric identity – the concept of the user being able to control what information (PII or otherwise) they release about themselves and thus control their identity.

OpenID – a technology that came out of the social software world (think blogs and wikis) that allows you to control what information you release about yourself. One of the most popular user-centric identity systems.

OAuth – a new technology that allows applications or sites to carry out an action for a user without handing over user names and password details to another site. The best example is the recent talk over software like Tweetdeck being able to post comments to Twitter on behalf of users.

Federated identity – is where a series of bodies that provide identity credentials and a series of bodies that control access to resources get together to agree a common set of rules so someone who has identity credentials from any of the bodies that issue them can access resources from any of the bodies that control access to them.

UK federation – the body in the UK that provides federated identity for UK further and higher education.

Finally, here are a few further resources:

– The presentation and notes can be found here;

– www.pipl.com allows you to see what information is held about you.  You can equally use Google to do a similar search for yourself;

– http://openid.net tells you more aboutOpenID and how to get one;

– http://oauth.net/about/gives more information on OAuth and how it works;

– Andy Powell is running a symposium on access and identity management in e-research; more details can be found here;

– JISC’s latest project is producing an identity management toolkit for institutions.  More details on what it is doing can be found here;

I’d welcome comments on this blog, the events blog or tweets tagged with either #jisc09 or #jisc09_id.  We need to keep this discussion going and build on the good work done in the session.  If you have any direct questions on what JISC is doing in access and identity management going forward then please talk with Chris Brown, who is taking over this area from me.

Report on Identity management for lifelong learning

This report has just gone up on the e-learning blog so please click here to get to the blog post, which has a link through to the report and explains some of the context plus offers a chance to comment.  Following on after the Review of OpenID, it would be good to get folks’ comments on what they think.

As the post says:

The study was intended to describe current practices, envision future processes in identity management and explore identity management issues within the context of lifelong learning.

The study set out to detail the identity management lifecycle in a series of episodes within the educational journey of a lifelong learner, both in an ideal world and also as they are currently. It also looked at how far existing initiatives were meeting these requirements and to map the differences between the ideal world and what happens currently.

The use cases in the study were expected to cover provisioning of identity, maintenance of identity, deprovisioning of identity and provision of authenticated information about learners to other organisations.
The study was commissioned to aid understanding of the challenges facing the education sector in identity management for lifelong learning, and to support JISC in future planning in this area.

Audiences

Increasingly at JISC we are asking projects to look at who their audiences are and what the audience would like from the project.  I thought I’d write a post to explain what it is we are looking for as I realise it can be quite a confusing area.  I’d also welcome comments from those who are grappling with this at the moment so we can improve the advice we give and hopefully make it a positive experience for all concerned.

To set the background, looking in much more detail at audiences has recently become important for JISC.  Back in the ‘Thousand Flowers’ days we knew broadly who the audience was for projects and it was sufficient to broadly outline who the key stakeholders were (which, note, could be a larger group than the audience).  What was important was to get small projects out there experimenting with the technology and passing those lessons on to quite a broad audience who would pick up what was of interest to them.  Some projects would fail as a result of not quite connecting with their audience or simply not having an audience but that was all part of the risk-taking we did for the sector.  Fast forward to today and we are commissioning some very large projects and we are getting some of those to go along what we call the Development to Service route.  What this means is that we are interested in how they go from being a good idea to being something that can be used by others.  Unfortunately JISC can’t support all of those projects so we have to know which ones are of particular interest and for the others we’d like to help them find funding from other sources.   This is where knowing who could use the service and who would be interested in funding it proves to be vital.

So, how do you go about doing that?  I’ve tried to combine some do’s and don’ts below from projects that have been both successful and unsuccessful in finding audiences to use them.

DO:

– Get engaged with your community through events, mailing lists, blogs, etc and find out what they think about your idea and who they think would find it useful.  They are also likely to have some useful input into what you need to do.  A good recent example is my last post that I did on OpenID; the JISC-SHIB list are actively discussing its conclusions and helping suggest how we can take it forward;

– Identify named communities, institutions, companies and organisations who would be interested in your project.  It is so much easier if you can name members of your audience.   So, for example, my NAMES project is working very closely with the British Library.  That is so much better than saying the audience is ‘those in the academic community who would be interested in an authoritative registry of academic names’;

– Work with those named entities to establish their interest in your project.  They could well help with testing intial demos and prototypes or be able to offer some financial asssistance or resources in other areas such as connections to other similar initiatives or those who could help you;

– Talk with your programme manager who may be able to suggest useful people to get in touch with or audiences it may be useful to engage with;

– Put the work in on your project plan to identify key named audiences, record who those are and come back and revisit these, changing them as necessary;

– Get a demo or prototype out early so your potential audiences can see what it is you are doing and get to grips with it;

– Come along to JISC organised events such as Andy McGregor’s Developer Happiness Days  or the JISC Conference as they provide a good opportunity to talk about what you are doing and find more potential named interested parties for your audience;

DON’T:

– Define your audience so widely that it will be impossible to take practical action to engage with them.  If you’re aiming at ‘the UK academic community’ or ‘those interested in repositories’ then you need to be doing some more work;

– Skimp on engaging with your audience and getting their feedback.  You need their interest even if JISC are doing the funding as it provides evidence that what you are doing is useful;

– Use surveys as a substitute for engaging with your audience or finding it.  Surveys are useful but they can’t be used on their own;

Hopefully that is helpful and if you have something to add to the above then please post a comment.

JISC OpenID Report

This morning I got the final copy of this report so I popped it straight up onto the JISC site, which means you can see it around lunchtime if you click here.

We feel this is an important report for the sector as it reviews a technology that we constantly get asked questions about and up to now we haven’t had authoritative answers for.  OpenID is, without a doubt, an important technology but up until now there hasn’t been a comprehensive review of how it could be used in the higher and further education sectors.  This has led to a lot of speculation and rhetoric with very strong advocates for the technology but, equally, very strong critics.  We’re hoping this report will inform the debate, particularly given the project has also developed a gateway between OpenID and the UK federation so those with OpenID credentials can access Shibbolised resources (subject to the resource provider being happy with providing access).

Overall, the conclusions were:
i) there is considerable interest in OpenID in the commercial market place, with players such as Google and Microsoft taking an active interest. However,
ii) all commercial players want to be OpenID providers, since this gives them some control over the users, but fewer want to be service providers since this increases their risks without any balancing rewards
iii) until we can get some level of assurance about the registration of users and their attributes with the OpenID providers, it won’t be possible to use OpenID for granting access to resources of any real value. In other words, without a trust infrastructure OpenID will remain only of limited use for public access type resources such as blogs, personal repositories, and wikis
iv) imposing such a trust infrastructure with barriers to the acquisition and use of OpenIDs may be seen to negate its open-access, user-centric advantages
v) OpenID has a number of security vulnerabilities that currently have not been addressed, but at least one of these is also present in the current UK federation.

The implications from this are:
i) Whilst OpenID does have its security vulnerabilities and weaknesses, some of these are also shared by Shibboleth as it is currently designed. Other technologies may subsequently solve these and therefore this could have implications for the UK federation.
ii) The UK federation as currently deployed has a significant shortcoming which is the readiness of IdPs to disclose the real-world identity of users to SPs (as distinct from providing opaque persistent identifiers to support simple customisation). This is not a technical shortcoming but an operational one. Whilst it is relatively easy to solve, until it is, it limits the applicability of Shibboleth to personalised and other services which need to know who the users are. OpenID does not suffer from this limitation and therefore there might be use for it in some scenarios where trust issues can be resolved.

And, finally, the recommendations are:
i) The UK academic community should keep track of both OpenID and CardSpace identity management systems as they evolve. There is clearly a great demand for a ubiquitous secure identity management system, but no consensus yet as to what this should be.
ii) Now that a publicly available OpenID gateway has been built, publicise its availability to the community and monitor its applications and usage. If usage becomes substantial, consider productising the service.
iii) Consider offering a more secure and more trustworthy gateway registration service for SPs that do not use, or use more than, the eduPersonPrincipalName attribute. This will allow them to use OpenIDs for authentication and a wider selection of eduPerson attributes for authorisation. (The current self-registration service is clearly open to abuse).

I’d welcome any comments on the report and/or gateway.  I think what we need to do is to keep the debate going and share experience to ensure that researchers and learners can get the most of OpenID.

AHM2008

I am currently sitting waiting for a sleeper back to London so it seemed a good time to reflect on this year’s All Hands Meeting that I attended in Edinburgh, the main annual e-Science conference in the UK.  There were several changes for this year.  The first was the venue so goodbye to the East Midlands Conference Centre and hello to a multi-location venue in Edinburgh.  I think the overall reaction was positive with a variety of places to meet up with colleagues both on and off site and a series of venues from the National e-Science Centre (NeSC) to the Appleton Tower and the very shiny and slick Informatics Forum.  Accommodation was a little far from the main venue locations but had wireless and a good breakfast (always essential to get me going in the morning!).  Dinner was also very well received at Dynamic Earth, with plenty to look around before the dinner and plenty of opportunities to mix with colleagues old and new.  I was rather less convinced about having coffee breaks that ran through the sessions but most people seemed to get used to it and there was a good deal of material to fit in so you could forgive the programme committees from running out of space to get it all in! So, down to the sessions, which proved notable this year for being very much focused on researchers carrying out good research enabled by e-Science.   It seemed that this year we saw a good deal more adoption of the tools that we’ve heard about in previous years and that was good to see.  Whilst tool development is still vital, it’s equally vital that the tools are used in a production environment.My first session was a BoF run by Alex Hardisty and Neil Chue Hong on e-Infrastructure.   I think the level of attendance rather took the organisers by surprise and a couple of thought-provoking presentations helped kick off our consideration of the subject material.  I’ll pop a link in here to the conclusions of the session when I get it but, in summary:

  •  e-Infrastructure is for everyone and is useful for a range of different challenges.  What determines its success is how it is used;
  • e-Infrastructure is increasingly being used by researchers as part of what they do on a day to day basis;
  • There are increasingly varied ways in which e-infrastructure can be used and this is likely to get more diverse into the future;
  • There is a mix of requirements from e-Infrastructure.  Some are quite happy to glue together components and use it in a very ad hoc way.  Others would like a more structured approach.  All in all, it’s quite a complex landscape so it’s important to work with the researchers as to what best suits what they are doing;
  • e-Infrastructure is already part of everyday research and that is likely to get to be more so as time goes on;

My first event followed after the BoF.  We’d invited an Australian delegation led by Dr Ann Borda to a drinks reception so that they could meet up with the eResearch team and members of the JISC Support of Research Committee (JSR).    There were some great conversations as all of the eResearch team got a chance to swap experiences of eResearch.  From my side, I got up to date with Australian developments in eResearch tools with Jane Hunter, Paul Davies and Ann Borda.  I also had a great conversation with Andrew Treloar, David Groenewegen and Paul Bonnington that ranged from approaches to data and the latest on ANDS to internet TV.  Finally, I got the chance to catch up over dinner with Andy Richards and Neil Geddes from the National Grid Service.  As always in these events, one of the main reasons for us to attend is to meet up with those who are out practising eResearch so I spent quite a lot of time on the stand on Tuesday.   It proved to be a great opportunity to catch up with some of my more recent projects so thanks to Tom Jackson from iREAD, Pete Burnap from SPIDER and Stephen Booth from Grid-SAFE for popping in and catching up.

I also had an interesting conversation with Andrew Cormack on PII (Personally Identifiable Information).  Andrew’s point was that most applications simply don’t need to ship PII and I would agree.  I think it’s often used as a comfort blanket but it’s a comfort blanket that carries its own risks.  If SPs (or RPs if you prefer that term) were to adhere to Kim Cameron’s second law (minimal disclosure for a constrained use)  from his Laws of Identity then the world would be a better place.   This brought us to an interesting case of what happens with grid computing.  It’s one of the few cases where you cannot get around issuing PII because you need to have a way of contacting the user in case their job fails or if it’s not going as intended.  However, it still adheres to Kim’s second law in that there is only the need to get a contact address for the user.

Finally, I talked with Richard Sinnott and David Medyckyj-Scott on geo data and access to more complex data sets. Richard has a long history of complex access to data sets, particularly around medical data and using roles to determine who can access what. I think we are reaching a stage where we can start moving towards a broader rollout of the technologies so that they become more ubiquitous and it is hopefully something we can build on top of Richard’s work and that of the data access projects we are currently running.  On the geo side, we are already running quite a few geo projects and I can see that location is going to prove to be increasingly important for research data and collaboration.  One of the initiatives that David is very much interested in is INSPIRE, a European directive to create a spatial data infrastructure in Europe.  I think INSPIRE is going to prove important over the next few years as it will help make spatial data easier to access and also provide an incentive to talk about spatial data in a common way.

POSTSCRIPT – This has been a long time in gestation as I’ve tried to get my notes down to a reasonable size for a blog post, which reflects just how much material there is at All Hands, making it a very worthwhile event to attend if you are involved in research and want to find new and better ways of doing research.

Less of the XML

I’m sitting in another conference where I’ve seen several presentations that are littered with XML that is then dissected ad nauseam.  Now, I’m sure that they are very valuable for the people who are presenting them and we’re all familiar with the pride with which we talk about our new ‘baby’.  Unfortunately, it switches off the bulk of the audience (I’m not just talking about myself, btw – there are several people who feel the same way).  So, if you’re a developer or technical manager then please resist the temptation to put XML in your presentation.  Diagrams to show how your system works are good (animated ones even better).  Short, concise slides that outline where your solution could be used and how it helps the user are good.  Short demonstrations of your system working are good.  Sharing all that lot on a site like SlideShare or on a website (which is what we do at JISC) is even better.

If you do this then you avoid the demoralising prospect of people switching off from what you are presenting altogether (and, yes, this even applies to technical people).  If someone wants to see exactly what is in the XML you output or  take in then you can safely rest assured that they’ll ask you about it over coffee.

IDM2008

Given this is my second blog entry in as many days you’re either in for a treat or the tedium continues; I leave  you to decide.  I’d also add that due to train issues, this and the above entry WERE written separately but offline as there doesn’t yet appear to be a 3G service that provides continuous coverage on the journeys I make; if anyone can suggest one then please comment below.

So,  today was IDM2008, billed as an opportunity for those from business and government to get together and share their experiences on identity management.  I was the representative from higher and further education and giving a presentation on Innovation, which outlined what we had done on the Access Management Federation and subsequent developments.

The day featured the following presentations;

·         Graham Morrison on getting Kerberos to solve the Home Office’s issues of ‘seamless authentication’ across a range of different systems.  I liked this one for a number of reasons.  The first was that it was using what was already there and proven to work, which I think is important in identity and access management (IAM).  Next, it has been kept simple – you can’t get much more simple than using Kerberos to issue a ticket to authenticate the user (the Ticket Granting Ticket) and a ticket to authorise them to do ‘stuff’ (the Session Ticket or TGS).  Finally, it deals with levels of assurance but only gets into heavyweight biometrics and role-based access control, etc when it needs to;

·         David McIntosh (hope I spelt that right) presenting on biometric technologies and SITC.  The former taught me that your ear echoes back any sound that is played into it in a unique way to you; interesting but not particularly useful unless you want to biometrically identify someone in a quiet environment.  The latter could be more widely useful to JISC as it is a body consisting of SMEs that would like to engage with universities;

·         Jim Slevin on Manchester Airports IDM systems.  A very topical presentation as the authentication of a user can now be carried out by National Identity Card, which has caused quite a stir in the papers this morning.  More interestingly, their focus was on delivering a capability, not a solution, which I think we should focus more on.  You can actually do something with a  capability;

·         Joe Baguely presented on AD as an identity store.  The sub-title was ‘are you mad?’ and I think this summed up many people’s impression of doing this but Joe presented a very convincing argument to re-use what is already in place with Active Directory (AD) and carried out a rather unsubtle plug for his organisation, which does this and I am not going to repeat here.  I also quite liked the idea of Segregation of Duties or SOD – I’ve known it as a concept but having an acronym somehow makes me feel so much better;

·         Fraud and IDM by Logica.  I quite liked the abstract for this so attended.  I didn’t entirely regret it but found out more interesting facts about fraud than necessarily the business case for IDM, which is why I’d originally gone;

·         Dave Nesbitt on how to avoid an identity trainwreck.  Whilst this was saying what we all know such as getting senior level sponsorship, having clear priorities on what is going to be done agreed with key users, iteratively deploying rather than going for big bang and technology is difficult, it’s the human stuff that is difficult, it’s all worth repeating.  Even the take home message was worthwhile: ‘IDM is many small projects to constantly improve your infrastructure that never end’;

·         David Bowen looked at how identity management worked at Great Ormond Street Hospital.  I didn’t learn much from this but had a sharp intake of breath on the mention that single sign-out is more difficult but more valuable than single sign-in.  On the Shib front I don’t think we are ever going to get there and we shouldn’t be trying given the issues, IMHO;

·         Yours truly was next up and if you read this blog and the stuff on what I do on the JISC site then you’re going to know what was presented;

·         Conn Crawford went through how local authorities approach identity management but specifically what Sunderland have been doing.  It was great to see Conn again.  He has a knack of connecting up a range of identity management ‘stuff’ to do really valuable things in the community.  What he has done ranges from federated solutions right the way through to user-centric identity management and he was presenting on the Let’s Go Sunderland portal he has put together that allows kids from a disadvantaged background to load up a smart card with activities they can attend.  They have an allowance every month and sign up for activities but the smart thing is that they also tell the portal what they are interested in, which gives the resource providers some anonymised marketing info back and hence an incentive to offer their resources to the scheme.  This is a great example of making personalisation work whilst protecting the individual;

·         Alan Coburn presented on Glow, a teaching and learning portal for Scottish schools.  I think the most interesting thing out of this was that schools wanted to sign up for it, hence there were a great number of users, and that they had used Shib but not the federation.  It turns out the latter was due to specifying it before the federation existed;

·         Hellmuth Broda had the rather unenviable task of being last up and went through Liberty Alliance.  All very good stuff but nothing new for me.  What was of more interest was his company’s creation of batches of unique codes that could be attached to 2D bar codes, RFID tags and text messages; basically, name a media and it could be attached.  The potential was huge as these codes linked to specific actions such as vouchers, one time visits to web sites, etc.  More info on this is at www.firstondemand.com;

 

Thanks also go to Professor Gloria Laycock, who did a great job chairing the meeting to the extent that we even finished early!  All in all, a useful day and there were quite a few contact I met during the day that I’ll follow up further.  Well worth a look next year if you are interested in identity management outside the education sector.