The thoughts of a Code Gorilla

October 14, 2008

Repositories and “old school” academics

Filed under: repositories — codegorilla @ 8:42 am
Tags: , ,

On Repositories

I had an interesting chat with a self-confessed Old School academic: he’s in a deeply unfashionable area of research, and publishes in deeply unfashionable journals…. but he makes sure that everything he publishes goes into his local Institutional Repository.

I ran my idea of a CRIS-like system past him, and he spotted an immediate flaw: “It’s mine!”

He will not share anything until has been published. He will not put unpublished work anywhere that it can be got at1. The problem is that your unpublished work can be plagerised, and published, before you finish your work… meaning that you are now plagerising someone else – on your own research!

I asked him about copies of his work, and if he keeps them on the fileservers in his college: Nope, he keeps them on a removeable hard disk, which he takes home with him every night.

So where does that leave us?

  • I think we need to accept that that old school have a point: plagerism is rife, and not just at undergrad level – it happens at all levels of academia.
  • I think that the “google generation” will be less paranoid about their work… and more aware of computing systems (on which: who else noticed that Peter Murray-Rust mentioned having disk-level encription on his laptop when giving his presentation at OR08?).
  • I think that the idea of providing an backup (or archive) for “work in progress” is valid, and that the idea of a hierarchical system can be sold.

BUT (and you notice it is a pretty damn big “but”), we will need to be sure that the archive is secure, that work cannot be copied, and that the academic feels firmly in control.

On another topic

My friend was hugely supportive of his local repositorty: not only were the staff excellent at handling the deposit and sorting out all the metadata stuff for him; but he was actually able to raise the profile of his work!

He drums into his students two messages when it comes to publications:

  1. Do NOT release anything into the public domain until your work has been definitely accepted
  2. Make sure you put a copy into the local IR: the more people find your work, the greater the pool of people who might cite your work: a 1% citation rate from 10 people is 1-in-10; a 1% citation rate from 100 people is 1: a 10-fold increase!

[1] He told me a story of, when he was in China over the summer, a student submitted a piece for his Masters degree. A quick read of it showed that this was an incomplete work, by someone else. Further, fairly simple, investigation revealed it was written by a PostDoc, in a US University, and was going through it’s final review process.

September 9, 2008

Understanding Organisational Cultures…. the journey down

Well, the plane boarded 10 minutes late… and we were held up for “15 to 20 minutes” due to a fault in the air-conditioning system. Once that was fixed, we taxi’d out… and ran over a bolt!

So, we taxi’d back, had the tyre checked over, and then waited for another space in the traffic to take off.

The flight down was fairly uneventful – £5 for a sarnie and a coffee! Talk about a captive market!

Plane lands, and I jump on the bendy-bus to Luton Airport Parkway station, pre-booked tickets to Mlton Keynes… except no trains from LAPW go to MK…. they go from the other train station… and the last train went 10 minutes ago. My choices are: Train into London, cross London, back out to MK; or a taxi.

45 quid later, I’m at the hotel.

Thank god for big beds and strong showers!

… and Premier Inn hotels do an All You Can Eat breakfast – do I look like a man that picks at his food? Get stuck in there mon!

Now at Cranfield, free coffee on tap…. a happy gorilla :) (and I’ve already been ranting about CRIS’ and the need to market oneself :chuckle: )

September 2, 2008

Videos from Repository Fringe 2008

Filed under: Conferences, repositories — codegorilla @ 2:30 pm
Tags: , ,

During Repo Fringe 08 we recorded all the speakers up in the Playfair Library.

They will be made available via a Streaming Server at some point, however this is a microsoft-specific platform, so non-windows/non-Internet Explorer users struggle to access the data.

For those of us who prefer our data less propriety, I’ve uploaded them to google video

August 13, 2008

Repositories are dead, long live repositories – redux

Filed under: Conferences, repositories — codegorilla @ 8:02 am

There has been talk around the place that the term Repositories (as in Institutional Repositories) is detrimental (it appears in several guises on the IdeaScale page Repositories – communicating the idea and by a number of people in conversation and talks.

There is an interest in Repository Fringe ‘09…. but if the term “Repository” is to be replaced, what should we do about the not-a-conference title?

…. and what should the term “Repository” be replaced with?

August 11, 2008

Repository Rap?

Filed under: repositories — codegorilla @ 10:21 am

If the Large Hadron particle accelerator at CERN can raise it’s profile through song perhaps OA and/or Repositories should do the same…. but who is this?

There are videos talking about it, but nothing catchy…

August 5, 2008

Will we ever get it right?

Filed under: humour, repositories — codegorilla @ 5:03 pm
Tags: ,

Of course we will!

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare.

Or, to put it another way, if we have enough Code Monkeys hacking away, we will (eventually) get the perfect system that will give us Green Open Access with Green Publishers.

In fact, the protocol for doing this is defined in RFC 2795

We just need to “Keep the faith”

“I have a dream”

Filed under: repositories — codegorilla @ 9:18 am
Tags: , ,

I’ve talked about this many times before, but this is probably the first time that I’ve committed pen to paper (or, keypress to web-page, in this Web 2.0 world) and described it publicly.

I believe that the current Repository concept is flawed… actually, lots of people now realise that the current concept is flawed, and people are trying to decide what the correct model should be.

We know that the binary-object-centric view does not work, and we know that we need to involve the user much much earlier in the aquisition process

Here is my model:

Researchers have Interests: a general topic/area/subject that interests them; that they want to investigate; to understand.

Researchers apply for funds to Research aspects of that Interest: to look at particular facets; to search for significance; to find reasons or rules of behaviour.

Researchers produce things: they write articles; they produce data; they have significant emails; they go to conferences…. and all of these relate to specific pieces of Research, or to general Interests.

Articles go through iterations: a lengthy final draft; a pre-submission version tailored to a specific journal; a post-review version; a publishers final-copy version… and each of these link to the previous version, or directly to a specific Research thing or Interest thing

The idea is that when someone puts a thing into their workspace, they describe it:

  • An Interest will have a Description, and inherit the primary “author” from the user profile.
  • A piece of Research will have a Title, a Description, a Funder (and “grant number” or other code), will default the Principle Investigator to be Interest.Author[1].
  • The various Article/Conference/Email/Data things will have their appropriate meta-data, and inherit from the item above them..

The idea, therefore, is that the researcher deposits often, and rarely has to provide much supportive metadata… which could easily fit into their normal working flow… and if we could promise to back up their data, they will probably be happy.

Look at http://www.myexperiment.org/ as an example of something heading in this direction.

[1] If we are clever, then we can link into the organisations MIS databases, and pull the “principle investigator” from there, and get a list of assocciated researchers (those paid from the grant), who are likely to be co-authors.

August 4, 2008

Repository Junction, a new world

Filed under: Coding, repositories — codegorilla @ 3:31 pm
Tags: , ,

I have, as many people know, written a small piece of code called Repository Junction.

The function was needed because we are running an ePrint repository called the Depot. The purpose of the Depot is to accept deposits on behalf of Institutions that don’t yet have repositories, with the intent to pass them on once said Institution sets up a repository. We do NOT want to take deposits that should be lodged somewhere else, we don’t want to be seen hogging the landscape.

With this in mind, I wanted to direct people to The Right Place[tm], hence the need for Repository Junction.

Currently, RJ is integrated piece of code, with the lookup functions built into the code-base directly. I want to extract them, and re-write the whole thing so that the core Junction functionality becomes a simple API (maybe multiple APIs, with common/consistent parameters) which anyone can call, anyone can use.

(But after the current three jobs on my list)

Repository Fringe 2008

“The Repository is dead, Long live the repository!” Dorothea Salo’s (University of Wisconsin) Keynote speech kick-started the inaugral Repository Fringe by lambasting Evangelists, policy makers and developers in a cutting diatribe that exposed the very foundations of the Open Access movement. Speaking from bitter experience, she highlighted issues with fundamental concepts; institutional procedures; deposit processes; and even the software developers… usually from first-hand experience.

Back in April, I was attending Open Repositories 2008 down in Southampton, and fell into discusson with Les Carr (General Chair of OR08, and one of the leading lights of the Repositories movement) about this & that and how there is lots of “little stuff” being done by “little groups” with no co-ordination…. and from this grew the insane idea of having a meeting to gather these people together, these people who are working on the fringes of the repository world, and try to foster co-operation and cross-pollination.
“Where should we have this group?”
“Well, it’s a Fringe thing, isn’t it? I’m from Edinburgh, Lets tap into the Fringe thing in Edinburgh, and give people an excuse to have a bit of a holiday at the same time?”

And this was born the idea of the Repository Fringe event.

When I got back to Edinburgh, wiser heads interceeded, and the event as it played out was planned by Theo Andrew, Philip Hunter, Clair Knowles, Stuart Macdonald, Robin Rice, Robin Taylor and Clare Whittaker (with me sticking my oar in occasionally).

The event was centered in the magnificance of Edinburgh Universitys PlayFair Library and was styled in homage to the Edinburgh Fringe, With “Soapbox” sessions (multiple, parallel, 20-minute pitches, incongruously terminated by a clamoring bell), “Group Improv” sessions (hour-long meeting aimed at audience participation), and “An Audience with…” talks (half-hour talks where someone presents an idea or piece of research.)

There was a lot of very interesting information that came out of the conferance, more that I can include here, partly because I was running around swapping laptops for speakers; toting microphones for audience participation; and just possibly hogging a microphone myself, and partly because I actually didn’t see all of the event, as most of the time we had two rooms going!

The highlights for me were:
- Andrew Girdwood echoing the “Build and they shall come” failure of the dot-com era from Dorothea’s “Build and they shall come” failure of the Open Access movement.
- Steven Hichcock’s analogy to banks and libraries
-Neamh Brennan’s hilarious soliloquy to a laptop in a shroud, echoing the death of the repository, followed by a superb example of how a well thought out, well designed, and well executed Current Research Information System can be a positive enhancement to researchers, and just incidentally, by happenstance, provide an Institutional Repository.

The closing Plenary by David De Roure (Southampton) was a look at the GRID system, and highlighting how the successes and failures of the grid system can be mapped to the Open Access/Repositories systems… and how the new world of the Web 2.0 can, and should, influence our descisions.

The event seemed to take on, in some places, more import than could be expected from an event thrown together in such a short time – and maybe some of this reflects on the import that Edinburgh, and EDINA, play in this field.

Everyone I spoke too seemed energised and enthused that, dispite the recognised failings in our systems, we were looking forward to a better and brighter world…. different, but from evolution, not revolution.

An on a final note: I think we have a right to be proud that we put on an event in less than two months; an event that attracted over 80 participants; and an event that was interesting enough to keep the Director of EDINA present for two days when he was less than 20 minutes from his office!

(See pictures at http://www.flickr.com/search/?q=repofringe08)

Blog at WordPress.com.