Tuesday, November 22, 2011

Unit 13: Semester Summary

I feel like this is not a good time for me to write a commentary on the semester as a whole. Given all the technical problems I have had with EPrints and now Protege, I feel like I am ending the semester on a really negative note. Even when I finally get things to work, I rarely understand why they weren't working to begin with. It makes me question whether this is really a good use of my time, knowing that at none of the several institutions I have worked at as a professional Archivist would I ever be the person to be doing these installs and system configurations. There has always been at least one IT person on staff whose responsibility the system installation and configuration would be. I would be responsible for implementing the system, writing documentation, and training other staff on how to use it, but not these technical backend tasks that I don't really understand. I have by far spent more hours doing repeated system installs and troubleshooting than I have been able to spend actually experimenting within the repositories themselves, which is the portion that would actually inform my job. So many more hours could have been spent really exploring what these systems are capable of doing. Hopefully, I will be able to come back to these repositories and test them out more in the future after the course is over...

Tuesday, November 15, 2011

Unit 12: Virtual Machines

This week we have been asked to discuss the possibility of downloading a pre-installed VM versus building your own, from a learning and pedagogical perspective. I am really glad to add the ability to setup a basic VM to my set of skills. I can set one up pretty quickly now after having done it so many times, so the repetition has definitely helped speed up the process. If downloading a pre-configured solution is almost as much work, then I am not really sure it would save that much time at the expense of gaining VM experience. On the other hand, it would be nice to have more time to work with the collections and explore the functionality of the systems more. I think the only way that would really be enabled would be to have hosted versions of each software already configured, similar to our digital preservation class. From a professional perspective, as much as I may enjoy working with the command line and VM setup, if I am being honest those are the skills I will probably use the least on the job. Those are the sorts of tasks the Technology department would handle, whereas my expertise would be needed as an Archivist creating the digital collections, metadata, and controlled vocabulary.

Tuesday, November 8, 2011

Unit 11: Home Sites Evaluation

Particularly when working with open source software, the need to reference documentation online to troubleshoot or figure out how to use the systems becomes key. Unless you contract with a support service, there is no customer service line to call when you run into problems or client consultant to contact when you need additional training or support. Ease of use and comprehensiveness of information available online is crucial when evaluating and selecting systems. So far in our testing, I think the Drupal, DSpace, and Omeka home sites are the easiest to find and navigate and have the clearest, easiest to use documentation. The EPrints, JHOVE, and harvester sites have a fair amount of documentation, but their sites are not as easy to navigate and the documentation that exists is not nearly as easy to access and make sense of.

Tuesday, November 1, 2011

Unit 10: OAI Harvesters

I chose to look at the following three OAI metadata harvester sites:
I think all three harvesters demonstrate the general pros and cons of OAI harvesters. In the plus side, they provide researchers a one-stop shopping experience by pulling together in one place and accessible via one search disparate resources on a unified subject, geographic region, etc. Potential drawbacks, however, include duplicate entries resulting from multiple repositories having copies of the same resource, frustration derived from broken or non-existent links directly to the described resource, and false impressions of being fully comprehensive resulting in missing out on relevant resources not indexed by the harvester.

Tuesday, October 25, 2011

Unit 9: Consistency

In my experience, creating the catalog or metadata record for digital objects take the longest of any steps in the digitization process and as result winds up being the most expensive part. Enforcing consistency during the process is also challenging. In the past, I have relied heavily on detailed manuals describing each metadata field and providing specific examples to try to maintain some level of consistency. Consistency is easier to achieve when the person creating the metadata record has to make a simple, black and white judgement that can be clearly explained in a manual. It is much harder to achieve when dealing with subjective description, such as assigning controlled vocabulary. Particularly if you have more than one person working on a project, it is unrealistic to expect perfect consistency. I have had some success with identifying records that are inconsistent, and then sitting down periodically with staff and talking through the differences and their individual thought processes to try and get everyone on as close to the same page as possible. It is a time consuming process which adds to the project length and expense. In the case of this project, it is somewhat easier since only one person is doing all the cataloging. It is surprising, however, how inconsistent even the same person can be from day to day. I think the different installations we are doing and re-cataloging in a sense of the digital objects in our collections is helping to expose the inconsistencies so by the time we complete our final projects the consistency level of our description should be rather high. That is not really a plausible tactic in large scale though, unless you are using sampling methods.

Unit 8: Continued

So, my experience installing and configuring EPrints has not improved. It took me creating 4 virtual machines and attempting to install EPrints 5 times before it worked. I am not even sure if the one thing I did differently actually made it work or if it was just a fluke. I basically just edited the sources file in steps rather than uncommenting the two lines and adding the two new lines all at once, and ran the update and safe-upgrade commands twice after each change. Other than that, I didn't change a single step. Now thanks to EPrints problems, I am behind in classwork for the first time this semester which will adversely effect my grade. I am not sure I can manage to not hold all the aggravation of the last week or so and that against the software to judge it objectively at this point...

Tuesday, October 18, 2011

Unit 8: Stay Tuned

Well, for the first time ever I have not been able to get a repository up and running. Sadly, I keep getting the same "unknown id" error and can't even get past the very first part of configuring EPrints. I have started over completely from scratch, created a second new virtual machine and reinstalled everything and am still getting the same error. I don't understand what I am doing wrong. It is really frustrating, especially compared to the relatively painless installs of Drupal and DSpace. I hate that I am stuck and can't move any further. Sigh...stay tuned for a hopefully happy ending!