Mouse Manor: 2011

Tuesday, November 22, 2011

Unit 13: Semester Summary

I feel like this is not a good time for me to write a commentary on the semester as a whole. Given all the technical problems I have had with EPrints and now Protege, I feel like I am ending the semester on a really negative note. Even when I finally get things to work, I rarely understand why they weren't working to begin with. It makes me question whether this is really a good use of my time, knowing that at none of the several institutions I have worked at as a professional Archivist would I ever be the person to be doing these installs and system configurations. There has always been at least one IT person on staff whose responsibility the system installation and configuration would be. I would be responsible for implementing the system, writing documentation, and training other staff on how to use it, but not these technical backend tasks that I don't really understand. I have by far spent more hours doing repeated system installs and troubleshooting than I have been able to spend actually experimenting within the repositories themselves, which is the portion that would actually inform my job. So many more hours could have been spent really exploring what these systems are capable of doing. Hopefully, I will be able to come back to these repositories and test them out more in the future after the course is over...

Tuesday, November 15, 2011

Unit 12: Virtual Machines

This week we have been asked to discuss the possibility of downloading a pre-installed VM versus building your own, from a learning and pedagogical perspective. I am really glad to add the ability to setup a basic VM to my set of skills. I can set one up pretty quickly now after having done it so many times, so the repetition has definitely helped speed up the process. If downloading a pre-configured solution is almost as much work, then I am not really sure it would save that much time at the expense of gaining VM experience. On the other hand, it would be nice to have more time to work with the collections and explore the functionality of the systems more. I think the only way that would really be enabled would be to have hosted versions of each software already configured, similar to our digital preservation class. From a professional perspective, as much as I may enjoy working with the command line and VM setup, if I am being honest those are the skills I will probably use the least on the job. Those are the sorts of tasks the Technology department would handle, whereas my expertise would be needed as an Archivist creating the digital collections, metadata, and controlled vocabulary.

Tuesday, November 8, 2011

Unit 11: Home Sites Evaluation

Particularly when working with open source software, the need to reference documentation online to troubleshoot or figure out how to use the systems becomes key. Unless you contract with a support service, there is no customer service line to call when you run into problems or client consultant to contact when you need additional training or support. Ease of use and comprehensiveness of information available online is crucial when evaluating and selecting systems. So far in our testing, I think the Drupal, DSpace, and Omeka home sites are the easiest to find and navigate and have the clearest, easiest to use documentation. The EPrints, JHOVE, and harvester sites have a fair amount of documentation, but their sites are not as easy to navigate and the documentation that exists is not nearly as easy to access and make sense of.

Tuesday, November 1, 2011

Unit 10: OAI Harvesters

I chose to look at the following three OAI metadata harvester sites:

Avano - an OAI harvester focused on aggregating published and unpublished research relating to the fields of marine and aquatic science, including peer reviewed articles, working papers, posters, cruise reports. (http://www.ifremer.fr/avano/)
NORA (Norwegian Open Research Archives) - an OAI harvester covering the intellectual output of Norwegian university and college repositories. (http://www.ub.uio.no/nora/topic.html?siteLanguage=eng)
Sheet Music Consortium - an OAI harvester dedicated to providing access to and use of online sheet music collections. (http://digital2.library.ucla.edu/sheetmusic/)

I think all three harvesters demonstrate the general pros and cons of OAI harvesters. In the plus side, they provide researchers a one-stop shopping experience by pulling together in one place and accessible via one search disparate resources on a unified subject, geographic region, etc. Potential drawbacks, however, include duplicate entries resulting from multiple repositories having copies of the same resource, frustration derived from broken or non-existent links directly to the described resource, and false impressions of being fully comprehensive resulting in missing out on relevant resources not indexed by the harvester.

Tuesday, October 25, 2011

Unit 9: Consistency

In my experience, creating the catalog or metadata record for digital objects take the longest of any steps in the digitization process and as result winds up being the most expensive part. Enforcing consistency during the process is also challenging. In the past, I have relied heavily on detailed manuals describing each metadata field and providing specific examples to try to maintain some level of consistency. Consistency is easier to achieve when the person creating the metadata record has to make a simple, black and white judgement that can be clearly explained in a manual. It is much harder to achieve when dealing with subjective description, such as assigning controlled vocabulary. Particularly if you have more than one person working on a project, it is unrealistic to expect perfect consistency. I have had some success with identifying records that are inconsistent, and then sitting down periodically with staff and talking through the differences and their individual thought processes to try and get everyone on as close to the same page as possible. It is a time consuming process which adds to the project length and expense. In the case of this project, it is somewhat easier since only one person is doing all the cataloging. It is surprising, however, how inconsistent even the same person can be from day to day. I think the different installations we are doing and re-cataloging in a sense of the digital objects in our collections is helping to expose the inconsistencies so by the time we complete our final projects the consistency level of our description should be rather high. That is not really a plausible tactic in large scale though, unless you are using sampling methods.

Unit 8: Continued

So, my experience installing and configuring EPrints has not improved. It took me creating 4 virtual machines and attempting to install EPrints 5 times before it worked. I am not even sure if the one thing I did differently actually made it work or if it was just a fluke. I basically just edited the sources file in steps rather than uncommenting the two lines and adding the two new lines all at once, and ran the update and safe-upgrade commands twice after each change. Other than that, I didn't change a single step. Now thanks to EPrints problems, I am behind in classwork for the first time this semester which will adversely effect my grade. I am not sure I can manage to not hold all the aggravation of the last week or so and that against the software to judge it objectively at this point...

Tuesday, October 18, 2011

Unit 8: Stay Tuned

Well, for the first time ever I have not been able to get a repository up and running. Sadly, I keep getting the same "unknown id" error and can't even get past the very first part of configuring EPrints. I have started over completely from scratch, created a second new virtual machine and reinstalled everything and am still getting the same error. I don't understand what I am doing wrong. It is really frustrating, especially compared to the relatively painless installs of Drupal and DSpace. I hate that I am stuck and can't move any further. Sigh...stay tuned for a hopefully happy ending!

Tuesday, October 11, 2011

Unit 7: DSpace Community

Since our blog assignment this week was flexible and there was no direction for how to report back on our research into the DSpace community, I thought this would be a good place to talk about my findings for Assignment 1b. Essentially, I found the DSpace community to be quite substantial. There are currently 1136 registered users of the software, and I am sure if it is anything like other open source software communities that there are a large number of non-registered users out there. A few quick searches turns up significant amounts of thorough documentation, training modules, user group meetings and mailing lists, blogs and wikis. In addition to the documentation and introductions to DSpace covered in our assignments last week, I also like http://www.dspace.org/training-grid/configurable-submission-system-for-dspace.html. There is also the monthly publication NewSpace and the DSpace Global Outreach Committee to help stay connected. And then there's DSpace's proven staying power, as well as even more importantly perhaps for some several service providers available to assist with implementations for institutions lacking the necessary IT support. All in all if I were ranking user community as one of the evaluation criteria for a system, I would give DSpace a top rating in that category.

Tuesday, October 4, 2011

Unit 6: DSpace Install

The install of DSpace actually went pretty smoothly. I had a problem early on with the connection which preventing me from running the aptitude commands, but that was just because I didn't realize based on the instructions that we had to put the system in bridged mode for the static IP address to work properly. After I sorted that issue out with Professor Fulton's help, everything was fine. I think understood the gist of the instructions and individual commands, i.e. it made overall sense what I was trying to accomplish. I still find that I am mainly demonstrating my ability to carefully follow instructions and pay attention to detail sometimes rather than fully comprehending what I am typing. In the "real world" though, I suspect that is frequently the position archival professionals are placed in, especially if working at institutions without adequate IT support. What matters is that it worked! The other two examples of instructions look different but familiar at the same time. It is not surprising to me that installations can be approached from slightly different ways. The overall components of the installs looked similar if in slightly different orders. They weren't as helpful as Professor Fulton's videos mainly because they lack the audio commentary patiently talking you through all the steps and what to expect next, but they still looked fairly simple and straightforward. I think it is important to remember that we are not alone and there is no need to reinvent the wheel. There are user communities and lots of resources out there to turn to for help in installing this kind of content management system, assuming of course that you have some basic IT infrastructure in place to begin with. I think this is something I could do on my own in a basic deafult sort of way. As I have learned from working with alot of different software systems, its when you want to get into customization or advanced functionality that the assistance of a systems specialist is usually needed.

Tuesday, September 27, 2011

Unit 5: Contemplate

Well, I chose Contemplate as my self-selected Drupal module. According to the website, this module "was written to solve a need with the Content Construction Kit (CCK), where it had a tendency toward outputting content in a not-very-pretty way". Since I agree that the default Drupal presentation is really not attractive at all, I thought this module might help me to make some improvements. It also sounded like a good option because the website indicated that it "dovetails nicely with CCK" and required no additional modules. On the positive side, I was able to adjust previous directions and successfully download, unzip, and install the module! Yey me (or more accurately - yey Professor Fulton's detailed instructions)! On the downside, although the module description says it "makes it easy to rearrange fields" and other aspects of Drupal display, it seems to be "easy" only if you have a better understanding of php than I do. Or, maybe I am just doing something wrong, which is totally possible. Still, I feel confident now that I could explore the available modules in more detail and if I find one I like actually complete a successful installation, which I have to think is really one of the major goals here.

Tuesday, September 20, 2011

Unit 4: Drupal Impressions

I kind feel I am being unfair to Drupal. Our installation is straight out of the box with no customization at all. As is, it looks terrible! I would never choose it as a Digital Library & Archive solution if the best it could ever look is what we have up and running now. I have seen Drupal in action as a content management solution for a university website. I know at least when it comes to managing web content, it can function and look alot better than this. I have my doubts about its ability, at least without quite a significant amount of customization, to serve as a Digital Library & Archive solution. Several of the Drupal installations I have seen so far in exploring for this course have been working in tandem with ContentDM rather than relying solely on Drupal functionality. Some of my major concerns would be rights management and representation of archival hierarchy.

Tuesday, September 13, 2011

Unit 3 - Class Commentary

I am really excited about this semester's objective. I love the idea of installing multiple content management systems and experimenting with them as digital libraries and archives. In my experience, investigating, evaluating, testing, and selecting software is such a huge part of the responsibilities as a Digital Archivist. This is a great opportunity to get more experience with software I have not had the chance to explore before. In terms of the tech assignment, I actually wish there were a lot less reading and writing and more hands-on experimentation with the VM, MySQL, and the different software systems. In particular, it seems like two discussion assignments a week is unnecessary. The management portion of the course also doesn't feel integrated at all with the tech assignments.

Tuesday, September 6, 2011

Unit 2 - CMS Development and Design

I chose the theme article "LibData to LibCMS: One Library's Evolutionary Pathway to a Content Management System" by Paul F. Bramscher and John T. Butler from the Digital Library Development Lab at the University of Minnesota. The article describes there journey from a simple relational database for storing library related data to an increasingly more dynamic and flexible content management system. While the technical and functional details of the CMS they ultimately ended up with were somewhat interesting, I was really struck more by the strong commitment to both user-driven design and local, open-source development.

One of their overarching goals was that a CMS "should enable the organization to provide a high degree of customization for users so that key user communities feel that the site has been designed expressly for them." The authors were very honest about the trade-offs of a centralized CMS, especially concerns over loss of freedom and creative license for staff and individual departments. In exchange, however, the university was able to empower staff lacking HTML skills to take more direct control over the content of their own websites. To help off-set staff concerns, very careful attention was paid to understanding existing workflows and roles in practice (not ideally) prior to any technical development of the system itself. Moreover, increasing amounts of flexibility and multiple authoring mechanisms were built into the system throughout its evolution.

The authors also frankly discuss the pros and cons of local development and make it clear that the decision between that and purchasing proprietary software is really a decision unique to an individual institutional. While like many organizations they were challenged by the gap between library and computer science knowledge and skills, ultimately they found the bonuses of local development to outweigh the drawbacks. Namely, they prioritized the complete technical control to design a system that met the specific individual needs of the university and integrated as seamlessly as possible with existing university systems while proving cheaper in the long-term and avoiding restrictive licensing agreements. Moreover, there was a hope that the open-source approach would eventually lead to shared community use and development in the future.

I do have one rather petty and insignificant criticism of the article. I generally think the purpose of an analogy is to compare one thing a person is likely to already have an understanding of to a similar thing a person perhaps is unfamiliar with as a means of helping a person more quickly or easily grasp a new concept. So please tell me, what is the purpose here of using this analogy: "This [system] approach may be roughly analogized to arranging DNA nucleotides in chains to produce an 'information molecule'." Did the authors forget who their target audience was momentarily or is it just me?

Tuesday, August 30, 2011

Unit 1 - The Collection

So, I have decided to use this semester's primary project as an excellent opportunity to test out how the Rockefeller Archive Center's first large scale digitization project works in a variety of digital repository systems. The RAC digitized approximately 91,000 pages of Rockefeller Foundation Officer Diaries that had been previously microfilmed for preservation purposes. The Diaries offer fascinating details documenting the philanthropic efforts of the Rockefeller Foundation and the daily work of Foundation representatives across the globe. Although the digitization and OCR work is complete, the RAC is still investigating software options for the web delivery of the diaries and assessing the level and standardized form of description needed. For this project, I will be using a sample of diaries and also portraits of five Foundation officers. Permission has already been given by the Rockefeller Foundation for online public access to the materials. As for access terms, these will vary greatly depending on the geographic location and philanthropic area each officer is assigned to. Field Offices were maintained from India to Paris to Mexico and field operations conducted including tropical disease research, preventative medicine and nutrition, child and public health, and agriculture.

Mouse Manor