[Back to OAI-Harvester-Manager Homepage]


OAI Harvester Manager Web Based User Interface - Getting Started


The official Homepage of OAI-PMH says its all about "Interoperability through Metadata Exchange" and defines the Protocol as:
"The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a low-barrier mechanism for repository interoperability. Data Providers are repositories that expose structured metadata via OAI-PMH. Service Providers then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP."
This document describes how you can use OAI Harvester Manager inside of an Internet Browser Window to plan, define and manage your harvesting activities. To learn how you can extend/modify it for your own needs please see the Documentation for Developers like JavaDoc and Sourcecode.
You can also use our: OAI Harvester Manager is Free Software as Defined by Open Source Initiative and distributed under GNU GPL v3. So please feel free to improve it. We will be thankful if you share your improvements with us.
The source code is commented completely in English but the user interface is in German. It will also be a great help if you translate it in your language.
You can mail directly to the project lead Kadir Karaca Koçer using the internal mailing function of SourceForge.

Thanks to all the creative folks at KDE-Look for providing us with such free, great looking icons.


Defining a new OAI-PMH Repository


To harvest any kind of metadata over OAI Protocol first a Repository which serves the metadata in terms of PMH v. 2.0 has to be defined.
The web form for defining a new repository can be accessed by clicking the appropriate icon on the taskbar. This form looks like this:


Screenshot new repository form


Steps to fill this form:
  1. Give a human readable, meaningful name for this repository.
  2. Enter the full URL of the repository.
  3. Check this checkbox if the server supports time format with minutes, if not leave it uncheckt (default yes). For more information see the official W3C Definition of datetime.
  4. Check this checkbox if the server supports HTTP strictly, if not leave it uncheckt (default yes). For more information see the official Apache HTTP Client documentation.
  5. Choose the time limit to wait till a successfull connection to this repository. Default is 5 minutes and should be ok for 99% of servers. Increase it if this repository is often overloaded and can not answer within that timeout limit.
  6. Choose the amount of tries till a successfull connection to this repository. Default is 3 times. Increase it if there are problems with this repository.
  7. Choose the time to wait between two tries. Default is 2 minutes. Increase it if there are problems with this repository.
  8. Control all of the values you entered. You can set them back to the defauls with the RESET button on the right.
  9. After all of the values you entered are correct, save them with the SAVE button on the left.
  10. You can define as many repositories as you like or start directly defining tasks which use this repository.

Now you are ready to define your first harvesting task.


Defining a new Harvesting Task



Defining a new Harvesting Task, which uses OAI-PMH command ListRecords


Screenshot new task form (ListRecords)


Steps to fill this form:
  1. Choose from the drop-down-menu the repository to be harvested. The repository names listed are the ones you gave during defining the repository.
  2. Click the calender icon to choose the correct start date and time of this task. If you choose a date in the past the actual values will be used and the harvesting starts immediately.
  3. Click the calender icon to choose from which date and time afterwards the records have to be harvested.
  4. Click the calender icon to choose until which date and time the records have to be harvested. Please note that this effects only one time tasks. Tasks which will be repeated automatically (see below) compute their from and until dates themselves.
  5. Type the OAI-Set to harvest (or leave blank if no sets defined).
  6. Type the correct Metadata Prefix.
  7. Choose how often this task should be repeated (default is one-time-task = do not repeat).
  8. Choose the Data Receiver. That means which user defined module shall process the returned metadata. You can use the provided "Write to File System" Data Receiver for your first OAI-PMH experiments. This module saves each received record as a XML file in a given directory on your file system.
  9. Check this checkbox if you want that the OAI-Response should be validated.
  10. Check this checkbox if you want that the XML-Structure should be validated.
  11. Check this checkbox if you want that the original server response should be saved.
  12. You can save the task by clicking the left button or cancel it with the right.
  13. You can define as many tasks as you like either with ListRecords or GetRecord.




Defining a new Harvesting Task, which uses OAI-PMH command GetRecord


Sometimes is one particular record is interesting and not the record list. If the identifier of this particular record is known, the OAI-PMH command GetRecord can be used instead of ListRecords or ListIdentifiers. Then the Harvester asks only for this record and the server returns only the metadata of this record.
To use this functionality the GetRecord Form must be filled:

Screenshot new task form (GetRecord)


Steps to fill this form:
  1. Choose from the drop-down-menu the repository to be harvested. The repository names listed are the ones you gave during defining the repository.
  2. Click the calender icon to choose the correct start date and time of this task. If you choose a date in the past the actual values will be used and the harvesting starts immediately.
  3. Type the unique identifier of the record to harvest.
  4. Type the correct Metadata Prefix.
  5. Choose the Data Receiver. That means which user defined module shall process the returned metadata. You can use the provided "Write to File System" Data Receiver for your first OAI-PMH experiments. This module saves each received record as a XML file in a given directory on your file system.
  6. Check this checkbox if you want that the OAI-Response should be validated.
  7. Check this checkbox if you want that the XML-Structure should be validated.
  8. Check this checkbox if you want that the original server response should be saved.
  9. You can save the task by clicking the left button or cancel it with the right.
  10. You can define as many tasks as you like either with ListRecords or GetRecord.



Listing all the quequed Tasks



Clicking the first icon on the task bar returns a table, which lists:

Screenshot all waiting tasks.





Listing all the finished Requests



Clicking the second icon on the task bar returns a table, which lists:

Screenshot all finished requests





Listing all the registered Repositories



Clicking the third icon on the task bar returns a table, which lists:

Screenshot all registered repositories





Listing all servers with access problems



Clicking the fourth icon on the task bar returns a table that lists all the servers which could not be harvested successfully. Please note that only the servers that exceed the user-defined treshold (default is equal to or greater than 5 errors) are listed and after every successful harvesting the error count will be set back to 0.



[Back to OAI-Harvester-Manager Homepage]



Disclaimer


The present OAI-PMH Harvester Manager software represents an experimental software (so-called "beta release"). The software and its specification may include technical inaccuracies or typographical errors. Changes are periodically added to the information therein; these changes will be incorporated into new versions of the specification, if any. The usage of the software is at one's own risk. The Sourceforge project or German National Library in no event will be liable for any damages to hardware and software, lost data or other special, indirect, consequential incidential or punitive damages however caused, arising or related to any use of the specification or the software. See the chapters 15 and 16 of the licence.
In some countries, the cryptographical software or other components contained in the software may be subject to special export regulations or software patents. In such cases the software may not be distributed to and within those countries.



© Copyright 2006-2009 - Kadir Karaca Koçer, German National Library, All Rights Reserved.

SourceForge.net Logo

Valid XHTML 1.0 Transitional