DSpace AIP Extractor

DSpace is a system developed by DuraSpace, an open source organization, together with several Universities and Institutions, and is a repository web service that is able to safely store a series of different documents - the system allows the organization of these documents, easy backup and intuitive administration. As of version 1.7 of DSpace, a functionality called AIP - Archival Information Packages - was introduced. It was created to facilitate the backup of Data and it consists on a format that stores all groups, collections and Items (the documents themselves) from a DSpace instance.

The Dspace AIP Extractor uses this functionality to perform an AIP backup and store in the target machine. It then returns the system path where the generated AIP was stored.

Besides the AIP backup, this module also crawls through the DSpace main installation folder and searches for possibly customized files. When finding them, it returns the corresponding path.

Creator
Caixa Magica Software

License
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Find at opensourceprojects.eu
https://opensourceprojects.eu/p/timbus/context-population/extractors/dspace-aip-extractor

Go to the DSpace AIP Extractor's page on opensourceprojects.eu for more details about requirements, interaction and source code by following the link above.

How to install DSpace AIP Extractor

Tools: - Git – Git is a distributed revision control and source code management (SCM) system with an emphasis on speed, data integrity, and support for distributed, non-linear workflows. In Timbus project, git is therefore utilized to facilitate distributed development and cooperation between partners. All tools are available in www.opensourceprojects.eu website. To get this particular tool, once having Git installed run the following in the command line: git clone https://opensourceprojects.eu/git/p/timbus/context-population/extractors/linux-hw It will save the project in a new folder. - Java 1.7 or over – A simple tutorial on how to install the required java on different platforms is available in https://opensourceprojects.eu/p/timbus/wiki/How%20to%20install%3A%20Java/ - Maven - Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information. It is used to manage all dependencies and build of the project. Besides fetching all required dependencies during build, it allows a fine-grain control over the whole build process.

Artifacts: - Maven pom parents: osgi (2-beta-4) and core (2-beta-3) – As various projects within Timbus utilize common dependencies and have similar build behaviour, a series of maven parents were created to facilitate the creation of new tools – Declaring a pom parent in the main project pom file states that the current project extends its parent and, therefore, follows its behaviour . However, it is possible to override certain build options in order to adapt the parent to the context of the tool being developed. As any other tool or artifact in Timbus, the parent is stored in opensourceprojects repository and can be fetched by running the following command: git clone https://opensourceprojects.eu/git/p/timbus/support/maven-parents timbus-support-maven-parents - Extractors-core (currently version 0.0.3-RELEASE) – This module sets the behaviour for all remote extractors in TIMBUS and it's necessary to invoke this artifact within Virgo environment (for more detail, see description of this tool). You can get it by running the command: git clone https://@opensourceprojects.eu/git/p/timbus/context-population/extractors-core The project does have other dependencies, however all of them are fetched automatically from remote repositories during the build process.

Step-by-step compilation Once all requirements are met, the following steps are necessary to properly compile the project: - Install the maven parent in the local maven repository. All other project dependencies are available in remote accessible repositories, but the parent has to be installed so that maven is able to find it when compiling: • Go to timbus-support-maven-parents/core • Run “mvn install”. This command will recognize the pom.xml file in the folder and install it properly on the local repository. • Go to timbus-support-maven-parents/osgi and perform the same command. - Install the extractors-core artifact dependency in your local repository. For this, go to the root directory of the previously fetched project and run “mvn install”. - Go to the SSH Wrapper project's main folder, and run “mvn clean package” - This command will build the project appropriately and place it in a “target” folder. The [clean] option replaces the “target” folder if there is already any.

If everything was successful, the compiled .jar file can now be found in the “target” folder. You can deploy this jar file into a configured Virgo installation (see THIS GUIDE for instructions on how to install and set up Virgo)

Create a New Extractor
To learn how to develop an extractor, follow this link to a tutorial by Caixa Magica Software: https://opensourceprojects.eu/p/timbus/context-population/extractors/wiki/How%20to%20create%20a%20new%20Extractor/

 

 

 

 

 

 

 

 

 

 

 

 

Learn more about the extractors from the video about the Linux Hardware Extractor.

Back to Context Population Framework