Collection DMP
The Oregon Flora Collection: For The Average Plant Enthusiast
The Oregon Flora Collection is a digital research collection for the average plant enthusiast to enjoy. We chose Oregon flora as our topic because our interest in flowers brought us all together, despite our different majors and interests. This collection focuses on botanical photographs and illustrations captured in Oregon as early as 1862. We hope the average Oregonian can access our collection and enjoy Oregon’s flora without the stress scientific names can often add for people who just want to enjoy plants.
Data Curation Methodology
Our group accessed Oregon forestry-related data via GLAMS, gathering objects and metadata. Each member used a personal Dropbox for object storage and an individual Excel sheet for metadata. SharePoint was utilized for shared documents with standardized file naming and citation format. Initially, metadata was consolidated in a shared Excel sheet. Later, complete metadata for selected objects related to industry workers was added to another shared Excel sheet. For updated and comprehensive metadata, a shared Google Drive folder with personal Google Sheets was employed. After data verification and matching, it was compiled into a shared Google Sheet for GitHub upload. Additionally, objects were resized, redownloaded to personal computers, and stored in Dropboxes as backups.
Roles & Responsibilities
The Oregon Flora Collection was created by four students: Chloe Gold, Cassidy Perkins, Danielle Lichtenstein, and Haley Sherman. Chloe is the Project Manager. She coordinated group meetings and assignments, acted as a point-person for questions, and submitted all projects on behalf of the group. She also filled in if people were unable to complete their tasks. Cassidy, the Object Preservation Manager, reviewed and revised file naming standards ensuring that files were consistently named, properly formatted, and machine readable. Similarly, she managed file types and formats to ensure platform compatibility while maintaining file quality (e.g. converting TIFF files into JPEGs using GIMP image editing software.) In addition, Cassidy undertook various miscellaneous tasks, unrelated to her role as Object Preservation Manager, such as collaborating with other group members to ensure metadata was consistent and formatted correctly and organizing files and their subjects and assigning serial numbers. Danielle Lichtenstein is the Repository Manager. She collaborated closely with the project manager to resolve any issues related to the repository’s structure or data, ensuring the successful upload of materials and conducting quality checks on the objects and information. Haley is the Metadata Manager. She reviewed and edited metadata protocols, reviewed and edited object descriptions, and assisted in the group’s effort to standardize data entry.
Expected Data
Data Types
There are 26 images (.jpg or .jpeg) in our digital collection
The following table illustrates the overall collection size, work type, including file formats, total number of assets, average file size, and total size of the files included.
Work Type | File Format | Total Assets | What is the average file size (in MB) for each file type in this collection? | What is the total size of all files (in MB) that have the same file type? |
---|---|---|---|---|
image | jpg/jpeg | 26 | 0.47 MB | 12.23 MB |
Our objects consist exclusively of jpeg/jpg images with a variety of sizes ranging from 0.012 MB to 3.1 MB ultimately totaling to 12.23 MB worth of images.
Data Handling
The objects including in our collection were collected, compiled, and cataloged by Chloe Gold, Cassidy Perkins, Danielle Lichtenstein, and Haley Sherman and are considered to be in public domain or under fair use based on their creative commons licences. Our efforts began in Sharepoint with a shared folder dedicated to our group, titled “Group 1.” This folder, held within the LIB 350M Sharepoint website, operated as the storage place for our data and metadata. We used Microsoft Word for any text documents and Excel to begin cataloging our objects and metadata in a spreadsheet called “ofc_objectcatalog.xlsx.” Our objects were initially downloaded in their original formats, regardless of GitHub compatibility, and placed in a general objects folder later named “002objects.” From here, any original TIFF files were moved to a raw object folder labeled “001rawobjects” for storage and were copied and converted into jpeg files using GIMP to be returned to the general object folder. Once we compiled all of our objects, we transferred our data from Excel to Google Sheets to transfer our file to a .csv or Comma Separated Values file for compatibility with GitHub and Collection Builder. Our project is complete as of March 2024 and we do not currently plan on adding additional objects in the future.
File Naming Standard
Filenames should follow the following format: collectionabbreviation_serialnumber_commonplantname_year.filetype Example: ofc_001_crimsoncolumbine_1862.jpeg
- collectionabbreviation: ofc – Oregon Flora Collection
- serialnumber: three-digit number assigned to object in subjective order determined during
- collection development to help further distinguish and organize objects
- commonplantname: a name that is most used to identify the plant because our audience is common plant appreciators
- year: year image was taken, or earliest estimated date or decade. If no date is present 0000 will indicate that as a placeholder.
- filetype: type of file (ex. jpg/jpeg)
Files are to be named in this order to ensure consistency and machine readability throughout the collection. The most general component of our file naming standard, the collection title abbreviation, is to appear first for the purpose of easy access to collection materials in the file search system. All file names are to be lower case with no spaces or special characters (excluding underscores) to ensure machine readability.
Legal and Ethical Restrictions
At this time, there aren’t any expected legal and ethical restrictions or concerns regarding the objects and content of our digital collection. All objects in the collection have copyright statuses permitted general re-use or educational use. In addition, our topic, flora, is a relatively non controversial one, and generally shouldn’t pose ethical concerns. Our collection consists of images and illustrations which were previously cataloged via other collections and objects of special interest or concern such as endangered plant species or poisonous plant species have been tagged as such to encourage further interest and exploration.
Aggregated and Shared Data
All of the images within our collection will be available for download through our public CollectionBuilder repository page. Many of our objects are in public domain or available for reuse with relatively little restrictions. Information on rights and creative commons licensing are included on each individual items page within their object metadata to ensure our viewers understanding of rights for re-use.
Period of Data Retention
The data will be retained, available and visible on CollectionBuilder when it is uploaded into the group GitHub repository. The information submitted into GitHub comes from Sharepoint and the group folder on Google Drive called “ofc_group1_lib400.”
Data Formats and Dissemination
Our final dataset is accessible to the public via a digital collection that can be found here. It is made up of 26 images formatted as JPEGs or JPGs that we uploaded to our website using Collection Builder and GitHub. We chose to use JPEGs or JPGs for their compatibility with most devices. Collection Builder provided us with a template entitled “CollectionBuilder-GH” that allowed us an easier time plugging in our objects. These objects were sourced from a variety of existing databases, including the Oregon Institute of Marine Biology, Oregon State University’s Special Collections and Archives, Oregon Flora, Oregon Historical Society, and Oregon Digital. Our objects have a variety of rights and Creative Commons statuses, so their respective statuses can be found in their individual item pages. Their statuses are also linked to allow our consumers the easiest time understanding their rights and the creators rights. Despite the variety of statuses, all objects are available for educational use.
Data Storage and Preservation of Access
The project’s data is stored across multiple platforms. The initial dataset, composed of metadata spreadsheets and object images, is available on Group 1’s team SharePoint, accessible to any team member. A file folder within the sharepoint titled “001rawobjects” contains the original object downloads that were in an incompatible format (any TIFFs for example) and a folder titled “002objects” contains the converted files in their jpeg format and ready for upload. Copies of the data from SharePoint have been transferred to Google Drive, where they can be accessed by anyone with a Google account. Additionally, the project data is hosted on a GitHub repository page and is accessible via the Collection Builder site.
Appendix
Metadata Application Profile
Our metadata application profile is available in our GitHub repository: Group 1 Metadata Application Profile
File Naming and Citation Conventions
Our file naming and citation conventions can be found in our GitHub repository: Group 1 File Naming and Citation Conventions.