Contents
Databases
Design, query, and evaluate information retrieval systems.
Introduction
Information retrieval system design is implemented with the goal of successful query for information. Evaluation is used to determine if the design implementation is efficient and useful for all necessary contexts. As G. G. Chowdhury explains, “Information retrieval is concerned with all the activities related to the organization of, processing of, and access to, information of all forms and formats” (93). The purpose, functions, and components of an information retrieval system need to be designed based on user needs. The purpose is simple: “to retrieve documents or information required by the user community” (Chowdhury, 98). In order to serve this purpose, the information retrieval system must function by “analyz(ing) the contents of the sources of information as well as the users’ queries, and then match these to retrieve those items that are relevant” (Chowdhury, 98). This description makes search retrieval sound simple, until one realizes that there are multiple components of an information retrieval system, including document, indexing, vocabulary, searching, and matching subsystems that generate search results in a user-system interface (Chowdhury, 98-100).
Design
When a database for information retrieval is designed, the main goal is efficient search and retrieval processes. Search is defined as “a process of attempting to use a word or combination of words to retrieve the documents that will meet some kind of information need” (Weedman, 119). Understanding how people search, enter data, and browse through information will aid the design team; user studies and surveys can help develop a database that truly fills the needs of the user community. A requirements analysis determines the basic functions that the database should provide. In library settings, guidelines and standards are often followed for controlled vocabularies and metadata. All librarians will benefit from learning XML, EAD, and MARC, and how to crosswalk between multiple coding standards. When creating a database, standards for controlled vocabularies such as LOC Subject Headings should be followed. Using these standards provides consistency across multiple databases, which is very helpful for successful migration of data when platforms and databases evolve or change (Weedman, 223).
Query
Information retrieval is a science. Different types of search engines will generate varied results depending on how they are designed. Many search engines can be efficiently searched with a knowledge of Boolean operators, but not every engine uses these operators (AND, OR, NOT) to narrow down search results. Nesting (using parentheses) tells the search engine to search for the word outside of the parentheses first. Large engines like Google will take misspellings into consideration, but smaller databases do not have this technology design. Instead, narrowing results can be done with field restrictions, range searching, or synonym generators (Tucker, 316-317). Bibliographic databases and OPACs often include features for narrowing down items or finding similar items; they also often have a feature that links the source to all other sources that used the first source as a citation. These technological developments have been enhanced over the years based on user needs. At one point, tagging was imperative when creating a new source, but now databases and search engines perform web crawling which truly captures the contents of a website. An exciting improvement on search capabilities is image searching. Now, pictures can be searched, not just words. The implications of image searching opens whole new possibilities for research.
Evaluate
There are multiple things to check when evaluating an information retrieval system design. First, the collection of documents or objects should be evaluated. Is it well represented in the system? Are new records efficiently created for use? Do the collection records describe the contents in a manner that is understandable to the user community? After analyzing the records and collections, the search terms should be evaluated. Do they make sense? Do the search terms replicate the chosen standards being used? Do the search terms represent the collection well? The searchable terms rely on a controlled vocabulary. Evaluation can determine if this vocabulary works well for the database and its users. The search engine should also be evaluated for functionality; any errors with links or retrieval should be remedied. Finally, the user experience is an imperative area of evaluation (Tucker, 350-351). A repository might think their database works great, but daily librarian users will have a comfort level with the database that the typical visitor does not have. User studies and needs assessments can determine if the database functions well for all users.
Evidence
The following evidence reflects my awareness and achievements towards accomplishing goals related to the design, query, and evaluation principles of information retrieval system design.
Evidence A: Beta Database Design
INFO202 Information Retrieval System Design
This group assignment for INFO202 Information Retrieval System Design was a creative endeavor to design a data structure for a collection of non-traditional objects. The group used handbags to explore different ways that objects can be categorized and presented to a user community. The target audience is described in the Statement of Purpose, which is followed by a workflow process for managing records in WebDataPro. The data entry steps include rules for which type of information goes in each section. For this group assignment, we met via Zoom while creating the rules in a shared Google Doc. After drafting the rules, we each entered a handbag into the system to test the design. I completed the final proofreading and formatting in Microsoft Word before submission.
Group 12: Project 1 Beta Design Document
[embeddoc url=”https://blogs.uoregon.edu/sarahfisherportfolio/files/2022/03/Group-12_202_Project-1-Beta.pdf” download=”all” viewer=”google”]
Evidence B: Website Evaluation and Redesign
INFO202 Information Retrieval System Design
This group project for INFO202 Information Retrieval System Design called for the group to evaluate a library website to improve navigation. The group evaluated the Huntington Beach Public Library website and provided recommendations for improving the website’s structure, organization, and labeling to achieve a better user experience when navigating the site. The project includes a current site map and a revised site map, with the primary recommendation of separating the library website from the city website, which will ease library user frustration. For this assignment, the group met via Zoom while co-editing in Google Docs. I completed the proofreading and formatting in Microsoft Word before submission.
Project 3: Huntington Beach Public Library Website Redesign
[embeddoc url=”https://blogs.uoregon.edu/sarahfisherportfolio/files/2022/03/Info-202-Project-3-Group-12.pdf” download=”all” viewer=”google”]
Evidence C: Image Indexing
INFO247 Vocabulary Design
This assignment for INFO247 Vocabulary Design explores concept-based image indexing versus content-based image indexing through searching digital images in online databases. Reverse image lookup features, found on Google, TinEye, and iStock, are offering new ways to find information outside of standard search terms, or concept-based searching (using subjects). Content-based searching is a visual feature search so the engine scans the image and searches for similar features. This can narrow down results greatly, especially if looking for a specific image. Content-based searching is not always accurate for the user’s search needs, though, because the main feature of the image will be indexed. Smaller, less noticeable features in an image will be ignored, limiting the possible search results.
Image Indexing
[embeddoc url=”https://blogs.uoregon.edu/sarahfisherportfolio/files/2022/02/INFO247-Image-Indexing.pdf” download=”all” viewer=”google”]
Evidence D: ArchivesSpace and Archives West
Folklore Archivist Collection Coordinator, RVMA, UO
As the folklore archivist, I manage an instance of ArchiveSpace for the repository. This week, a new academic term started on campus, which means changes in repository staffing. In ArchivesSpace, I removed the graduate students and intern from winter term as users since they no longer need access. I kept one student user account but changed the password for the new intern. I have these accounts restricted. In my experience, students often accidently delete entire records. Having to repeat work is not productive so now they do not have a delete button. This is one example of managing a database. Most of my work time is spent checking records, entering new records into the database, or doing maintenance on vocabulary design.
Database Management
Finding aids are first created on ArchivesSpace.
ArchivesSpace
When a record is entered, reviewed, and finally published on ArchivesSpace, I download the EAD file. On Archives West, I upload the file for conversion, and then upload the converted file into the online Archives West database. This shares the repository’s records with a wider user audience.
Archives West
Conclusion
When I was completing my first Masters, I worked as a student archivist in the Folklore Archive. Upon graduation, I took over as head of the archive but I quickly realized that I was lacking in knowledge. I needed an MLIS to truly be good at my job. Now, thanks to SJSU, I can create full XML records, I understand vocabulary design, and I realized that I did teach myself how to do LOC Classifications correctly. My job became so much easier, archival management is more efficient, and the repository is better organized.
References
Chowdhury, G. G. “Basic Concepts of Information Retrieval Systems.” Information Retrieval System Design: Principles & Practice, V. Tucker, editor, edition 6.0. Academic Pub, 2019. 93-105
Tucker, Virginia M. (Ed.). “Evaluation.” Information Retrieval System Design: Principal & Practice, 6th ed. Academic Pub, 2019. 349-357.
Tucker, Virginia M. (Ed.). “Search.” Information Retrieval System Design: Principal & Practice, 6th ed. Academic Pub, 2019. 317-326.
Weedman, Judy. “Designing for Search.” Information Retrieval System Design: Principles & Practice, V. Tucker, editor, edition 6.0. Academic Pub, 2019. 119-139.
Weedman, Judy. “The Design Process.” Information Retrieval System Design: Principles & Practice, V. Tucker, editor, edition 6.0. Academic Pub, 2019. 220-232.
Return to page: |
Next page: |