2010 TLA Annual Conference Abstract: With the introduction of an Electronic Resources Management product, the University Libraries of the University of Memphis began an ongoing project to provide access to all databases, included free government databases, through the online public catalog. Discovery, selection, and presentation of the resources are discussed.
For over 100 years, the United States government has mandated the distribution of government information to the public. This dissemination of information is today managed through the Federal Depository Library Program (FDLP). Until recently, distribution took the form of sending boxes of tangible materials to depository libraries where they were cataloged and housed. Today it is estimated that well over 90% of all federal publications are now available in an electronic format and available on the web. The greatest advantage of this electronic access is that major federal databases are now freely available to anyone with Internet access. The vast amounts of data contained in the American Factfinder and PubMed databases are well known: their two respective agencies, the Bureau of Census and the National Library of Medicine (NLM), have maintained these resources for years, and in the case of NLM, through many different online iterations.
With the shift from print to online delivery, it has grown easier to locate government information. NLM, the Census, and hundreds of other federal and state agencies generate material that is as useful as it is freely available, reflecting the long-standing adage among documents librarian that government information is lost treasure that people simply do not know about.
Government resources are useful and free, but still, patrons need assistance in finding them. The ease of locating a government document through a Google or a Google Uncle Sam search versus using the Monthly Catalog is something that only librarians can truly appreciate. It has been decades since average users touched the Monthly Catalog, and while they use Google with abandon, they still need help in finding the best resources from among the 1,365,817 hits their searches inevitably generate. Database resources—whether government or not—should be accessible, visible, and obvious. They should be placed in the way of potential users; library patrons, in a phrase, should be falling over them.
For this reason, librarians at the University Libraries of the University of Memphis determined to merge free government databases with licensed for-fee databases in their recently-acquired Electronic Resources Management tool (ERM). This paper focuses on how these resources were selected for incorporation into the ERM and thence, into the online public catalog (OPAC).
In 2008, the University Libraries migrated an integrated library system (ILS) and OPAC from DRA Classic to an Innovative Millennium platform. An electronic resources management module was included in the new Millennium ILS. Implementation of the ERM began the following year, in 2009. Until the advent of the ERM, the University Libraries had attempted to manage access to its database holdings through multiple, increasingly lengthy, alphabetic lists: the first a typical A-Z list with the names of the databases accompanied with brief information; the second, with names and fuller descriptions. In addition, there were ten pages listing databases in broad disciplinary categories (e.g. Science, Social Sciences, etc.). All these pages were mounted on the library’s website. As HTML creations, they could only be edited singly, and the longer the pages became, the more time it took to maintain them. Even in a small library, as numbers of electronic resources grow, A-Z lists would only become longer; in a large library, serving research interests in many disciplines, just the maintenance of access was unwieldy, inefficient, and time-consuming.
More importantly, and beyond the time involved in maintenance, A-Z lists have significant problems, despite librarians’ fondness for them. Their most significant flaw is that they require that users know the name of any resource they might want to use before they can find it. When the percentage of people who actually ask a librarian for help is factored into the mix, A-Zs become even less useful. Contemplating the likelihood that most of those not asking for help are just apt to pick something from the top of a list renders them closer to useless insofar as the majority of users is concerned.
Government Databases Background
Some government databases had always been incorporated into the University Libraries’ alphabetic lists—PubMed, Thomas, American Factfinder, ERIC—so there was no question that these would also be included in the ERM. In fact, recent years had seen the numbers of incorporated government databases grow beyond those usual suspects, driven by the frustrated question of a faculty member from the School of Nursing: “Why do I have to go to UT to use PubMed Nursing?” This question exemplifies the problem of leaving patrons to their own devices in relation to government databases. Regarding PubMed specifically, typical patrons do not necessarily know the full extent of its coverage; they may not know how to limit to a subset like Nursing; they may not want to have to limit as they may prefer to have it done for them. Why then should the direct links to specific areas of databases not be embedded, if available? A similar effort at deep-linking the specific databases making up the CSA Biological Sciences Collection had resulted in a tripling of the collection’s usage. It seemed that the more links/names/points-of-entry that could be made available, then the more use that would be made of resources.
Ultimately, choices related to the names and links to various databases plus their display in the ERM were based on these considerations. Patrons:
• do not know the proper names of resources
• do not know what makes up a resource
• do not know what a resource can do
• do not know “free” from “fee”
Therefore, the ERM should:
• use as many alternative names as necessary
• use deep-links when relevant
• display descriptions on results lists
• interfile licensed and open access resources
Implementation of the ERM allowed the replacement of all the bad old ways of keeping track of databases: HTML pages, EXCEL files, manila folders, scraps of paper, or the memory of the electronic resources librarian. One record would hold all information related to a resource. With the resource records, the ERM’s functionality allowed the generation of multiple ways of presenting information about databases to the public, offering searchability, browsability, name access, variant name access, keyword access, subject access, subject list access, provenance of the data, and even alphabetic access for those librarians who were still enamored of an A-Z list. All information was stored in individual resource records; the information populating the multiple public views came from those individual records; and any editing done, was accomplished in the individual records—once and once only. [NOTE: recording information about databases and the public presentation of said information was only the first ERM application to be implemented; other applications have yet to be completed.]
Resource records for then-subscribed databases were created February through May 2009 with records activated for public view in June. Through July, ERM-generated pages were displayed simultaneously with the old HTML pages, allowing public service staff time to adjust to the new means of search and display.
The ERM and the OPAC
To access database information in the catalog, users can click on the word “Databases” from the Libraries’ website, or on the sidebar tab for Databases within the OPAC. From that page, there are several options for finding a database. It is possible to search for a database by its name (for those users—librarians—who actually know the names of databases), to browse by the alphabet (also for librarians), or to pull up lists of databases by subject or category (for everyone else).
Each subject list indicates the number of resources; there are 105 from or about the Government. They also extract descriptions from resource records, generating not only a list of resources, but an annotated list, giving patrons an idea of what specific resources can do for them. Links are built into the names of the databases. Anyone wanting to see more details in the full resource records (the likelihood that this will be librarians only is high) can open them with an “About Resource” link. Given the typical patrons’ willingness to click no more than what is absolutely necessary to get where they want to go, any special instructions related to access were included in the description field so that they would display on the annotated subject lists.
Fields chosen for public display in government resource records included:
• Database name
• Alternate name – if necessary and repeated, if necessary
• Description – explanation of the coverage and use of the resource
• Database subjects – names of the disciplines and categories the database pertains to; repeated, if necessary
• Public notes – identifying the agency providing the resource
• Database type – such as citation index, fulltext database, federated searcher, etc.; repeated, if necessary
• Format – such as pdf, html, images; repeated, if necessary
• RefWorks instructions – a customized field requested by the University of Memphis in order to include instructions for using a resource with the campus’ citation management program; not applicable to all government resources
Fields suppressed from public view remained available for information and comments that might be relevant to library staff. These were less used with government resources than with licensed databases.
Templates allowed fields that are repeated from one record to another to be input automatically. Repeatable fixed fields classed all the government e-resources as public domain, open access, and suppressed. New records were only unsuppressed from the OPAC after a review. In the Millennium product, un-suppressing automatically and immediately opens a record for public view. Variable-length fields also included some repetitive wording—for example, that found in the public notes. For purposes of display, “subject” was reworded “academic area or category” so that certain types of databases could be listed en masse. All government resource templates included “Government” as a subject category so that a comprehensive list of government resources could be easily extracted by the OPAC. (See: http://bibliotech.memphis.edu/search/m?SEARCH=Government&searchscope=4.) With the Millennium ERM, subjects can be assigned with as great or as little detail as is desired, and multiple subjects can be assigned to a database. The generic lists—Sciences, Humanities, Social Sciences—are quite long themselves so more specific subjects became necessary. Beyond the Government category, each government resource was also assigned as many subjects as appropriate. For instance, Thomas, the collection of legislative information from the Library of Congress, can be found under the subjects “General Topics,” “Government,” “History,” “Law,” and “Politics and Political Science.”
Multiple subject listings of this sort keep the government resources from being silo’ed. Mixing them in with other resources useful for research in a field places them in the view of potential users. At a demonstration of the Database Search function, the Electronic Resources Librarian was greeted with an exclamation from a faculty member from the Geology Department: “I didn’t know we had LandSat!” Librarians forget that just because things are freely available, users—even specialists in a field—may not be aware of them; patrons really don’t know that everybody has LandSat.
As any number of templates could be created, and as some government agencies were heavily represented in the University Libraries’ collection, templates were also created for specific agencies (NCBI, for instance, with its many Entrez databases and PubMed subsets, all of which were listed individually to enhance access).
Isn't This Cataloging?
Organizing and inputting all this information into the ERM is akin to cataloging, for all that library staff are working in non-MARC records. Databases, even free ones, are complex resources and the need to impose order on the haystack of their existence is vital if libraries wish to help patrons.
It cannot be denied that identifying databases and collecting information related to them was and will continue to be time-consuming. However, the time spent in tracking this information benefits reference staff and better enables them to assist patrons and is in many ways analogous to what long-standing government documents librarians remember as “opening the boxes.”
Staffs in depository libraries quickly became familiar with what the library receives from the GPO during the actual processing of physical materials—handling them, seeing what they were. Coupling this processing with reference was immensely helpful. Experienced personnel learned which agencies published what interesting data and where valuable research on various topics could be found. With online databases, however, it becomes necessary to look at each database in an attempt to at least understand the nature of searching it. Considering the number and varied sophistication of government databases, this task may seem impossible, but the impossibility of the task should not preclude its attempt. Librarians provide reference related to all matters of information in numerous disciplines while possessing limited personal knowledge of specific topics. The information hidden in these resources is to be located, not necessarily explained, to the patrons.
Identifying Government Resources
Beyond the likes of Thomas, ERIC, and PubMed (and does EVERYBODY really know about them?), which resources were selected from an admittedly enormous reserve of material? Why were they chosen? How were they found?
The primary consideration was the curriculum at the University of Memphis. What programs, what research areas, was the University Libraries being asked to support?
The University of Memphis grants Bachelors and Masters degrees as well as the PhD; enrollment is around 20,000. Its undergraduate program is spread among the arts, the sciences, and the professions. Colleges include Arts and Sciences, Communication and Fine Arts, Education, the Fogelman College of Business and Economics, the Herff College of Engineering, the Loewenberg School of Nursing, the School of Audiology and Speech-Language Pathology, the School of Public Health, and the University College. The Graduate School offers comprehensive graduate programs. A library serving such large and varied body can make use of many of the specialized resources provided by the U.S. government, so the search for resources to include in the ERM was a broad one.
As an example, the University’s Center for Earthquake Research & Information utilizes materials from the United States Geological Survey (USGS), so the USGS was a relevant agency to mine for resources. The Survey has dozens of different databases that have been included in the ERM. Many of these databases, though unfamiliar to the librarians, were located by doing simple searches on Google Uncle Sam (http://www.google.com/unclesam) using as search terms, the agency name and the word databases (i.e. “U.S. Geological Survey” databases). Another government database, USASearch.gov can also be used to find similar results. Indeed many specific databases—on earthquakes, maps, land features, etc., and all from the USGS—can be located with either of these tools. This simple technique for locating databases can also be modified by file type. Again, a simple search of Google Uncle Sam using the terms “images” and “databases” results in hundreds of interesting image databases on a myriad of topics, many of which were selected for inclusion.
Large databases, such as PubMed, can be subdivided and deep-linked to point to more specific resources. Various agencies may create portals for databases, or they may federate search over several resources sharing a common platform. Science.gov is one such example; Entrez is another. An information portal for science information from the federal government, Science.gov covers all scientific disciplines, including health. Entrez is more subject specific, focusing on the life sciences, particularly the fields of medicine and genetics.
Another source of identifying databases was the tried and true method of looking at other libraries’ web pages and subject lists. Locating the website of another large public or academic depository can yield a web page identifying federal databases. Although federal databases have been the focus of this discussion, databases from states and municipalities may also be useful. Tennessee has many valuable information resources as can be seen at the website, TN.gov (http://www.tennesseeanytime.org/topics/Online+Services.)
[Note: the concept of a “database” was applied quite loosely when selecting resources from the state of Tennessee. Websites offering government services should be considered for inclusion in the catalog, depending on the libraries’ user group. These resources offer points of information that might not be found unless cataloged.]
Datasets are online collections of raw data related to any number subjects. Many of them are provided by the government and these are typically available for free download and analysis. Analyzing datasets is developing as an area of research, particularly in the sciences.
Datasets run the gamut. The Substance Abuse treatment episode dataset (http://www.oas.samhsa.gov/dasis.htm#teds2) provides sophisticated information on substance abuser admissions into treatment programs throughout the U.S. Data can be downloaded, analyzed, at various geographic locations. It also contains reports coming forth from the analyzed data. Datasets are available from the Census Bureau, such as numerous Summary Files for various populations, (http://factfinder.census.gov/servlet/DatasetMainPageServlet). The Solar and Heliospheric Observatory’s (SOHO) (http://sohowww.nascom.nasa.gov/data/data.html) provides data measuring the sun’s irradiance, particle flare activities, helium intensity, proton intensity, and more. In many cases, the decision of whether to include them or not has already been made, as many government agencies simply include them as part of their databases. For a large academic or research library, datasets can be extremely useful, particularly to graduate students and faculty. Using these resources requires specialized knowledge and more than ordinary technical skill, bringing up issues related to training and support at the Reference Desk.
An ERM lifts several burdens from the shoulders of a librarian, but there are, and will always be, maintenance issues in relation to tracking resources provided by government agencies.
Although agency databases are released for public consumption, they are primarily created for each agency’s constituency, that is, for specialists. Agricultural data on pesticides assumes a basic background & knowledge of the issues involved. Understanding of genes and gene sequencing is necessary for optimum usage of many of the Entrez databases from the National Center for Biotechnology Information (NCBI). While librarians may not necessarily have backgrounds in these fields, it is their job to point to and provide access to relevant resources, depending on the needs of their user groups. Librarians are expected to give reference service related all matters of information despite having only a cursory personal knowledge of a subject. Information is to be located, though not necessarily explained, to patrons.
An additional problem for librarians is that there are no standards among agencies for the design and functioning of these databases and datasets. The sophistication (or lack thereof) of a resource is primarily a function of budgetary allocations for information technology support in each agency. Rest assured that the Department of Defense probably allocates much more of its budget to technology than does the Department of Veterans Affairs, having both more money and more IT support to work with. Some agencies have the very latest IT toys; others do not. Consequently government databases can be all over the map regarding versions of software, interface changes, compatibility with the latest version of Microsoft, and data being added and deleted.
A final maintenance issue is one that is very familiar to documents librarians. Even before the web, it was difficult for the Government Printing Office (GPO) to collect every document. For decades, GPO has acted as a clearing house for agency publications, distributing them via the government depository system. Each agency, however, bore responsibility for reporting the existence of documents. Inevitably, titles were missed. The same can be said for some these databases. A librarian might trip across one of these, herself; make note of its usefulness; but still find no mention of it from GPO or OCLC or anyplace else. It has just been there, waiting to be found.
However, substitute “vendors” for “agencies,” and the many of the same problems will be found related to the management of any electronic resource. Creating a resource record for a database is wholly unlike cataloging a book. Book records are done and done once. If a resource record is only done once, it is not done at all. This is the reality of cataloging in the electronic library: records must be as fluid as the resources that they document. Bearing a greater resemblance to serial records, ERM records are never entirely done. Names, URLs, additions, deletions, agency names, focus (i.e. subject) can—and will—change, often frequently and always without warning. Federal agencies do not report these kinds of changes readily. Therefore, monitoring resources for these changes must become a constant routine for staff responsible for maintaining the ERM.
With any electronic resource, there are costs involved in enabling patrons to “fall over” them, whether those resources are free or not, because any electronic resource is far more involved and complicated than any book.
Making things easy for patrons also means making more work for librarians. Not only does the maintenance of the data generate more work; maintaining staff’s awareness of changes also requires work—active teaching and cross-training which are especially necessary in this era of the flattening of reference services. Increasingly, generalist reference staff are being expected to field questions that formerly went to documents specialists. This need for training and awareness, like constant change, is also a reality of the electronic library. Enabling users to discover databases through an online catalog requires more than what is involved with finding the location of a book on the shelf.
Even if no new electronic resources were to be added to the University Libraries’ collection, existing resources would continue to require modification: there will always be new functions, incorporated resources, agency reorganization, and website redesigns that will affect users’ ability to access the information within databases. Changing URLs and modifying descriptions will be required to maintain records: adding and removing resources, to keep the lists current.
Following is a list of govenment databases mentioned in this paper, as well as other useful sites. For the complete list of government resources incorporated into the University Libraries catalog, see: http://bibliotech.memphis.edu/search/m?SEARCH=Government&searchscope=4. All resources can also be found using the OPAC’s Keyword search option. For the slideshow corresponding to the content of this paper, see: http://www.slideshare.net/kazlists/falling-over-free-resources
- Agricultural Image Gallery (http://www.ars.usda.gov/is/graphics/photos/search.htm): Searchable collection of 2000+ digital photos provided by the Agriculture Research Services Information Staff. Can also browse. Downloadable images available in several sizes
- American FactFinder ( http://factfinder.census.gov/home/saff/main.html?_lang=en): Source for population, housing, economic, and geographic data
- American Memory Photos (http://memory.loc.gov/ammem/browse/ListSome.php?format=Photograph): Digital collections of photographs and prints related to different aspects of U.S. history. Collections address African American history, advertising, architecture, baseball, baseball cards, the Civil War, conservation, North American Indians, presidents and first ladies, quilts, the West, women's history, and much more. Copyright and permissions information can be found on the Library of Congress' home page. Coverage: varies by collection.
- Arago: People, Postage, and the Post (http://www.arago.si.edu/): Resource for the study of philately and postal operations in the United States.
- Entrez: The Life Sciences Search Engine ( http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi?itool=toolbar): Cross searches all NCBI databases, including PubMed, PubMed Central, gene sequencing resources, the NLM Library catalog, online books, the PubChem databases, and more.
- ERIC (http://www.eric.ed.gov/): Open access version of ERIC, indexing journal and non-journal literature in the field of education; includes books, journal articles, ERIC Digests, policy papers, conference papers, technical reports. Coverage: 1966- .
- FBI Electronic Reading Room (http://foia.fbi.gov/alpha.htm): FBI reports requested under the Freedom of Information Act that have been in the news or are frequently requested.
- Household Products Database (http://hpd.nlm.nih.gov/): Search for health effects of commonly used household chemicals. Browse by product name, ingredients, or manufacturer. Includes contact information for manufacturers.
- Land Patents and Surveys (http://www.glorecords.blm.gov/): Includes land patents & surveys to Federal land conveyance records for the Public Land States & image access to more than three million Federal land title records for Eastern Public Land States, issued between 1820 and 1908.
- LandSat Image Gallery (http://landsat.usgs.gov/gallery.php): Online archive of space-based moderate-resolution land remote sensing data. Coverage: past 40 years.
- National Atlas (http://nationalatlas.gov/): Source of maps from 20+ Federal organizations, devoted to the heritage, culture, and resources of the United States. Customized maps available through the Map Maker tool.
- PubMed (http://www.ncbi.nih.gov/pubmed): Provides MEDLINE access, along with OLDMEDLINE, Pre-MEDLINE (citations in-process), and citations outside the scope of MEDLINE provided by journal publishers. Interconnects with other NCBI databases and resources. Articles from many medical journals are free after a 6-12 month embargo. Linkouts to fulltext provided at publishers' discretion. Coverage: OLDMEDLINE, 1950- 1965; MEDLINE, 1966-.
- Thomas (http://thomas.loc.gov/): Federal legislative information freely available to the public. Coverage: 1989-present.