A Webinar held by CRL on October 23, 2013
The open online distribution of government information is changing the ways that researchers access data and documentation produced by governmental agencies worldwide. This webinar explored:
- The changes in the “supply chain” for government information brought about by digital media;
- How libraries might continue to ensure access to this information for researchers; and
- What role CRL might play in supporting those library efforts.
Background: Current CRL Support for Access to Government Information
CRL has always played an important role in ensuring the survival and integrity of primary publications of U.S., Canadian, and foreign governments. CRL collections include historical documents from a great variety of jurisdictions and government agencies. CRL does not, however, hold extensive U.S. federal government publications, which are widely available through the Federal Depository Library Program.
CRL strengths include:
Canadian and U.S. state legislative journals: CRL has extensive holdings of Canadian legislative journals and U.S. state legislative journals. Many of these materials have now been digitized and are accessible in the LLMC-Digital database.
US state documents issued prior to 1951: CRL holds more than a halfmillion volumes of monographic and serial publications of U.S. state government agencies and legislatures from the earliest periods through 1950. Of major historical value for various disciplines, these materials are notoriously difficult to access, since no comprehensive bibliographies or catalogs of them exist. Most are uncataloged, but are listed in CRL finding aids.
Publications of foreign governments: CRL foreign documents holdings were initially formed of deposits from member libraries, with particular strengths in Western Europe and Latin America, dating from roughly 1800—1950. As CRL’s focus evolved from a depository to a center for cooperative acquisition, CRL initiated blanket orders and subscriptions to a limited number of foreign document series, over time concentrating on South Asia, Southeast Asia, Yugoslavia, Poland, and Israel. An important source of these was the PL-480 exchange program.
Other notable CRL holdings include an extensive set of central bank reports, dating from the late 19th century through the 1990s; official gazettes published by 161 countries from the 17th century through the 1990s; and a great variety of primary source materials and archives microfilmed and/or digitized by area studies interest groups affiliated with CRL. CRL has posted topic guides, listing key holdings of U.S. federal and state publications.
Presentation: Sustained Access to Government Information: the Changing Roles of Libraries
Annelise Sklar
Social Sciences Collections Coordinator
University of California San Diego Library
Even in the “straightforward” print-only Federal Depository Library Program (FDLP) collection of times past, up to a third of the collection was uncataloged and thus unfindable.1 Today, this collection disorder is compounded by the individual agency quirkiness that accompanies distribution of state, local, and international documents, multiplied by the complexity of the online environment.
The infamous Ithaka S+R modeling initiative on the future direction of the FDLP (commissioned by the Government Printing Office) notes that libraries are moving away from the “just in case” model of purchasing tangible materials, to licensing electronic collections instead.2 In fact, approximately one-fifth of FDLP libraries are almost exclusively collecting electronic-only new government publications.3
Libraries are adapting collection strategies to meet evolving user expectations and behaviors, of course. Similar to discovery practices with other types of resources, the GPO has found that 55% of self-acknowledged FDLP users list “Google or other search engine” as a frequent source for U.S. government information (a number that rises to 91% with the addition of “sometimes” users). Not surprisingly, FDLP users also list “access to more materials online” as their top most-desired improvement to government information access.4
Access to government information is also now part of the larger “Open” movements. The key driver behind Open Access is the idea that research results, especially those funded by taxpayers, should be freely shared for the greater good. Open Data, on the other hand, is about providing free access to data for the replication of scholarly results or repurposing into new research or new tools. Open Government, something of an extension of Open Access and Open Data, is about transparency and government accountability through public access to government-produced information.
Within this context, there is an increasing desire to remix government information. Organizations like the Sunlight Foundation and projects like Followthemoney.org and Govtrack.us are mashing up government information into new tools, and projects like IPUMS and the American Presidency Project are reformatting government data for one-stop-shopping. Users who produce these projects prefer direct access through APIs to feed into their increasingly sophisticated projects rather than having to mine that data from unwieldy pdf documents.
But at the same time, it should be noted that costs of distributing government information have shifted with the move to online access. As a 2012 CRS report notes, in the print model, depository libraries bore the costs of managing tangible materials, staff, physical plant needs, providing public access, etc. However, in the electronic model, costs of providing digital materials fall to the GPO for FDSys, and to the other agencies that provide online content.5
In contrast to the Open Government movement is a push toward smaller government, which means fewer government employees to produce and care for government information. The direct impact on librarians and government information users has been the defunding of key resources like the Statistical Abstract and Sourcebook of Criminal Justice Statistics. In fact, during the 2013 (partial) federal government shut-down, and to the surprise of many people, a number of heavily used government websites--including the Bureau of Economic Analysis, the Census Bureau, Data. gov, the Department of Agriculture, the National Center for Education Statistics and the ERIC database, National Institute of Standards and Technology (NIST), and the National Science Foundation—went offline.
Likewise, content has been lost to a “temporary” removal of online ERIC document and NASA technical reports. Though this material was available for years in the microfiche collections of FDLP libraries across the country, the implications of widespread access once the full text documents were made available online led the issuing agencies to remove the full collections of those online documents pending review (in the case of ERIC for “privacy concerns” and at NASA “to ensure that it does not contain technical information subject to U.S. export control laws and regulations.” Much of the content has since been restored, but there are no target completion dates, and any library that had weeded its microfiche collection with the intent of relying on the online simply had to do without.
Libraries, of course, need to respond to these changes in access to information in whatever ways are best for our individual user communities.
One option to compensate for the items no longer available through the depository system is to buy commercially produced government information resources. This is not a new concept (for example, libraries have bought commercially published case law for over a century), and sometimes it is worth the money to have a better interface or more (perceived, at least) stable access. Vendors typically have the resources that governments, nonprofits, and libraries may not have to carry out these big digitization projects, update and maintain interfaces, and provide value-added context and/or metadata. However, they still have to rely on the government to produce that data in the first place, and libraries then have to justify spending the money on something that is “free.”
Another option is to digitize and preserve the materials we want ourselves. One advantage to digitizing FDLP materials is the lack of copyright restrictions. But because one cannot simply guillotine a Regional Depository’s full collection, libraries still need to develop realistic workflows for these projects. Additionally, state, local, and regional collections are often scarcer in print, and may come with additional intellectual property restrictions.
Luckily, there are already a number of collaborative digitization projects underway for libraries with the time and resources to model or join. These include FDSys, the American Memory Project, HathiTrust, LLMC-Digital, TRAIL, the Internet Archive, University of North Texas’s A—Z digitization project, and CRS archive (Indiana Virtual CD-ROM Library), to name a few. Likewise, there are a slew of projects to model or join for capturing and preserving born-digital materials: the EndofTerm collaborative web archive, the Internet Archive, UNT’s CyberCemetery, the Web Arching Service, Archive-It, LOCKSS-USDOCS, and public.resource.org.
However, born-digital materials—such as databases, websites, and publications— may be composed of dynamic content and formats that our current tools do not harvest well. Our biggest challenge is probably going to be figuring out how to preserve not just the data but the ability to interact with the data. This includes not only old CD-ROM collections, but also databases and database-driven content that go offline. We can harvest the search page, but someone needs to capture and then maintain the back-end code if we want to provide continued access to the content.
Concrete solutions are still scarce, but the key probably ultimately lies in those things librarians are really good at—preservation, metadata standards, providing access and making it all findable—and in the thing we’re not so good at: working together to make all of these projects interoperable and seamlessly searchable.
Response: Sustained Access to Government Information
Paul Belloni
Business and Economics Reference Librarian and Selector for Psychology at the University of Chicago Library
As Annelise Sklar has indicated, government information includes so much more than it used to. But with numerous distribution sources, distribution roles are not clearly defined. Things have never been so messy and exciting for researchers and librarians.
Free software packages such as R, Google Drive, and Zotero help empower researchers from just about anywhere to do research on par with researchers from elite institutions. In addition to free tools, many organizations are taking free government information and repackaging it in helpful ways. (For example, see FollowTheMoney and Open City.
The World Bank (WB) is one organization that works well with the new complexities of the supply chain. It looked at the landscape, saw what tools researchers are using and reacted to it. The World Bank’s Data site offers many options for gaining access to their data including microdata, an open knowledge repository, and access to WB data via external platforms such as Google Public Data and Quandl.
The U.S. Census Bureau has taken a different approach. At first glance it appears unorganized about informing researchers where to access its information. There are in-house products such as American FactFinder and DataFerrett as well as commercial products, such as Social Explorer, and organizations like MetroTrends and MetroPulse. Even though it is difficult to be sure of all of the possibilities for locating census data, there is something exciting about how so many different government agencies, nonprofit organizations, research institutions and civic-minded geeks have placed themselves in the Census Bureau supply chain. The government information ecosystem is fluid and dynamic. And it doesn’t get any more fluid and dynamic than working with U.S. Census information.
The best strategy for libraries to remain relevant within the supply chain of information is more collaboration and better service. In the digital era, our expertise as information professionals becomes as important to our libraries as the collections that we house. Due to the complexity of the government information ecosystem, we must collaborate more to improve service. We must continue to grow and cultivate collaborations with librarians both inside and outside our own institutions, maintain strong relationships with the organizations that supply the information, work closely with our own IT services, and continue to educate ourselves.
It is an exciting and overwhelming time to be working with government information and we need a supportive community in order to do the best job we can.
- R. Eric Petersen, Jennifer E. Manning and Christina M. Bailey, Federal Depository Library Program: Issues for Congress, CRS Report R42457 (Washington, DC: Library of Congress, Research Service, March 29, 2012), accessed November 11, 2013, http://www.fas.org/sgp/crs/misc/R42457.pdf.
- Ross Housewright and Roger C. Schonfeld, Future Direction of the FDLP: Modeling Initiative (New York, NY: Ithaka S+R, May 16, 2011), accessed November 11, 2013, http://beta.fdlp.gov/about-the-fdlp/projects/23-about/projects/135-future-direction-of-the-fdlp.
- Federal Depository Library Program, Biennial Survey of Depository Libraries 2011 Result Highlights (Washington, DC: Government Printing Office, September 13, 2012), accessed November 11, 2013, http://www.fdlp.gov/home/repository/cat_view/72-about-the-fdlp/84-biennialsurvey/331-2011.
- David Powell, Sheila King and Leigh Watson Healy, FDLP Users Speak: The Value and Performance of Libraries Participating in the Federal Depository Library Program (Washington, DC: Government Printing Office, July 28, 2011), accessed November 11, 2013, http://www.fdlp.gov/component/content/article/19-general/1011-depositoryusersurveyreport.
- R Eric Petersen, Jennifer Manning, and Christina Bailey, Federal Depository Library Program: Issues for Congress. Washington, D.C.: Congressional Research Service, 2012. (CRS report for Congress R4257).