Report of the Library Technology Officer

February 2003

Vanderbilt Television News Archive

The staff of the Archive continue their routine activities of recording the four regular evening newscasts, producing abstracts, recording and cataloging Nightline and other special programs. Website activity continued to be heavy in February. 1,300 new customers registered to use the site-the second largest number since the launch of the new website. The number of videotape loan requests was also relatively high.

Off-air recording

Web server access

1,300 new customers registered on the website
80,211 total entries in activity log
14,165 views of the home page
6,593 views of the search page
9,646 searches executed
3,515 Calenadars viewed
22,610individual records viewed
6,184 program listings viewed

Abstracting and Database Maintenance

Visitors

The Archive had 7 on-site visitors in February. Three of these were affiliated with Vanderbilt, two from Concordia University in Montreal, Canada, as well as two other non-Vanderbilt visitors.

Loan Requests filled

View the Cumulative Table of Statistics for the Archive's activities.

Publicity

Marshall Breeding and John Lynch were interviewed by Michael Simms who will write articles on the Archive for Vanderbilt Magazine and the Acorn Chronicle. These articles will focus on the history of the archive, its current status, and on our plans for digitizing the collection.

Other Activities and Projects

Cheryl Carpenter, the UT Library School Intern, spent part of her time with the Archive in February working on database cataloging related to the State of the Union addresses and working on the backlog of uncataloged specials. She learned other aspects of the Archive, including learning how to write abstracts for regular evening news programs. She also spent a day learning how to produce videotapes for fulfillment of loan requests for duplications and compilations.

NSF Grant Report

Digitizing. Progress continues on a number of fronts toward digitizing video content. This month progress was made on preliminary projects for digitizing special broadcasts, evening news programs and in recording current programs digitally. These projects follow the tentative specifications that we have formulated: initial digitizing with MPEG-2 Main Profile at Main Level (MP@ML) 6000kb/sec bit rate at full D1 resolution; transcoding into Realmedia 9 format with multiple bit rates (56k,150k,256k,512k) at reduced screen resolution (320x212); and storing the resulting files on DVD-R in program stream format.

Marshall has digitizing equipment set up on his workstation so that he can convert material as he works on other projects. During February, he worked on a project to digitize an entire month of evening news programming for all networks. The MPEG-2 files were also transcoded into Realmedia 9 streaming format, and copied onto DVD-R media. This project included about 36 hours of material. Marshall also worked on digitizing programs that include some of the extraordinary events that have been preserved by the Archive. Having these items available in digital form will help us create promotional materials for the Archive.

As of the beginning of March we have almost 100 hours of evening news programming digitized in MPEG-2, transcoded into Realmedia, and written to DVD-R.

Steve Davis has continued working with digitizing presidential speeches. We now have all the State of the Union Addresses and Inaugural Addresses from 1969 through 2003 completed. Many of the speeches were split among multiple tapes when they were originally recorded. We would like to have the speech preserved within a single digital file. Using MPEG-2 editing software we can splice the digital file to produce a continuous version of the speech. This project has given us experience with the numerous problems involved in editing MPEG-2 files, giving us the opportunity to test a few different video editors and utilities.

Steve has begun a project to digitize part of the Senate Watergate hearings. By processing a few day's worth of this content, we will be able to assess the effort that will be required to digitize this large and significant collection within the Archive.

A significant part of the content that we have digitized has been added to the prototype streaming video delivery system.

This month saw the first programs recorded digitally in the Archive. We have installed an MPEG-2 encoding card in the workstation associated with the CNN off-air recording equipment. This approach involves feeding the signal into the card after it leaves the TV tuner, the signal generator that produces the Network/Time/Date stamp, and the processing amplifier and distribution amplifier. We have been working with this equipment to come up with the best way to use it to record news programs. This software lacks automatic scheduling capabilities, making it more difficult to control for recording of regular broadcasts. We have been able to leave the station recording from through the evening through the next morning, producing a MPEG-2 file holding 14 or more hours of content, that may be 40 or more gigabytes in size. We have been able to use editing software to extract the regular 9:30 - 10:30 program that we normally record. This method shows great promise in recording both regularly scheduled programs and for events as they occur.

The approach described above contrasts with a second alternative that we have been investigating of using a self-contained PC-based television card. The PC-card approach gives digital files of comparable quality and includes advanced scheduling capabilities. The disadvantage of the PC card approach lies in its inability to produce the Network/Time/Date stamp required for all Archive materials. We have sent an inquiry to Hauppauge, the manufacturer of the PC TV card, to investigate the possibility of having custom software developed to produce the Network/Time/Date stamp.

Library Technology Officer Activities

Investigation of storage options. The TV News digitization project will require very large amounts of digital storage. We must plan for the storage of the large MPEG-2 preservation-quality digital files as well as the smaller files created for the streaming video delivery system. A recent analysis shows that for the evening news alone, the MPEG-2 files require 50,477.1 GB of storage and the Realmedia streaming files will require about 6,846 GB. The MPEG-2 files will reside on DVD-R optical media and copies will be transferred to the Library of Congress for permanent archiving. The Realmedia files will reside on local servers as part of the streaming video delivery system. One of our main concerns involves how we will accommodate the amount of online storage needed for the video delivery system. Our initial thinking involved using Storage Area Network (SAN) technology to create a single large and scaleable storage system that would be used for this project and other library projects that require large-scale storage. We are finding that SAN to be very expensive, with questionable returns on the investment.

The model that we are currently investigating for the streaming video delivery system involves creating a cluster comprised of a large number of mid-sized servers, each with its own storage. This cluster approach gives the overall system a large number of processors, network connections, and provides a large amount of aggregate storage without the need for a monolithic storage system. One possibility, as an example, would use a cluster where each unit serves out a single year of the collection. Storage requirement for each year range from 140 - 300 GB, which is well within what can be accommodated by storage included in each server unit. We continue to model pricing for the servers, storage, and software for this conceptualization of the streaming video delivery system.

Creative Services Image transfer assessment. Marshall also did some work on doing a preliminary assessment on the transfer of digital images from Creative Services to the library's photographic archives. Creative Services has a large number of digitized images and produces many new digital photographs each year. This assessment, expected to be completed in March, will outline the issues involved in transferring images from the system used by Creative Services to a format that can be used by the library in its photographic archives digitization effort. Key issues include transfer of metadata, assessment of the levels of metadata available, standard image formats, storage requirements, ability to make archived images available back to Creative Services, and digital preservation issues. Marshall met with Lynn Craddick to discuss these issues; Jody convened a meeting that included interested individuals from Creative Services and the library regarding this project.

This month Marshall participated in meetings of the Heard Library Web Task Force, its Search Engine sub-group, and the new Digital Collections Committee, and Library Management Council. Marshall worked with the Web Task Force to produce a new version of the interface for access to Electronic Journals.

Met with two individuals from Concordia University in Montreal Quebec regarding their archive of BBC radio newscasts. We see some possibility for collaboration in the future. They have digitized their 8,000 hours of content, but do not have abstracts or transcripts.

Marshall worked with Anne Womack relating to planning for future development work for the ATLA Religious Iconography Database. The grant for this project has been extended to a second year. We plan to implement enhancements to the system to include support for the thesauri from the ICONCLASS system, which is widely used in the field of religious iconography.

Extra-curricular Activities

Marshall's regular Systems Librarian column was published in the February 2002 issue of Computers In Libraries.

Taught a workshop on virtual reference technologies for SOLINET at Western Kentucky University in Bowling Green, KY on February 21, 2003.