Making a Difference: Comments on Electronic Records Management R&D Projects at Ohio State University, Indiana University and City of Philadelphia: SAA Annual Meeting August 29, 1996, Session 6
Richard E. Barry
I have been asked to comment on the three projects presented here today. You will appreciate that, although I have heard presentations on the projects and have visited their WWW sites, I have never visited the facilities where these projects are being implemented or met and heard other stakeholders beyond the speakers. My remarks should therefore be considered as virtual. I was, however, involved on the panels for the University of Pittsburgh project on Functional Requirements (FRs) for Evidence in Recordkeeping, and did use the Pittsburgh FRs to evaluate the World Bank's electronic filing system. The Philadelphia and Indiana projects are follow-on projects to the Pittsburgh project. The Ohio project, which had roots preceding the Pittsburgh project, is silent on the Pittsburgh FRs. The Philadelphia and Indiana projects are similar in that they aim at ultimately implementing recordkeeping systems using the Pittsburgh requirements model. They differ in that one is applying the Pittsburgh Requirements in an academic and the other in a municipal business environment and, of course, they differ in some ways in how they are approaching their projects, and we will talk about some of those differences.
The Ohio project has a different thrust than the others and is therefore not amenable to comparisons in the same way that the other two are. The focus of the Philadelphia and Indiana projects is on electronic recordkeeping systems development and implementation. One might say they are outcome oriented. By contrast, the focus of the Ohio project is on awareness building and collaboration in the development not of a system but of a set of guidelines for the management of electronic records. One might say it is more process oriented. How does one frame brief comments on three such projects? I will not go into the detailed system design of these projects. I neither have a sufficient level of familiarity with them to do that, nor do I believe there would be much value gained of a generic nature to interest this group by doing that. Furthermore, much has been said in recent conferences about two of these projects with respect to their responsiveness to archives and recordkeeping needs. I therefore propose to speak to a few topics that we do not typically focus on at SAA meetings, that are primarily related to information architecture and organizational aspects, through discussions of design process, the handling of functional requirements and metadata, the build-vs-buy options for systems development, project sustainability and the reporting of research results. I will begin with the Ohio project because it cuts the broadest swath and in many ways offers insights that are almost universally applicable to electronic records management projects, current and future.
Participative Design of Recordkeeping Electronic Information Systems
The Ohio project drives home a crucial lesson in participative design, whether the object of the exercise is (as in its own case) a set of guidelines that can be used to guide the design of future systems or it is the design of the system itself, as in the case of Indiana and Philadelphia projects. It is difficult to conceive how any electronic recordkeeping system can be successfully implemented in the absence of a deep and trusting partnership among the archives and records management (ARM) function, the information management and technology (IM&T) function, often led by a chief information officer (CIO), and the mission-critical end-user community. Identifying the stakeholders and building that partnership are an essential first step in requirements definition and analysis. Why are these particular stakeholders crucial to the field of electronic records management?
The Ohio project brought to light some downside lessons that can be very useful to all of us as most of them could be avoided if worried about in advance. Some of them were related to unfortunate accidents of timing and resulting budget problems, others had to do with the unusual multi-institutional nature of the project, something that would not ordinarily be present in an internal system design process, though even there you have the multi-stakeholder situation to deal with. Despite these hard lessons, it is also clear that many good things happen when all of the stakeholders are engaged in the definition of functional and technical requirements, even if they come out with the same results in the end. It is a common learning experience that results in information managers gaining a much greater depth of understanding of records management issues, and a greater respect for their ARM colleagues and what they can bring to the information management table. At the same time, archivists and records managers gain insights into their organizations' broader information architecture and strategy and into some of the limitations of their organization's current technological environment and some of the intractable issues facing their IM&T colleagues. (Not all intractable problems are centered in archives and records management, though that is what we tend to hear most about at our conferences.) This common learning experience is more than an educational adventure. Well organized and facilitated, it also can:
Like anything else, the participative approach has its own downsides and risks:
In the Pittsburgh project, the balance of participation between the ARM and IM&T professions was heavily weighted in favor of the former, and there was no real participation by end-users as such in the articulation of requirements. This was more because of the research nature of the project and because the requirements sought were not intended to be limited to specific end user business areas. Nonetheless, the Pittsburgh project itself was in large part fashioned through a participative design process such as just described, and gained from many of the benefits outlined above. What this means however is that it is important how the Pittsburgh functional requirements are introduced into a new setting, such as they were in Philadelphia and Indiana. The Ohio project had the most extensive coverage of stakeholders. All three projects were interdisciplinary in their collaboration between ARM, IM&T and business area groups. So far as I can tell, none of the projects involved end-users outside of the target business area staff that create records.
What are the options open to the archivist in this respect at the beginning of the design process that may be too late to undertake once it gets started? One option would be to accept the Pittsburgh functional requirements, BAQ model and metadata requirements as givens and move ahead into the design of system specifications that satisfy those needs. The advantages are obvious. Mainly it is that a good deal of time can be saved in the definition phase and work can move quickly into specifications definition, bid offerings and systems development. The risk however is that the energy and buy-in that was experienced in Pittsburgh is not transferable to the new teams in some other location. In the absence of experiencing a similar process, there is the danger that all of the important stakeholders will not be identified or engaged and that they will not have the same level of understanding of the issues -- the IM&T user people will not understand the ARM issues and the ARM group will not understand the interface or information architecture, technology architecture or technical issues, and the end users will understand neither -- and none will likely have the same commitment to the final design solution.
Another option would be to establish the stakeholder group and allow it to go through the kind of participative process outlined above and come to its own definition of functional requirements. The risk, of course, is that the ARM group in any given situation may not be sufficiently well equipped or persuasive to ensure that recordkeeping functional requirements are adequately provided for.
Are these approaches mutually exclusive? They need not be. My own view is that the Pittsburgh functional requirements can be very effectively used by the functional and technical requirements teams as an agenda for the stakeholders to use to learn to talk to one another and to begin to understand some of the ARM concepts and related technology and user interface issues. They can also be used by such a team in an exercise to evaluate one of the organization's current systems, thus demonstrating by use how the requirements relate to existing systems and building an excellent common vocabulary for discussion of functional requirements, possibly resulting in endorsement of the Pittsburgh requirements or some acceptable variation on them.
Treatment of Functional Requirements
How were these issues dealt with in the three projects presented today? It is clear that the Ohio project took the partnership building approach from the beginning, even including other corporate stakeholders such as legal counsel and auditors. How did the Indiana and Philadelphia projects attempt to rationalize the Pittsburgh functional requirements in a real-life information systems and recordkeeping environment? One might interpret that the Philadelphia project bought some time by basically adopting the Pittsburgh functional requirements largely at the outset. The project engaged consultants who were highly identified with the Pittsburgh project, probably a wise choice given the high level of commitment to the Pittsburgh model. That approach had the great merit of moving the project along at a rapid pace to the point where detailed metadata already have been identified and system acquisition, development and implementation will take place in the course of the next year. For a complex system in an organization of 10,000 staff, a system design and development cycle that spans less than three years is certainly a fast-track cycle. Whether this fast-lane approach will endure internally over time remains to be seen. In partial support of the Philadelphia approach, some professionals (including IM&T professionals) have begun to challenge the traditional highly structured approach to systems analysis. Indeed, one of the classical systems analysis methodologies is the "structured analysis" methodology. Large systems ventures are being challenged on several grounds -- they are too time consuming, too costly, rely on an outdated central computing paradigm that doesn't well serve distributed systems and are more likely to result in monolithic solutions that are impossible to keep well documented or properly maintain. These critics would rather see a minimal level of requirements analysis, just enough to rapidly implement a small-scale prototype that can be built out through trail and error and extended to other users over time.
The Indiana project began with a different approach: it was to carry out business systems analysis in key business areas and, in the process, test the viability of the Pittsburgh requirements in an academic business setting. It might be interpreted that the Pittsburgh requirements were seen as a useful starting point but might not survive in their entirety when applied. This will likely result, if it hasn't already, in a higher buy-in for what ultimately will be decided on as functional requirements to be used at time of project implementation. It will be interesting to revisit these projects 2-3 years hence and observe in retrospect how implementation times and systems acceptance varied if at all.
The fact that the approaches taken by the projects in Indiana and Philadelphia differ in some ways is not important in itself or a cause for alarm. On the contrary, we are plowing new ground here and need to see where different approaches take us. The three most important needs in electronic records management today are: implementation, implementation and implementation. The sooner we can achieve some solid observable and repeatable results from the systems being implemented by these projects, and others like them, such as in Vermont, New York and at UBC, the better.
All recordkeeping systems must manage the content, structure and context of individual records as they are bound to other related records. Essential contextual information about the record, much if not most of which is not inherently derivable from the content of the record is variously referred to as metadata, profile or description information. Metadata, or data about data, is an old information management and engineering word that has been used for many years in the design of information directories and systems. Document profile is another more recent term in use jointly by ARM professionals and information scientists in some organizations to reflect similar kinds of data. Profiles are associated with individual documents, and may be used to associate individual documents with groups of like documents or documents associated with the same business process. The term metadata may used in the same way; however in connection with the Pittsburgh functional requirements and metadata encapsulated object, it is used not at the document level but at the transaction level. Archival description is a still another term used to describe the context of records but with much broader connotations that go into the provenance of the record and the state of the organization and its policies and procedures at the time of the action or transaction of which the record is a residue. Just as the term "metadata" has long standing and meaning in the IM&T community, "description" is a term of long standing and meaning in archival theory. They are both forms of contextual information, but at very different levels. The differences in the meanings and applications of these highly charged terms are -- and have been -- the subjects of another session if not another conference. Suffice it to say here that both projects use the metadata approach, though differently and neither, to the best of my knowledge has provisions for the maintenance of electronic provenance or description databases.
In an undated paper on the Pittsburgh WWW site, David Bearman and Ken Sochats posed two scenarios for delivering metadata services in an electronic recordkeeping system: one was the encapsulated approach in which the agreed metadata is both logically and physically bound and encapsulated with the record. The other was one in which the metadata is logically bound but managed in a separate metadata database. Even in the second approach, of course, when a complete record is communicated externally, it must at that point have the metadata bundled with it. While the first approach has the advantage of ensuring that the record has all the needed metadata with it at all times, the second approach may be more desirable from a technology migration perspective, because of its applications independence. In fact, the paper goes on to suggest that some combination of approaches or some approach in between these two approaches (what the paper refers to as "the extremes") is probably the most practical. To further elaborate metadata requirements, the Pittsburgh project further elaborated the so-called Business Acceptable Communications or BAC model. This uses the classical reference model approach made so familiar in the Open Systems Interconnection (OSI) Reference Model with various transport and application layers at which standards are articulated.
It may be that both projects use some combination of these approaches for capturing metadata. As far as I am able to determine, however, the Philadelphia project has adopted the Pittsburgh MEO approach with the BAC model fully in tact. This will certainly extend our knowledge base with respect to implementing the Pittsburgh requirements in a manner that should make them highly understandable to data administrators, information systems managers and software specialists, including commercial software development vendors who are operating in the field of electronic document management and workflow systems. The outcome of the Philadelphia project should therefore offer excellent testimony to the strengths and weaknesses of the whole Pittsburgh approach and I, for one, am waiting with baited breath. This also suggests, however, that the Philadelphia project has an unusual burden to ensure that we hear about the warts as well as the beauty of that project as it emerges.
It is not yet clear exactly what approach the Indiana project will take and, indeed, it may be premature to ask. The Indiana web site on this project has the metadata descriptions as yet under construction and without hyper-links. My understanding, however, is that it is planned that Indiana will not use the encapsulated approach of designing the system to capture the metadata at the front end -- sometimes called data harvesting; rather, as Phil Bantin noted in this remarks at the recent NAGARA presentation on this project, that it will use an application-independent database approach in which metadata will be managed separately from records although logically linked to them. This approach is also known as data mining. This difference in approaches may simply be a reflection that the Philadelphia project is using a new system to carry out this work, in which it is feasible to design the metadata capture into system design, whereas Indiana is using legacy systems not easy to redesign retrospectively. It would be desirable to hear from the project leaders if that is the case or if there are other reasons why they have chosen the approaches they have. This information, and the subsequent testing of the results, is important because it will provide other organizations with very important insights into what information architectures work best for taking recordkeeping aspects into account and under what circumstances. This is especially interesting in the case of organizations that have already undertaken the development of enterprise electronic document management systems or intranets where an applications-independent approach to metadata may already have been incorporated in the information architecture for other reasons. If that is the situation, and the ARM function is now proposing a quite different architecture to use encapsulation to make those systems trustworthy from a recordkeeping perspective, then the ARM function can anticipate some serious meetings with the chief information officer. It is what we refer to in polite company as a "non-trivial, non-technical problem", although of course there are also some non-trivial technical problems that would come with any such dispute.
The Bearman/Sochats paper makes another important point that is worthy of highlighting. It says:
"From the perspective of the business, all data in information systems can be treated as convenience copy, to be kept as long as required for on-going business purposes and to be altered as desired to increase efficiency.
"When needed, records from recordkeeping systems may be copied to information systems which need require their content, but the record itself will never be deleted from, or changed within, the recordkeeping system except with specific records disposition authority. Recordkeeping systems will store and provide access to metadata encapsulated objects."
The suggestion, unless I have misread the "paper", is that the recordkeeping system must be a separate and parallel system to the operational system in order to maintain its integrity and the authenticity of its records through a custodial systems approach. It may also be interpreted as suggesting a central recordkeeping system as opposed to the kind of distributed system now planned by the Australian Archives. This parallel approach should certainly be an alternative, and it might be particularly well suited to certain business processes, for example certain financial or procurement processes that are susceptible to fraud, where it might be preferable not to have the organization's records easily altered or deleted by interested parties in the procurement office. I realize that this is a very attractive scheme to many archivists and diplomatists who believe that no costs should be spared in the interests of control. At the same time, it is also a very costly approach and one that seems to run counter to the idea of distributed, non-custodial electronic recordkeeping at a time when we are trying to inculcate the idea that responsible business areas should also be responsible recordkeepers. It is certainly an approach that the ARM function would have to be prepared to justify to both the CIO and the CFO, and the archivists doesn't need enemies like them. It would therefore also be of interest to learn what the thinking is among the architects of the Philadelphia and Indiana projects on this score.1 I find no references to this aspect in the project documentation for either project.
Build vs. Buy
Another way in which the two projects differ is in their system development approach. Philadelphia took the "buy" approach and is in the process of sending specifications out to bid. However, I also understand that the in-house information systems group will be allowed to bid on the project. This sounds like it could get messy if the in-house IM&T group is in any way involved in a bid evaluation in which it is one of the bidders or, even if it isn't, if it loses. Indiana, on the other hand, took the "build" approach and is carrying out the system development in house.
The buy alternative has several advantages, especially if it has the support of the CIO:
On the other side:
There is no single best recipe for this meal and often the best solution may be some combination in which the RFP specifically includes high weighting factors among the bid evaluation criteria for the existence of a good users' tool-kit and related training program for use by the in-house staff in building needed APIs (possibly with the assistance of the winning bidder's staff). Not all document management systems toolkits are equal, and some systems don't have them at all, making the buyer much more captive to the vendor.
Reporting of Research Results to the Professional Community
An extremely important element of research and development projects is how well they do at regularly reporting their results in ways that allow others to learn as quickly as possible from the on-going work. Unlike some fields of science, applied research and development in the area of electronic records -- which is really what we are talking about here, not basic research -- is prone to results being learned throughout the process, not just at the end. Especially when it involves the very rapidly changing field of information technology, it is very important to take conscious steps to ensure that results get out regularly and widely. In my opinion, as I said at it concluding meeting, the Pittsburgh project set a great standard in making its work and findings known to all colleagues in the field of archives and records management as it happened, typos and all, that other projects should seek to emulate. The Indiana and Philadelphia projects, each now with a WWW homepage are working well in that direction.
One of the aspects of these projects that must be a serious matter of concern, in my opinion, is their sustainability over time in terms of leadership and professional support. Like some other topics outlined above, this topic has not been discussed in the literature on these projects or in the presentations today. Thus, my comments should be taken as purely conjectural, only to make the point that sustainability is something that should be consciously thought about and planned for each such project. It is my impression that the Ohio project has been organized in such a manner as to ensure continued support following the conclusion of the NHPRC grant, because the product is something that is designed to be used by people already in place in the participating organizations. Similarly, in the IU case, there is strong involvement on the part of managers in both the ARM and IM&T areas.
In the case of Philadelphia, we have a different situation. The leader of the project, an information scientist who is also the leading technical professional on the project, is funded through an NHPRC grant program only through end-1996. Another one-year grant proposal is in the making for funding during 1997. Without wishing to get personal in any way, the sustainability of a project of this significance and order depends greatly on the City's organizational buy-in and investments in the long-term implementation of the project, staff continuity, documentation and shared institutional experience. No doubt the project would never have happened without this short-term approach, and certainly it could not have happened with the speed and deliberation that has characterized this project. For this, NHPRC is highly to be commended. Nonetheless, having done its part, NHPRC cannot fund very many such positions or any one of them indefinitely. There is therefore a lesson for other organizations expecting to undertake projects of this kind. It is that there is a prima facie case for the establishment of positions for information management and technology specialists within the archives and records management organization. The ARM function can no longer hope to do what it is mandated to do that requires sophisticated technology-based systems solely depending for its technical expertise on borrowing time from the IM&T function as and when it becomes available and often with little understanding or interest in records or records management systems, especially those having anything to do with legacy paper systems. Clearly this has not been the case at IU where the Office of Information Technologies has become deeply involved and committed to the project.
As essential as it is to ensure the involvement of the IM&T group and that all ARM staff receive education or continuing education in IM&T skills, that level of technical support within the ARM function is inadequate. Rather, the ARM function needs to build and sustain its own technical staff with the capacity to carry out business systems analysis, articulate requirements and specifications for information systems, and collaborate and be able to go toe-to-toe with operational organizations and IM&T to ensure that organizational information systems are also designed, developed, implemented and maintained as trustworthy recordkeeping systems. Even if we were all to agree with this approach, and I doubt that everyone would, there is a great distance between ARM professionals understanding this and their obtaining the necessary budget to staff such positions. Thus, how well any organization succeeds in implementing a trustworthy recordkeeping system in the near future, before recordkeeping functionality becomes a common requirement of all automated systems, will depend in a very large way on how successful the ARM function is at persuasively articulating the organization's needs for such systems to their chief information and financial officers and to top management. In the meantime, most organizations will operate at increasing levels of risk.
Ohio takes a basically different approach than either the Indiana or Philadelphia projects because it does not begin with the Pittsburgh functional requirements as a given premise, although it did use those requirements in its deliberations. Rather, Ohio took a more holistic approach by first identifying then engaging the key stakeholders in electronic systems and together worked out the requirements through a partnership and brokering process.
Lessons for Future Projects
These projects together with others like them, for example the projects at the University of British Columbia and Vermont State, which also use business systems analysis for the decomposition of business processes in ways that take recordkeeping requirements into account, have provided a new level of rigor in testing our thinking about recordness and recordkeeping and the potential for automated or computer-assisted records management and we owe a considerable round of applause for the managers and others related to these projects and to NHPRC for making most of them possible.
I would like to close on a few lessons that I take from reviewing these projects today:
1) Design the deliberative process by which recordkeeping requirements will be incorporated as carefully as you do the system itself, and learn as much about the technological and end-user environment and frustrations as you expect them to learn about your own. Don't get too far ahead of the business users or chief information officer, or they may not be there the next time you turn around when you need them.
2) Get budget analyst assistance up front, from within your own group or externally, and include this person in the stakeholder process so that at each stage that base is covered. Enlist the assistance of the offices of the chief financial officer and chief information officer to ensure that your project fits into current year, new year, medium-term and long-term planning and budgeting exercises to gain interest and avoid funding gaps.
3) Work toward the situation in which the ARM function has full-time internal IM&T skills, just as most other user organizations have done in the past 10 years. In the meantime, borrow or buy the help.
4) Understand and weigh the information architecture options for dealing with metadata and description issues.
5) Don't leave the outcome of the buy-vs-build options to circumstance, happenstance or arcane procurement regulations. Think through what is best and go for it in a way that minimizes risk to the organization as a whole.
6) Make the best use of the technology to share project results as they emerge. The World Wide Web is ideal. And share the real results, warts and all.
7) Don't deliberate. Implement. Like these guys are!
REBarry: c:\data\msw60\b-assoc\assns\saa\aa-my-p1.doc: 14 August, 1996
1In communications with the author following this presentation, David Bearman indicated that this point had not been made clear and that the centralized model should be considered as a viable alternative, but not the only alternative.