Previous Document
Document control databases
'Document control' essentially refers to the creation of computerised
summaries of documents relevant to particular litigation, by the use of
database software which allows each document to be treated as a separate
record, and for each record to contain a series of fields which record the
salient features of the document. Litigation support literature often uses the
term 'document control' to refer to such methods of retrieving documents:
Broderick (1990). The documents concerned will most commonly be potential
exhibits, although witnesses' statements, investigators reports, transcript of
previous hearings, and other documents not likely to be exhibits may well be
summarised in such a database.
The software used to create a document control database is likely to
be a 'flat file' database. A relational database program may well be
used, but it is just as likely that none of its relational
features will be used. A hybrid database and free
text retrieval system can also be used, and is
of particular value if the full text of all or some [PP16]
documents, not just summaries, are to be used. Typical fields
in an exhibit-oriented document control database might include:
document number / exhibit number; date(s); title of document; number
/ name of manual file location; whether discoverable; whether privileged;
document type (original, photocopy, fax); date acquired; source acquired from;
other documents referred to; document from; document to; author; addressee;
copies to; key words; notes by person summarising. See Chapter 5 for further
discussion.
Creation of document control databases
Document control databases cannot usually be created in any semi-automatic
way. Their creation involves people analysing each document and extracting its
salient features. Many of these summarising tasks require interpretative
skills and, in some cases, considerable legal skills. There is often a 'two
pass' approach to the creation of a document control database (see Broderick
and Adrian (1991), Rubenstein (1992), Staudt and Keane (1992) Chapter 6). In
the first pass, a person without professional legal skills extracts from the
documents those details which are relatively obvious, such as the author,
recipient (if any) and date of creation, and assigns the document a number. In
the second pass, a legally qualified person re-examines all documents and
completes (or confirms) those fields in the database involving legal matters
such as questions of privilege or likely admissibility. One or other will also
usually index the document in relation to the issues which are likely to
arise in the matter.
Retrieval from the full text of documents
Free text retrieval technology can be used to retrieve three main categories
of documents in relation to litigation: transcript of proceedings (either the
current proceedings or prior proceedings); witness' and other statements; and
the full text of potential exhibits.
Effectiveness of free text retrieval in law
The effectiveness of free-text retrieval as a means of litigation support has
been criticised severely (Blair and Maron (1985); Blair and Maron (1990)), on
the basis of an experiment showing that lawyer users of a 40,000 document
litigation support database considered that acceptable performance of their
system (which they thought they were achieving) was recall of 75% of all
relevant documents, whereas in fact they were only retrieving 20% of all
relevant documents. Blair and Maron's criticisms were partly on the basis that
lawyers using the system had exaggerated and misleading expectations as to
the effectiveness of their searches in recalling all relevant documents, but
also on the basis that to expect matches between word occurrences and concepts
is inherently unreliable. Use of free text searching to retrieve case-law has
also been criticised, with recommendations that greater reliance be placed on
catchwords as the prime means of retrieval (Lindgren (1990)). To some extent,
these criticisms of free-text retrieval are criticisms of unrealistic user
expectations, as well as identifications of its inherent limitations.
The deficiencies identified can also be overcome to some extent by
improvements in retrieval software and techniques, including better use of
truncation, synonyms, error checking, relevance ranking of retrieved results,
and more 'user friendly' user interfaces: see Bing (1984), Bing (1990),
Greenleaf, Mowbray and Lewis (1988) Chapter 3. There is little doubt that
free-text retrieval can be made more effective if the 'raw' text of cases,
transcript or potential exhibits is enhanced by the addition of indexing terms
(Bing (1984)) such as catchwords of cases, summaries of documents, or 'issue
identifiers' inserted into transcript (Broderick (1990)). Hybrid software
has clear advantages in facilitating this. However, all such
'value adding' of indexing terms to the raw text
involves the time and intellectual effort
Next Document