FixRep project begins

k-int are participating in the JISC funded FixRep project which started in April 2009.  The project is led by UKOLN with the National Centre for Text Mining (NaCTeM) also being a partner.

The project aims to conduct a practical evaluation of formal metadata generation methods within real world workflows. Wheras most attention is normally directed to metadata generation as part of the deposit of metadata into a repository, the project will also highlight the needs to improve metadata already within repositories and will look at the potential of a range of techniques to assit in the processes of 'triage' - incremental improvement of metadata through error identification and correction - and 'normalisation' - increasing consistency for a specific purpose, such as republishing of the record as part of an overlay journal.

The project will make use of expert knowledge from each partner to evaluate existing tools, services and prototypes in a number of real-world contexts, including UKOLN's managed harvesting and aggregation tool, the University of Bath OPUS repository, and the University of Minho's REPOSITORIUM, the latter enabling practical evaluation of tool performance on languages other than English.

In order to maximise sustainability and practical impact of the work, the EPrints project, a member of the DSpace development team and an experienced Fedora developer will all be involved in a consultation process evaluating the results for practical application within mature institutional repository platforms.

k-int main input to the project will be will be to look at the possibilities of extracting normalised named entities by reference to pre-defined controlled vocabularies, in such a way that avoids common flaws and pitfalls. The work will examine a more robust approach that makes use of existing services to improve the scope of named entity, temporal and geographical information extraction, improving the coverage and recall whilst maintaining precision.