- Improve conversion of paper documents into machine readable format
- Provide facilities to test new multilingual technologies
- Enhance tools for data discovery
- Detect duplicates in the repository
|
- Develop code to retrieve the content from documents
- Develop algorithms to detect duplicates
- Host facilities to test new multi-language technologies
|
- Provided secure, traceable access for independent language experiments
- Reduced the number of duplicate records
- Improved the ability to extract intelligence from documents
- Increased the number of documents available to English language users
|