Each project and asset is legally independent and has its own managers.
NLP: Natural Language Processing, Disrupting Legal Document classification
The project aims to conduct the market validation of our NLP bases solution, capable of automatically interpreting and categorizing legal documents in Spanish taking into consideration all peculiarities related to legal jargon, and to extend it new target languages (English, Italian) and new geographies.
Project Description
The project aims to conduct the market validation of our NLP bases solution, capable of automatically interpreting and categorizing legal documents in Spanish taking into consideration all peculiarities related to legal jargon, and to extend it new target languages (English, Italian) and new geographies.
Employees of large companies in the area of Business Process Services spend too much time and effort browsing through legal documents in order classify and extracting the right information in a due diligence process. In this project it is proposed to validate and extend its state-of-the-art opensource-based NLP solution for entity recognition and extraction to classify legal documents, covering Spanish, English and Italian.
The expected outcome of its NLP solution is to be able to dramatically cut operational costs while processing legal documents, reducing manual cost up to 80% in document classification and 40% in metadata extraction. This will be possible due to an iterative testing-training-tailoring of our current NLP technology assets.
Participants
- Fondazione Bruno Kessler FBK
- Universidad Politécnica de Madrid
- Indra
- Ferrovial
- Ci3
Schedule
- EIT Digital decision in 2018
- Duration: 12 months, starting the 1st of July 2019
Results
Expected benefits in case of successful Project development:
a) Technical benefits
- This market validation project will allow to localize the solution in English and Italian (targeting a bigger target market).
- Re-train the AI models with new documents in a new business environment (that implies a significant effort in data pruning and data processing).
- Tailor the solution to the business need of Ferrovial.
b) Economic benefits
- Spare time when reading legal documents, covering Spanish, English and Italian.
- It will reduce business service process cost about 30-40%.
c) Environment benefits
- Potential reduction of paper used.
d) Social benefits
- The tool to develop will help to avoid or minimize wasting time in reading non-key points in legal documents.
e) Commercial benefits
- To have a product localized, tailored, validated and ready to be commercialized.
- It could represent a competitive advantage and a clear added value to our business.
Identified Risks
The main risks identified in this project are listed as follows:
a) Technical risks
- Suitability (and accuracy of the results) of the Ferrovial/Amey use case chosen in order to validate the solution and the change management process to successfully deploy the solution in different business units.
- Competitive product already in the market. Some disruptive players (Luminance, Kira or Kim) are approaching some targets segments of the industry we are facing, although their use cases are only in English and not pretrained.
b) Organizational Risks
- One or more partners leaving the consortium.
- Time for development is underestimated.
c) Economic risks
- Partner organizations with financial problems requires restructuring project budget.
- Time for development is underestimated.
d)Environment risks
- No environment risks have been identified.
f) Social risks
- Lack of commitment of the end-users to provide data.
- No interest in applying this type of technology by the managers in other business units.
g) Legal risks
- Continuous evolution of the reference standards.
- Legal problems to collet/use data from other stakeholders.
Strategic Impact
The specific impacts of the project can be listed as follows:
- The expected outcome of the NLP solution is to be able to dramatically cut operational costs while processing legal documents, reducing manual cost up to 80% in document classification and 40% in metadata extraction.
- The solution will be tailored to the business need of Ferrovial.