Machine Learning NLP Applications - Contract Clauses Legal Review


When signing up to a consumer service, do you actually read the terms of service (TOS) before clicking on the “Accept” button? If you do, then you are an exception. Majority of us have a tendency not to review them, and sign our rights away unknowingly. There are government organisations like the EU attempting to address this issue through regulation.

Despite their efforts, online platform companies mostly do not adhere to these rules. 

The objective of an initiative dubbed the CLAUDETTE project is to automate the legal evaluation of terms of service and privacy policies of online platforms, using machine learning. The project’s philosophy is to empower the consumers and civil society using artificial intelligence. It is run by an interdisciplinary research project hosted at the Law Department of the European University Institute. Their research paper can be found here and associated dataset can be found here.

There has also been a lot of chatter in the last 12 months around the evolutionary leap for state of the art Natural Language Processing (NLP) models in the form of Transformers. The breakthrough was the result of Google research on models that process words in relation to all the other words in a sentence, rather than sequentially (one-by-one) and in order. These models can consider the full context of a word by looking at the words that come before and after it — particularly useful for understanding the intent. More information on Transformers can be found here.


The article details the exploratory work we had carried out on the CLAUDETTE dataset using our NoamAi automated NLP platform. The purpose was to: 

  1. Train Transformer models to detect terms of services clause types and their associated fairness in contracts. 

  2. Ascertain if a NLP model trained on the CLAUDETTE dataset (online platform services), can be applied to terms of services from another domain (i.e. telco service contracts).

We were able to achieve a ~92% model accuracy on the CLAUDETTE dataset, and applied the model on an anonymised Telco customer wireless service contract, reaping interesting results. Data associated to the experiment can be found here.



The CLAUDETTE dataset was split into training and validation datasets:

Training: ~2.9k labeled clauses were used to train a NLP model. The cleansed data can be found here. Below is a breakdown of unclassified (clauses labeled as unc) and classified (clauses not labeled as unc) clauses:

  • Unclassified clauses: 2,422 

  • Classified clauses: 571

Validation: ~1.2k labeled clauses were used to evaluate the model’s accuracy. The cleansed data can be found here. Below is a breakdown of unclassified (clause labeled as unc) and classified (clause not labeled as unc) clauses:

  • Unclassified clauses types: 1,026

  • Classified clauses types: 247

Clause Categories

The clauses are broken up into 9 categories as per described below.

The fairness is measured on a scale of 1 - 4 (Fair: 1, Potentially Unfair: 2, Clearly Unfair: 3, Unclassified: 4):

Telco Wireless Service Contract Data

A publicly available unlabeled Telco service contract was used as a test dataset. It contained terms of service for wireless products, features, applications, and services.

The contract was anonymised (company names and contact details removed) and contained ~620 lines of text. Details of the dataset can be found here.



The trained model achieved an accuracy of ~92% on the CLAUDETTE validation dataset. Details of the results can be found here.

Model accuracy for the unclassified (clause symbol equal to unc) and classified (clause symbol not equal to unc) clause types were as follow:

  • Unclassified clauses: 97.5%

  • Classified clauses: 69%

Telco Wireless Service Contract

Below is an extract of some of the terms of service clauses from the Telco contract that the NLP model had evaluated and classified. The model had successfully evaluated some of the clauses' classifications, despite a difference in domain. Details of the results can be found here.


NoamAi is a fully automated NLP platform that embodies the principles of DevOps and CI/CD (MLOps) for Machine Learning and AI. It simplifies and automates the end to end ML model build process (data preparation -> model training -> model deployment) by way of standardisation, consistency, versioning, speed and scale. All users need to do is to provide the data and define the problem. 

The platform has the ability to build models for the following use cases:

  • Classification - Classify sentences or text narratives (i.e. sentiment analysis).

  • Comprehension - Answer questions based on paragraphs of text.

  • Multiple Choice Question Answering - Answer multiple choice questions

  • Named entity recognition - Text information extraction into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages.

  • Summarisation - Summarising of text.

Market leaders have recognised that:

  • Machine learning and AI have demonstrated benefits and value in an organisation

  • Standardization & automation of ML drive speed, scale and efficiencies

  • Teams become productive in their day jobs

  • Giving them the capacity to innovate 

  • Allowing for breakthroughs and discoveries

  • Which improves and matures Machine Learning in an organisation

  • And increases return of investment

What Else?

Exciting times ahead, as we are in the midst of an “industrial revolution” epoch for Machine Learning and AI - Organisations are productising and commercialising academic research and algorithms to solve real world problems, hence gaining in efficiencies, benefits and value. It has also changed the way businesses operate, using data driven and machine learning solutions to streamline business operations and decision making.

423 views0 comments