TVH is known for its forklifts, aerial platforms and other in-plant transport. They have branches in 30 countries with customer support for 37 languages. They have 50 years of experience in the business.
TVH has two business units, TVH Parts and TVH Equipment. TVH Parts is responsible for Material handling parts (MPA), Industrial equipment parts (IPA) and Agricultural equipment parts (APA). TVH Equipment is responsible for Service & repair, Sales and Rental & operational leasing. This is where Brainjar comes in.
TVH Equipement receives 8000 orders per day in Belgium. A large part of these orders are being made through email. Right now this is maintained by employees that read the email and manually place an order. This is rather time consuming and not very efficient. That is why TVH Equipment was looking for a way to reduce the workload in order for the employees to be able to focus on more essential tasks. Brainjar can speed up the process by using artificial intelligence to read the email and extract the necessary information to place an order.
How did Brainjar speed up the order processing of TVH?
Brainjar build an end-to-end machine learning application that
- integrates with the mailbox;
- classifies which emails can be handled by our software;
- can extract the necessary information for building an order from an email;
- presents the extracted information in a web interface for review.
To build this application Brainjar used a continuous process of 5 distinct steps:
- Data analysis: We went on-site at TVH to understand the data and the business process
- Research: The field of machine learning is advancing at a fast pace, therefore some additional research is required for each project
- Machine learning: During this stage we train and fine-tune our machine learning models
- Integration: Both at the input and output side we integrate with the systems of TVH
- Customer feedback: Periodically we go on-site or schedule a call to get customer feedback
A bird's-eye view of the application
The first step was to analyse the workflow of TVH. Looking at how clients send emails, in which language, what information can we find in the email ... . Not only do we look at the emails used for creating orders but also other emails that are being sent to TVH. The first system we need, is something that can distinguish the emails containing orders from the ones that don't.
This is where we needed to research what the best way was to interact with the mailbox, manipulate emails and separate emails. To interact with the mailbox we've created both a Gmail and Office365 API for handling emails. For separating emails we used a branch of artificial intelligence called Natural Language Processing (NLP). With NLP we can classify emails based on the body of the email. There are a lot of possible NLP approaches to classify text so research was needed to search for one that was most suitable for the use case of TVH. More on this below.
The next step is to solve the problem of extracting information from emails. We analysed the emails as well as the corresponding order. From this we could determine which information is necessary to create an order. To extract information from the email we used Named Entity Recognition (NER). NER will locate and classify certain entities in the text. There are again a lot of approaches to NER. More on this below.
The last step was to decide how we would build a user friendly web interface. TVH already used Angular for their web platform with their own house style. We used the same technology and layout to make the transition for the employees as intuitive as possible.
The business case behind the automation
Before going into the technical details let us first present you with an overview of the business case. The following numbers are approximations of the actual TVH Equipment numbers. In Belgium alone, TVH Equipment receives around 8000 orders per day. Almost half of those orders is processed by email, the other half is either by phone or through the website. Each order requires on average 5 minutes of processing time. Totalling around 330 man-hours per day just for email order processing. With our order automation application we can reduce the processing time from an average of 5 minutes to an average of 1 minute, a reduction of 80%! An average reduction of 260 man-hours per day!
The nitty-gritty technical details
Gmail and Office365 API
One of the important parts about this project is manipulating the mailbox in a way that doesn't interrupt the workflow at TVH. This means we needed to be able to move emails around based on their labels or folder structure. That is why we developed an API that integrates both with Gmail and Office365. This API is written in Python and has the following functionality:
- Move incoming mails to the dedicated label or folder
- Extract all the information from an email
- Put all the information per email on a Cloud Pub/Sub
- Be able to add and remove labels from an email
- Send and reply on emails
The backend is the backbone of this project. This will interact with all the other services. It will make sure that all emails are being processed correctly. This API is written in Python and has the following functionality:
- Listen on the Cloud Pub/Sub
- Send the email data to the text classifier
- Send data to the Named Entity Recognition
- Send instructions to the mail API
- Process requests from the web interface
- Create an order
The text classifier is necessary to determine if an email contains an order or not. In the future this can also be expanded to classify different types of emails. For example order, cancelation, sign out, quotation ... . We ended up going for a neural network that was language independent. TVH offers customer support in 37 languages. Keeping that in mind from the beginning will save us time in the long run.
The API works as followed: the body of the email is converted to a sequence of tokens. These tokens are then embedded into vectors. We are transforming the body to vectors because a neural networks cannot process text, only vectors. The vectors are then processed by the neural network which outputs an output vector. By adding a classification on the whole text the API is able to return if the email is an order or not.
Named Entity Recognition
For the Named Entity Recognition we used the same approach as the text classifier. This means that we can extract entities from emails independent of the language. The difference with the text classifier is that for the NER we want to predict the label of a token, not for the whole text. This is done by adding the classification on the output vector of each token. This will return a label for each token. With this we can extract the necessary entities like delivery date for example. This API expects a body of text as input and will return all the tokens with their corresponding label.
The web interface has two functions. The first is to verify if an order has been made correctly before sending it to the system. The second function is to save the corrections that are being made. This will help to improve the text classifier and the Named Entity Recognition. The web interface has a login page, overview page and an order page.
An email comes in on the mailbox of TVH. The email is being labeled as 'In Progress' while all the information is put on the Cloud Pub/Sub. The backend retrieves all the information from the Pub/Sub and sends it to the text classifier. This returns that the email is indeed an order. The backend sends the body to the named entity recognition. This will return all the entities in the email. The backend will then use all the information to create an order. It will then send instructions to the mail API to remove the label 'In Progress' and add the label 'Processed'.
Now an employee of TVH can login onto the web interface and see a list of emails that need reviewing or that are reviewed. When the employee opens an order he will see the original email on the lefthand side and the order on the righthand side. The employee can then make changes as needed and click submit when done. This information is used to improve the different ai solutions, reducing the time even further.
Brainjar is still working with TVH on improving the application even further. Right now we are working on implementing PDF. A lot of the customers of TVH make orders with PDF's. The problem is that each customer has its own layout and PDF structure. The next step is to also process those emails. Another future extension is adding additional classes of emails, for example: stop orders, change orders, ... .