About 1% of all the payments value are lost due to operational error like duplicate payments. Such duplicate payments occur due to alteration of the determinants of each invoice: Vendor name, invoice number, invoice amount, invoice date. For example, due to errors in OCR, the invoice number was altered and instead of one letter (“l”) we might have a number (1). Therefore, the invoice number is seen as different by the payments department and another payment is done. Example can continue.
For the high value of payments made by SG EBS, this 1 percent means a lot of money. As a consequence, they’ve built a department focused on verifying manually all the potential duplicated invoices provided by their existing invoice monitoring tools. Unfortunately, such traditional tools return an immense number of invoices that would need to be analyzed manually by humans (it looks for combinations of 3 invoice criteria like same invoice amount, same vendor name, same date). Moreover, the false positive rate of these returned invoices is about 98%. It means that for each 100 invoices analyzed only 2 of them prove to be duplicated after all. In conclusion, the huge effort done by all the people in this department is extremely inefficient and costly. Our aim was to reduce the human effort by at least 40% using Machine Learning.