
One of the key steps in our client’s data preprocessing pipeline was mapping payment terms and payment methods from various documents into standardized categories. This standardization enabled downstream systems to efficiently recognize the data and perform further analysis.
Given the client’s time constraints and urgency to deliver results quickly, we divided the project into two stages. The first stage provided an immediate intermediary solution, while the second focused on maximizing accuracy and reducing latency.
We created a standalone application integrated into the client’s cloud architecture that utilized an LLM as the primary engine for data categorization. For each financial term sent to our solution, the system responded with the proper mapping as requested by the client.
Leveraging the client’s labeled data, we repurposed Google’s BERT model to create a more accurate and efficient solution tailored to their specific use case.
During backtests, the final solution achieved approximately 95% accuracy across all payment term and payment method categories. The data classification process delivered near real-time responses with only a couple of seconds delay between input and output.
While the system didn’t require real-time interaction, the client was positively surprised that we increased solution performance while keeping processing time minimal. Additionally, the solution’s infrastructure deployed on existing computing resources, generating no additional maintenance costs for the client.