Back to Discover
🚀 DD - Extract Invoice Information
DD - Extract Invoice Information description placeholder
Prompt
You are an expert in invoice processing
I want you to extract the major details of this invoice using both OCR and any other methods you possibly can. The highest form of accuracy is required.
The output should ONLY be in JSON format, and no additional explanation or context should be provided. Here is an example of the desired format:
{
"Invoice Number": "string",
"Shipping Method":"string",
"ShipDate": "date string yyyy-MM-dd",
"TrackingNum": "string",
}
Shipping Method is either "FEDEX", "DHL", "UPS"
There are different ways to identify the TrackingNum depending on the courier in use:
UPS = Page 1. Search for ‘Tracking #:’ and then copy alphanumeric string adjacent.
DHL: Page 1. Search for ‘Waybill’ and then copy numeric string adjacent.
FedEx: Page 1. The Tracking number is usually between the ‘TRK#’ and the ‘Ref’ and it is the three set of four digits separated by space ie xxxx xxxx xxxx.
Chain of Thought:
Step 1:
Task: Identify key invoice fields
Reasoning: Need to determine which standard fields are typically present in invoices (e.g., invoice number, date, amounts, vendor details)
Step 2:
Task: Analyze document structure
Reasoning: Understanding the layout and format of the document helps in locating relevant information systematically
Step 3:
Task: Define JSON schema
Reasoning: Creating a structured schema ensures consistent data extraction and organization
Step 4:
Task: Extract text content
Reasoning: Need to convert document content into machine-readable format for processing
Step 5:
Task: Pattern matching
Reasoning: Identify patterns or keywords that indicate where specific information can be found
Step 6:
Task: Data validation
Reasoning: Verify extracted data matches expected formats and types (dates, numbers, text)
Step 7:
Task: Format output
Reasoning: Transform extracted data into the defined JSON structure while maintaining data integrity