No edit summary
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{ScriptingNav}}
== Overview ==
 
The AI-Powered PDF Invoice Import System enables automated extraction and processing of supplier invoices from PDF documents using artificial intelligence. The system uses vision AI models to read invoice PDFs and automatically create AP (Accounts Payable) invoices in the accounting system, eliminating manual data entry.
 
== System Capabilities ==
 
* '''Multi-file processing''': Select and process multiple PDF invoices in a single operation
* '''AI vision processing''': Leverages advanced AI models to read and interpret invoice documents
* '''Flexible extraction''': Adapts to different invoice formats and layouts automatically
* '''OCR accuracy''': Intelligent character recognition with disambiguation of similar characters
* '''Mathematical validation''': Verifies totals and GST calculations for data integrity
* '''Automated invoice creation''': Generates complete AP invoices with all line items
* '''Batch support''': Optional batch processing for grouping related supplier invoices
 
== AI Processing Methods ==
 
The system supports two processing approaches:
 
=== Vision-Based Processing (Recommended) ===


== Overview ==
Processes the PDF directly using AI vision models:
 
* Analyzes the visual layout and structure of the invoice
* Handles complex formatting, tables, and multi-column layouts
* Works with scanned documents and image-based PDFs
* Processes multi-page invoices as a complete document
* More accurate for invoices with complex layouts
 
=== Text Extraction Processing ===
 
Extracts text from PDF then processes with AI:
 
* Uses GemBox libraries to extract structured text
* Suitable for text-based PDFs with simple layouts
* Faster processing for straightforward invoices
* May struggle with complex table layouts or scanned documents
 
== Supported AI Providers ==
 
=== Anthropic (Claude) ===


The Invoice PDF Import System provides automated processing of supplier invoices using AI-powered document analysis. The system extracts invoice data from PDF files and creates AP (Accounts Payable) invoices in the accounting system with minimal manual intervention.
* Default and recommended provider
* Excellent document understanding capabilities
* Strong structured data extraction
* Handles complex invoice layouts


== Key Features ==
=== OpenAI (GPT Vision) ===


* '''Multi-file processing''': Process multiple PDF invoices in a single operation
* Alternative vision processing option
* '''AI-powered extraction''': Uses vision AI models to read and interpret invoice documents
* Compatible with GPT-4 Vision models
* '''Invoice type detection''': Automatically distinguishes between Service and Parts invoices
* Good for standard invoice formats
* '''Data validation''': Performs mathematical verification of totals and line items
* '''Automated invoice creation''': Generates complete AP invoices with line items
* '''Batch processing''': Optional batch creation for grouping related invoices


== Invoice Types ==
== Configuration Parameters ==


=== Service Invoices ===
The system requires the following configuration:


Service invoices contain work performed on specific vehicles or equipment. Key characteristics:
* '''AIVisionEnabled''': Enable/disable vision processing (true/false)
* '''AIModel''': Model identifier string
* '''AIModelId''': Specific model version (e.g., "claude-sonnet-4-20250514")
* '''AIApiKey''': Encrypted API key for AI service authentication
* '''FileName''': Path to PDF file (when processing single files)


* Include vehicle/equipment identifiers (Serial Number, Make, Model, Meter Reading)
== OCR and Character Recognition ==
* Contain detailed service description text
* Service description may span multiple pages
* Serial Number is used as the Department identifier


=== Parts Invoices ===
The AI system includes intelligent character disambiguation:


Parts invoices list individual components or products purchased. Key characteristics:
=== Common Character Confusions ===


* Focus on part numbers and quantities
The system is instructed to carefully distinguish between:
* No vehicle-specific information
* Line items include Part Number + Description format


== Data Extraction ==
* '''Z vs 2''': Z has diagonal line, 2 has curves
* '''O vs 0''': O is round, 0 may have slash or be more oval 
* '''I vs 1 vs l''': Context-aware recognition (numbers vs letters)
* '''S vs 5''': Shape and context analysis
* '''G vs 6''': Character form recognition


The system extracts the following information from invoice PDFs:
=== Validation Checks ===


=== Header Information ===
* Mathematical verification of line totals
* GST calculation validation (typically 15% in NZ)
* Cross-checking of subtotals and final amounts
* Multi-page continuity verification


* Invoice Type (Service/Parts)
== Data Extraction Process ==
* Invoice Number
* Invoice Date
* Order Number (Sales Order for Parts, Purchase Order for Service)
* Serial Number (Service invoices only)
* Customer Reference Number
* Service Description (Service invoices only)


=== Line Items ===
The system can extract various fields depending on the invoice format:


For each line on the invoice:
=== Standard Fields ===


* Description (Part Number + Description for parts)
* Invoice Number
* Quantity
* Invoice Date
* Unit Price
* Company/Supplier Name
* Extended Total (line total)
* Client/Customer Name
* Account Numbers
* Order Numbers
* Reference Numbers


=== Financial Totals ===
=== Financial Totals ===


* Net Subtotal (excluding tax)
* Line item amounts
* Total GST/Tax
* Subtotal (excluding GST)
* Total Amount (including tax)
* GST/Tax amount
* Total amount (including GST)


== AI Model Configuration ==
=== Line Item Details ===


The system supports multiple AI vision providers:
* Product codes or descriptions
* Quantities
* Unit prices
* Extended amounts
* Date information (for time-based services)
* Reference codes or job numbers


* '''Anthropic''' (default): Claude vision models
=== Custom Fields ===
* '''OpenAI''': GPT vision models


Configuration parameters:
The AI prompt can be customized to extract additional fields specific to your supplier's invoice format:


* <code>AIModel</code>: Model identifier string
* Vehicle registration numbers
* <code>AIModelId</code>: Specific model version
* Serial numbers
* <code>AIVisionEnabled</code>: Enable/disable AI processing
* Odometer readings
* <code>AIApiKey</code>: Encrypted API key for authentication
* Job descriptions
* Barcode data
* Custom reference fields


== Data Validation ==
== Invoice Creation Configuration ==


The extraction process includes built-in validation:
=== Transaction Settings ===


=== OCR Accuracy ===
* '''Transaction Type''': APINV (AP Invoice)
* '''Transaction Type Code''': 20 (default for AP invoices)
* '''Created Date''': Defaults to current date/time
* '''Created User''': Configurable (default: "Admin")


* Disambiguates common character confusions (Z/2, O/0, I/1/l, S/5, G/6)
=== Processing Dates ===
* Reads all pages of multi-page invoices


=== Mathematical Verification ===
* '''Invoice Date''': Extracted from PDF
* '''Payment Date''': Calculated (typically 20th of following month)
* '''Process Date''': Defaults to current date


* Verifies sum of line items matches subtotal (within 0.02 tolerance)
=== Account Assignments ===
* Confirms GST calculation (≈ 15% of subtotal)
* Validates total = subtotal + GST


== Invoice Creation Process ==
* '''Supplier ID (OtherPartyId)''': Must be configured in script
* '''Location ID''': Configurable (default: "Misc")
* '''Order Number''': Extracted from invoice or use reference
* '''Reference''': Invoice number from PDF


=== Line Item Processing ===
=== Line Item Configuration ===


'''For Service Invoices:'''
* '''Item ID''': Default item code for imported lines (must be configured)
* '''Description''': Extracted from PDF line items
* '''Department''': Optional, can be mapped from invoice fields
* '''Quantity''': Extracted from line items
* '''Unit Price''': Calculated or extracted
* '''GST Treatment''': Configurable (GST inclusive or exclusive)


* Creates a nominal line item with service description
== GST Calculation Methods ==
* Adds offsetting line item for calculation purposes
* Uses extracted line items for parts/materials used


'''For Parts Invoices:'''
The system supports different GST calculation approaches:


* Converts each extracted line item to invoice row
=== GST Inclusive Amounts ===
* Calculates GST inclusive totals
* Computes unit prices from extended totals


=== Department Assignment ===
When invoice amounts include GST:


* Service invoices: Uses Serial Number as Department
<pre>
* Parts invoices: Department left blank or as specified
Total With GST = Line Amount (as shown on invoice)
GST Exc Total = Total With GST ÷ 1.15
GST Amount = Total With GST - GST Exc Total
Unit Price = GST Exc Total ÷ Quantity
</pre>


=== Financial Calculations ===
=== GST Exclusive Amounts ===


All amounts are stored GST exclusive with separate tax tracking:
When invoice amounts exclude GST:


<pre>
<pre>
GST Exc Total = Line Extended Total
GST Exc Total = Line Amount (as shown on invoice)
Total With GST = GST Exc Total × 1.15
Total With GST = GST Exc Total × 1.15
GST Amount = Total With GST - GST Exc Total
GST Amount = Total With GST - GST Exc Total
Line 123: Line 180:
</pre>
</pre>


=== Invoice Record Creation ===
== Multi-Page Invoice Handling ==
 
The system is designed to handle multi-page invoices:
 
* Processes all pages of the PDF document
* Extracts line items spanning multiple pages
* Maintains line item sequence
* Validates totals across entire document
* Provides page count in extraction results
 
== Batch Processing ==


Creates AP invoice with:
=== Batch Creation ===


* Transaction Type: APINV
Optional batch grouping for related invoices:
* Vendor/Supplier: OtherPartyId (default: "5521")
* Location: Configurable (default: "Misc")
* Payment Date: 20th of month following invoice date
* Reference: Invoice Number from PDF
* Line items with all pricing and tax details


=== Batch Processing ===
* '''Batch ID''': Defaults to supplier ID or can be specified
* '''Batch Comment''': Descriptive text for the batch
* '''Automatic Batching''': Groups invoices by supplier


Optional batch creation groups related invoices:
=== Batch Benefits ===


* Batch ID defaults to supplier/vendor ID
* Groups related invoices for review
* Includes batch comments for tracking
* Simplifies posting process
* Supports recurring batch patterns
* Maintains audit trail
* Enables bulk approval workflows


== Error Handling ==
== Error Handling ==


The system includes comprehensive error handling:
The system provides comprehensive error handling:


=== PDF Processing Errors ===
=== File Selection Errors ===


* Invalid or corrupted PDF files
* No file selected
* Missing or unreadable pages
* Invalid file type (non-PDF)
* Unsupported PDF formats
* File not found or inaccessible
* Corrupted PDF files


=== Data Extraction Errors ===
=== AI Processing Errors ===
 
* AI service connection failures
* Invalid or empty AI responses
* JSON parsing errors
* Incomplete data extraction
 
=== Data Validation Errors ===


* No invoice detected in PDF
* Invalid JSON response from AI model
* Missing required fields
* Missing required fields
* Multiple invoices in single PDF (selects largest by subtotal)
* Invalid date formats
* Mathematical inconsistencies
* Zero or negative amounts


=== Invoice Creation Errors ===
=== Invoice Creation Errors ===


* Invalid supplier/vendor ID
* Invalid supplier ID
* Missing required master data (items, departments)
* Missing item master data
* Missing department codes
* Transaction validation failures
* Transaction validation failures
* Batch creation conflicts
* Database constraint violations
 
== Customizing the AI Prompt ==
 
The extraction prompt can be customized for specific invoice formats:
 
=== Prompt Structure ===
 
<pre>
1. Role definition (extraction agent)
2. Task description (extract invoice data)
3. OCR guidance (character disambiguation)
4. Field list (specific fields to extract)
5. JSON schema (output format)
6. Validation rules (totals, dates, etc.)
7. Output constraints (no extra text, no markdown)
</pre>
 
=== Customization Examples ===
 
'''Adding Custom Fields:'''
 
Add fields to the JSON schema in the prompt to extract additional data specific to your invoices.


== Usage Guidelines ==
'''Changing Date Formats:'''


=== File Selection ===
Specify date format requirements in the prompt (e.g., "yyyy-MM-dd", "dd/MM/yyyy").


# Use the file picker dialog to select one or more PDF files
'''Field Name Variations:'''
# Only PDF files are supported
# Multiple files can be processed in sequence


=== Best Practices ===
Provide alternative field names the AI should recognize (e.g., "Cust. Order No", "Customer Order", "Order Ref").


* Ensure invoice PDFs are clear and readable
'''Calculation Rules:'''
* Verify supplier master data exists before import
* Check that department codes (serial numbers) are valid
* Review created invoices for accuracy
* Use batch processing for related invoices from same supplier


=== Post-Import Actions ===
Specify how amounts should be calculated or validated.


* Review generated invoices in the system
== Implementation Workflow ==
* Verify line item details and totals
* Confirm department assignments
* Check payment dates
* Approve batches when ready for posting


== Technical Architecture ==
=== Basic Implementation Steps ===


=== Components ===
# Configure AI provider and API credentials
# Set default supplier ID and item codes
# Customize extraction prompt for invoice format
# Configure GST calculation method
# Set location and account defaults
# Test with sample invoices
# Review and validate created invoices
# Adjust prompt and settings as needed


* '''Import Scripter''': Main orchestration class
=== Testing Recommendations ===
* '''AI Vision Service''': PDF analysis and data extraction
* '''Invoice Functions''': Business logic for invoice creation
* '''Transaction Functions''': Database persistence layer


=== Data Flow ===
* Start with clear, simple invoices
* Verify mathematical accuracy of extractions
* Check department and item code assignments
* Validate date parsing and calculations
* Test multi-page invoice handling
* Review batch creation behavior


<pre>
== Best Practices ==
PDF File → AI Vision Analysis → JSON Extraction
 
Data Validation → Invoice Row Creation →
=== Invoice Preparation ===
AP Invoice Generation → Database Persistence
 
</pre>
* Use clear, readable PDF scans
* Ensure full pages are captured
* Avoid skewed or rotated scans
* Check PDF file integrity before processing
* Process similar invoice types together
 
=== Configuration ===
 
* Set appropriate default values for all parameters
* Use descriptive batch comments
* Configure supplier-specific item codes
* Validate master data prerequisites
* Document custom prompt modifications
 
=== Data Quality ===
 
* Review AI-extracted data before posting
* Verify mathematical calculations
* Check supplier ID assignments
* Validate department code mapping
* Confirm date calculations
 
=== Performance ===
 
* Process invoices in reasonable batch sizes
* Monitor AI service response times
* Handle errors gracefully with clear messages
* Log processing results for audit trails
 
== Advanced Features ==
 
=== JSON Response Extraction ===
 
The system includes a helper method to extract clean JSON from AI responses:


=== Performance Considerations ===
* Navigates AI response structure
* Extracts text content from nested JSON
* Handles various response formats
* Provides error handling for malformed responses


* Batch loading of items and departments
=== Dynamic Field Mapping ===
* Dictionary-based lookups for master data
* Minimal database queries during processing
* Asynchronous AI processing support


== Configuration Requirements ==
The system can map extracted fields to invoice line items:


=== System Parameters ===
* Product codes to item IDs
* Reference numbers to departments
* Custom fields to standard accounting fields
* Date parsing and conversion


* AI API credentials (encrypted)
=== Calculation Flexibility ===
* Default supplier/vendor ID
* Default location ID
* Invoice numbering scheme
* Tax calculation rules (GST rate)
* Payment terms (default 20 days after month-end)


=== Master Data Prerequisites ===
Supports various calculation scenarios:


* Supplier/vendor records
* Zero-quantity items (single services)
* Item master data (for parts)
* Division by zero protection
* Department codes (for service invoices)
* Rounding rules for currency
* Location codes
* Tax-inclusive vs tax-exclusive amounts
* Transaction type definitions


== Integration Points ==
== Integration Points ==


The Invoice PDF Import integrates with:
The PDF Import System integrates with:


* '''Item Management''': Item lookup and validation
* '''Item Management''': Item code lookup and validation
* '''Department Management''': Department code resolution
* '''Department Management''': Department code resolution
* '''Transaction Processing''': Invoice creation and posting
* '''Supplier Management''': Supplier/vendor record validation
* '''Transaction Processing''': Invoice creation and persistence
* '''Batch Management''': Batch header creation and tracking
* '''Batch Management''': Batch header creation and tracking
* '''AI Vision Services''': External API for document analysis
* '''AI Vision Services''': External API for document analysis


== See Also ==
== Security Considerations ==
 
* API keys are encrypted using 128-bit encryption
* File access restricted to allowed paths
* User authentication tracked for created invoices
* Audit trail maintained for all imports
 
== Troubleshooting ==
 
=== Common Issues ===
 
'''Problem''': AI extracts incorrect invoice numbers<br/>
'''Solution''': Add specific field location hints in prompt, emphasize OCR character disambiguation
 
'''Problem''': Missing line items from multi-page invoices<br/>
'''Solution''': Ensure prompt explicitly mentions checking all pages, verify PDF page count
 
'''Problem''': GST calculations don't match<br/>
'''Solution''': Verify GST inclusive/exclusive setting matches invoice format
 
'''Problem''': Department codes not assigned<br/>
'''Solution''': Check department master data exists, verify field mapping in script
 
'''Problem''': JSON deserialization errors<br/>
'''Solution''': Check AI response format, verify date format compatibility, review JSON schema
 
== Future Enhancements ==


* [[Import Scripting]]
Potential improvements for consideration:
* [[Transaction Processing]]
* [[AI Vision Services]]


[[Category:Scripting]]
* Automatic supplier detection and matching
[[Category:Import Functions]]
* Learning from correction patterns
[[Category:AI Features]]
* Support for additional currencies
* Purchase order matching (three-way matching)
* Email-based invoice submission
* Duplicate invoice detection
* Confidence scoring for extracted data
* Interactive review and correction interface
* Export of extraction results for verification
* Batch progress tracking and reporting

Latest revision as of 03:29, 10 November 2025

Overview

The AI-Powered PDF Invoice Import System enables automated extraction and processing of supplier invoices from PDF documents using artificial intelligence. The system uses vision AI models to read invoice PDFs and automatically create AP (Accounts Payable) invoices in the accounting system, eliminating manual data entry.

System Capabilities

  • Multi-file processing: Select and process multiple PDF invoices in a single operation
  • AI vision processing: Leverages advanced AI models to read and interpret invoice documents
  • Flexible extraction: Adapts to different invoice formats and layouts automatically
  • OCR accuracy: Intelligent character recognition with disambiguation of similar characters
  • Mathematical validation: Verifies totals and GST calculations for data integrity
  • Automated invoice creation: Generates complete AP invoices with all line items
  • Batch support: Optional batch processing for grouping related supplier invoices

AI Processing Methods

The system supports two processing approaches:

Vision-Based Processing (Recommended)

Processes the PDF directly using AI vision models:

  • Analyzes the visual layout and structure of the invoice
  • Handles complex formatting, tables, and multi-column layouts
  • Works with scanned documents and image-based PDFs
  • Processes multi-page invoices as a complete document
  • More accurate for invoices with complex layouts

Text Extraction Processing

Extracts text from PDF then processes with AI:

  • Uses GemBox libraries to extract structured text
  • Suitable for text-based PDFs with simple layouts
  • Faster processing for straightforward invoices
  • May struggle with complex table layouts or scanned documents

Supported AI Providers

Anthropic (Claude)

  • Default and recommended provider
  • Excellent document understanding capabilities
  • Strong structured data extraction
  • Handles complex invoice layouts

OpenAI (GPT Vision)

  • Alternative vision processing option
  • Compatible with GPT-4 Vision models
  • Good for standard invoice formats

Configuration Parameters

The system requires the following configuration:

  • AIVisionEnabled: Enable/disable vision processing (true/false)
  • AIModel: Model identifier string
  • AIModelId: Specific model version (e.g., "claude-sonnet-4-20250514")
  • AIApiKey: Encrypted API key for AI service authentication
  • FileName: Path to PDF file (when processing single files)

OCR and Character Recognition

The AI system includes intelligent character disambiguation:

Common Character Confusions

The system is instructed to carefully distinguish between:

  • Z vs 2: Z has diagonal line, 2 has curves
  • O vs 0: O is round, 0 may have slash or be more oval
  • I vs 1 vs l: Context-aware recognition (numbers vs letters)
  • S vs 5: Shape and context analysis
  • G vs 6: Character form recognition

Validation Checks

  • Mathematical verification of line totals
  • GST calculation validation (typically 15% in NZ)
  • Cross-checking of subtotals and final amounts
  • Multi-page continuity verification

Data Extraction Process

The system can extract various fields depending on the invoice format:

Standard Fields

  • Invoice Number
  • Invoice Date
  • Company/Supplier Name
  • Client/Customer Name
  • Account Numbers
  • Order Numbers
  • Reference Numbers

Financial Totals

  • Line item amounts
  • Subtotal (excluding GST)
  • GST/Tax amount
  • Total amount (including GST)

Line Item Details

  • Product codes or descriptions
  • Quantities
  • Unit prices
  • Extended amounts
  • Date information (for time-based services)
  • Reference codes or job numbers

Custom Fields

The AI prompt can be customized to extract additional fields specific to your supplier's invoice format:

  • Vehicle registration numbers
  • Serial numbers
  • Odometer readings
  • Job descriptions
  • Barcode data
  • Custom reference fields

Invoice Creation Configuration

Transaction Settings

  • Transaction Type: APINV (AP Invoice)
  • Transaction Type Code: 20 (default for AP invoices)
  • Created Date: Defaults to current date/time
  • Created User: Configurable (default: "Admin")

Processing Dates

  • Invoice Date: Extracted from PDF
  • Payment Date: Calculated (typically 20th of following month)
  • Process Date: Defaults to current date

Account Assignments

  • Supplier ID (OtherPartyId): Must be configured in script
  • Location ID: Configurable (default: "Misc")
  • Order Number: Extracted from invoice or use reference
  • Reference: Invoice number from PDF

Line Item Configuration

  • Item ID: Default item code for imported lines (must be configured)
  • Description: Extracted from PDF line items
  • Department: Optional, can be mapped from invoice fields
  • Quantity: Extracted from line items
  • Unit Price: Calculated or extracted
  • GST Treatment: Configurable (GST inclusive or exclusive)

GST Calculation Methods

The system supports different GST calculation approaches:

GST Inclusive Amounts

When invoice amounts include GST:

Total With GST = Line Amount (as shown on invoice)
GST Exc Total = Total With GST ÷ 1.15
GST Amount = Total With GST - GST Exc Total
Unit Price = GST Exc Total ÷ Quantity

GST Exclusive Amounts

When invoice amounts exclude GST:

GST Exc Total = Line Amount (as shown on invoice)
Total With GST = GST Exc Total × 1.15
GST Amount = Total With GST - GST Exc Total
Unit Price = GST Exc Total ÷ Quantity

Multi-Page Invoice Handling

The system is designed to handle multi-page invoices:

  • Processes all pages of the PDF document
  • Extracts line items spanning multiple pages
  • Maintains line item sequence
  • Validates totals across entire document
  • Provides page count in extraction results

Batch Processing

Batch Creation

Optional batch grouping for related invoices:

  • Batch ID: Defaults to supplier ID or can be specified
  • Batch Comment: Descriptive text for the batch
  • Automatic Batching: Groups invoices by supplier

Batch Benefits

  • Groups related invoices for review
  • Simplifies posting process
  • Maintains audit trail
  • Enables bulk approval workflows

Error Handling

The system provides comprehensive error handling:

File Selection Errors

  • No file selected
  • Invalid file type (non-PDF)
  • File not found or inaccessible
  • Corrupted PDF files

AI Processing Errors

  • AI service connection failures
  • Invalid or empty AI responses
  • JSON parsing errors
  • Incomplete data extraction

Data Validation Errors

  • Missing required fields
  • Invalid date formats
  • Mathematical inconsistencies
  • Zero or negative amounts

Invoice Creation Errors

  • Invalid supplier ID
  • Missing item master data
  • Missing department codes
  • Transaction validation failures
  • Database constraint violations

Customizing the AI Prompt

The extraction prompt can be customized for specific invoice formats:

Prompt Structure

1. Role definition (extraction agent)
2. Task description (extract invoice data)
3. OCR guidance (character disambiguation)
4. Field list (specific fields to extract)
5. JSON schema (output format)
6. Validation rules (totals, dates, etc.)
7. Output constraints (no extra text, no markdown)

Customization Examples

Adding Custom Fields:

Add fields to the JSON schema in the prompt to extract additional data specific to your invoices.

Changing Date Formats:

Specify date format requirements in the prompt (e.g., "yyyy-MM-dd", "dd/MM/yyyy").

Field Name Variations:

Provide alternative field names the AI should recognize (e.g., "Cust. Order No", "Customer Order", "Order Ref").

Calculation Rules:

Specify how amounts should be calculated or validated.

Implementation Workflow

Basic Implementation Steps

  1. Configure AI provider and API credentials
  2. Set default supplier ID and item codes
  3. Customize extraction prompt for invoice format
  4. Configure GST calculation method
  5. Set location and account defaults
  6. Test with sample invoices
  7. Review and validate created invoices
  8. Adjust prompt and settings as needed

Testing Recommendations

  • Start with clear, simple invoices
  • Verify mathematical accuracy of extractions
  • Check department and item code assignments
  • Validate date parsing and calculations
  • Test multi-page invoice handling
  • Review batch creation behavior

Best Practices

Invoice Preparation

  • Use clear, readable PDF scans
  • Ensure full pages are captured
  • Avoid skewed or rotated scans
  • Check PDF file integrity before processing
  • Process similar invoice types together

Configuration

  • Set appropriate default values for all parameters
  • Use descriptive batch comments
  • Configure supplier-specific item codes
  • Validate master data prerequisites
  • Document custom prompt modifications

Data Quality

  • Review AI-extracted data before posting
  • Verify mathematical calculations
  • Check supplier ID assignments
  • Validate department code mapping
  • Confirm date calculations

Performance

  • Process invoices in reasonable batch sizes
  • Monitor AI service response times
  • Handle errors gracefully with clear messages
  • Log processing results for audit trails

Advanced Features

JSON Response Extraction

The system includes a helper method to extract clean JSON from AI responses:

  • Navigates AI response structure
  • Extracts text content from nested JSON
  • Handles various response formats
  • Provides error handling for malformed responses

Dynamic Field Mapping

The system can map extracted fields to invoice line items:

  • Product codes to item IDs
  • Reference numbers to departments
  • Custom fields to standard accounting fields
  • Date parsing and conversion

Calculation Flexibility

Supports various calculation scenarios:

  • Zero-quantity items (single services)
  • Division by zero protection
  • Rounding rules for currency
  • Tax-inclusive vs tax-exclusive amounts

Integration Points

The PDF Import System integrates with:

  • Item Management: Item code lookup and validation
  • Department Management: Department code resolution
  • Supplier Management: Supplier/vendor record validation
  • Transaction Processing: Invoice creation and persistence
  • Batch Management: Batch header creation and tracking
  • AI Vision Services: External API for document analysis

Security Considerations

  • API keys are encrypted using 128-bit encryption
  • File access restricted to allowed paths
  • User authentication tracked for created invoices
  • Audit trail maintained for all imports

Troubleshooting

Common Issues

Problem: AI extracts incorrect invoice numbers
Solution: Add specific field location hints in prompt, emphasize OCR character disambiguation

Problem: Missing line items from multi-page invoices
Solution: Ensure prompt explicitly mentions checking all pages, verify PDF page count

Problem: GST calculations don't match
Solution: Verify GST inclusive/exclusive setting matches invoice format

Problem: Department codes not assigned
Solution: Check department master data exists, verify field mapping in script

Problem: JSON deserialization errors
Solution: Check AI response format, verify date format compatibility, review JSON schema

Future Enhancements

Potential improvements for consideration:

  • Automatic supplier detection and matching
  • Learning from correction patterns
  • Support for additional currencies
  • Purchase order matching (three-way matching)
  • Email-based invoice submission
  • Duplicate invoice detection
  • Confidence scoring for extracted data
  • Interactive review and correction interface
  • Export of extraction results for verification
  • Batch progress tracking and reporting