Bulk Import
Universal bulk data import system for master data, transactions, and migrations with automatic sync/async routing
Overview#
Artifi's universal import system provides a single, consistent way to import data in bulk across all object types -- from master data like customers and vendors to financial transactions. The system automatically handles small and large imports differently, with real-time progress tracking for bigger jobs.
Key Features#
- Single consistent interface for all object types
- Automatic sync/async routing based on record count
- Background processing with real-time progress tracking
- Configurable duplicate handling (skip, update, or error)
- Validate-only mode for dry runs before committing
- Preflight checks to discover required fields before importing
- External ID resolution for seamless data migration from other systems
- Detailed error reporting per record with recovery guidance
Sync vs. Async Processing#
The system automatically routes imports based on the number of records:
| Records | Mode | Behavior |
|---|---|---|
| 1-50 | Synchronous | Full result returned immediately with counts, errors, and created IDs |
| 51+ | Asynchronous | Returns an import ID immediately; processes in the background |
Why Async?#
Each record can involve dozens of database operations (validation, entity resolution, GL posting, tax calculation). For large batches, synchronous processing would be too slow. The async mode processes records in batches and provides real-time progress updates.
Progress Tracking#
For async imports, you can poll the import status at any time to see:
- Current progress percentage
- Records processed so far
- Import, skip, and error counts
- Detailed error information on completion
Only one large import per organization runs at a time. Small imports (50 records or fewer) are unaffected by this limit.
Supported Object Types#
Master Data#
| Object Type | Duplicate Detection | Notes |
|---|---|---|
| Customer | Tax ID, then name | Supports billing/shipping addresses |
| Vendor | Tax ID, then name | Supports billing/remit-to addresses |
| Employee | Work email | Auto-generates global employee ID |
| Account | Account number | Supports parent account hierarchy |
| Item | Item number | Revenue/expense/asset/COGS account linking |
| Dimension Type | Name | Hierarchy support |
| Dimension Value | Type + code | Parent code for hierarchies |
| Fixed Asset | Asset number + entity | Category resolution, depreciation config |
Transactions#
| Object Type | Duplicate Detection | Notes |
|---|---|---|
| Transaction (all types) | Type + reference + party | Handles 30+ transaction types through a single importer |
Supported transaction types include AP invoices, AR invoices, journal entries, payments, expense reports, bank transfers, and more.
Banking & Opening Balances#
| Object Type | Notes |
|---|---|
| Card Transaction | From payment processors or CSV |
| Opening Balance | For migration cutover |
Import Options#
| Option | Default | Description |
|---|---|---|
| Duplicate handling | Skip | Skip ignores duplicates, Update overwrites existing records, Error treats duplicates as failures |
| Batch size | 100 | Records per batch; each batch is a transaction boundary |
| Stop on error | No | When enabled, the entire import stops on the first error |
| Validate only | No | Validates all records without persisting (dry run) |
| Defaults | None | Default values merged into every record (e.g., currency, location) |
Notes:
- Duplicate update is not supported for financial transactions (immutable records)
- If a batch fails, only that batch rolls back; other batches are unaffected
Preflight Checks#
Before importing, you can run a preflight check to discover the required and optional fields for a specific object type or transaction type. Required fields are configured per organization, so they may vary.
A preflight check returns:
- Required header fields with types
- Required line fields (for transactions)
- Optional header and line fields
- Optionally, validation results for a sample record
This helps ensure your import file is properly structured before processing.
Entity Resolution#
The import system flexibly resolves references to entities using multiple identifier types. This is particularly valuable during data migration when the source system uses different ID schemes.
Resolution priority (checked in order):
| Entity Type | Resolution Order |
|---|---|
| Vendor | Internal ID, global ID, external ID, tax ID, name |
| Customer | Internal ID, global ID, external ID, name |
| Employee | Internal ID, external ID, name |
| Account | Account number, external ID |
| Item | Internal ID, item number, external ID |
| Project | Internal ID, external ID, name |
Name-based lookups are case-insensitive. All reference data is pre-cached at import start for fast lookups during batch processing.
Data Migration#
When migrating from external systems (QuickBooks, Xero, or legacy software), the import system supports a phased approach:
Phase 1: Reference Data (Small Volume)#
Import foundational data first:
- Chart of accounts
- Dimension types and values
- Payment terms, tax codes, number series
Phase 2: Master Data (Medium Volume)#
Import entities with external IDs for cross-referencing:
- Customers, vendors, employees
- Items and products
When importing master data, include the external system's ID in the metadata. This external ID is then used to resolve references when importing transactions.
Phase 3: Transactions (High Volume, Async)#
Import financial transactions that reference the master data imported in earlier phases:
- AP and AR invoices
- Payments
- Journal entries
- Bank statements
Transactions can reference entities by external ID, so you do not need to know the new internal IDs.
Error Handling#
Per-Record Errors#
Each record is validated individually, and errors include:
- Row number for easy cross-reference
- Field name causing the error
- Descriptive error message
- Record identifier (name, number, etc.)
Recovery from Partial Failures#
- Review the error details in the import results
- Fix the problematic records in your source data
- Re-submit with duplicate handling set to "skip" to avoid re-importing successful records
- Successfully imported records are committed independently and are not rolled back
Performance Expectations#
| Volume | Mode | Expected Time |
|---|---|---|
| 10 records | Sync | ~5 seconds |
| 50 records | Sync | ~25 seconds |
| 200 records | Async | ~90 seconds |
| 500 records | Async | ~4 minutes |
| 5,000 records | Async | ~50 minutes |
Dimension caching reduces repeated lookups, improving performance by approximately 30% for large imports.
Import Tracking#
Every import is tracked with a full audit trail:
- Import ID, object type, and source
- Status progression: pending, importing, completed, or failed
- Record counts: total, imported, updated, skipped, errors
- Error and warning details (first 100 of each)
- IDs of created and updated records
- Duration and timestamps
- User who initiated the import
For async imports, progress is updated after each batch, enabling real-time monitoring.
Subscribe to new posts
Get notified when we publish new insights on AI-native finance.