In web development, one of the most frequent requirements is a feature that lets users upload a spreadsheet to populate a database. Manual tools work for admins, but a production application needs an automated, secure, and performant pipeline. This guide covers three PHP approaches in order of speed — and the security rules that apply to all of them.
| Metric | pg_copy_from() | pg_put_line() stream | Prepared INSERT loop |
|---|---|---|---|
| Speed | Fastest | Fast | Moderate |
| Memory usage | High — full file in RAM | Low — line-by-line | Low — line-by-line |
| Row validation | Difficult | Difficult | Easy |
| Best for | System migrations | Multi-GB files | User uploads / web forms |
| Crash safety | Wrap in transaction | Wrap in transaction | Built-in per BEGIN/COMMIT |
Method 1: pg_copy_from() — Maximum Bulk Speed
PHP's PostgreSQL extension provides pg_copy_from(), which wraps the high-speed SQL COPY command. It is the fastest path for loading large CSVs — ideal for migrations and system log imports where validation happens before the upload. For a deep-dive on what makes COPY so much faster than INSERT, see the COPY vs. INSERT comparison.
<?php
$conn = pg_connect("host=localhost dbname=mydb user=myuser password=mypass");
// file() reads the entire CSV into an array of row strings
$rows = file('data.csv');
$result = pg_copy_from($conn, 'target_table', $rows);
if ($result) {
echo "Bulk import successful.";
} else {
echo pg_last_error($conn);
}
?>Memory ceiling
file() loads the entire CSV into a PHP array — a 500 MB file can easily hit PHP's memory_limit. For files over ~100 MB, use the streaming approach (Method 3) instead.
Method 2: Prepared INSERT Loop — Row-Level Validation
When you need to validate each row before it hits the database — checking email format, verifying a product code exists, or skipping malformed records — a prepared statement loop gives you that control. It is slower than COPY but far more flexible.
<?php
$handle = fopen("data.csv", "r");
$headers = fgetcsv($handle); // Consume and discard the header row
// Wrap the loop in a transaction — reduces disk I/O from
// thousands of individual commits to exactly one
pg_query($conn, "BEGIN");
$query = "INSERT INTO users (name, email) VALUES ($1, $2)";
pg_prepare($conn, "import_query", $query);
while (($row = fgetcsv($handle, 1000, ",")) !== FALSE) {
// Validation logic: skip empty or invalid rows
if (empty($row[0]) || !filter_var($row[1], FILTER_VALIDATE_EMAIL)) {
continue;
}
pg_execute($conn, "import_query", [$row[0], $row[1]]);
}
pg_query($conn, "COMMIT");
fclose($handle);
?>Why wrap the loop in BEGIN / COMMIT
Without a transaction, each pg_execute() call auto-commits to disk. For 10,000 rows that means 10,000 fsync operations. Wrapping the whole loop in one transaction reduces that to a single commit — often a 5–10× speed improvement on spinning disk.
Security: CSV Data is User Input
A critical mistake when building upload features is treating a CSV as "safe" because it came from a spreadsheet. CSV files are external input just like a web form — they can carry malicious data if not handled correctly.
Security checklist for CSV upload features
- Always use prepared statements — never concatenate CSV cell values directly into a query string.
- Validate MIME type on upload — ensure the file is actually a CSV (
text/csv), not a PHP script with a .csv extension. - Whitelist delimiter choices — if your UI lets users pick a delimiter, only accept
,/;/|/\t— anything else is rejected. - Cap file size server-side — enforce a maximum in both PHP (
upload_max_filesize) and your application logic.
Method 3: Streaming with pg_put_line() — Large Files Without Memory Limits
For multi-gigabyte files, loading the whole CSV into a PHP array (as pg_copy_from() does with file()) will exhaust server RAM. The pg_put_line() function lets you pipe data line-by-line directly to Postgres's COPY protocol — without ever holding the full file in memory.
<?php
$handle = fopen("large_data.csv", "r");
// Signal to Postgres that CSV data will follow on STDIN
pg_query($conn, "COPY target_table FROM STDIN WITH (FORMAT csv, HEADER)");
// Stream line by line — memory usage stays flat regardless of file size
while ($line = fgets($handle)) {
pg_put_line($conn, $line);
}
// Required terminator — tells Postgres the stream is complete
pg_put_line($conn, "\.
");
pg_end_copy($conn);
fclose($handle);
?>This approach scales to arbitrarily large files. Memory usage stays constant because PHP reads and forwards one line at a time. For the Postgres-side performance tuning that pairs with this (index drops, maintenance_work_mem, WAL tuning), see the bulk loading millions-of-rows guide.
If your PHP pipeline imports data submitted by customers, the data validation problem only grows with scale — different column names, different date formats, different encodings per customer. The CSV import troubleshooting guide covers the most common type mismatches and encoding errors you'll encounter in the wild.
Related reading
Replace the PHP pipeline with a hosted upload flow
Elvity gives your customers a validated, schema-aware CSV upload experience — no pg_put_line() boilerplate, no MIME-type checks, no memory limit engineering.