I’m importing a CSV that has 3 columns, one of these columns could have duplicate records.
I have 2 things to check:
1. The field 'NAME' is not null and is a string 2. The field 'ID' is unique
So far, I’m parsing the CSV file, once and checking that 1. (NAME is valid)
, which if it fails, it simply breaks out of the while loop and stops.
I guess the question is, how I’d check that ID
is unique?
I have fields like the following:
NAME, ID, Bob, 1, Tom, 2, James, 1, Terry, 3, Joe, 4,
This would output something like `Duplicate ID on line 3′
Thanks
P.S this CSV file has more columns and can have around 100,000 records. I have simplified it for a specific reason to solve the duplicate column/field
Thanks
Advertisement
Answer
I went assuming a certain type of design, as stripped out the CSV part, but the idea will remain the same :
<?php /* Let's make an array of 100,000 rows (Be careful, you might run into memory issues with this, issues you won't have with a CSV read line by line)*/ $arr = []; for ($i = 0; $i < 100000; $i++) $arr[] = [rand(0, 1000000), 'Hey']; /* Now let's have fun */ $ids = []; foreach ($arr as $line => $couple) { if ($ids[$couple[0]]) echo "Id " . $couple[0] . " on line " . $line . " already used<br />"; else $ids[$couple[0]] = true; } ?>
100, 000 rows aren’t that much, this will be enough. (It ran in 3 seconds at my place.)
EDIT: As pointed out, in_array
is less efficient than key lookup. I’ve updated my code consequently.