Validation of a data (1 – 10000 possible rows)
So to give some context I am creating a web based mapping interface to map other data formats to our format.
Problem
Taking an array of data and an array of formats like this:
$vehicle = [ 'cost' => '$159', 'reconditioning' => '$140' ]; $formats = [ 'cost' => 'integer', 'reconditioning' => 'integer' ];
And checking all of the fields for inconsistencies with our data structure, like “total_cost” would be an error in the data since it has a non-numeric character in it. The output of the validation would have to show an error for each individual field like shown below (true meaning that that field has an error):
$vehicle_errors = [ 'total_cost' => true, 're_condition' => true ];
Which after the validation, I am displaying the fields that had errors for the user to correct them.
I have succeeded in doing all of this but I was using for-loops and a switch case to validate each individual field, this becomes a problem when we get more than a few thousand vehicles.
What I’ve Tried
I have tried using for-loops to accomplish this with success but php required me to increase the “max_input_vars” to a lot which I would like to try to avoid because this operation heavily taxed my server. I have also tried Laravels validation which didnt give me the output I wanted and also had the same problem as using for-loops did, with the php error.
Possible Solutions
I know that there is a way to validate data in a mysql table and get an output similar to what I need, this would fix my problem due to it using the mysql servers resources but I do not have any idea of how to accomplish this and couldn’t find anything online about this topic.
Advertisement
Answer
It sounds like your fundamental problem in this case is that the array of data is too large to load into memory in one go, so your application is crashing once the array gets too big.
Generators were added to PHP to solve this problem. They allow you to iterate over a set of data without needing to hold the whole thing in memory, making it more practical to work with very large datasets. However, the syntax can be a bit cumbersome.
As it happens, the newly released Laravel 6 offers a solution to this problem in the form of lazy collections. These use generators, but provide a friendlier interface so that you can easily iterate through very large data sets without using too much memory. Here’s an example of it in use, taken from https://www.magutti.com/blog/laravel-lazy-collections:
use AppProject; use IlluminateSupportLazyCollection; Route::get('/', function () { LazyCollection::make(function () { // project.csv with 300.000 rows $filePath = storage_path('app/public/project.csv'); $handle = fopen($filePath, 'r'); while ($line = fgetcsv($handle)) { yield $line; } }) ->chunk(10000) //split in chunk to reduce the number of queries ->each(function ($lines) { $list = []; foreach ($lines as $line) { if (isset($line[1])) { $list[] = [ 'name' => $line[1], 'email' => $line[2], 'status_id' => $line[3] ]; } } // insert 10000 rows in one shot Project::insert($list); }); /* display memory usage */ echo number_format(memory_get_peak_usage() / 1048576, 2) . ' MB'; });
The Eloquent ORM also now returns a lazy collection when you call the cursor()
method, so you can use that to handle very large sets of data returned from the database as well. All of this is achievable with generators alone, but lazy collections provide an easier and more readable way to do it.