Skip to content
Advertisement

Understanding ‘parse_str’ in PHP

I’m a PHP newbie trying to find a way to use parse_str to parse a number of URLs from a database (note: not from the request, they are already stored in a database, don’t ask… so _GET won’t work)

So I’m trying this:

    $parts = parse_url('http://www.jobrapido.se/?w=teknikinformat%C3%B6r&l=malm%C3%B6&r=auto');
    parse_str($parts['query'], $query);
    return $query['w'];

Please note that here I am just supplying an example URL, in the real application the URL will be passed in as a parameter from the database. And if I do this it works fine. However, I don’t understand how to use this function properly, and how to avoid errors.

First of all, here I used “w” as the index to return, because I could clearly see it was in the query. But how do these things work? Is there a set of specific values I can use to get the entire query string? I mean, if I look further, I can see “l” and “r” here as well…

Sure I could extract those too and concatenate the result, but will these value names be arbitrary, or is there a way to know exactly which ones to extract? Of course there’s the “q” value, which I originally thought would be the only one I would need, but apparently not. It’s not even in the example URL, although I know it’s in lots of others.

So how do I do this? Here’s what I want:

  1. Extract all parts of the query string that gives me a readable output of the search string part of the URL (so in the above it would be “teknikinformatör Malmö auto”. Note that I would need to translate the URL encoding to Swedish characters, any easy way to do that in PHP?)
  2. Handle errors so that if the above doesn’t work for some reason, the method should only return an empty string, thus not breaking the code. Because at this point, if I were to use the above with an actual parameter, $url, passed in instead of the example URL, I would get errors, because many of the URLs do not have the “w” parameter, some may be empty fields in the database, some may be malformed, etc. So how can I handle such errors stably, and just return a value if the parsing works, and return empty string otherwise?

There seems to be a very strange problem that occurs that I cannot see during debugging. I put this test code in just to see what is going on:

function getQuery($url)
{
    try
    {
        $parts = parse_url($url);
        parse_str($parts['query'], $query);
        if (isset($query['q'])) {
            /* return $query['q']; */
            return '';
        }
    } catch (Exception $e) {
        return '';
    }
}

Now, obviously in the real code I would want something like the commented out part to be returned. However, the puzzling thing is this:

With this code, as far as I see, every path should lead to returning an empty string. But this does not work – it gives me a completely empty grid in the result page. No errors or anything during debugging, and objects look fine when I step through them during debugging.

However, if I remove everything from this method except return ”; then it works fine – of course the field in the grid where the query is supposed to be is empty, but all the other fields have all the information as they should. So this was just a test. But how is it possible that code that should only be able to return an empty string does not work, while the one that only returns an empty string and does nothing else does work? I’m thoroughly confused…

Advertisement

Answer

It turned out the problem was with Swedish characters – if I used utf8_encode() on the value before returning it, it worked fine.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement