Skip to content
Advertisement

PHP Resolving URL format with Base URL Relative Path into Absolute

I get some html code from other website. Using Bellow Code

$content =  file_get_contents('http://something.net/path/test.php');

From Here I get like this code

<div class="main"><a href="/testother.php?abhijit=1">Test One</a></div>
<div class="main"><a href="top.php?kumar=1">Test One</a></div>
<div class="main"><a href="/testother.php?abhijit=3">Test One</a></div>
<div class="main"><a href="ww.php?kumar=1">Test One</a></div>

I get all the href attribute by using Regex.

/testother.php?abhijit=1
top.php?kumar=1
/testother.php?abhijit=3
ww.php?kumar=1
//otherdomain.com/something.php

Now all this links has no domain and paths

But I want to get all this links like bellow(I hope all links are format link this)

http://something.net/testother.php?abhijit=1
http://something.net/path/top.php?kumar=1
http://something.net/test/other.php?abhijit=3
http://something.net/path/ww.php?kumar=1

How can I transform relative path into absolute URL using PHP.

By use my main URL and the href attributs links with PHP.

(THANKS)

Advertisement

Answer

Transform Relative Path Into Absolute URL Using PHP

function rel2abs($rel, $base) {
    /* return if already absolute URL */
    if (parse_url($rel, PHP_URL_SCHEME) != '') return $rel;

    /* queries */
    if ($rel[0] == '?') return explode("?", $base)[0] . $rel;

    /* anchors */
    if ($rel[0] == '#') return explode("#", $base)[0] . $rel;

    /* parse base URL and convert to local variables: $scheme, $host, $path */
    extract(parse_url($base));

    /* Url begins with // */
    if ($rel[0] == '/' && $rel[1] == '/') {
        return "$scheme:$rel";
    }

    /* remove non-directory element from path */
    $path = preg_replace('#/[^/]*$#', '', $path);

    /* destroy path if relative url points to root */
    if ($rel[0] == '/') $path = '';

    /* dirty absolute URL */
    $abs = "$host$path/$rel";

    /* replace '//' or '/./' or '/foo/../' with '/' */
    $re = array('#(/.?/)#', '#/(?!..)[^/]+/../#');
    for ($n = 1; $n > 0; $abs = preg_replace($re, '/', $abs, -1, $n)) {}

    /* absolute URL is ready! */
    return "$scheme://$abs";
}

Testing …

echo '<h4>Queries</h4>';
echo rel2abs("?query=1", "http://something.net/path/test.php");
echo '<br>';
echo rel2abs("?query=1", "http://something.net/path/test.php?old_query=1");

echo '<h4>Anchors</h4>';
echo rel2abs("#newAnchores", "http://something.net/path/test.php?a=1");
echo '<br>';
echo rel2abs("#newAnchores", "http://something.net/path/test.php?a=1#something");

echo '<h4>Path</h4>';
echo rel2abs("/testother.php", "http://something.net/folder1/folder2/folder3/test.php");
echo '<br>';
echo rel2abs("./../../testother.php", "http://something.net/folder1/folder2/folder3/test.php");
echo '<br>';
echo rel2abs("./../testother.php", "http://something.net/folder1/folder2/folder3/test.php");
echo '<br>';
echo rel2abs("./testother.php", "http://something.net/folder1/folder2/folder3/test.php");
echo '<br>';
echo rel2abs("testother.php", "http://something.net/folder1/folder2/folder3/test.php");

echo '<h4>Url begins with //</h4>';
echo rel2abs("//google.com/path/", "https://something.net/path/test.php");
echo '<br>';
echo rel2abs("//google.com/path/", "http://something.net/path/test.php");

Test Output …

Queries

http://something.net/path/test.php?query=1
http://something.net/path/test.php?query=1

Anchors

http://something.net/path/test.php?a=1#newAnchores
http://something.net/path/test.php?a=1#newAnchores

Path

http://something.net/testother.php
http://something.net/folder1/testother.php
http://something.net/folder1/folder2/testother.php
http://something.net/folder1/folder2/folder3/testother.php
http://something.net/folder1/folder2/folder3/testother.php

Url begins with //

https://google.com/path/
http://google.com/path/
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement