Find all links in a page using JavaScript
You can use “links” object array to traverse all the links present in an active document. You can open “Java Script Console” in the Firefox browser and run the following piece of code.
s = "Number of links: " + document.links.length + "\r\n";
for(i=0;i<document.links.length;i++) {
s = "link " + i + ": " + document.links[i].href + "\r\n";
}
alert(s);


about 1 year ago
Good one. Your article tells us how we identify all A links by placing javascript code in the HTML page itself.
This is how a spider works using a php script. The following code parses entire HTML content using XPATH and identifies all A links in the page.
$html = file_get_contents('http://www.google.com');
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
echo $url.'';
}