ohrores wrote:I suppose a scripting language would be the best choice.
Perl FTW. Does wonders with file parsing. Wanna me show you how to do a web page parser?
Also a correction if you don't mind:
ohrores wrote:Because Javascript is a client-side language and PHP a server-side language.
That means PHP runs as/on the server and Javascript on the PC of the client,
Yes you're right, but you can anyway get a PHP and/or JS interpreter and run code on your own PC

But as I said above, Perl also excels at this.
EDIT: Just for fun, I made it. It was a nice exercise. Probably needs a bit of polishing but it does extract the links from web pages

- Code: Select all
#!/usr/bin/perl
# Online HTML Parser by m0skit0
# 12/01/2012
# Usage: script.pl <URL to parse>
use strict;
use HTML::Parser;
use LWP::UserAgent;
use constant HTTP_AGENT => "DarthPerl/3.0";
use constant HTML_LINK_TAG => "a";
# Invoked each time a starting tag is found
sub start
{
my ($tagname, $attr) = @_;
if ($tagname eq HTML_LINK_TAG)
{
print ("$attr->{href}\n");
}
}
########
# MAIN #
########
# Create a UserAgent (HTTP client)
my $ua = LWP::UserAgent->new();
$ua->agent(HTTP_AGENT);
# Create an HTTP request
my $req = HTTP::Request->new(GET => $ARGV[0]);
# Request the page
my $res = $ua->request($req);
# If successful
if ($res->is_success)
{
# Create an HTML parser
my $parser = HTML::Parser->new
(
api_version => 3,
start_h => [\&start, "tagname, attr"],
);
# Parse web page
$parser->parse($res->content);
}
# Request failed
else
{
print "$res->status_line\n";
}
# Bye bye and have fun