Simple Proxy Scraper

December 12th, 2009

I’ve been obsessed with all things black hat these days, I really can’t say that’s a big change from what I’ve been all about from day one but I’m really starting to take things seriously and start writing my own tools and get more splogs up as the days go on. It’s a great way to learn the things I want to learn quickly and profit a little bit as I go.

I wrote this up a few days ago to scrape some proxies for use with BookmarkWiz. I will be rewriting this script to write to a file and I’m currently throwing together a tester to get all the working proxies and ditch all the crap ones.

// get url contents
// pretty much any url will work
$url = "http://www.ddday.com/free-proxy-server/anonymous-proxy-server/anonymous-proxy-server-–-updated-2009-12-11/";

//Initialize a cURL session

$ch = curl_init();

curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); //Return page contents
curl_setopt ($ch, CURLOPT_URL, $url); // Pass URL as parameter
$result = curl_exec($ch); // Grab URL and pass it to the variable
curl_close($ch); // close curl resource and free up sys resources

$regex = '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}[:][0-9]{1,5}/';

if (preg_match_all($regex,$result,$matches)) {

foreach ($matches as $proxies) {
foreach ($proxies as $proxy) {
echo $proxy . "
\n";
}
}
} else {
echo "Not an IP address!
\n";
}
?>

Share:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogosphere News
  • Reddit

Leave a Reply