Open proxies… Everybody likes them! Please don’t immediately think about malicious activities… Of course, open (and chained) proxies can be useful to make you anonymous on the Internet but they can also by very interesting for “good” purposes. As a pentester, they can help you to distribute your reconnaissance phase across multiple IP addresses and to reduce the risk to be identified. Most log management solution come with out-of-the-box light currelation rules to detect brute-force attacks (“if event X is seen Y times in a time window of Z seconds, pop up an  alert“). Finally, you can use them to avoid stupid blacklists system which try to prevent you to abuse an online service. Note that in this (my) context “abuse” means “accessing several times the same resources for research purposes“. Nothing malicious!
How to find open proxies? It’s easy, there are plenty of lists available on the Internet. Big lists of proxies are posted daily on pastebin.com. Some sites are dedicated to this business of compiling hugh lists like www.freeproxylists.com or www.xroxy.com. Of course, most of them propose premiumpaying services if you have some money to waste. Personally I like xroxy.com just because they provide updates via RSS Feed (XML). The major issue for me is the reliability of the listed open proxies. xroxy gives an reliability indicator (0-100%) but he don’t find it… reliable! Most proxies are often unavailable or reject connections.
I wrote a small Perl script to help me to maintain my own list of open proxies. The script is called oplb for “Open Proxies List Builder“. To not re-invent the wheel, it is based on the PHP RSS Aggregator also provided by xroxy.com. Proxies are stored in a SQLite DB and reliability is checked using the WWW:ProxyChecker Perl module. All proxies are tested on a regular basic (via a cron job) and reliability updated when needed. The script is used in two different modes: scheduled at regular interval to grab new proxies published by xroxy.com and to check their availability. The second one is the manual mode to generate your list of reliable proxies:
$ ./oplb.pl --help Usage: ./oplb.pl [--debug] [--dump] [--force] [--help] [--reliability=percent] [--ttl=seconds] Where: --debug : Produce verbose output --dump : Generate a list of reliable proxies (stdout) --force : Ignore TTL and force a check of the xroxy.com RSS feed --reliability=x : Define minimum reliability for proxies --ttl=x : TTL for xroxy.com RSS feed update (default: 3600)
Create a crontab like:
*/15 * * * * oplb.pl --reliability=90 --ttl=3600
This will check for new proxies every hour and verify the reliability of proxies which are currently below 70%. To dump a list of reliable proxies, just use:
$ ./oplb.pl --dump --reliability=95 122.72.28.19:80 122.72.33.138:80 122.72.33.139:80 219.159.105.180:8080 196.1.178.254:3128 192.162.150.77:8080 88.85.108.16:8080 202.112.117.202:3128 59.172.208.186:8080 41.191.27.226:80 114.79.159.2:8080
Only proxies checked at least once during the last 3 days are dumped. The list is ready to be used by other tools. Personally, I’m using OPBL to build a list of proxies used by pastemon.pl (to avoid being blacklisted by pastebin.com). A rough version of the script is already available here. Comments/suggestions are welcome!
I find that it is more practical to seek fresh proxies free proxy list site like http://proxygaz.com/
Hi Xavier,
First of all thank you for your work! 🙂
I’m trying to use oplb.pl script but I receive errors. I have googled a while but I really can’t understand the issue.
If I try to run ./oplb.pl –debug –force I have the following error:
“Illegal field name ‘If-None-Match:’ at /usr/local/share/perl/5.10.1/LWP/UserAgent.pm line 725”
If I try to run ./oplb.pl –debug I have the following error:
“Illegal division by zero at /usr/local/share/perl/5.10.1/WWW/ProxyChecker.pm line 73.”
Could you help me please ?
Thank you!
RT @xme: [/dev/random] Manage an Efficient List of Open Proxies http://t.co/wpLWqpsQ
RT @xme: [/dev/random] Manage an Efficient List of Open Proxies http://t.co/wpLWqpsQ
RT @xme: [/dev/random] Manage an Efficient List of Open Proxies http://t.co/wpLWqpsQ
Tor is interesting too but… sometimes very slow and prohibited by some online services. Ex: Google does not allow queries from Tor exit nodes!
Interessant. Mais tu sais que pour le scraping distribué tu peux aussi utiliser Tor du style : http://blog.databigbang.com/distributed-scraping-with-multiple-tor-circuits/
J’ai déjà employé cette technique avec succès. Très, très pratique.
RT @xme: [/dev/random] Manage an Efficient List of Open Proxies http://t.co/wpLWqpsQ