Rotating HTTP proxies in your python script

November 11, 2022 2 minutes

Ok, there can be a whole bunch of reasons why you would need to use multiple proxy servers at once. But most of them, to be honest, are related to playing around with bot detection, scrapping, and other not-so-honorable stuff.

Anyway. We have a task to get https://httpbin.org/ip content and to be sure that each time I will get a result that differs from the previous one.

import urllib.request
import json

previous = None
while True:
    with urllib.request.urlopen('https://httpbin.org/ip') as response:
        current = json.loads(response.read())['origin']
        assert previous != current, f"Got the same {current} twice in a row"
        print("OK: ", current)
        previous = current

That will obviously fail. The second request will get the same IP which is not OK for us.

$ python script.py 
OK:  79.143.111.139
Traceback (most recent call last):
  File "script.py", line 8, in <module>
    assert previous != current, f"Got the same {current} twice in a row"
AssertionError: Got the same 79.143.111.139 twice in a row

Let’s extend ProxyHandler a little bit. We need a way to define multiple proxies for one proto and circulate it for each request. A little trick with cycle iterator will do the thing

import urllib.request
import json
import itertools

class RotatingProxyHandler(urllib.request.ProxyHandler):

    def __init__(self, proxies):
        self.proxies = proxies or {}
        for type, urls in self.proxies.items():
            type = type.lower()
            setattr(self, f'_{type}_proxies', itertools.cycle(urls))
            setattr(self, f'{type}_open',
                    lambda r, proxy=getattr(self, f"_{type}_proxies"), type=type, meth=self.proxy_open:
                        meth(r, proxy, type))

    def proxy_open(self, req, proxy, type):
        super().proxy_open(req, next(proxy), type)


proxy_support = RotatingProxyHandler(proxies={'https': [
    "92.205.22.114:38080",
    "169.57.1.85:8123",
]})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)

previous = None
while True:
    with urllib.request.urlopen('https://httpbin.org/ip') as response:
        current = json.loads(response.read())['origin']
        assert previous != current, f"Got the same {current} twice in a row"
        print("OK: ", current)
        previous = current

And now we have it!

$ python script.py
OK:  92.205.22.114
OK:  169.57.1.85
OK:  92.205.22.114
OK:  169.57.1.85
OK:  92.205.22.114
OK:  169.57.1.85
OK:  92.205.22.114

Have a comment on one of my posts? Start a discussion in my public inbox by sending an email to ~histrio/[email protected] [mailing list etiquette]