Here is a quick way to determine the country of origin for the IP addresses which are accessing your website.
IP to Location Mapping
There are a few entities that offer IP-to-location mapping services, but for this example we’ll be using WIPMania. They offer a quick and easy way to get a response, simply by making an HTTP GET request.
So, looking up the country code for an IP address is as simple as this:
[unabated] $ wget -q -O- http://api.wipmania.com/68.16.200.0
US[unabated] $
In the example above, we’re using wget to make the HTTP request, telling it to suppress it’s normal lines of output (-q) and to send the response body to STDOUT (-O-). We then append the IP address we’re interested in to the end of the URL. The country code is then output to our terminal (without a newline character, of course).
Getting IP addresses from Apache
So to apply this to our webserver logs, Apache in this case, we’ll need a quick command to grab the IP addresses from the log. We’re also going to use sed to replace the last octet in the IP address with a zero, so that we end up performing a lookup at a class-C subnet level. This just cuts down on the number of requests we will make, and we’re not too concerned with that level of accuracy anyway. Lastly, we’re going to use sort and uniq to break the list down to the unique subnets.
[unabated] $ cat access_log|awk '{print $1}'|sed -e 's/[0-9]+$/0/'|sort|uniq
100.43.83.0
101.226.166.0
101.226.169.0
65.111.177.0
94.173.30.0
^C
[unabated] $
Wrapping it together
Now we just need to loop over the list of IP subnets, call wget for each address and then output the subnet and country code together. A quick for loop handles that for us, and our final command looks like this:
[unabated] $ for NET in `cat access_log|awk '{print $1}'|sed -e 's/[0-9]+$/0/'|sort|uniq` ; do printf "%-16s %sn" "${NET}" "`wget -q -O- http://api.wipmania.com/${NET}`" ; done
100.43.83.0 US
101.226.166.0 CN
101.226.169.0 CN
65.111.177.0 US
94.173.30.0 GB
^C
[unabated] $
And that’s it. You can now see what country your website requests are originating from.