It’s a common need, when checking yout Apache or Nginx log to see how many times did one IP visit your website in that day (in that log file), to determine if something is crawling your site, overloading the server, etc.
We’ll suppose that the IP address of the visitor is the first column in your Nginx/Apache log (doesn’t matter which one, this should work on both), so we will extract the 1st column from the log, count the number of occurances, and make eg. a top list of top 10 visitors of the site. awk '{ print $1}' /var/log/access.log | sort | uniq -c | sort -nr | head -n 10
The upper command will exctract the 1st row from the access log awk ‘{ print $1}’ at /var/log/httpd/access.log (replace this with the location if your desired Nginx or Apache log), sort them and feed into uniq -c which will extract the occurences and count them by inserting the number of occurences before each line.
From this moment on we have 2 columns – the number of occurences in the first, and the IP address itself in the 2nd collumn. Now we feed that into sort -nr which will again sort the whole output by the 1st collumn (number of occurences) in a reverse order (r) – from highest to lowest, and finally we feeed it to head -n 10 which will get the top ten lines only.
This way we get a result such as this:
69429 138.201.252.71
19264 80.28.132.26
17572 116.203.57.178
13923 138.201.252.126
10068 127.0.0.1
4583 102.222.181.152
4096 54.36.148.131
3682 102.222.181.65
2371 40.77.167.152
2272 40.77.167.145
So now we have our top list of visitors for this log file.
Now we can dig the log more, with grep, to see what were those IPs actually doing on our website.
A big thank you for your blog article. Really looking forward to read more. Really Cool. Corrina Abel Erde