Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"Jesus may love you, but I think you're garbage wrapped in skin." -- Michael O'Donohugh


computers / comp.os.linux.misc / Re: Suppress exit status of system() command

Re: Suppress exit status of system() command

<s6lv9p$84i$1@reader1.panix.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=4908&group=comp.os.linux.misc#4908

  copy link   Newsgroups: comp.os.linux.misc
Path: i2pn2.org!i2pn.org!aioe.org!goblin3!goblin.stu.neva.ru!panix!not-for-mail
From: fork...@panix.com (John Forkosh)
Newsgroups: comp.os.linux.misc
Subject: Re: Suppress exit status of system() command
Date: Sun, 2 May 2021 10:35:37 +0000 (UTC)
Organization: PANIX Public Access Internet and UNIX, NYC
Lines: 234
Message-ID: <s6lv9p$84i$1@reader1.panix.com>
References: <s6letp$hu1$1@reader1.panix.com> <s6lr9t$3kh$1@dont-email.me>
NNTP-Posting-Host: panix3.panix.com
X-Trace: reader1.panix.com 1619951737 8338 166.84.1.3 (2 May 2021 10:35:37 GMT)
X-Complaints-To: abuse@panix.com
NNTP-Posting-Date: Sun, 2 May 2021 10:35:37 +0000 (UTC)
User-Agent: tin/2.4.5-20201224 ("Glen Albyn") (NetBSD/9.0 (amd64))
 by: John Forkosh - Sun, 2 May 2021 10:35 UTC

The Natural Philosopher <tnp@invalid.invalid> wrote:
> John Forkosh wrote:
>> I have a little one-line awk/shell script like this...
>> awk '{split($0,ip,"- -"); print ip[1]}' | sort | uniq -c | sort -nr
>> which analyzes my website logs, producing output like this...
>> 672 62.210.98.10
>> 178 54.173.189.222
>> 116 23.100.232.233
>> 116 101.19.4.45
>> 88 151.38.64.253
>> etc
>> which is merely a count (on the left) of the number of times
>> each ip-address (on the right) has accessed my site, highest
>> count first.
>>
>> Problem with that is there are many accesses with the same
>> first two aaa.bbb.. and different ..ccc.ddd. So they're
>> treated separately, even though they most likely should
>> be aggregated. So I wrote a little C program, ipprefix.c
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>> int main ( int argc, char *argv[] ) {
>> char ipstr[999]="\000", *delim=NULL;
>> int ndots=2, idot=0;
>> if ( argc > 1 ) strcpy(ipstr,argv[1]);
>> if ( (delim=strchr(ipstr,',')) != NULL ) {
>> *delim = '\000'; ndots=atoi(delim+1); }
>> for ( delim=ipstr-1,idot=1; idot<=ndots; idot++ ) {
>> delim = strchr(delim+1,'.');
>> if ( delim == NULL ) break; }
>> if ( delim != NULL ) *delim = '\000';
>> printf("%s",ipstr);
>> } /* --- end-of-job --- */
>> whose argv[1] input is aaa.bbb.ccc.ddd and whose output is aaa.bbb
>>
>> And then I modified the awk/shell script like this...
>> awk '{split($0,ip,"- -"); print (system("ipprefix "ip[1]))}' |
>> sort | uniq -c | sort -nr
>>
>> So the annoying little problem is that its output is now...
>> 1986 116.1790
>> 672 62.2100
>> 576 114.1190
>> 355 140.820
>> etc
>> I've checked, and the ip addresses are all correct except for
>> that extra "0" at the end of every line. So it's obviously, I think,
>> the exit status from system(), and I can just ignore it. But I'd
>> prefer to suppress system()'s exit status so there is no
>> extraneous "0" to begin with. Is there any way to do that?
>
> Honestly, if you have gone so far with C, why not finish the job?
> This is just one more example of 'oh here's a tool that looks like
> it will make the job easier, after I have fixed this issue, and
> that issue, and this other issue', ...and in the end you look and see
> that 'the short cut was longer, by far'

Thanks, Natural Philosopher, for your remarks and the program below.
One pragmatic reason I did it this way is because it's not yet clear
to me what data needs to be collected from the logs (maybe what tables
in a mysql kind of view), what variables sorted by, summed over, etc.
The ultimate problem that needs solving is some dos and ddos attacks
hitting some of my .cgi programs over-and-over-and-over-and...,
and just some "deny from"'s in .htaccess, and some other stuff,
hasn't sufficiently reduced the problem. So I've been looking at
the logs this-way and that-way and any-other-way, just looking for
ideas about what might work better. That is, who are the main
culprits, exactly what are they doing, etc.

So these scripts are just one-line throwaways to help that effort.
If I knew what I actually wanted/needed, then it might be worth
the effort to program it more elaborately. But lots of effort at
this point in the "development cycle" would be wasted; they're
just "prototypes" at best. So I can easily live with that
concatanated "0". I only asked because for some reason the
question interested me, not because I really needed the problem
solved. I figured there'd either be a really easy answer, or
a followup of the form "it can't be done", and either way would
be okay.

Another, not-progmatic reason, is because I gotta say that
I can't quite agree with your
" This is just one more example of 'oh here's a tool that looks
like it will make the job easier, after I have fixed this issue,
and that issue, and this other issue', ...and in the end you look
and see that 'the short cut was longer, by far' "
At least, in this kind of case I can't quite agree. All these
so-called "little languages" are meant for just this kind of
purpose I have here, and they work quite well in the problem domain
they were designed for, e.g., just munging through some data and
spitting out the result. Rather than your "the short cut was longer,
by far", I'd suggest "don't swat a fly with a sledgehammer".
I use C a lot, and love it, but don't think it's always the best
solution to address every problem. And note that in this particular
case, I hadn't even initially anticipated the need for that little
ipprefix program. Once I realized it was needed, the five minutes
it took to write it (way less time than writing these posts) was
quite a lot less than starting from scratch in C.

> The way I did all this is as follows
>
> Firstly pipe apache logs to a program -n te apoache conf file...
>
> CustomLog "|/usr/local/bin/ipcounter" iponly env=!loopback
>
>
> The program that accepts the pipe logs ip addresses and hits in a mysql
> databse
>
> /*
> This takes IP addresses as input from a pipe from apache
> It waits till the stream of addresses changes, then logs the result to the
> sql table hits with the date of the last access in database gridwatch.
> */
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <unistd.h>
> #include <time.h>
> #include <mysql/mysql.h>
>
> #define DBUSER "user"
> #define DBPASS "password"
> #define DBNAME "database"
> #define DBTABLE "hits"
>
> //#define DEBUG
> #ifdef DEBUG
> #define LOGFILE "/var/log/apache2/ipcounter.log"
> FILE *logfp=0;
>
> void logwrite(char *data)
> {
> if(!logfp)
> logfp=fopen(LOGFILE,"w");
> while(*data)
> {
> fputc(*data++,logfp);
> }
> fputc('\n',logfp);
> fflush(logfp);
> }
> #endif
>
> MYSQL mysql;
> MYSQL_RES *result;
> MYSQL_ROW row;
>
> int get_line(char *buf)
> {
> char c;
> char *p;
> int count;
> count=0;
> p=buf;
> while ((c = getchar()) !=EOF)
> {
> if(c=='\n')
> {
> *p++=0;
> return 0;
> }
> if(count>16)
> {
> *p=0;
> return (count);
> }
> *p++=c;
> count++;
> }
> return (EOF);
> }
> void update_database(char *ip, int count)
> {
> char query[1024];
> sprintf(query,"insert into hits set ip='%s', timestamp=now(),
> count='%d' ON DUPLICATE KEY UPDATE timestamp=now(), count=count+'%d'
> ",ip, count, count );
> while (mysql_ping(&mysql)) // check server still there...if not wait a
> second and try again...and again..
> {
> sleep (1);
> mysql_real_connect(&mysql,"127.0.0.1",DBUSER,DBPASS,DBNAME,0,"",0);
> }
> mysql_query(&mysql,query);
> }
> int main()
> {
> char last_ip[24];
> char ip[24];
> last_ip[0]=0; // initialise to null
> int count=1;
> // open database
> if(!mysql_init(&mysql)) // initalise data structure
> return 1;
> if(!mysql_real_connect(&mysql,"127.0.0.1",DBUSER,DBPASS,DBNAME,0,"",0))
> // connect to database
> {
> printf("Connect failed -%s\n",mysql_error(&mysql));
> return 2;
> }
> // read a line from stdin
> while (get_line(ip) !=EOF)
> {
> #ifdef DEBUG
> logwrite(ip);
> #endif
> if(strcmp(last_ip,ip) || count > 20) // found new input line or just
> too many hits from one source
> {
> if(*last_ip) // dont update with a null entry!
> update_database(last_ip,count);
> count=1; //reset counter
> strcpy(last_ip,ip);
> }
> else count++;
> }
> mysql_close(&mysql);
> }
>
> --------------------------------------------
> That allows me to record number of separate ip addresses that hit the site.
>
> And build nice graphs using a cron script
>
> e.g.
>
> https://gridwatch.templar.co.uk/admin/

--
John Forkosh ( mailto: j@f.com where j=john and f=forkosh )

SubjectRepliesAuthor
o Suppress exit status of system() command

By: John Forkosh on Sun, 2 May 2021

23John Forkosh
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor