Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

news: gotcha


computers / news.software.nntp / Re: cleanfeed help

SubjectAuthor
* cleanfeed helpNigel Reed
`* Re: cleanfeed helpJulien ÉLIE
 `* Re: cleanfeed helpNigel Reed
  `- Re: cleanfeed helpJulien ÉLIE

1
cleanfeed help

<20220314210807.48ae7fec@wibble.sysadmininc.com>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=661&group=news.software.nntp#661

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.endofthelinebbs.com!.POSTED.47.186.38.89!not-for-mail
From: sys...@endofthelinebbs.com (Nigel Reed)
Newsgroups: news.software.nntp
Subject: cleanfeed help
Date: Mon, 14 Mar 2022 21:08:07 -0500
Organization: End Of The Line BBS
Message-ID: <20220314210807.48ae7fec@wibble.sysadmininc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: www.sysadmininc.com; posting-host="47.186.38.89";
logging-data="706466"; mail-complaints-to="usenet@www.sysadmininc.com"
X-Newsreader: Claws Mail 4.0.0git423 (GTK+ 3.24.20; x86_64-pc-linux-gnu)
 by: Nigel Reed - Tue, 15 Mar 2022 02:08 UTC

hi all,

When I installed cleanfeed, I pretty much kept the defaults. While
poking around I decided to check the logfiles and there seems to be a
lot of logged posts that seem like they should be legitimate.

These for example:

From foo@bar Thu Jan 1 00:00:01 1970
INFO: EMP (md5)
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=ISO-8859-1
Date: Sun, 07 Nov 2021 01:26:06 +0100
From: Spijkeltje
<rHBoJ0wNfNZAhDJubZ7mkbiQINQK0rXP-pCqT9Uo-pWuZdbsFP6LckFeQCHRojkgUV.hDJIkMWcbizrqeKjffaAd8yymWo7jTSKz9lOOhmXhh1HjimuWs-sDeqos9Jd9nmF4@spot.net>
Injection-Date: Sun, 07 Nov 2021 01:26:06 +0100 Injection-Info:
reseller; mail-complaints-to="abuse@abavia.com" Lines: 3
Message-ID: <82N4rKZ4Jk8KhaDYQhu08.0.VZ9YmLTp2GoHR2HYQ.BYsA@spot.net>
Newsgroups: free.usenet
Organization: www.abavia.com
Path:
weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!news.mixmin.net!feed.abavia.com!abe002.abavia.com!abp003.abavia.com!reseller!not-for-mail
References: <82N4rKZ4Jk8KhaDYQhu08@spot.net> Subject: Re: Bandari -
Music For Relaxation - Vol. 09 X-Newsreader: Spotnet 1.9.0.6
X-No-Archive: Yes
Xref: feeder6.news.weretis.net free.usenet:2926744
Thanks.^M
^M
..^M

From foo@bar Thu Jan 1 00:00:01 1970
INFO: EMP (md5)
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=ISO-8859-1
Date: Sun, 07 Nov 2021 01:26:50 +0100
From: Spijkeltje
<rHBoJ0wNfNZAhDJubZ7mkbiQINQK0rXP-pCqT9Uo-pWuZdbsFP6LckFeQCHRojkgUV.jzapDQD-pZPbjojU321Cr-pJT7tyH2YQceiPjdQIb6eHKuKcKkjCQBXVHzmC9RkPg9@spot.net>
Injection-Date: Sun, 07 Nov 2021 01:26:50 +0100 Injection-Info:
reseller; mail-complaints-to="abuse@abavia.com" Lines: 3
Message-ID: <FXoIjoFB0pwx0ppYQ7iwe.0.B3mhtLf0A4wSR2HYQ.icoy@spot.net>
Newsgroups: free.usenet
Organization: www.abavia.com
Path:
weretis.net!feeder8.news.weretis.net!news.mixmin.net!feed.abavia.com!abe002..abavia.com!abp003.abavia.com!reseller!not-for-mail
References: <FXoIjoFB0pwx0ppYQ7iwe@spot.net> Subject: Re: Blasmusik aus
Tirol - Musikkapellen aus Nord-, Ost- und Südtirol X-Newsreader:
Spotnet 1.9.0.6 X-No-Archive: Yes
Xref: feeder8.news.weretis.net free.usenet:535364
Thanks.^M
^M
..^M

I guess the md5 check doesn't think about the poster is actually
responding to two different posts. How can I not block things like
this, other than disable the md5 check?

Another thing, I'm seeing a lot of rejects due to "439 Subject (for)"
and I cannot for the life of me figure what this is or why they're
blocked, or then having "for" in the subject, which I imagine is a
pretty common occurrence.

--
End Of The Line BBS - Plano, TX
telnet endofthelinebbs.com 23

Re: cleanfeed help

<t12hhj$fk24$1@news.trigofacile.com>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=662&group=news.software.nntp#662

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.trigofacile.com!.POSTED.san13-h02-176-143-2-105.dsl.sta.abo.bbox.fr!not-for-mail
From: iul...@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: cleanfeed help
Date: Fri, 18 Mar 2022 19:03:30 +0100
Organization: Groupes francophones par TrigoFACILE
Message-ID: <t12hhj$fk24$1@news.trigofacile.com>
References: <20220314210807.48ae7fec@wibble.sysadmininc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 18 Mar 2022 18:03:31 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="san13-h02-176-143-2-105.dsl.sta.abo.bbox.fr:176.143.2.105";
logging-data="512068"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.6.2
Cancel-Lock: sha1:xD1ZiSZPn2SF4nMauiYc2O8k0Mw= sha256:3DdoLUGsz0eNbFXKTKwm++k+2eOS/7bEWF2EZWC328I=
sha1:IpuePL6w7ZrzyMMgi69dio178/g= sha256:W7K2bPl8JNGhZoNpJ1DN/Q/Crs9wYYYtdGEWRuK/n2o=
In-Reply-To: <20220314210807.48ae7fec@wibble.sysadmininc.com>
 by: Julien ÉLIE - Fri, 18 Mar 2022 18:03 UTC

Hi Nigel,

> I guess the md5 check doesn't think about the poster is actually
> responding to two different posts. How can I not block things like
> this, other than disable the md5 check?
>
> References: <FXoIjoFB0pwx0ppYQ7iwe@spot.net>
> [...]
> Thanks.^M
The MD5 check should not have been performed on the 2 examples you gave.
In your Cleanfeed script, what's the value of:

md5_skips_followups => 1, # avoid MD5 check on articles with References?

The default (1) is to *not* perform MD5 checks in these cases.

I've checked how this parameter is used, and the code seems correct to
me (it should really not have rejected these articles).

> Another thing, I'm seeing a lot of rejects due to "439 Subject (for)"
> and I cannot for the life of me figure what this is or why they're
> blocked, or then having "for" in the subject, which I imagine is a
> pretty common occurrence.

That's pretty strange.
What do you have in the "bad_subject" file? (this file is located in the
$config_dir directory set at the beginning of the Cleanfeed script)

The default only contains "simpbiz.software", which will rejects all
articles containing that string in their Subject header field.


https://raw.githubusercontent.com/crooks/cleanfeed/master/samples/bad_subject

Seems like "for" appears uncommented in that file. You should
investigate its contents.

--
Julien ÉLIE

« Les propositions mathématiques sont reçues comme vraies parce que
personne n'a intérêt qu'elles soient fausses. » (Montesquieu)

Re: cleanfeed help

<20220321015525.10407e93@wibble.sysadmininc.com>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=668&group=news.software.nntp#668

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.endofthelinebbs.com!.POSTED.47.186.38.89!not-for-mail
From: sys...@endofthelinebbs.com (Nigel Reed)
Newsgroups: news.software.nntp
Subject: Re: cleanfeed help
Date: Mon, 21 Mar 2022 01:55:25 -0500
Organization: End Of The Line BBS
Message-ID: <20220321015525.10407e93@wibble.sysadmininc.com>
References: <20220314210807.48ae7fec@wibble.sysadmininc.com>
<t12hhj$fk24$1@news.trigofacile.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: www.sysadmininc.com; posting-host="47.186.38.89";
logging-data="1039420"; mail-complaints-to="usenet@www.sysadmininc.com"
X-Newsreader: Claws Mail 4.0.0git423 (GTK+ 3.24.20; x86_64-pc-linux-gnu)
 by: Nigel Reed - Mon, 21 Mar 2022 06:55 UTC

On Fri, 18 Mar 2022 19:03:30 +0100
Julien ÉLIE <iulius@nom-de-mon-site.com.invalid> wrote:

> Hi Nigel,

> md5_skips_followups => 1, # avoid MD5 check on articles with
> References?
>
> The default (1) is to *not* perform MD5 checks in these cases.

OK. That is set to 0 here in my cleanfeed.local file. I don't recall
ever changing it but I'll go ahead and change it to 1.

>
>
> I've checked how this parameter is used, and the code seems correct
> to me (it should really not have rejected these articles).
>
>
>
> > Another thing, I'm seeing a lot of rejects due to "439 Subject
> > (for)" and I cannot for the life of me figure what this is or why
> > they're blocked, or then having "for" in the subject, which I
> > imagine is a pretty common occurrence.
>
> That's pretty strange.
> What do you have in the "bad_subject" file? (this file is located in
> the $config_dir directory set at the beginning of the Cleanfeed
> script)
>

Other than comments

simpbiz.software
Buy ketamine for depression

> Seems like "for" appears uncommented in that file. You should
> investigate its contents.

Ah, that is where the "for" is. I'm obviously using it incorrectly. I
thought PCREs would use a single space as a space but I guess I need to
use period (any character) or I guess \s instead.

--
End Of The Line BBS - Plano, TX
telnet endofthelinebbs.com 23

Re: cleanfeed help

<t1ai1l$kkik$1@news.trigofacile.com>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=672&group=news.software.nntp#672

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder8.news.weretis.net!news.trigofacile.com!.POSTED.san13-h02-176-143-2-105.dsl.sta.abo.bbox.fr!not-for-mail
From: iul...@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: cleanfeed help
Date: Mon, 21 Mar 2022 20:01:08 +0100
Organization: Groupes francophones par TrigoFACILE
Message-ID: <t1ai1l$kkik$1@news.trigofacile.com>
References: <20220314210807.48ae7fec@wibble.sysadmininc.com>
<t12hhj$fk24$1@news.trigofacile.com>
<20220321015525.10407e93@wibble.sysadmininc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 21 Mar 2022 19:01:09 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="san13-h02-176-143-2-105.dsl.sta.abo.bbox.fr:176.143.2.105";
logging-data="676436"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.7.0
Cancel-Lock: sha1:9hKsjv/JAsp+AI3Z9m79qpC+enk= sha256:TPkG/RWDCD5ykCElYPmbvMkEh4Qw3hbbyCcc8sAPCkY=
sha1:b1fG0KWNpsFNjovvoO93dzR7pXQ= sha256:kV6rCWXez6zwe8+kK2/iKwdxZgaveGQcmqX6VIS22JA=
In-Reply-To: <20220321015525.10407e93@wibble.sysadmininc.com>
 by: Julien ÉLIE - Mon, 21 Mar 2022 19:01 UTC

Hi Nigel,

>> md5_skips_followups => 1, # avoid MD5 check on articles with
>> References?
>>
>> The default (1) is to *not* perform MD5 checks in these cases.
>
> OK. That is set to 0 here in my cleanfeed.local file. I don't recall
> ever changing it but I'll go ahead and change it to 1.

In fact, the cleanfeed.local sample sets it to 0:
https://github.com/crooks/cleanfeed/blob/master/cleanfeed.local.sample

Normally, you don't need changing the defaults (and therefore do not
need having a cleanfeed.local).
Make sure all the other changes fit your needs (notably the special
filters for Google Groups posts, and the fact that all cancels are
blocked - even those using Cancel-Lock).

I've opened an issue to change the sample file (or at least comment the
lines).

>> What do you have in the "bad_subject" file? (this file is located in
>> the $config_dir directory set at the beginning of the Cleanfeed
>> script)
>
> Other than comments
>
> simpbiz.software
> Buy ketamine for depression
>
>> Seems like "for" appears uncommented in that file. You should
>> investigate its contents.
>
> Ah, that is where the "for" is. I'm obviously using it incorrectly. I
> thought PCREs would use a single space as a space but I guess I need to
> use period (any character) or I guess \s instead.

I agree the documentation is not clear. Each line of the file is
expected to be a Perl regexp. "Buy ketamine for depression" is normally
one, but when parsing the file in read_file(), Cleanfeed does a split
when a white space is seen, so the whole regexp becomes:
(simpbiz.software|Buy|ketamine|for|depression)
and therefore matches "for".

Though not tested, I believe using "Buy\sketamine\sfor\sdepression" will
work.

--
Julien ÉLIE

« Tous les chemins mènent à rame… » (Mouléfix)

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor