Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

To understand a program you must become both the machine and the program.


computers / news.software.readers / Re: [slrn] experiment: can bayesian filtering score usenet posts?

SubjectAuthor
* [slrn] experiment: can bayesian filtering score usenet posts?Tavis Ormandy
`- Re: [slrn] experiment: can bayesian filtering score usenet posts?HenHanna

1
[slrn] experiment: can bayesian filtering score usenet posts?

<j2abr7F1sctU1@mid.individual.net>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=722&group=news.software.readers#722

  copy link   Newsgroups: news.software.readers
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.szaf.org!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: tav...@gmail.com (Tavis Ormandy)
Newsgroups: news.software.readers
Subject: [slrn] experiment: can bayesian filtering score usenet posts?
Date: 20 Dec 2021 03:32:55 GMT
Lines: 31
Message-ID: <j2abr7F1sctU1@mid.individual.net>
X-Trace: individual.net nexR7PMjT1Fj2X+H0E+gUA/4d+VJERzkjKY/IpqmnJT95Li87a
Cancel-Lock: sha1:k6hzcpI3lOqkpUPLKmqCksmk5JI=
User-Agent: slrn/pre1.0.4-5 (Linux)
 by: Tavis Ormandy - Mon, 20 Dec 2021 03:32 UTC

The problem with training spam filters with NNTP is that the protocol is
designed around offering headers and bodies seperately.

Sure, in theory you could just download everything at once, but then you
lose all the performance benefits of the protocol. If you could just
score on the XOVER headers, then you would still have all the protocol
benefits, but is that enough data?

I decided to try it, and the answer is it works! *but* it took a lot of
training before it started to work.

I used bogofilter (https://bogofilter.sourceforge.io/) and wrote a macro
to pipe just the overview headers into it. It then auto-generates a
scorefile.

For the last few months, it has been really accurate at identifying the
messages I want to read and I've been finding it really useful. If
anyone else wants to try it out, here is the macro I used:

https://lock.cmpxchg8b.com/files/bogofilter.sl

The macro automatically learns any articles you read when you leave a
group. If the message had a positive score, it learns them as good. If
it has a very low score, it learns them as bad.

Tavis.

--
_o) $ lynx lock.cmpxchg8b.com
/\\ _o) _o) $ finger taviso@sdf.org
_\_V _( ) _( ) @taviso

Re: [slrn] experiment: can bayesian filtering score usenet posts?

<e32a3e0040860161f9635b8f60ab5d79@www.novabbs.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=1957&group=news.software.readers#1957

  copy link   Newsgroups: news.software.readers
Date: Wed, 21 Feb 2024 22:54:47 +0000
Subject: Re: [slrn] experiment: can bayesian filtering score usenet
posts?
From: HenHa...@gmail.com (HenHanna)
Newsgroups: news.software.readers
X-Rslight-Site: $2y$10$UwjGlDjZbKy8I0bEpc5.uOBQ7HskS5kv7dnznw34J3apdkalf4s7q
X-Rslight-Posting-User: 5a1f1f09909a70d7ae18ae9af00e018f83ece577
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Rocksolid Light
References: <j2abr7F1sctU1@mid.individual.net>
Organization: novaBBS
Message-ID: <e32a3e0040860161f9635b8f60ab5d79@www.novabbs.com>
 by: HenHanna - Wed, 21 Feb 2024 22:54 UTC

Tavis.

--
_o) $ lynx lock.cmpxchg8b.com
/\\ _o) _o) $ finger taviso@sdf.org
_\_V _( ) _( ) @taviso

------------ i love your .SIG lines !!!

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor