Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The world is coming to an end. Please log off.


computers / news.software.nntp / Re: INN2: importing archival messages an threads

SubjectAuthor
* INN2: importing archival messages an threadsejs
`* Re: INN2: importing archival messages an threadsHenning Hucke
 `* Re: INN2: importing archival messages an threadsejs
  `* Re: INN2: importing archival messages an threadsHenning Hucke
   `* Re: INN2: importing archival messages an threadsRuss Allbery
    `* Re: INN2: importing archival messages an threadsJesse Rehmer
     `- Re: INN2: importing archival messages an threadsJulien ÉLIE

1
INN2: importing archival messages an threads

<tj3ko6$197dn$3@dont-email.me>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1307&group=news.software.nntp#1307

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: Usernet....@seniejitrakai.net (ejs)
Newsgroups: news.software.nntp
Subject: INN2: importing archival messages an threads
Date: Sun, 23 Oct 2022 17:57:10 +0300
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <tj3ko6$197dn$3@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 23 Oct 2022 14:57:10 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="65eedcb790eac74b7bbe075f21b7e981";
logging-data="1351095"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+ByWgW1HBUx+p/vtWdJu2G"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101
Thunderbird/107.0
Cancel-Lock: sha1:gN29koo3hMOePP2Fd9ESGGbJRwo=
Content-Language: lt
 by: ejs - Sun, 23 Oct 2022 14:57 UTC

Hi All,

We are running a local instace of Usenet server.

Sometimes we get a pieces of historical messages and in order not to
obstruct the active groups, we moved them to a hifferent hierarchy.

The current problem is the messages we have to proces will interfere
with the messages on the server, as they cover the save time period.

Cold someone explain how the messages are stored and referenced on the
server? I can see the numerical ID, corresponding to the physical file
on the server and Message-ID, which appears from ...?

My current idea is to query the INN server for a Message-IDs in a
specific newsgroup and if they are found, we have a duplicates. If no, i
can feed the message and the entire thread to the server. The file name
and the history entry will be created by the server.
Am i right? Or maybe there are messages on a different groups and i may
have a clash there?

What we did until now - just placed the files into spool and recreated
the history. But we were sure there will be no clashes neither in file
names, nor Message-IDs; this was performed with vgrep and oh, boy ...
It works for small batches and short threads and į'm not sure if it will
scale easilly.

I have the nearly 3M messages to be fed in the database and i can
perform alomst any adjustments on-the-fly.

--
ejs
news://news.rkm.lt

Re: INN2: importing archival messages an threads

<tj5igi$hsc$1@sirius.aeon.icebear.cloud>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1308&group=news.software.nntp#1308

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: h_hucke+...@newsmail.aeon.icebear.org (Henning Hucke)
Newsgroups: news.software.nntp
Subject: Re: INN2: importing archival messages an threads
Date: Mon, 24 Oct 2022 08:31:14 -0000 (UTC)
Organization: aeon: think longer than you thought before
Lines: 34
Distribution: world
Message-ID: <tj5igi$hsc$1@sirius.aeon.icebear.cloud>
References: <tj3ko6$197dn$3@dont-email.me>
Reply-To: Henning Hucke <h_hucke+news.reply@newsmail.aeon.icebear.org>
X-Trace: individual.net 94gz7x1Hj/t6EiA4EAXMjQg0rgIp2OqIhY26shjO9yBYcx1PI+
X-Orig-Path: news.aeon.icebear.cloud!news1.aeon.icebear.cloud!.POSTED.romulus.aeon.icebear.cloud!not-for-mail
Cancel-Lock: sha1:3gJM+lxRm7UpcK+sd2LmTK40d9g= sha1:s9IbxzdOss265GWBgNRfxl4ueE4=
Injection-Date: Mon, 24 Oct 2022 08:31:14 -0000 (UTC)
Injection-Info: sirius.aeon.icebear.cloud; posting-host="romulus.aeon.icebear.cloud:fd09:afca:b044:1:4ecc:6aff:fecf:5c8f";
logging-data="18316"; mail-complaints-to="abuse+news@aeon.icebear.cloud"
User-Agent: tin/2.4.1-20161224 ("Daill") (UNIX) (Linux/4.9.0-15-amd64 (x86_64))
 by: Henning Hucke - Mon, 24 Oct 2022 08:31 UTC

ejs <Usernet.eternal-september@seniejitrakai.net> wrote:
> Hi All,

Hi stranger,

> We are running a local instace of Usenet server.
> [...]

honestly this is a somehow unstructured request for help or at least it
has no structure I recognise.

You want to import historic/archived postings into a running inn
instance?
In which format are the postings you have available? One article per
file? Batch files containing multiple postings?
And do you want to know whether or not you use a viable method to import
the postings or do you want to know a / the appropriate way to import the
postings?
(I think the later would be the/a sensefull way)

Be aware that article numbers are heavily problematic since they may be -
not unlikely are - already in use on the actual server. They are also
only partially helpfull if you use other storage methods than "tradspool".

Duplicate detection and the like are performed if you "feed" the
postings into inn. This is processor intensive but from my point of
view the most secure method to feed - historic as well as current -
postings into the message base.

Best regards
Henning
--
How many bits would a BitBlit blit if a BitBlit could blit bits?
-- macanespie@waves.pas.ti.com in <1993Nov16.130625.1@waves.pas.ti.com>

Re: INN2: importing archival messages an threads

<tj5rle$1lkov$1@dont-email.me>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1309&group=news.software.nntp#1309

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: Usernet....@seniejitrakai.net (ejs)
Newsgroups: news.software.nntp
Subject: Re: INN2: importing archival messages an threads
Date: Mon, 24 Oct 2022 14:07:26 +0300
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <tj5rle$1lkov$1@dont-email.me>
References: <tj3ko6$197dn$3@dont-email.me>
<tj5igi$hsc$1@sirius.aeon.icebear.cloud>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 24 Oct 2022 11:07:26 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="11ccf27260e2334d2d2424355123162d";
logging-data="1757983"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19TMOZoclz4Nu4qguN6UBIl"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101
Thunderbird/107.0
Cancel-Lock: sha1:1KhPWBfjVmFgB8jJ8Pi5t+UUHFI=
In-Reply-To: <tj5igi$hsc$1@sirius.aeon.icebear.cloud>
Content-Language: lt
 by: ejs - Mon, 24 Oct 2022 11:07 UTC

2022-10-24 11:31, Henning Hucke rašė:
> You want to import historic/archived postings into a running inn
> instance?

Yes. I've done it in the off-line mode, but that was for the new hierarchy.

> In which format are the postings you have available? One article per
> file? Batch files containing multiple postings?

I can export either as one file per message or feed them using Python
NNTP library.

> And do you want to know whether or not you use a viable method to import
> the postings or do you want to know a / the appropriate way to import the
> postings?
> (I think the later would be the/a sensefull way)

I need to do it in the proper way.

> Be aware that article numbers are heavily problematic since they may be -
> not unlikely are - already in use on the actual server. They are also
> only partially helpfull if you use other storage methods than "tradspool".

Right now I assume there will be Message-ID duplicates.

> Duplicate detection and the like are performed if you "feed" the
> postings into inn. This is processor intensive but from my point of
> view the most secure method to feed - historic as well as current -
> postings into the message base.

So, for the consistency, i could fetch all the headers, build a list of
Message-IDs used and alter the Message-ID as well as 'References:' and
'In-Reply-To:' fields of the message imported.
I need to have proper threading and no duplicate messages on the server.

--
ejs
news://news.rkm.lt

Re: INN2: importing archival messages an threads

<tjdnmm$p32$1@sirius.aeon.icebear.cloud>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1310&group=news.software.nntp#1310

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!aioe.org!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!lilly.ping.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: h_hucke+...@newsmail.aeon.icebear.org (Henning Hucke)
Newsgroups: news.software.nntp
Subject: Re: INN2: importing archival messages an threads
Date: Thu, 27 Oct 2022 10:48:54 -0000 (UTC)
Organization: aeon: think longer than you thought before
Lines: 44
Distribution: world
Message-ID: <tjdnmm$p32$1@sirius.aeon.icebear.cloud>
References: <tj3ko6$197dn$3@dont-email.me> <tj5igi$hsc$1@sirius.aeon.icebear.cloud> <tj5rle$1lkov$1@dont-email.me>
Reply-To: Henning Hucke <h_hucke+news.reply@newsmail.aeon.icebear.org>
X-Trace: individual.net 4rL+uLFZosByBrALmth48AKzJhncV8P6OY1dLAEO0BkMBlFBVC
X-Orig-Path: news.aeon.icebear.cloud!news1.aeon.icebear.cloud!.POSTED.romulus.aeon.icebear.cloud!not-for-mail
Cancel-Lock: sha1:bNJTLew6mz6o+U+XZ6tB+RRQhg8= sha1:z7Oe3N73kFj478b15a4MUpH+iDs=
Injection-Date: Thu, 27 Oct 2022 10:48:54 -0000 (UTC)
Injection-Info: sirius.aeon.icebear.cloud; posting-host="romulus.aeon.icebear.cloud:fd09:afca:b044:1:4ecc:6aff:fecf:5c8f";
logging-data="25698"; mail-complaints-to="abuse+news@aeon.icebear.cloud"
User-Agent: tin/2.4.1-20161224 ("Daill") (UNIX) (Linux/4.9.0-15-amd64 (x86_64))
 by: Henning Hucke - Thu, 27 Oct 2022 10:48 UTC

ejs <Usernet.eternal-september@seniejitrakai.net> wrote:
> [...]
> So, for the consistency, i could fetch all the headers, build a list of
> Message-IDs used and alter the Message-ID as well as 'References:' and
> 'In-Reply-To:' fields of the message imported.
> I need to have proper threading and no duplicate messages on the server.

Either I still have a misunderstanding of what you want to achieve or
you by yourself have a misunderstanding of how INN and NNTP work...

Posting headers or extracted message ids aren't helpfull if you want to
write historic postings into an INN message base since you want to write
the articles in whole and you you have duplicate postings very seldom
(otherwise this would mean that you already have a lot of these
"historic" posting in your message base).

INN does duplicate detection already by itself. That's nothing you need
to do. And you also need no separate python implemented NNTP feeder
since such tools already exist.
Maybe its a good idea to use two computers so that the INN powers news
server can do its work while the other computer can manage the IO load
to grab and post all the single posting files.

Its possibly also no good idea to use the tradspool storage and
therewith also writing the postings directly into the storage would also
be no good idea.

The more relevant stuff with this task is possibly the INN
configuration. To be able to write the historic postings into the message
base you need to adapt the "artcutoff" (and some other) settings and the
"expire.ctl" if you use a storage method affected by expire.
After having imported the historic postings you should certainly reset
the "artcutoff" setting (and some others).

Its evetually helpful that you simply describe where /you/ see problems
in writing the postings via NNTP (tools) and /why/ you want to write
directly into a tradspool storage.

Regards
Henning
--
Can't open /usr/fortunes. Lid stuck on cookie jar.

Re: INN2: importing archival messages an threads

<8735b9p8un.fsf@hope.eyrie.org>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1311&group=news.software.nntp#1311

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.killfile.org!news.eyrie.org!.POSTED!not-for-mail
From: eag...@eyrie.org (Russ Allbery)
Newsgroups: news.software.nntp
Subject: Re: INN2: importing archival messages an threads
Date: Thu, 27 Oct 2022 08:46:56 -0700
Organization: The Eyrie
Message-ID: <8735b9p8un.fsf@hope.eyrie.org>
References: <tj3ko6$197dn$3@dont-email.me>
<tj5igi$hsc$1@sirius.aeon.icebear.cloud>
<tj5rle$1lkov$1@dont-email.me>
<tjdnmm$p32$1@sirius.aeon.icebear.cloud>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: hope.eyrie.org;
logging-data="26537"; mail-complaints-to="news@eyrie.org"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
Cancel-Lock: sha1:rEL0OaHKpMVBlV06EwDnsT1V5+k=
 by: Russ Allbery - Thu, 27 Oct 2022 15:46 UTC

Henning Hucke <h_hucke+spam.news@newsmail.aeon.icebear.org> writes:

> INN does duplicate detection already by itself. That's nothing you need
> to do. And you also need no separate python implemented NNTP feeder
> since such tools already exist.

The big problem with injecting old posts is that INN uses strictly
increasing article numbers, so they'll get larger article numbers than
existing (more recent) posts.

How this will look to the user will vary by newsreader, but in general all
the historic posts will show up as new, and they may or may not be sorted
correctly when people view the group depending on whether the newsreader
sorts by article date or by article number. Sorting by article number is
quite common.

If you want this to look as if all the articles had arrived in normal
order, unfortunately you (speaking to the original poster here) have to do
major surgery. You'll have to assemble an article tree, probably with
manually assigned article numbers, that has all the articles you want
numbered in the right order. I think you'll have to use tradspool and
tradindexed overview, and then use the tdx-util program from tradindexed
to rebuild overview for that group. You'll also have to inject the
articles into history, probably by rebuilding history.

This is unforutnately not going to be easy to do and is going to be
disruptive for any existing readers of the group on that server (because
you'll end up renumbering the articles in that group). INN doesn't
provide any tools out of the box for doing this, although I have done
things like this before (many years ago) manually.

You may find it easier to set up a second INN server, assemble a list of
all the articles you want on that server in correct date sorted order, and
then feed all the articles to that server in that order using innxmit or
some similar tool. This will also require some work to assemble all the
pieces and build the batch file pointing to all the article files, so it
will take some manual experimentation. (Since I haven't done that
experimentation in over ten years, I unfortunately can't give you
step-by-step instructions.) But it may mean less fiddling than manually
assembling a tradspool structure and rebuilding history and overview. Or
it may not! I'm not sure which is easier.

--
Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>

Please post questions rather than mailing me directly.
<https://www.eyrie.org/~eagle/faqs/questions.html> explains why.

Re: INN2: importing archival messages an threads

<tjeerr$2ih6$1@nnrp.usenet.blueworldhosting.com>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1312&group=news.software.nntp#1312

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!nnrp.usenet.blueworldhosting.com!.POSTED!not-for-mail
From: jesse.re...@blueworldhosting.com (Jesse Rehmer)
Newsgroups: news.software.nntp
Subject: Re: INN2: importing archival messages an threads
Date: Thu, 27 Oct 2022 17:24:12 -0000 (UTC)
Organization: BlueWorld Hosting Usenet (https://usenet.blueworldhosting.com)
Message-ID: <tjeerr$2ih6$1@nnrp.usenet.blueworldhosting.com>
References: <tj3ko6$197dn$3@dont-email.me> <tj5igi$hsc$1@sirius.aeon.icebear.cloud> <tj5rle$1lkov$1@dont-email.me> <tjdnmm$p32$1@sirius.aeon.icebear.cloud> <8735b9p8un.fsf@hope.eyrie.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=fixed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 27 Oct 2022 17:24:12 -0000 (UTC)
Injection-Info: nnrp.usenet.blueworldhosting.com;
logging-data="84518"; mail-complaints-to="usenet@blueworldhosting.com"
User-Agent: Usenapp for MacOS
Cancel-Lock: sha1:NNtC39fJzRKWIEk4+NQQb63U0IQ= sha256:tSJ165z8XL2vDrQcQnO8DKNIwzazrKmrtcRpSSLwOBY=
sha1:sU5N4ilkpS50GTXajE0vNOEHZYc= sha256:V+DZWRDIgTOcQwx8ihFvD+1uJVqQka5MtquJcQ63xFc=
X-Usenapp: v1.23/d - Full License
 by: Jesse Rehmer - Thu, 27 Oct 2022 17:24 UTC

On Oct 27, 2022 at 10:46:56 AM CDT, "Russ Allbery" <eagle@eyrie.org> wrote:

> You may find it easier to set up a second INN server, assemble a list of
> all the articles you want on that server in correct date sorted order, and
> then feed all the articles to that server in that order using innxmit or
> some similar tool. This will also require some work to assemble all the
> pieces and build the batch file pointing to all the article files, so it
> will take some manual experimentation. (Since I haven't done that
> experimentation in over ten years, I unfortunately can't give you
> step-by-step instructions.) But it may mean less fiddling than manually
> assembling a tradspool structure and rebuilding history and overview. Or
> it may not! I'm not sure which is easier.

This is the way to go - there was a thread I started some months back where
Julien helped provide the syntax necessary to generate the list of messages
sorted by posting date which you can then transmit to another server. This is
the path of least resistence to get a large amount articles in a sane order
without several large operations in place.

Re: INN2: importing archival messages an threads

<tjehq3$8t72$1@news.trigofacile.com>

 copy mid

https://www.novabbs.com/computers/article-flat.php?id=1313&group=news.software.nntp#1313

 copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!news.nntp4.net!news.gegeweb.eu!gegeweb.org!news.trigofacile.com!.POSTED.176-143-2-105.abo.bbox.fr!not-for-mail
From: iul...@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: INN2: importing archival messages an threads
Date: Thu, 27 Oct 2022 20:14:27 +0200
Organization: Groupes francophones par TrigoFACILE
Message-ID: <tjehq3$8t72$1@news.trigofacile.com>
References: <tj3ko6$197dn$3@dont-email.me>
<tj5igi$hsc$1@sirius.aeon.icebear.cloud> <tj5rle$1lkov$1@dont-email.me>
<tjdnmm$p32$1@sirius.aeon.icebear.cloud> <8735b9p8un.fsf@hope.eyrie.org>
<tjeerr$2ih6$1@nnrp.usenet.blueworldhosting.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 27 Oct 2022 18:14:27 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="176-143-2-105.abo.bbox.fr:176.143.2.105";
logging-data="292066"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
Gecko/20100101 Thunderbird/102.4.0
Cancel-Lock: sha1:U0jTDsWjKq+s8XjwtYzJkS+IDeA= sha256:mGZ5cFmGKcIh9l4NNv16AnFSJl2WYbgygOjIbaKSCvw=
sha1:vDjanlB7u5fLsi+bbg069KmUCJw= sha256:RhUYmvx+4WCiQbfyZTsUN4cAC6Mbz/o6/Ry2hf4Sowg=
In-Reply-To: <tjeerr$2ih6$1@nnrp.usenet.blueworldhosting.com>
 by: Julien ÉLIE - Thu, 27 Oct 2022 18:14 UTC

Hi Jesse,

>> You may find it easier to set up a second INN server, assemble a list of
>> all the articles you want on that server in correct date sorted order, and
>> then feed all the articles to that server in that order using innxmit or
>> some similar tool. This will also require some work to assemble all the
>> pieces and build the batch file pointing to all the article files, so it
>> will take some manual experimentation. (Since I haven't done that
>> experimentation in over ten years, I unfortunately can't give you
>> step-by-step instructions.) But it may mean less fiddling than manually
>> assembling a tradspool structure and rebuilding history and overview. Or
>> it may not! I'm not sure which is easier.
>
> This is the way to go - there was a thread I started some months back where
> Julien helped provide the syntax necessary to generate the list of messages
> sorted by posting date which you can then transmit to another server.

Yup, and I added that information in the FAQ as I thought it may be
useful to other people :-)

https://www.eyrie.org/~eagle/faqs/inn.html#S6.4

"""
[generating the "<pathoutgoing>/list" file]

The result file contains tokens ordered by arrival time on the old
server (which is usually roughly the same as the posting time). In case
the history file was not populated chronologically, it is better to sort
it by posting time so that articles are fed in the right order. This
can be achieved with the following command:

sort -t '~' -k3n < history > history.sorted

And then, consider history.sorted instead of history for the next steps.
"""

--
Julien ÉLIE

« 21.1.1 How to convert mSQL tools for MySQL?
1. Run the shell script msql2mysql on the source. This requires the
replace program, which is distributed with MySQL.
2. Compile.
3. Fix all compiler errors. » (MySQL online manual)

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor