Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Quark! Quark! Beware the quantum duck!


devel / rocksolid.programming / import script from several formats

SubjectAuthor
* import script from several formatsRetro Guy
`* import script from several formatsRetro Guy
 `* import script from several formatsRetro Guy
  `- import script from several formatsRetro Guy

1
import script from several formats

<20210207030130.3d1aec8d@desktop14.dt>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=411&group=rocksolid.programming#411

  copy link   Newsgroups: rocksolid.programming
Path: i2pn2.org!.POSTED!not-for-mail
From: retro_...@novabbs.com (Retro Guy)
Newsgroups: rocksolid.programming
Subject: import script from several formats
Date: Sun, 7 Feb 2021 03:01:30 -0700
Organization: novaBBS
Message-ID: <20210207030130.3d1aec8d@desktop14.dt>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Info: i2pn2.org; posting-account="retrobbs1";
logging-data="13693"; mail-complaints-to="usenet@i2pn2.org"
X-Newsreader: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu)
 by: Retro Guy - Sun, 7 Feb 2021 10:01 UTC

I'm working on an import script for news archives into rslight.

The main reason was to easily copy a newsgroup from one rslight site to
another without just copying the entire spool, just copy the group .db3
file. Then I realized I have old .mbox files of some newsgroups dating
back to around 1998 and thought it would be fun to import these messages
also.

I can easily pull the articles from .mbox, and I'll use the code from
spoolnews.php to plug them into rslight. The idea is I can just drop
a .mbox or rslight.db3 file into a directory and it will be
automatically imported.

It shouldn't be difficult to get this done, so hopefully I can spend
some time next week and finish it up. I've been reading through some of
the old articles and it's interesting. I also need to make sure the
date format will work, or I'll convert the dates during import. Some
old news has dates like

Date: 1998/08/10

Just need to make sure these dates are read properly.

Retro Guy

Re: import script from several formats

<6402f5cbf9bd82f20a231e2f4f2a8111$1@news.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=412&group=rocksolid.programming#412

  copy link   Newsgroups: rocksolid.programming
Path: i2pn2.org!.POSTED!not-for-mail
From: retro....@rocksolidbbs.com (Retro Guy)
Newsgroups: rocksolid.programming
Subject: Re: import script from several formats
Date: Tue, 9 Feb 2021 10:10:37 +0000
Organization: Rocksolid Light
Message-ID: <6402f5cbf9bd82f20a231e2f4f2a8111$1@news.novabbs.org>
References: <20210207030130.3d1aec8d@desktop14.dt>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org; posting-account="retrobbs1";
logging-data="3757"; mail-complaints-to="usenet@i2pn2.org"
User-Agent: Rocksolid Light (news.novabbs.com/getrslight)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on novabbs.org
X-Rslight-Site: $2y$10$PFqaDkuThg2r5c7P7Q0Tx.hydUumYC16BfDtQ6Q6AYLZYk.Lw/Wp2
 by: Retro Guy - Tue, 9 Feb 2021 10:10 UTC

Retro Guy wrote:

> It shouldn't be difficult to get this done, so hopefully I can spend
> some time next week and finish it up. I've been reading through some of
> the old articles and it's interesting. I also need to make sure the
> date format will work, or I'll convert the dates during import. Some

This seems to be going well. Conversion was not difficult.

I also have earlier articles (early 80s) which are fun to have, but the headers are severely lacking. Not sure I'll spend much time on those right now.

Threading is weak in earlier articles, but as long as I have an actual newsgroup name I can put it in the right place. I'll be working on this more over the next few weeks.

Retro Guy
--
Posted on Rocksolid Light
news.novabbs.org

Re: import script from several formats

<fa2ad40a81c579200c90e825a22f30c6$1@news.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=413&group=rocksolid.programming#413

  copy link   Newsgroups: rocksolid.programming
Path: i2pn2.org!.POSTED!not-for-mail
From: retro....@rocksolidbbs.com (Retro Guy)
Newsgroups: rocksolid.programming
Subject: Re: import script from several formats
Date: Tue, 9 Feb 2021 10:28:14 +0000
Organization: Rocksolid Light
Message-ID: <fa2ad40a81c579200c90e825a22f30c6$1@news.novabbs.org>
References: <20210207030130.3d1aec8d@desktop14.dt> <6402f5cbf9bd82f20a231e2f4f2a8111$1@news.novabbs.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org; posting-account="retrobbs1";
logging-data="7435"; mail-complaints-to="usenet@i2pn2.org"
User-Agent: Rocksolid Light (news.novabbs.com/getrslight)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on novabbs.org
X-Rslight-Site: $2y$10$5qbM.Fpin.Vp1MGCUZ4MlO6WyPQza5IWinxb.QRnawI6i5oC/MB9m
 by: Retro Guy - Tue, 9 Feb 2021 10:28 UTC

Retro Guy wrote:

> I also have earlier articles (early 80s) which are fun to have, but the headers are severely lacking. Not sure I'll spend much time on those right now.

Here's an example of an early header and message. Quite different from current headers, but the info is there. Forget threading, lol:

Autzoo.101
test
utzoo!henry
Fri Feb 6 00:19:47 1981
first_test
This is the first U of T test of the Duke news program.
Here is some more text.
And some more.
--
Posted on Rocksolid Light
news.novabbs.org

Re: import script from several formats

<c16e0801b24eface3fec0a359072ddd1$1@news.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=414&group=rocksolid.programming#414

  copy link   Newsgroups: rocksolid.programming
Path: i2pn2.org!.POSTED!not-for-mail
From: retro....@rocksolidbbs.com (Retro Guy)
Newsgroups: rocksolid.programming
Subject: Re: import script from several formats
Date: Thu, 11 Feb 2021 07:43:48 +0000
Organization: Rocksolid Light
Message-ID: <c16e0801b24eface3fec0a359072ddd1$1@news.novabbs.org>
References: <20210207030130.3d1aec8d@desktop14.dt> <6402f5cbf9bd82f20a231e2f4f2a8111$1@news.novabbs.org> <fa2ad40a81c579200c90e825a22f30c6$1@news.novabbs.org>
Mime-Version: 1.0
Content-Type: multipart/mixed;boundary="------------6024e0307443f5.08382671"
Injection-Info: i2pn2.org; posting-account="retrobbs1";
logging-data="7881"; mail-complaints-to="usenet@i2pn2.org"
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on novabbs.org
X-Rslight-Site: $2y$10$eZgEyn4UAxM7wiEKXk2LheRympyGMf2FdE9IWsHULtgQhATttznWi
 by: Retro Guy - Thu, 11 Feb 2021 07:43 UTC
Attachments: archive2.png (image/png)

I've written a small script to modify headers as necessary, for example:

Title: now is Subject:
Posted: now is Date:

This is simple to change. Attached is a very small example of my testing, from a very small number of articles. These are all dated in April 1988. They will thread better once I import more articles.

Retro Guy

Attachments: archive2.png 
1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor