Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The light of a hundred stars does not equal the light of the moon.


devel / comp.lang.forth / word count

SubjectAuthor
* word countHugh Aguilar
+- Re: word countdxforth
+* Re: word countNN
|`* Re: word countHugh Aguilar
| +- Re: word countdxforth
| `* Re: word countStephen Pelc
|  `- Re: word countHugh Aguilar
+* Re: word countRon AARON
|+- Re: word countRon AARON
|`* Re: word countdxforth
| `- Re: word countRon AARON
+* Re: word countJali Heinonen
|`- Re: word countJali Heinonen
+- Re: word countRobert L.
`- Re: word countRobert L.

1
word count

<97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=15981&group=comp.lang.forth#15981

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:6214:2a88:: with SMTP id jr8mr36283076qvb.118.1641065892514;
Sat, 01 Jan 2022 11:38:12 -0800 (PST)
X-Received: by 2002:a05:622a:10e:: with SMTP id u14mr34645078qtw.493.1641065892367;
Sat, 01 Jan 2022 11:38:12 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Sat, 1 Jan 2022 11:38:12 -0800 (PST)
Injection-Info: google-groups.googlegroups.com; posting-host=24.255.113.178; posting-account=OxDKOgoAAADW0cxAqHqpN1zqeCoSsDap
NNTP-Posting-Host: 24.255.113.178
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Subject: word count
From: hughagui...@gmail.com (Hugh Aguilar)
Injection-Date: Sat, 01 Jan 2022 19:38:12 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 82
 by: Hugh Aguilar - Sat, 1 Jan 2022 19:38 UTC

Recently code for a word-count program was posted in Python
here on comp.lang.forth.
https://groups.google.com/g/comp.lang.forth/c/jw6yU_yI15E
This was said:
On Thursday, December 30, 2021 at 1:12:23 AM UTC-7, dxforth wrote:
> On 30/12/2021 14:59, Hugh Aguilar wrote:
> > On Tuesday, December 28, 2021 at 8:07:01 PM UTC-7, dxforth wrote:
> >> On 29/12/2021 01:44, Andy Valencia wrote:
> >> > ...
> >> > I could write this in Forth, but I don't want to. It would be tedious
> >> > and take a LOT more time than these four lines of Python.
> >
> > Andy Valencia doesn't want to do this program in Forth because he doesn't know
> > Forth very well and hence even simple programs seem tedious and difficult to him.
> > A program such as this is trivial given my novice-package which is ANS-Forth.
> > ASSOCIATION.4TH can be used to collect all of the distinct words. Each node's key
> > would be the string and it would have one datum tacked on which would be the count.
> > STRING-STACK.4TH can be used to parse out the words from the text.
> > STRING-STACK.4TH has pretty good pattern-matching and sub-string extraction
> > capability, so it could use a much more sophisticated definition of what a word is
> > than just a blank-delimited string --- no word-count program has been so crude as to
> > use blank-delimited strings since the early 1970s --- that is kindergarten-level programming.
> >
> >> "We choose to go to the Moon [...] and do the other things, not because
> >> they are easy, but because they are hard; because that goal will serve
> >> to organize and measure the best of our energies and skills"
> >>
> >> The de-skilling of people that has occurred over the last 40 years because
> >> subsequent govts chose the easy way, has made us all dumb and vulnerable.
> >
> > DXforth says this as if he had skill --- but he doesn't --- he is all talk and no programming
> > beyond kindergarten-level programming such as his END macro.
> >
> END - easy to remember, easy to use and no shortage of ELSE to eliminate.
> It doesn't get better than that. It'll do me as an epitaph.

I was patient with DXforth (the idiot previously known as HAA) and his
kindergarten-level Forth programming. I even included his much-hyped
END macro in the novice-package:
-----------------------------------------------------------------------------------------------
\ END was HAA's idea:
\ https://groups.google.com/forum/#!topic/comp.lang.forth/MdGWDMEbKIA
macro: end ( -- )
exit then ;

macro: ?exit ( -- ) \ like END but even less useful
if exit then ;
-----------------------------------------------------------------------------------------------

Of course, DXforth doesn't have MACRO: because that kind of programming
is way beyond his skill-level, so he would have to write END like this:
-----------------------------------------------------------------------------------------------
: end ( -- ) postpone exit postpone then ; immediate
-----------------------------------------------------------------------------------------------

I am not patient with DXforth anymore because he attacked me by
using the term "disambiguifier" for some idiotic nonsense that he was
blathering about. He was mocking me and my disambiguifiers (that are
required for MACRO: to work) --- this insult will not be forgiven.

Anyway, getting back to the subject of a word-count program,
this is trivial for me to write given the novice package. I post this challenge
now though, that the Forth experts of comp.lang.forth should attempt this.

The first part of the challenge is to parse distinct "words" out of the
text stream. This is the hard part. Counting distinct words is trivial
given ASSOCIATION.4TH or something similar.
Here is my definition of a "word" (I just thought this up off the top
of my head; the definition may need some revision).
Punctuation is defined as one of: . , : ; ! ? ( )
Punctuation delimits words, as does whitespace.
The ' is an apostrophe. The 's is removed and only the prefix is used.
Any word with an @ in it is assumed to be an email address and is
left as is (not broken apart on the dot character).
Numbers are left as is (not broken apart on the dot character).
I'll worry about the hyphen later --- what I have above is enough for now.

Testing chars for punctuation can be done by DXforth using his END
macro. He would have a colon word with a series of IF ... END statements
testing for each character. This is not how I will do it though
(I have my <SWITCH ... FAST-SWITCH> construct).

So, comp.lang.forth experts --- prove that you haven't been de-skilled!

Re: word count

<sqr14c$evk$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=15986&group=comp.lang.forth#15986

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Sun, 2 Jan 2022 13:02:51 +1100
Organization: Aioe.org NNTP Server
Message-ID: <sqr14c$evk$1@gioia.aioe.org>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="15348"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Sun, 2 Jan 2022 02:02 UTC

On 2/01/2022 06:38, Hugh Aguilar wrote:
>
> I am not patient with DXforth anymore because he attacked me by
> using the term "disambiguifier" for some idiotic nonsense that he was
> blathering about. He was mocking me and my disambiguifiers (that are
> required for MACRO: to work) --- this insult will not be forgiven.
>
> Anyway, getting back to the subject of a word-count program,
> this is trivial for me to write given the novice package. I post this challenge
> now though, that the Forth experts of comp.lang.forth should attempt this.

I'd expect a Forth Grand Master to be writing applications and compilers.
Issuing duels and peddling novice packs brings them down to my level :)

Re: word count

<a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16005&group=comp.lang.forth#16005

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:620a:95c:: with SMTP id w28mr29497137qkw.229.1641132993460;
Sun, 02 Jan 2022 06:16:33 -0800 (PST)
X-Received: by 2002:ac8:590a:: with SMTP id 10mr37016339qty.186.1641132993169;
Sun, 02 Jan 2022 06:16:33 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Sun, 2 Jan 2022 06:16:33 -0800 (PST)
In-Reply-To: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23c5:6f05:3a01:7cca:e673:4653:5b1;
posting-account=9A5f7goAAAD_QfJPZnlK3Xq_UhzYjdP-
NNTP-Posting-Host: 2a00:23c5:6f05:3a01:7cca:e673:4653:5b1
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com>
Subject: Re: word count
From: november...@gmail.com (NN)
Injection-Date: Sun, 02 Jan 2022 14:16:33 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 89
 by: NN - Sun, 2 Jan 2022 14:16 UTC

On Saturday, 1 January 2022 at 19:38:13 UTC, Hugh Aguilar wrote:
> Recently code for a word-count program was posted in Python
> here on comp.lang.forth.
> https://groups.google.com/g/comp.lang.forth/c/jw6yU_yI15E
> This was said:
> On Thursday, December 30, 2021 at 1:12:23 AM UTC-7, dxforth wrote:
> > On 30/12/2021 14:59, Hugh Aguilar wrote:
> > > On Tuesday, December 28, 2021 at 8:07:01 PM UTC-7, dxforth wrote:
> > >> On 29/12/2021 01:44, Andy Valencia wrote:
> > >> > ...
> > >> > I could write this in Forth, but I don't want to. It would be tedious
> > >> > and take a LOT more time than these four lines of Python.
> > >
> > > Andy Valencia doesn't want to do this program in Forth because he doesn't know
> > > Forth very well and hence even simple programs seem tedious and difficult to him.
> > > A program such as this is trivial given my novice-package which is ANS-Forth.
> > > ASSOCIATION.4TH can be used to collect all of the distinct words. Each node's key
> > > would be the string and it would have one datum tacked on which would be the count.
> > > STRING-STACK.4TH can be used to parse out the words from the text.
> > > STRING-STACK.4TH has pretty good pattern-matching and sub-string extraction
> > > capability, so it could use a much more sophisticated definition of what a word is
> > > than just a blank-delimited string --- no word-count program has been so crude as to
> > > use blank-delimited strings since the early 1970s --- that is kindergarten-level programming.
> > >
> > >> "We choose to go to the Moon [...] and do the other things, not because
> > >> they are easy, but because they are hard; because that goal will serve
> > >> to organize and measure the best of our energies and skills"
> > >>
> > >> The de-skilling of people that has occurred over the last 40 years because
> > >> subsequent govts chose the easy way, has made us all dumb and vulnerable.
> > >
> > > DXforth says this as if he had skill --- but he doesn't --- he is all talk and no programming
> > > beyond kindergarten-level programming such as his END macro.
> > >
> > END - easy to remember, easy to use and no shortage of ELSE to eliminate.
> > It doesn't get better than that. It'll do me as an epitaph.
>
> I was patient with DXforth (the idiot previously known as HAA) and his
> kindergarten-level Forth programming. I even included his much-hyped
> END macro in the novice-package:
> -----------------------------------------------------------------------------------------------
> \ END was HAA's idea:
> \ https://groups.google.com/forum/#!topic/comp.lang.forth/MdGWDMEbKIA
> macro: end ( -- )
> exit then ;
>
> macro: ?exit ( -- ) \ like END but even less useful
> if exit then ;
> -----------------------------------------------------------------------------------------------
>
> Of course, DXforth doesn't have MACRO: because that kind of programming
> is way beyond his skill-level, so he would have to write END like this:
> -----------------------------------------------------------------------------------------------
> : end ( -- ) postpone exit postpone then ; immediate
> -----------------------------------------------------------------------------------------------
>
> I am not patient with DXforth anymore because he attacked me by
> using the term "disambiguifier" for some idiotic nonsense that he was
> blathering about. He was mocking me and my disambiguifiers (that are
> required for MACRO: to work) --- this insult will not be forgiven.
>
> Anyway, getting back to the subject of a word-count program,
> this is trivial for me to write given the novice package. I post this challenge
> now though, that the Forth experts of comp.lang.forth should attempt this.
>
> The first part of the challenge is to parse distinct "words" out of the
> text stream. This is the hard part. Counting distinct words is trivial
> given ASSOCIATION.4TH or something similar.
> Here is my definition of a "word" (I just thought this up off the top
> of my head; the definition may need some revision).
> Punctuation is defined as one of: . , : ; ! ? ( )
> Punctuation delimits words, as does whitespace.
> The ' is an apostrophe. The 's is removed and only the prefix is used.
> Any word with an @ in it is assumed to be an email address and is
> left as is (not broken apart on the dot character).
> Numbers are left as is (not broken apart on the dot character).
> I'll worry about the hyphen later --- what I have above is enough for now.
>
> Testing chars for punctuation can be done by DXforth using his END
> macro. He would have a colon word with a series of IF ... END statements
> testing for each character. This is not how I will do it though
> (I have my <SWITCH ... FAST-SWITCH> construct).
>
> So, comp.lang.forth experts --- prove that you haven't been de-skilled!

What do you intend to use for the text stream ?

/usr/share/dict/words ?

Re: word count

<1841bfac-6f75-4944-bfc2-ab99c51212b2n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16038&group=comp.lang.forth#16038

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:620a:462b:: with SMTP id br43mr31536877qkb.465.1641175383684;
Sun, 02 Jan 2022 18:03:03 -0800 (PST)
X-Received: by 2002:a37:a7cd:: with SMTP id q196mr30428762qke.110.1641175383510;
Sun, 02 Jan 2022 18:03:03 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Sun, 2 Jan 2022 18:03:03 -0800 (PST)
In-Reply-To: <a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=148.167.132.245; posting-account=OxDKOgoAAADW0cxAqHqpN1zqeCoSsDap
NNTP-Posting-Host: 148.167.132.245
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com> <a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1841bfac-6f75-4944-bfc2-ab99c51212b2n@googlegroups.com>
Subject: Re: word count
From: hughagui...@gmail.com (Hugh Aguilar)
Injection-Date: Mon, 03 Jan 2022 02:03:03 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 106
 by: Hugh Aguilar - Mon, 3 Jan 2022 02:03 UTC

On Sunday, January 2, 2022 at 7:16:34 AM UTC-7, NN wrote:
> On Saturday, 1 January 2022 at 19:38:13 UTC, Hugh Aguilar wrote:
> > Recently code for a word-count program was posted in Python
> > here on comp.lang.forth.
> > https://groups.google.com/g/comp.lang.forth/c/jw6yU_yI15E
> > This was said:
> > On Thursday, December 30, 2021 at 1:12:23 AM UTC-7, dxforth wrote:
> > > On 30/12/2021 14:59, Hugh Aguilar wrote:
> > > > On Tuesday, December 28, 2021 at 8:07:01 PM UTC-7, dxforth wrote:
> > > >> On 29/12/2021 01:44, Andy Valencia wrote:
> > > >> > ...
> > > >> > I could write this in Forth, but I don't want to. It would be tedious
> > > >> > and take a LOT more time than these four lines of Python.
> > > >
> > > > Andy Valencia doesn't want to do this program in Forth because he doesn't know
> > > > Forth very well and hence even simple programs seem tedious and difficult to him.
> > > > A program such as this is trivial given my novice-package which is ANS-Forth.
> > > > ASSOCIATION.4TH can be used to collect all of the distinct words. Each node's key
> > > > would be the string and it would have one datum tacked on which would be the count.
> > > > STRING-STACK.4TH can be used to parse out the words from the text.
> > > > STRING-STACK.4TH has pretty good pattern-matching and sub-string extraction
> > > > capability, so it could use a much more sophisticated definition of what a word is
> > > > than just a blank-delimited string --- no word-count program has been so crude as to
> > > > use blank-delimited strings since the early 1970s --- that is kindergarten-level programming.
> > > >
> > > >> "We choose to go to the Moon [...] and do the other things, not because
> > > >> they are easy, but because they are hard; because that goal will serve
> > > >> to organize and measure the best of our energies and skills"
> > > >>
> > > >> The de-skilling of people that has occurred over the last 40 years because
> > > >> subsequent govts chose the easy way, has made us all dumb and vulnerable.
> > > >
> > > > DXforth says this as if he had skill --- but he doesn't --- he is all talk and no programming
> > > > beyond kindergarten-level programming such as his END macro.
> > > >
> > > END - easy to remember, easy to use and no shortage of ELSE to eliminate.
> > > It doesn't get better than that. It'll do me as an epitaph.
> >
> > I was patient with DXforth (the idiot previously known as HAA) and his
> > kindergarten-level Forth programming. I even included his much-hyped
> > END macro in the novice-package:
> > -----------------------------------------------------------------------------------------------
> > \ END was HAA's idea:
> > \ https://groups.google.com/forum/#!topic/comp.lang.forth/MdGWDMEbKIA
> > macro: end ( -- )
> > exit then ;
> >
> > macro: ?exit ( -- ) \ like END but even less useful
> > if exit then ;
> > -----------------------------------------------------------------------------------------------
> >
> > Of course, DXforth doesn't have MACRO: because that kind of programming
> > is way beyond his skill-level, so he would have to write END like this:
> > -----------------------------------------------------------------------------------------------
> > : end ( -- ) postpone exit postpone then ; immediate
> > -----------------------------------------------------------------------------------------------
> >
> > I am not patient with DXforth anymore because he attacked me by
> > using the term "disambiguifier" for some idiotic nonsense that he was
> > blathering about. He was mocking me and my disambiguifiers (that are
> > required for MACRO: to work) --- this insult will not be forgiven.
> >
> > Anyway, getting back to the subject of a word-count program,
> > this is trivial for me to write given the novice package. I post this challenge
> > now though, that the Forth experts of comp.lang.forth should attempt this.
> >
> > The first part of the challenge is to parse distinct "words" out of the
> > text stream. This is the hard part. Counting distinct words is trivial
> > given ASSOCIATION.4TH or something similar.
> > Here is my definition of a "word" (I just thought this up off the top
> > of my head; the definition may need some revision).
> > Punctuation is defined as one of: . , : ; ! ? ( )
> > Punctuation delimits words, as does whitespace.
> > The ' is an apostrophe. The 's is removed and only the prefix is used.
> > Any word with an @ in it is assumed to be an email address and is
> > left as is (not broken apart on the dot character).
> > Numbers are left as is (not broken apart on the dot character).
> > I'll worry about the hyphen later --- what I have above is enough for now.
> >
> > Testing chars for punctuation can be done by DXforth using his END
> > macro. He would have a colon word with a series of IF ... END statements
> > testing for each character. This is not how I will do it though
> > (I have my <SWITCH ... FAST-SWITCH> construct).

> > So, comp.lang.forth experts --- prove that you haven't been de-skilled!
> What do you intend to use for the text stream ?
>
> /usr/share/dict/words ?

Any English-language text file should be fine, as that would have punctuation in it.
I don't care.

I would like to see anybody on comp.lang.forth write any Forth --- a "Hello World" program
would be a step up from this endless debating about recognizers or other nonsense.
A word-count program is a super-easy challenge. I remember telling Gavino and
Elizabeth Rather to write some Forth code, but neither of them ever did.
Now we have DXforth endlessly whining about how any kind of language standard
or code-library puts his awesome creativity in a box, although he can't write even the
most simple programs --- he just wants to drag everybody down to his level, with his
endless stream of snarky comments and put-downs.
Stephen Pelc says that my disambiguifiers don't work, although he doesn't have either
SYNONYM or and early-binding MACRO: which both depend upon the disambiguifiers.
Stephen Pelc says that anybody can write a better string-stack that what I have,
although he has nothing to show, and this has been about 30 years now.

A lack of programming is why nobody considers Forth "programmers" to be programmers.
Write some Forth code! Anything!

Re: word count

<sqtu1u$fmm$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16041&group=comp.lang.forth#16041

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Mon, 3 Jan 2022 15:28:46 +1100
Organization: Aioe.org NNTP Server
Message-ID: <sqtu1u$fmm$1@gioia.aioe.org>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
<a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com>
<1841bfac-6f75-4944-bfc2-ab99c51212b2n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="16086"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-GB
 by: dxforth - Mon, 3 Jan 2022 04:28 UTC

On 3/01/2022 13:03, Hugh Aguilar wrote:
>
> Now we have DXforth endlessly whining about how any kind of language standard
> or code-library puts his awesome creativity in a box, although he can't write even the
> most simple programs --- he just wants to drag everybody down to his level, with his
> endless stream of snarky comments and put-downs.

So post your word count program without condition and full source so it can be
tested. Let folks decide for themselves if that's a system they can use. Let
them do better if they can.

Re: word count

<squtjd$ns5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16061&group=comp.lang.forth#16061

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: step...@vfxforth.com (Stephen Pelc)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Mon, 3 Jan 2022 13:27:09 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <squtjd$ns5$1@dont-email.me>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com> <a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com> <1841bfac-6f75-4944-bfc2-ab99c51212b2n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=fixed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 3 Jan 2022 13:27:09 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="781cf81b7b5483fe9d4821f5716d6d6c";
logging-data="24453"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19n5zt7hBpBZA9zqHNkafqs"
User-Agent: Usenapp/1.17/l for MacOS - Full License
Cancel-Lock: sha1:9Pk42FJgnGp4JADrydciO/mCAOw=
 by: Stephen Pelc - Mon, 3 Jan 2022 13:27 UTC

On 3 Jan 2022 at 03:03:03 CET, Mr Angry, "Hugh Aguilar"
<hughaguilar96@gmail.com> wrote:

> Stephen Pelc says that my disambiguifiers don't work, although he doesn't have
> either
> SYNONYM or and early-binding MACRO: which both depend upon the disambiguifiers.
> Stephen Pelc says that anybody can write a better string-stack that what I
> have,
> although he has nothing to show, and this has been about 30 years now.

Lies, lies, lies.

Stephen
--
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

Re: word count

<0a436618-cb63-4080-bbbb-b8022f30bed5n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16099&group=comp.lang.forth#16099

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:ac8:5b85:: with SMTP id a5mr42380788qta.414.1641255813883;
Mon, 03 Jan 2022 16:23:33 -0800 (PST)
X-Received: by 2002:a05:6214:2aab:: with SMTP id js11mr43469806qvb.16.1641255813729;
Mon, 03 Jan 2022 16:23:33 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Mon, 3 Jan 2022 16:23:33 -0800 (PST)
In-Reply-To: <squtjd$ns5$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=148.167.132.245; posting-account=OxDKOgoAAADW0cxAqHqpN1zqeCoSsDap
NNTP-Posting-Host: 148.167.132.245
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
<a40d4844-f820-4105-b9d7-91a269397a53n@googlegroups.com> <1841bfac-6f75-4944-bfc2-ab99c51212b2n@googlegroups.com>
<squtjd$ns5$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0a436618-cb63-4080-bbbb-b8022f30bed5n@googlegroups.com>
Subject: Re: word count
From: hughagui...@gmail.com (Hugh Aguilar)
Injection-Date: Tue, 04 Jan 2022 00:23:33 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 50
 by: Hugh Aguilar - Tue, 4 Jan 2022 00:23 UTC

On Monday, January 3, 2022 at 6:27:11 AM UTC-7, Stephen Pelc wrote:
> On 3 Jan 2022 at 03:03:03 CET, Mr Angry, "Hugh Aguilar"
> <hughag...@gmail.com> wrote:
>
> > Stephen Pelc says that my disambiguifiers don't work, although he doesn't have
> > either
> > SYNONYM or and early-binding MACRO: which both depend upon the disambiguifiers.
> > Stephen Pelc says that anybody can write a better string-stack that what I
> > have,
> > although he has nothing to show, and this has been about 30 years now.
> Lies, lies, lies.
>
> Stephen

I have already showed how the liar Stephen Pelc lied about my disambiguifiers:
https://groups.google.com/g/comp.lang.forth/c/T-yYkpVwYew
Stephen Pelc is a failure at ANS-Forth programming. He doesn't understand
how to fix FIND and tick etc. in ANS-Forth so they aren't ambiguous.
If he doesn't know how FIND works, then what does he know???
If he is unable to program in ANS-Forth at the most basic level, then why
is he qualified to be the chair-person of Forth-200x???

Here the liar Stephen Pelc says that some anonymous African wrote a better
string-stack than mine 30 years ago, but he doesn't have any source-code.
Most likely that was just a warmed over version of Wil Baden's crap code.
Nobody other than myself has ever done this with COW (copy-on-write).
All other string stacks are slow because they move the entire strings around
during stack-juggling of the string-stack, and they have severe restrictions
on how big the strings are and how many strings are supported.
Also, I have a lot of pattern-matching and substring-extraction code that
all of those other string-stack implementations lack --- they are useless.

On Sunday, March 29, 2020 at 5:27:53 PM UTC-7, hughag...@gmail.com wrote:
> On Tuesday, June 25, 2019 at 3:35:43 PM UTC-7, Stephen Pelc wrote:
> > On Tue, 25 Jun 2019 06:39:51 -0700 (PDT), hughag...@gmail.com
> > wrote:
> >
> > >I don't know what they have in Africa, but I doubt that it is
> > >any better than America, Europe, Russia, etc..
> > >I stand by what I said.
> > >There is nothing comparable in quality on any continent.
> >
> > You are just demonstrating your ignorance and your failure to
> > accept that anyone else can do better than you.
> >
> > Stephen
>
> Stephen Pelc doesn't have any working code, and I don't think he is
> capable of implementing a string-stack because he doesn't know
> what COW (copy-on-write) is.
> He has done nothing himself, but he says that anyone can do better than me.

Re: word count

<sr5v3b$dhc$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16152&group=comp.lang.forth#16152

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: clf...@8th-dev.com (Ron AARON)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Thu, 6 Jan 2022 07:35:37 +0200
Organization: A noiseless patient Spider
Lines: 74
Message-ID: <sr5v3b$dhc$1@dont-email.me>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 6 Jan 2022 05:35:39 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="03ff69d1c9215cbaa94c421291a14b6f";
logging-data="13868"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+cP+ci0HB7A2kt3d+vew80"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Cancel-Lock: sha1:fsAjtNkmlAhbOpjLmfa2siWUhPY=
In-Reply-To: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Content-Language: en-US
 by: Ron AARON - Thu, 6 Jan 2022 05:35 UTC

On 01/01/2022 21:38, Hugh Aguilar wrote:

> Anyway, getting back to the subject of a word-count program,
> this is trivial for me to write given the novice package. I post this challenge
> now though, that the Forth experts of comp.lang.forth should attempt this.
>
> The first part of the challenge is to parse distinct "words" out of the
> text stream. This is the hard part. Counting distinct words is trivial
> given ASSOCIATION.4TH or something similar.
> Here is my definition of a "word" (I just thought this up off the top
> of my head; the definition may need some revision).
> Punctuation is defined as one of: . , : ; ! ? ( )
> Punctuation delimits words, as does whitespace.
> The ' is an apostrophe. The 's is removed and only the prefix is used.
> Any word with an @ in it is assumed to be an email address and is
> left as is (not broken apart on the dot character).
> Numbers are left as is (not broken apart on the dot character).
> I'll worry about the hyphen later --- what I have above is enough for now.
>
> Testing chars for punctuation can be done by DXforth using his END
> macro. He would have a colon word with a series of IF ... END statements
> testing for each character. This is not how I will do it though
> (I have my <SWITCH ... FAST-SWITCH> construct).
>
> So, comp.lang.forth experts --- prove that you haven't been de-skilled!

Here's a version in 8th which fulfills all your requirements, AFAICT:

\ Answer to Hugh's "word count challenge". Give the name of the file to
'count' on the command-line.

: remove-empties
( /^\s*$/ r:match not nip ) a:filter ;

: isnumber? \ s -- s T
/^[+-]?[0-9]+\.?[0-9]*/ r:match swap
r:str nip swap ;

: isemail? \ s -- s T
"@" s:search null? not nip ;

\ read the requested file into a big string:
0 args f:slurp >s

\ split the string by whitespace
/\s+/ s:/

\ remove empty or blank lines
remove-empties

\ collapse the array to a dictionary, split out punctuation except
numbers and emails, and make "Cat" and "cat" count as one word
: process-array \ m a -- m
\ iterate each item in the array
(
s:lc
isnumber? not if
isemail? not if
\ not a special case, so remove punctuation and split again
/[.,:;!?()]+/ " " s:replace! /\s+/ s:/
remove-empties
\ if more than one word, process the array again
a:len 1 n:> if recurse ;then
\ single word, remove 's at end of word
a:open /'s$/ "" s:replace
then
then
\ stick this word in the map
true m:!
) a:each! drop ;

m:new swap process-array
\ count how many words there are
m:len . cr bye

Re: word count

<sr5vid$ffo$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16153&group=comp.lang.forth#16153

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: clf...@8th-dev.com (Ron AARON)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Thu, 6 Jan 2022 07:43:40 +0200
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <sr5vid$ffo$1@dont-email.me>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
<sr5v3b$dhc$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 6 Jan 2022 05:43:41 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="03ff69d1c9215cbaa94c421291a14b6f";
logging-data="15864"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/yUH9wBe3SChf/vhaS6/wV"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Cancel-Lock: sha1:bcY3PmXZY4HtUx9HqOcGiyhXzLE=
In-Reply-To: <sr5v3b$dhc$1@dont-email.me>
Content-Language: en-US
 by: Ron AARON - Thu, 6 Jan 2022 05:43 UTC

On 06/01/2022 7:35, Ron AARON wrote:

> Here's a version in 8th which fulfills all your requirements, AFAICT:
>
> \ Answer to Hugh's "word count challenge". Give the name of the file to
> 'count' on the command-line.
>
> : remove-empties
>   ( /^\s*$/ r:match not nip ) a:filter ;
>
> : isnumber? \ s -- s T
>   /^[+-]?[0-9]+\.?[0-9]*/ r:match swap
>   r:str nip swap ;
>
> : isemail? \ s -- s T
>   "@" s:search null? not nip ;
>
> \ read the requested file into a big string:
> 0 args f:slurp >s
>
> \ split the string by whitespace
> /\s+/ s:/
>
> \ remove empty or blank lines
> remove-empties
>
> \ collapse the array to a dictionary, split out punctuation except
> numbers and emails, and make "Cat" and "cat" count as one word
> : process-array \ m a -- m
>   \ iterate each item in the array
>   (
>     s:lc
>     isnumber? not if
>       isemail? not if
>         \ not a special case, so remove punctuation and split again
>         /[.,:;!?()]+/ " " s:replace! /\s+/ s:/
>         remove-empties
>         \ if more than one word, process the array again
>         a:len 1 n:> if recurse ;then
>         \ single word, remove 's at end of word
>         a:open /'s$/ "" s:replace
>       then
>     then
>     \ stick this word in the map
>     true m:!
>   ) a:each! drop ;
>
> m:new swap process-array
> \ count how many words there are
> m:len . cr bye

Correction: "recurse" should be "process-array".

Re: word count

<sr61tb$ai$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16154&group=comp.lang.forth#16154

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Thu, 6 Jan 2022 17:23:39 +1100
Organization: Aioe.org NNTP Server
Message-ID: <sr61tb$ai$1@gioia.aioe.org>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
<sr5v3b$dhc$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="338"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Thu, 6 Jan 2022 06:23 UTC

On 6/01/2022 16:35, Ron AARON wrote:
> On 01/01/2022 21:38, Hugh Aguilar wrote:
>
>> Anyway, getting back to the subject of a word-count program,
>> this is trivial for me to write given the novice package. I post this challenge
>> now though, that the Forth experts of comp.lang.forth should attempt this.
>>
>> The first part of the challenge is to parse distinct "words" out of the
>> text stream. This is the hard part. Counting distinct words is trivial
>> given ASSOCIATION.4TH or something similar.
>> Here is my definition of a "word" (I just thought this up off the top
>> of my head; the definition may need some revision).
>> Punctuation is defined as one of: . , : ; ! ? ( )
>> Punctuation delimits words, as does whitespace.
>> The ' is an apostrophe. The 's is removed and only the prefix is used.
>> Any word with an @ in it is assumed to be an email address and is
>> left as is (not broken apart on the dot character).
>> Numbers are left as is (not broken apart on the dot character).
>> I'll worry about the hyphen later --- what I have above is enough for now.
>>
>> Testing chars for punctuation can be done by DXforth using his END
>> macro. He would have a colon word with a series of IF ... END statements
>> testing for each character. This is not how I will do it though
>> (I have my <SWITCH ... FAST-SWITCH> construct).
>>
>> So, comp.lang.forth experts --- prove that you haven't been de-skilled!
>
> Here's a version in 8th which fulfills all your requirements, AFAICT:
> ...

He forgot to mention the snakes

https://youtu.be/ClwIj3x24Q4?t=18

But you already knew that before accepting his challenge.

Re: word count

<sr627e$qkj$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16155&group=comp.lang.forth#16155

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: clf...@8th-dev.com (Ron AARON)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Thu, 6 Jan 2022 08:29:02 +0200
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <sr627e$qkj$1@dont-email.me>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
<sr5v3b$dhc$1@dont-email.me> <sr61tb$ai$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 6 Jan 2022 06:29:02 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="03ff69d1c9215cbaa94c421291a14b6f";
logging-data="27283"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18BdNxdgbA8g1Ovh6BzNWDt"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Cancel-Lock: sha1:xg/nEkVsJzEB8TxpsGvrvdeNMMI=
In-Reply-To: <sr61tb$ai$1@gioia.aioe.org>
Content-Language: en-US
 by: Ron AARON - Thu, 6 Jan 2022 06:29 UTC

On 06/01/2022 8:23, dxforth wrote:
> On 6/01/2022 16:35, Ron AARON wrote:
>> On 01/01/2022 21:38, Hugh Aguilar wrote:
>>
>>> Anyway, getting back to the subject of a word-count program,
>>> this is trivial for me to write given the novice package. I post this
>>> challenge
>>> now though, that the Forth experts of comp.lang.forth should attempt
>>> this.
>>>
>>> The first part of the challenge is to parse distinct "words" out of the
>>> text stream. This is the hard part. Counting distinct words is trivial
>>> given ASSOCIATION.4TH or something similar.
>>> Here is my definition of a "word" (I just thought this up off the top
>>> of my head; the definition may need some revision).
>>> Punctuation is defined as one of: . , : ; ! ? ( )
>>> Punctuation delimits words, as does whitespace.
>>> The ' is an apostrophe. The 's is removed and only the prefix is used.
>>> Any word with an @ in it is assumed to be an email address and is
>>> left as is (not broken apart on the dot character).
>>> Numbers are left as is (not broken apart on the dot character).
>>> I'll worry about the hyphen later --- what I have above is enough for
>>> now.
>>>
>>> Testing chars for punctuation can be done by DXforth using his END
>>> macro. He would have a colon word with a series of IF ... END statements
>>> testing for each character. This is not how I will do it though
>>> (I have my <SWITCH ... FAST-SWITCH> construct).
>>>
>>> So, comp.lang.forth experts --- prove that you haven't been de-skilled!
>>
>> Here's a version in 8th which fulfills all your requirements, AFAICT:
>> ...
>
> He forgot to mention the snakes
>
> https://youtu.be/ClwIj3x24Q4?t=18
>
> But you already knew that before accepting his challenge.

I did, but I've grown immune...

Re: word count

<6a23f170-2066-4194-93e0-1a95b5ee7d24n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16156&group=comp.lang.forth#16156

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a37:6856:: with SMTP id d83mr40830721qkc.500.1641453787076;
Wed, 05 Jan 2022 23:23:07 -0800 (PST)
X-Received: by 2002:ac8:4e96:: with SMTP id 22mr52448311qtp.76.1641453786745;
Wed, 05 Jan 2022 23:23:06 -0800 (PST)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 5 Jan 2022 23:23:06 -0800 (PST)
In-Reply-To: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=37.33.148.225; posting-account=kiOBZQoAAADFsAs31ZHaefxTuQxv84Wm
NNTP-Posting-Host: 37.33.148.225
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6a23f170-2066-4194-93e0-1a95b5ee7d24n@googlegroups.com>
Subject: Re: word count
From: jali.hei...@gmail.com (Jali Heinonen)
Injection-Date: Thu, 06 Jan 2022 07:23:07 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 302
 by: Jali Heinonen - Thu, 6 Jan 2022 07:23 UTC

lauantai 1. tammikuuta 2022 klo 21.38.13 UTC+2 Hugh Aguilar kirjoitti:
> Recently code for a word-count program was posted in Python
> here on comp.lang.forth.
> https://groups.google.com/g/comp.lang.forth/c/jw6yU_yI15E
> This was said:
> On Thursday, December 30, 2021 at 1:12:23 AM UTC-7, dxforth wrote:
> > On 30/12/2021 14:59, Hugh Aguilar wrote:
> > > On Tuesday, December 28, 2021 at 8:07:01 PM UTC-7, dxforth wrote:
> > >> On 29/12/2021 01:44, Andy Valencia wrote:
> > >> > ...
> > >> > I could write this in Forth, but I don't want to. It would be tedious
> > >> > and take a LOT more time than these four lines of Python.
> > >
> > > Andy Valencia doesn't want to do this program in Forth because he doesn't know
> > > Forth very well and hence even simple programs seem tedious and difficult to him.
> > > A program such as this is trivial given my novice-package which is ANS-Forth.
> > > ASSOCIATION.4TH can be used to collect all of the distinct words. Each node's key
> > > would be the string and it would have one datum tacked on which would be the count.
> > > STRING-STACK.4TH can be used to parse out the words from the text.
> > > STRING-STACK.4TH has pretty good pattern-matching and sub-string extraction
> > > capability, so it could use a much more sophisticated definition of what a word is
> > > than just a blank-delimited string --- no word-count program has been so crude as to
> > > use blank-delimited strings since the early 1970s --- that is kindergarten-level programming.
> > >
> > >> "We choose to go to the Moon [...] and do the other things, not because
> > >> they are easy, but because they are hard; because that goal will serve
> > >> to organize and measure the best of our energies and skills"
> > >>
> > >> The de-skilling of people that has occurred over the last 40 years because
> > >> subsequent govts chose the easy way, has made us all dumb and vulnerable.
> > >
> > > DXforth says this as if he had skill --- but he doesn't --- he is all talk and no programming
> > > beyond kindergarten-level programming such as his END macro.
> > >
> > END - easy to remember, easy to use and no shortage of ELSE to eliminate.
> > It doesn't get better than that. It'll do me as an epitaph.
>
> I was patient with DXforth (the idiot previously known as HAA) and his
> kindergarten-level Forth programming. I even included his much-hyped
> END macro in the novice-package:
> -----------------------------------------------------------------------------------------------
> \ END was HAA's idea:
> \ https://groups.google.com/forum/#!topic/comp.lang.forth/MdGWDMEbKIA
> macro: end ( -- )
> exit then ;
>
> macro: ?exit ( -- ) \ like END but even less useful
> if exit then ;
> -----------------------------------------------------------------------------------------------
>
> Of course, DXforth doesn't have MACRO: because that kind of programming
> is way beyond his skill-level, so he would have to write END like this:
> -----------------------------------------------------------------------------------------------
> : end ( -- ) postpone exit postpone then ; immediate
> -----------------------------------------------------------------------------------------------
>
> I am not patient with DXforth anymore because he attacked me by
> using the term "disambiguifier" for some idiotic nonsense that he was
> blathering about. He was mocking me and my disambiguifiers (that are
> required for MACRO: to work) --- this insult will not be forgiven.
>
> Anyway, getting back to the subject of a word-count program,
> this is trivial for me to write given the novice package. I post this challenge
> now though, that the Forth experts of comp.lang.forth should attempt this..
>
> The first part of the challenge is to parse distinct "words" out of the
> text stream. This is the hard part. Counting distinct words is trivial
> given ASSOCIATION.4TH or something similar.
> Here is my definition of a "word" (I just thought this up off the top
> of my head; the definition may need some revision).
> Punctuation is defined as one of: . , : ; ! ? ( )
> Punctuation delimits words, as does whitespace.
> The ' is an apostrophe. The 's is removed and only the prefix is used.
> Any word with an @ in it is assumed to be an email address and is
> left as is (not broken apart on the dot character).
> Numbers are left as is (not broken apart on the dot character).
> I'll worry about the hyphen later --- what I have above is enough for now..
>
> Testing chars for punctuation can be done by DXforth using his END
> macro. He would have a colon word with a series of IF ... END statements
> testing for each character. This is not how I will do it though
> (I have my <SWITCH ... FAST-SWITCH> construct).
>
> So, comp.lang.forth experts --- prove that you haven't been de-skilled!

Here is my word count program in 8th, a bit low level approach using scanner but as an added benefit it gives you line and column numbers of bad input.. It could easily be modified to calculate word, email and number count separately. Emails and numbers could be handled properly using state machine. Testing for alphabet is a bit ugly as it requires special-chars array for non ASCII input (you may need to extend it as I added just enough to handle my dictionary).

needs file/getc

ns?

ns: scanner

-1 constant EOF
10 constant LF
32 constant SPACE

0 constant .SYM
1 constant ,SYM
2 constant :SYM
3 constant ;SYM
4 constant !SYM
5 constant ?SYM
6 constant (SYM
7 constant )SYM
8 constant WORDSYM

[ 243, 228, 246, 229, 196, 214, 197, 252, 225,
226, 241, 231, 232, 233, 234, 237, 244, 251 ] constant special-chars

: >char \ n -- char
"" swap s:+ ;

: new \ file -- scanner
m:new
"file" rot m:!
"column" 0 m:!
"line" 1 m:!
"char" SPACE m:!
"word" "" m:! ;

: line@ \ scanner -- scanner line
"line" m:@ ;

: column@ \ scanner -- scanner column
"column" m:@ ;

: char@ \ scanner -- scanner char
"char" m:@ ;

: word@ \ scanner -- scanner string
"word" m:@ ;

: file@ \ scanner -- scanner file
"file" m:@ ;

: accept \ n a -- n T
( over n:= ) a:map
' or false a:reduce ;

: digit? \ n -- bool
'0 '9 n:between ;

: alpha? \ n -- bool
dup >r 'a 'z n:between r@ 'A 'Z n:between or
r> special-chars accept nip or ;

: read-char \ scanner -- scanner
char@
LF n:= if
line@ n:1+ "line" m:_!
"column" 0 m:!
then

"file" m:@
f:getc swap -rot "char" m:_!
swap
f:eof? nip if
"char" EOF m:!
then

char@ EOF n:= not if
column@
n:1+ "column" m:_!
then ;

[ 39, 64 ] constant accept-word
[ 39, 46 ] constant accept-email

' accept-word deferred: accept-table
: read-word \ scanner -- scanner
"word" "" m:!
repeat
char@ swap word@ rot s:+ "word" m:_!
read-char

char@ dup >r alpha? r@ accept-table accept nip or not r> EOF n:= or if
' accept-word w:is accept-table
break
else
char@ '@ n:= if
' accept-email w:is accept-table
then
then
again
word@ s:len 2 n:> if
-2 s:/ 1 a:@ "'s" s:= if
0 a:@ nip "word" m:_!
else
drop
then
else
drop
then ;

: read-number \ scanner -- scanner
"word" "" m:!
repeat
char@ swap word@ rot s:+ "word" m:_!
read-char
char@ dup >r digit? r@ '. n:= or not r> EOF n:= or if
break
then
again ;

[ ( char@ EOF n:= ), ( EOF ),
( char@ '. n:= ), ( read-char .SYM ),
( char@ ', n:= ), ( read-char ,SYM ),
( char@ ': n:= ), ( read-char :SYM ),
( char@ '; n:= ), ( read-char ;SYM ),
( char@ '! n:= ), ( read-char !SYM ),
( char@ '? n:= ), ( read-char ?SYM ),
( char@ 40 n:= ), ( read-char 6 ),
( char@ 41 n:= ), ( read-char 7 ),
( char@ digit? ), ( read-number WORDSYM ),
( char@ alpha? ), ( read-word WORDSYM ),
( char@ >char swap column@ swap line@ nip "Line:%d Column:%d, Illegal character: %s" s:strfmt throw )
] var, symbols

: get-token \ scanner -- scanner token
\ skip white space
repeat
char@ EOF n:= not swap char@ SPACE n:> not rot and if
read-char
else
break
then
again


Click here to read the complete article
Re: word count

<fb3752f7-6558-4469-830b-52c14c8e6b7bn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16167&group=comp.lang.forth#16167

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a37:9ad8:: with SMTP id c207mr43356715qke.662.1641482084147;
Thu, 06 Jan 2022 07:14:44 -0800 (PST)
X-Received: by 2002:a05:620a:b1a:: with SMTP id t26mr39609317qkg.571.1641482083841;
Thu, 06 Jan 2022 07:14:43 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Thu, 6 Jan 2022 07:14:43 -0800 (PST)
In-Reply-To: <6a23f170-2066-4194-93e0-1a95b5ee7d24n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=80.186.163.128; posting-account=kiOBZQoAAADFsAs31ZHaefxTuQxv84Wm
NNTP-Posting-Host: 80.186.163.128
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com> <6a23f170-2066-4194-93e0-1a95b5ee7d24n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fb3752f7-6558-4469-830b-52c14c8e6b7bn@googlegroups.com>
Subject: Re: word count
From: jali.hei...@gmail.com (Jali Heinonen)
Injection-Date: Thu, 06 Jan 2022 15:14:44 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 2
 by: Jali Heinonen - Thu, 6 Jan 2022 15:14 UTC

Seems like source code formatting was lost... I made my word count program more modular and added validator for numbers, so it now properly parses numbers and supports E notation.

Source available here: https://www.dropbox.com/s/ettg5bra03m4y9w/wc.zip?dl=0

Re: word count

<strg3e$19uk$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16708&group=comp.lang.forth#16708

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!6y+gbT8AvqyaCYwdyTY1rQ.user.46.165.242.75.POSTED!not-for-mail
From: No_spamm...@noWhere_7073.org (Robert L.)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Mon, 7 Feb 2022 16:07:12 -0000 (UTC)
Organization: Aioe.org NNTP Server
Message-ID: <strg3e$19uk$1@gioia.aioe.org>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Injection-Info: gioia.aioe.org; logging-data="42964"; posting-host="6y+gbT8AvqyaCYwdyTY1rQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: XanaNews/1.18.1.6
X-Notice: Filtered by postfilter v. 0.9.2
 by: Robert L. - Mon, 7 Feb 2022 16:07 UTC

On 1/1/2022, Hugh Aguilar wrote:

> The first part of the challenge is to parse distinct "words" out of the
> text stream. This is the hard part. Counting distinct words is trivial
> given ASSOCIATION.4TH or something similar.
> Here is my definition of a "word" (I just thought this up off the top
> of my head; the definition may need some revision).
> Punctuation is defined as one of: . , : ; ! ? ( )
> Punctuation delimits words, as does whitespace.
> The ' is an apostrophe. The 's is removed and only the prefix is used.
> Any word with an @ in it is assumed to be an email address and is
> left as is (not broken apart on the dot character).
> Numbers are left as is (not broken apart on the dot character).
> I'll worry about the hyphen later --- what I have above is enough for now.

( SP-Forth )

\ Hash tables.
REQUIRE new-hash ~pinka/lib/hash-table.f
\ OFF and ON
REQUIRE OFF lib/ext/onoff.f
\ .R
REQUIRE .R lib/include/ansi.f
\ Ignore case.
REQUIRE CASE-INS lib/ext/caseins.f

8192 new-hash value word-table

variable in-word
variable allow-dot
variable dot-count
variable char-buf
create word-space 4096 allot
variable word-idx
0 value f-handle

: incr-count ( )
word-space word-idx @ 2dup
word-table HASH@N
( If entry not found, default to 0. )
0= if 0 then
1 + -rot
word-table HASH!N ;

( Remove "'s" from end of word. )
: trim-word
word-idx @ 2 >
if
word-space word-idx @ dup 2 - /string
s" 's" compare 0=
if
-2 word-idx +!
then
then ;

: dot? ( char -- bool ) [char] . = ;

: char-in-string? ( char adr len -- bool )
rot char-buf c!
char-buf 1 search
nip nip ;

: punctuation? ( char -- bool )
s" .,:;!?()" char-in-string? ;

: numeral? ( char -- bool )
s" 0123456789" char-in-string? ;

: consider-punctuation ( char -- bool )
dot? allow-dot @ dot-count @ 1 < and and
if
allow-dot off
1 dot-count +!
-1
else
0
then ;

: word-char? ( char -- bool )
dup 33 <
if drop 0
else
dup punctuation?
if
consider-punctuation
else
drop -1
then
then ;

: read-char char-buf 1 f-handle read-file throw 0= throw
char-buf c@ ;

: start-new-word in-word on 0 word-idx ! 0 dot-count ! ;

: append-char ( char -- )
word-space word-idx @ + c!
1 word-idx +! ;

: process-word-char ( char -- )
in-word @ 0= if start-new-word then
dup [char] @ = over numeral? or
if allow-dot on then
append-char ;

: process-word
word-idx @
if
trim-word
\ word-space word-idx @ type cr
incr-count
0 word-idx !
then ;

: process-non-word-char ( char -- )
drop
in-word @
if
process-word
in-word off allow-dot off
then ;

: parse-file
begin
read-char dup
word-char?
if
process-word-char
else
process-non-word-char
then
again ;

s" foo.txt" r/o open-file throw to f-handle
' parse-file catch drop
f-handle close-file throw
process-word

:noname 5 .r space type cr ; word-table all-hash
word-table del-hash

If the file contains:

foo; bob@zip.org.bar.
foo! 9.725.bar.
(that's not foo's fault)

then the output is:

1 not
2 bar
1 bob@zip.org
1 fault
1 that
1 9.725
3 foo

--
archive.org/details/nolies

Re: word count

<su8lse$1qp$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16760&group=comp.lang.forth#16760

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!4PEzYpqg//dYuxH1AxMC3A.user.46.165.242.75.POSTED!not-for-mail
From: No_spamm...@noWhere_7073.org (Robert L.)
Newsgroups: comp.lang.forth
Subject: Re: word count
Date: Sat, 12 Feb 2022 16:05:35 -0000 (UTC)
Organization: Aioe.org NNTP Server
Message-ID: <su8lse$1qp$1@gioia.aioe.org>
References: <97d97016-3b36-4215-8afb-909e1fef3613n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Injection-Info: gioia.aioe.org; logging-data="1881"; posting-host="4PEzYpqg//dYuxH1AxMC3A.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: XanaNews/1.18.1.6
X-Notice: Filtered by postfilter v. 0.9.2
 by: Robert L. - Sat, 12 Feb 2022 16:05 UTC

Hugh Aguilar wrote:

> The first part of the challenge is to parse distinct "words" out of the
> text stream. This is the hard part. Counting distinct words is trivial
> given ASSOCIATION.4TH or something similar.
> Here is my definition of a "word" (I just thought this up off the top
> of my head; the definition may need some revision).
> Punctuation is defined as one of: . , : ; ! ? ( )
> Punctuation delimits words, as does whitespace.
> The ' is an apostrophe. The 's is removed and only the prefix is used.
> Any word with an @ in it is assumed to be an email address and is
> left as is (not broken apart on the dot character).
> Numbers are left as is (not broken apart on the dot character).
> I'll worry about the hyphen later --- what I have above is enough for now.

( SP-Forth )

\ Hash tables.
REQUIRE new-hash ~pinka/lib/hash-table.f
\ OFF and ON
REQUIRE OFF lib/ext/onoff.f
\ .R
REQUIRE .R lib/include/ansi.f
\ Ignore case.
REQUIRE CASE-INS lib/ext/caseins.f

8192 new-hash value word-table

variable in-word
variable allow-dot
variable dot-count
variable char-buf
create word-space 4096 allot
variable word-idx
0 value f-handle

: incr-count ( )
word-space word-idx @ 2dup
word-table HASH@N
( If entry not found, default to 0. )
0= if 0 then
1 + -rot
word-table HASH!N ;

( Remove "'s" from end of word. )
: trim-word
word-idx @ 2 >
if
word-space word-idx @ dup 2 - /string
s" 's" compare 0=
if
-2 word-idx +!
then
then ;

: dot? ( char -- bool ) [char] . = ;

: char-in-string? ( char adr len -- bool )
rot char-buf c!
char-buf 1 search
nip nip ;

: punctuation? ( char -- bool )
s" .,:;!?()" char-in-string? ;

: numeral? ( char -- bool )
s" 0123456789" char-in-string? ;

: consider-punctuation ( char -- bool )
dot? allow-dot @ dot-count @ 1 < and and
if
allow-dot off
1 dot-count +!
-1
else
0
then ;

: word-char? ( char -- bool )
dup 33 <
if drop 0
else
dup punctuation?
if
consider-punctuation
else
drop -1
then
then ;

: read-char char-buf 1 f-handle read-file throw 0= throw
char-buf c@ ;

: start-new-word in-word on 0 word-idx ! 0 dot-count ! ;

: append-char ( char -- )
word-space word-idx @ + c!
1 word-idx +! ;

: process-word-char ( char -- )
in-word @ 0= if start-new-word then
dup [char] @ = over numeral? or
if allow-dot on then
append-char ;

: process-word
word-idx @
if
trim-word
\ word-space word-idx @ type cr
incr-count
0 word-idx !
then ;

: process-non-word-char ( char -- )
drop
in-word @
if
process-word
in-word off allow-dot off
then ;

: parse-file
begin
read-char dup
word-char?
if
process-word-char
else
process-non-word-char
then
again ;

s" foo.txt" r/o open-file throw to f-handle
' parse-file catch drop
f-handle close-file throw
process-word

:noname 5 .r space type cr ; word-table all-hash
word-table del-hash

If the file contains:

foo; bob@zip.org.bar.
foo! 9.725.bar.
(that's not foo's fault)

then the output is:

1 not
2 bar
1 bob@zip.org
1 fault
1 that
1 9.725
3 foo

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor