Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Human beings were created by water to transport it uphill.


devel / comp.unix.shell / Re: while read -r line ; do problem

SubjectAuthor
* while read -r line ; do problemBit Twister
+* Re: while read -r line ; do problemJanis Papanagnou
|`- Re: Solution while read -r line ; do problemBit Twister
`* Re: while read -r line ; do problemEd Morton
 +* Re: while read -r line ; do problemSivaram Neelakantan
 |+* Re: while read -r line ; do problemJanis Papanagnou
 ||`- Re: while read -r line ; do problemJanis Papanagnou
 |+* Re: while read -r line ; do problemEd Morton
 ||`* Re: while read -r line ; do problemSivaram Neelakantan
 || `- Re: while read -r line ; do problemEd Morton
 |`* Iterating over a collection in shell (Was: while read -r line ; do problem)Kenny McCormack
 | +- Re: Iterating over a collection in shell (Was: while read -r line ;Janis Papanagnou
 | `- Re: Iterating over a collection in shellSivaram Neelakantan
 `* Re: while read -r line ; do problemWilliam Ahern
  `* Re: while read -r line ; do problemEd Morton
   `- Re: while read -r line ; do problemJanis Papanagnou

1
while read -r line ; do problem

<slrnt255id.1avaq.BitTwister@wb.home.test>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4990&group=comp.unix.shell#4990

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: BitTwis...@mouse-potato.com (Bit Twister)
Newsgroups: comp.unix.shell
Subject: while read -r line ; do problem
Date: Fri, 4 Mar 2022 16:44:15 -0600
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <slrnt255id.1avaq.BitTwister@wb.home.test>
Injection-Info: reader02.eternal-september.org; posting-host="a944a94c86929e4836a84a588aa27866";
logging-data="27789"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ga8wsZlkxCt6QswbeUKHRrMLsbsVLBW8="
User-Agent: slrn/pre1.0.4-6 (Linux)
Cancel-Lock: sha1:O6Tj04HfkkGO7aS47o4zaEBJmR4=
 by: Bit Twister - Fri, 4 Mar 2022 22:44 UTC

while read -r line ; do problem

$ bash --version
GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)

I have a bash script which reads a script file and updates variables.
contents of some lines are modified without my script intervention.

Code snippet
1 while read -r line; do
2 _t=$line
3 set -- $(IFS='=' ; echo $_t)
4 _wd=$1
5 case "$_wd" in
6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
7 <big if/case snip none of which modify _medicare line>
8 echo $line >> $_tmp_fn
9 done < $_taxes_paid_fn

If you look at the following results from set -vx
You'll notice the _medicare line * was converted to file names used in the script

read -r line
_t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
: IFS==
echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
'[' 13 -gt 1 ']'
_wd=_medicare

Here is the
echo $line >> $_tmp_fn
which did/has the * jumk/substitution

echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'

How can I prevent the * substitution and still be use the line modification
like line 6 in the example snippet??

Thanks in advance for any advice.

Re: while read -r line ; do problem

<svuai1$5qj$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4991&group=comp.unix.shell#4991

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Sat, 5 Mar 2022 01:23:29 +0100
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <svuai1$5qj$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 5 Mar 2022 00:23:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="99f7674ea622918535218b39888730e5";
logging-data="5971"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19x70JtQszdGirFsWzEuUtZ"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:zjfsytbugunMBEmqwSSPgRHwI3o=
In-Reply-To: <slrnt255id.1avaq.BitTwister@wb.home.test>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sat, 5 Mar 2022 00:23 UTC

On 04.03.2022 23:44, Bit Twister wrote:
> while read -r line ; do problem
>
> $ bash --version
> GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
>
> I have a bash script which reads a script file and updates variables.
> contents of some lines are modified without my script intervention.
>
> Code snippet
> 1 while read -r line; do
> 2 _t=$line
> 3 set -- $(IFS='=' ; echo $_t)
> 4 _wd=$1
> 5 case "$_wd" in
> 6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
> 7 <big if/case snip none of which modify _medicare line>
> 8 echo $line >> $_tmp_fn
> 9 done < $_taxes_paid_fn
>
> If you look at the following results from set -vx
> You'll notice the _medicare line * was converted to file names used in the script
>
> read -r line
> _t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
> : IFS==
> echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
> set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
> '[' 13 -gt 1 ']'
> _wd=_medicare
>
> Here is the
> echo $line >> $_tmp_fn
> which did/has the * jumk/substitution
>
> echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
>
> How can I prevent the * substitution and still be use the line modification
> like line 6 in the example snippet??

If you quote your variables on expansion ("$var") the * as part of your
variable value will not expand to file names.

Janis

>
> Thanks in advance for any advice.
>

Re: while read -r line ; do problem

<svuc2t$egk$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4992&group=comp.unix.shell#4992

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: mortons...@gmail.com (Ed Morton)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Fri, 4 Mar 2022 18:49:33 -0600
Organization: A noiseless patient Spider
Lines: 62
Message-ID: <svuc2t$egk$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 5 Mar 2022 00:49:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="53e368214246228b1db15d7f2028d07f";
logging-data="14868"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/TZK0pFzhZk9k+8JnNOZ15"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:R2iX0UNzJ76J37CNeSVd1/gs1c0=
In-Reply-To: <slrnt255id.1avaq.BitTwister@wb.home.test>
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 220304-6, 3/4/2022), Outbound message
 by: Ed Morton - Sat, 5 Mar 2022 00:49 UTC

On 3/4/2022 4:44 PM, Bit Twister wrote:
> while read -r line ; do problem
>
> $ bash --version
> GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
>
> I have a bash script which reads a script file and updates variables.
> contents of some lines are modified without my script intervention.
>
> Code snippet
> 1 while read -r line; do
> 2 _t=$line
> 3 set -- $(IFS='=' ; echo $_t)
> 4 _wd=$1
> 5 case "$_wd" in
> 6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
> 7 <big if/case snip none of which modify _medicare line>
> 8 echo $line >> $_tmp_fn
> 9 done < $_taxes_paid_fn
>
> If you look at the following results from set -vx
> You'll notice the _medicare line * was converted to file names used in the script
>
> read -r line
> _t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
> : IFS==
> echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
> set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
> '[' 13 -gt 1 ']'
> _wd=_medicare
>
> Here is the
> echo $line >> $_tmp_fn
> which did/has the * jumk/substitution
>
> echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
>
> How can I prevent the * substitution and still be use the line modification
> like line 6 in the example snippet??
>
> Thanks in advance for any advice.
>

1) Always quote your shell variables unless you have an explicit reason
not to, see https://mywiki.wooledge.org/Quotes.

2) If you're going to use a shell read loop then always use both `IFS=`
and `-r`:

while IFS= read -r line

unless you have an explicit reason not to, see
https://mywiki.wooledge.org/BashFAQ/001.

3) Don't use a shell loop just to manipulate text as you seem to be
doing, see
https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.

Regards,

Ed.

Re: Solution while read -r line ; do problem

<slrnt25e2c.1b8t8.BitTwister@wb.home.test>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4993&group=comp.unix.shell#4993

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: BitTwis...@mouse-potato.com (Bit Twister)
Newsgroups: comp.unix.shell
Subject: Re: Solution while read -r line ; do problem
Date: Fri, 4 Mar 2022 19:09:27 -0600
Organization: A noiseless patient Spider
Lines: 55
Message-ID: <slrnt25e2c.1b8t8.BitTwister@wb.home.test>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuai1$5qj$1@dont-email.me>
Injection-Info: reader02.eternal-september.org; posting-host="cc404c070094ead3813f7beae44748c3";
logging-data="21878"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/DxiY/jFVSoJ1y3kz9zjaPV2uE/bBAfnk="
User-Agent: slrn/pre1.0.4-6 (Linux)
Cancel-Lock: sha1:+e8YP7p9SjqZA2F3OyaKnRLwnSI=
 by: Bit Twister - Sat, 5 Mar 2022 01:09 UTC

On Sat, 5 Mar 2022 01:23:29 +0100, Janis Papanagnou wrote:
> On 04.03.2022 23:44, Bit Twister wrote:
>> while read -r line ; do problem
>>
>> $ bash --version
>> GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
>>
>> I have a bash script which reads a script file and updates variables.
>> contents of some lines are modified without my script intervention.
>>
>> Code snippet
>> 1 while read -r line; do
>> 2 _t=$line
>> 3 set -- $(IFS='=' ; echo $_t)
>> 4 _wd=$1
>> 5 case "$_wd" in
>> 6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
>> 7 <big if/case snip none of which modify _medicare line>
>> 8 echo $line >> $_tmp_fn
>> 9 done < $_taxes_paid_fn
>>
>> If you look at the following results from set -vx
>> You'll notice the _medicare line * was converted to file names used in the script
>>
>> read -r line
>> _t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
>> : IFS==
>> echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
>> set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
>> '[' 13 -gt 1 ']'
>> _wd=_medicare
>>
>> Here is the
>> echo $line >> $_tmp_fn
>> which did/has the * jumk/substitution
>>
>> echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
>>
>> How can I prevent the * substitution and still be use the line modification
>> like line 6 in the example snippet??
>
> If you quote your variables on expansion ("$var") the * as part of your
> variable value will not expand to file names.
>
> Janis
>
echo "$line" >> $_tmp_fn
was/is the solution.fix.

--
The warranty and liability expired as you read this message.
If the above breaks your system, it's yours and you keep both pieces.
Practice safe computing. Backup the file before you change it.
Do a, man command_here or cat command_here, before using it.

Re: while read -r line ; do problem

<85ilsravit.fsf@gmail.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4994&group=comp.unix.shell#4994

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: nsivaram...@gmail.com (Sivaram Neelakantan)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Sun, 06 Mar 2022 22:05:54 +0530
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <85ilsravit.fsf@gmail.com>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="fc3a089b8e5fbc3a48e7e9ecd1ac2988";
logging-data="4175"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+v8SPUT2lf8xwENsZEOVvMWOSW8ljPJwQ="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (windows-nt)
Cancel-Lock: sha1:2lQI5yptL38qUuJjN0ZoPIYjiJU=
sha1:XcMdO8IbFCSpYCV34xhH4N6An3c=
User-Mail-Address: nsivaram.net@gmail.com
 by: Sivaram Neelakantan - Sun, 6 Mar 2022 16:35 UTC

On Fri, Mar 04 2022,Ed Morton wrote:

[snipped 37 lines]

>
> 1) Always quote your shell variables unless you have an explicit reason
> not to, see https://mywiki.wooledge.org/Quotes.
>
> 2) If you're going to use a shell read loop then always use both `IFS=`
> and `-r`:
>
> while IFS= read -r line
>
> unless you have an explicit reason not to, see
> https://mywiki.wooledge.org/BashFAQ/001.
>
> 3) Don't use a shell loop just to manipulate text as you seem to be
> doing, see
> https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.

[snipped 6 lines]

what then, would be a better way to use the shell for line by line
processing? The stackexchange answer clearly says, people are
mimicking C lang style and other issues, which I agree with. What
should a novice do then? Pretty sure, they wouldn't know about
paste/join/cut/comm etc which sort of makes them do all this.

sivaram
--

Re: while read -r line ; do problem

<t02sl3$gr5$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4995&group=comp.unix.shell#4995

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Sun, 6 Mar 2022 18:56:50 +0100
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <t02sl3$gr5$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 6 Mar 2022 17:56:51 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6fffe00221408327fd08e04cdf67ff9d";
logging-data="17253"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/emg2bnmu9G+hZisg0jwqS"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:L8t3y1oedKfYbE0DOtrvHuda5RY=
In-Reply-To: <85ilsravit.fsf@gmail.com>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 6 Mar 2022 17:56 UTC

On 06.03.2022 17:35, Sivaram Neelakantan wrote:
> On Fri, Mar 04 2022,Ed Morton wrote:
>>
>> 3) Don't use a shell loop just to manipulate text as you seem to be
>> doing, see
>> https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
>
> what then, would be a better way to use the shell for line by line
> processing? The stackexchange answer clearly says, people are
> mimicking C lang style and other issues, which I agree with. What
> should a novice do then? Pretty sure, they wouldn't know about
> paste/join/cut/comm etc which sort of makes them do all this.

If a novice wants to manipulate data files I'd suggest to not use
the shell "hammer" but an appropriate tool for data manipulation,
e.g. awk. (I agree that a novice not knowing all the Unix tools
will have a harder job learning them - or rather, getting to know
about their existence in the first place -, but awk is simple to
learn and makes a lot of the Unix tools just unnecessary.) In case
of having to do a lot of typical shell process handling based on
that data there's also the possibility to transform the data and
build the data-specific shell commands in awk and pipe it to 'sh'
awk '...create data-based shell commands...' data-file | sh
If the data is actually just within a few hundreds (or even a few
thousands) of lines I also wouldn't care much using a shell. That
depends on the data, its transformation, and application, though.

Janis

>
> sivaram
>

Re: while read -r line ; do problem

<t02uml$1ji$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4996&group=comp.unix.shell#4996

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Sun, 6 Mar 2022 19:31:48 +0100
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <t02uml$1ji$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
<t02sl3$gr5$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 6 Mar 2022 18:31:49 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6fffe00221408327fd08e04cdf67ff9d";
logging-data="1650"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX180MJf+FdN5KFdsGHu4ttAR"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:Z83MIDxFw39oRA1DxUR8TaQ+iiU=
In-Reply-To: <t02sl3$gr5$1@dont-email.me>
 by: Janis Papanagnou - Sun, 6 Mar 2022 18:31 UTC

On 06.03.2022 18:56, Janis Papanagnou wrote:
>
> If the data is actually just within a few hundreds (or even a few
> thousands) of lines I also wouldn't care much using a shell. That
> depends on the data, its transformation, and application, though.

Oops - I think the semantics of this was wrongly formulated. I meant:

If the data is actually just within a few hundreds (or even a few
thousands) of lines I also wouldn't care much _and use_ a shell.
[and of course, if appropriate]

Janis

Re: while read -r line ; do problem

<t0328j$up3$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4997&group=comp.unix.shell#4997

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: mortons...@gmail.com (Ed Morton)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Sun, 6 Mar 2022 13:32:34 -0600
Organization: A noiseless patient Spider
Lines: 47
Message-ID: <t0328j$up3$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 6 Mar 2022 19:32:35 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8691835d21348c5a06698dfd0f64a061";
logging-data="31523"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19TSBmxo80LS2pC+7QK75DM"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:ZKOzLDjG01nd19RPFUedycdJ8b4=
In-Reply-To: <85ilsravit.fsf@gmail.com>
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 220306-4, 3/6/2022), Outbound message
 by: Ed Morton - Sun, 6 Mar 2022 19:32 UTC

On 3/6/2022 10:35 AM, Sivaram Neelakantan wrote:
> On Fri, Mar 04 2022,Ed Morton wrote:
>
>
> [snipped 37 lines]
>
>>
>> 1) Always quote your shell variables unless you have an explicit reason
>> not to, see https://mywiki.wooledge.org/Quotes.
>>
>> 2) If you're going to use a shell read loop then always use both `IFS=`
>> and `-r`:
>>
>> while IFS= read -r line
>>
>> unless you have an explicit reason not to, see
>> https://mywiki.wooledge.org/BashFAQ/001.
>>
>> 3) Don't use a shell loop just to manipulate text as you seem to be
>> doing, see
>> https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
>
> [snipped 6 lines]
>
> what then, would be a better way to use the shell for line by line
> processing? The stackexchange answer clearly says, people are
> mimicking C lang style and other issues, which I agree with. What
> should a novice do then? Pretty sure, they wouldn't know about
> paste/join/cut/comm etc which sort of makes them do all this.

People shouldn't be writing shell scripts unless they do know about the
most common mandatory POSIX tools though. Doing so would be like trying
to build a house when all you know how to use is a toolbelt and you've
never heard of a hammer/screwdriver/saw/drill etc that the toolbelt is
designed to hold.

In general if you want to do small, simple operations then use tools
like sed, grep, cut, etc. but if you find yourself creating lengthy
and/or complicated pipelines of those or being tempted to write a shell
loop to process multi-line text then you should be using awk instead.

Again - the above is about manipulating text. If you find yourself
needing to manipulate (create/destroy) files or processes THEN a shell
loop may be appropriate (if xargs isn't a better solution).

Ed.

Re: while read -r line ; do problem

<85ee3eauii.fsf@gmail.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4998&group=comp.unix.shell#4998

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: nsivaram...@gmail.com (Sivaram Neelakantan)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Mon, 07 Mar 2022 16:39:57 +0530
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <85ee3eauii.fsf@gmail.com>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
<t0328j$up3$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="e372e95cc4de23fbbb9ed297d78e3680";
logging-data="29944"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18WRyLtVkQhYO1hHDgeF6gjREkchLdPQzI="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (windows-nt)
Cancel-Lock: sha1:koXEXaUpk2R6pVm59pXPT5Y5pzY=
sha1:gC2dgHIODh+vXVsWyUrF2sx+Xqo=
User-Mail-Address: nsivaram.net@gmail.com
 by: Sivaram Neelakantan - Mon, 7 Mar 2022 11:09 UTC

On Sun, Mar 06 2022,Ed Morton wrote:

[snipped 26 lines]

>
> People shouldn't be writing shell scripts unless they do know about the
> most common mandatory POSIX tools though. Doing so would be like trying
> to build a house when all you know how to use is a toolbelt and you've
> never heard of a hammer/screwdriver/saw/drill etc that the toolbelt is
> designed to hold.

On that standard, no one would ever get started on shell then, would
they? Me, I started with Pike's UPE on a linux bash shell till I saw
people getting politely chewed out for not being posixy/portable in
c.u.shell. And I'm not the only clown in this circus. And no, I
haven't seen/read one posix doc, though I have seen it being quoted
here.

>
> In general if you want to do small, simple operations then use tools
> like sed, grep, cut, etc. but if you find yourself creating lengthy
> and/or complicated pipelines of those or being tempted to write a shell
> loop to process multi-line text then you should be using awk instead.
>

As a low level sysadmin thrown in the deep end of a bog standard prod
support project decades ago, I have seen 5/10/15 yr scripts with the
above abused paradigm. I didn't touch or change it nor did the
retiring AT&T/Sprint/H3G/others chap who handed it over to me.
Unfortunately I used to use the template because it's been working for
so long. Talk about picking the one idea of the 1000s of shell script
which was bad. :-)

> Again - the above is about manipulating text. If you find yourself
> needing to manipulate (create/destroy) files or processes THEN a shell
> loop may be appropriate (if xargs isn't a better solution).
>

I suspect that with no one telling what's the best or optimal way to
save tears down the road, it's just like the mess you described. It's
good thing that my mistakes are generally not earth altering....so
far.

sivaram
--

Re: while read -r line ; do problem

<t04veh$tml$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=4999&group=comp.unix.shell#4999

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: mortons...@gmail.com (Ed Morton)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Mon, 7 Mar 2022 06:56:49 -0600
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <t04veh$tml$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
<t0328j$up3$1@dont-email.me> <85ee3eauii.fsf@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 7 Mar 2022 12:56:49 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="22982171315ad1a7c49cd808b4b56ae0";
logging-data="30421"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19NztC8ojCwWnafQpiUpuNA"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:DKzUIi1So7RS8xzl81+YKwH9zSs=
In-Reply-To: <85ee3eauii.fsf@gmail.com>
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 220307-0, 3/6/2022), Outbound message
 by: Ed Morton - Mon, 7 Mar 2022 12:56 UTC

On 3/7/2022 5:09 AM, Sivaram Neelakantan wrote:
> On Sun, Mar 06 2022,Ed Morton wrote:
>
>
> [snipped 26 lines]
>
>>
>> People shouldn't be writing shell scripts unless they do know about the
>> most common mandatory POSIX tools though. Doing so would be like trying
>> to build a house when all you know how to use is a toolbelt and you've
>> never heard of a hammer/screwdriver/saw/drill etc that the toolbelt is
>> designed to hold.
>
> On that standard, no one would ever get started on shell then, would
> they?

You're suggesting people are out there learning to write:

while IFS= read -r line; do
if [[ $line =~ foo ]]; then
echo "$line"
fi
done < file

before they've learned:

grep 'foo' file

I don't buy it.

Me, I started with Pike's UPE on a linux bash shell till I saw
> people getting politely chewed out for not being posixy/portable in
> c.u.shell. And I'm not the only clown in this circus. And no, I
> haven't seen/read one posix doc, though I have seen it being quoted
> here.

I'm not saying you need to read POSIX docs, I'm saying you need to have
heard of the most common tools that are required by POSIX to exist on
all Unix systems.

It's been 40+ years since I learned about Unix but as I recall the
starting point was "here's how to find text in a file" followed by grep
and ditto for when/how to call sed, cut, head, tail, etc. It was only
later we learned how to write shell scripts to glue them together and I
can't imagine how it'd make any sense to learn how to write the glue first.

Ed.

>
>>
>> In general if you want to do small, simple operations then use tools
>> like sed, grep, cut, etc. but if you find yourself creating lengthy
>> and/or complicated pipelines of those or being tempted to write a shell
>> loop to process multi-line text then you should be using awk instead.
>>
>
> As a low level sysadmin thrown in the deep end of a bog standard prod
> support project decades ago, I have seen 5/10/15 yr scripts with the
> above abused paradigm. I didn't touch or change it nor did the
> retiring AT&T/Sprint/H3G/others chap who handed it over to me.
> Unfortunately I used to use the template because it's been working for
> so long. Talk about picking the one idea of the 1000s of shell script
> which was bad. :-)
>
>> Again - the above is about manipulating text. If you find yourself
>> needing to manipulate (create/destroy) files or processes THEN a shell
>> loop may be appropriate (if xargs isn't a better solution).
>>
>
> I suspect that with no one telling what's the best or optimal way to
> save tears down the road, it's just like the mess you described. It's
> good thing that my mistakes are generally not earth altering....so
> far.
>
> sivaram

Iterating over a collection in shell (Was: while read -r line ; do problem)

<t080un$1cbv1$1@news.xmission.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=5000&group=comp.unix.shell#5000

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gaze...@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.unix.shell
Subject: Iterating over a collection in shell (Was: while read -r line ; do problem)
Date: Tue, 8 Mar 2022 16:40:55 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <t080un$1cbv1$1@news.xmission.com>
References: <slrnt255id.1avaq.BitTwister@wb.home.test> <svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
Injection-Date: Tue, 8 Mar 2022 16:40:55 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="1454049"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Tue, 8 Mar 2022 16:40 UTC

In article <85ilsravit.fsf@gmail.com>,
Sivaram Neelakantan <nsivaram.net@gmail.com> wrote:
....
>what then, would be a better way to use the shell for line by line
>processing? The stackexchange answer clearly says, people are mimicking
>C lang style and other issues, which I agree with. What should a novice
>do then? Pretty sure, they wouldn't know about paste/join/cut/comm etc
>which sort of makes them do all this.

I usually use MAPFILE (in bash) for this. MAPFILE reads an entire file or
process into an array. Then you can iterate the array. So, you end up
with:

mapfile -t < file
for i in "${MAPFILE[@]}"
do
...
done

Or, to do it with a process (the more common case):

mapfile -t < <(process)
for i in "${MAPFILE[@]}"
do
...
done

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/FiftyPercent

Re: Iterating over a collection in shell (Was: while read -r line ; do problem)

<t084c1$bn1$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=5001&group=comp.unix.shell#5001

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: Iterating over a collection in shell (Was: while read -r line ;
do problem)
Date: Tue, 8 Mar 2022 18:39:13 +0100
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <t084c1$bn1$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
<t080un$1cbv1$1@news.xmission.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 8 Mar 2022 17:39:13 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6c67cd80f66ada3c342cff97dfd101e8";
logging-data="12001"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/U/0Rc8anctBMbCU+kp0BY"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:AI5flGRbLdK4B2YuthVEIkwCqi0=
In-Reply-To: <t080un$1cbv1$1@news.xmission.com>
 by: Janis Papanagnou - Tue, 8 Mar 2022 17:39 UTC

On 08.03.2022 17:40, Kenny McCormack wrote:
> In article <85ilsravit.fsf@gmail.com>,
> Sivaram Neelakantan <nsivaram.net@gmail.com> wrote:
> ...
>> what then, would be a better way to use the shell for line by line
>> processing? The stackexchange answer clearly says, people are mimicking
>> C lang style and other issues, which I agree with. What should a novice
>> do then? Pretty sure, they wouldn't know about paste/join/cut/comm etc
>> which sort of makes them do all this.
>
> I usually use MAPFILE (in bash) for this. MAPFILE reads an entire file or
> process into an array. Then you can iterate the array. So, you end up
> with:
>
> mapfile -t < file
> for i in "${MAPFILE[@]}"
> do
> ...
> done
>

Nice bash feature, didn't knew it.

In other shell (like the ksh I use) I'd have to do something like

IFS=$'\n' MAPFILE=( $(< mapfile-data) )

to populate an array.

Janis

> Or, to do it with a process (the more common case):
>
> mapfile -t < <(process)
> for i in "${MAPFILE[@]}"
> do
> ...
> done
>

Re: Iterating over a collection in shell

<85a6dzc06c.fsf@gmail.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=5002&group=comp.unix.shell#5002

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: nsivaram...@gmail.com (Sivaram Neelakantan)
Newsgroups: comp.unix.shell
Subject: Re: Iterating over a collection in shell
Date: Wed, 09 Mar 2022 08:16:51 +0530
Organization: A noiseless patient Spider
Lines: 27
Message-ID: <85a6dzc06c.fsf@gmail.com>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <85ilsravit.fsf@gmail.com>
<t080un$1cbv1$1@news.xmission.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="f01d4f86b8bd79b68e878de18016b797";
logging-data="29545"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Jephn5+qtQ3bD5Qvo1ldBB80oErDOdno="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (windows-nt)
Cancel-Lock: sha1:Cymbli7beOK2UyEA2RWmxKR8fpo=
sha1:xBDkKfInHkbQTo+Rw3Bg7fsL8MQ=
User-Mail-Address: nsivaram.net@gmail.com
 by: Sivaram Neelakantan - Wed, 9 Mar 2022 02:46 UTC

On Tue, Mar 08 2022,Kenny McCormack wrote:

[snipped 9 lines]

> I usually use MAPFILE (in bash) for this. MAPFILE reads an entire file or
> process into an array. Then you can iterate the array. So, you end up
> with:
>
> mapfile -t < file
> for i in "${MAPFILE[@]}"
> do
> ...
> done
>
> Or, to do it with a process (the more common case):
>
> mapfile -t < <(process)
> for i in "${MAPFILE[@]}"
> do
> ...
> done

Thanks for this, this is news to me.

sivaram
--

Re: while read -r line ; do problem

<q0hlfi-9pj.ln1@wilbur.25thandClement.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=5003&group=comp.unix.shell#5003

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!buffer2.nntp.dca1.giganews.com!buffer1.nntp.dca1.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Tue, 08 Mar 2022 21:00:01 -0600
Message-ID: <q0hlfi-9pj.ln1@wilbur.25thandClement.com>
From: will...@25thandClement.com (William Ahern)
Subject: Re: while read -r line ; do problem
Newsgroups: comp.unix.shell
References: <slrnt255id.1avaq.BitTwister@wb.home.test> <svuc2t$egk$1@dont-email.me>
User-Agent: tin/2.4.4-20191224 ("Millburn") (OpenBSD/7.0 (amd64))
Date: Tue, 8 Mar 2022 18:46:18 -0800
Lines: 75
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-ASSWHTzCk3h3867vaUmJtx9+KlpOfFd0hLUeMTHDRYaSTRl0zjC7PINFttwhJQiZanIHWz+0qjAz8oe!FnOh+onU3sTLjYglbRjitLpW+STlesbCcPvZN+eIyHoN5WKeihmOb4j0mOealIcwkg==
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Original-Bytes: 4937
 by: William Ahern - Wed, 9 Mar 2022 02:46 UTC

Ed Morton <mortonspam@gmail.com> wrote:
> On 3/4/2022 4:44 PM, Bit Twister wrote:
>> while read -r line ; do problem
>>
>> $ bash --version
>> GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
>>
>> I have a bash script which reads a script file and updates variables.
>> contents of some lines are modified without my script intervention.
>>
>> Code snippet
>> 1 while read -r line; do
>> 2 _t=$line
>> 3 set -- $(IFS='=' ; echo $_t)
>> 4 _wd=$1
>> 5 case "$_wd" in
>> 6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
>> 7 <big if/case snip none of which modify _medicare line>
>> 8 echo $line >> $_tmp_fn
>> 9 done < $_taxes_paid_fn
<snip>
> 3) Don't use a shell loop just to manipulate text as you seem to be
> doing, see
> https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
>

IMO, this is not great advice.

1) If raw throughput matters, you shouldn't be using shell text processing
in the first place; use sed or awk, at the very least, instead.

2) If the input is coming from a pipe, what the read loops buys you is
concurrency and parallelism. If the process generating the input has high
latency, the concurrency can help tremendously. If either side uses alot of
CPU, the parallelism might help performance, overcoming the byte-by-byte
issue.

Case example: last year I downloaded a company engineer-managed script that
updated routing tables, created as a workaround for a poorly managed IPSec
VPN configuration deployed on company laptos. When I ran the script it
seemed to hang, so I'd kill it and run it again. After a few minutes I
decided to dive into the script to figure out what was happening. The
fundamental problem was that they had one routine generating a list of
addresses and another routine consuming the list in a loop. Crucially, the
latter, second routine was using the Bash'ism to slurp the input into an
array for processing using a for-loop rather than a while-read-loop. The
address-generating routine was doing network I/O to download and preprocess
the lists, which was taking considerable time. Meanwhile, the second loop
was completely idle waiting for the first to finish. The second loop also
incurred some surprisingly high latency per address (IIRC, might have been
every invocation of route(1) doing reverse DNS or some such). Long story
short, the entire script took much longer to complete than if they had used
a simple while-read-loop, permitting both loops to run concurrently. Plus,
the script would have provided immediate feedback that things were actually
progressing.

You see something similar with the widespread adoption of map-filter-reduce
functional patterns in languages like JavaScript. The current popularity
seems to have been kicked off by admiration for Haskell-style algorithms,
which once upon a time were popular blog fodder. But Haskell uses lazy list
evaluation, unlike languages like JavaScript. The result is that the new
preferred pattern results in a tremendous amount of memory usage and churn,
as every transformation step requires constructing and populating a whole
new array. It makes for some horribly inefficient programs; inefficient in a
quite opaque way, whereas with traditional patterns the unnecessary array
duplications would be immediately obvious, particularly if reading the code
with an eye toward improving performance. (Also aren't creating a bunch of
closures, which can create barriers to JIT optimization.)

Some of the old patterns--e.g. shell pipes--are far more sophisticated than
people give them credit for today. See, e.g., this 2014 paper by Doug
McIlroy, inventor of the Unix pipe, describing the equivalency between
coroutines, pipes, and lazy lists:

https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf

Re: while read -r line ; do problem

<t0a756$446$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=5004&group=comp.unix.shell#5004

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: mortons...@gmail.com (Ed Morton)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Wed, 9 Mar 2022 06:39:02 -0600
Organization: A noiseless patient Spider
Lines: 91
Message-ID: <t0a756$446$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <q0hlfi-9pj.ln1@wilbur.25thandClement.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 9 Mar 2022 12:39:03 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1055e4d3632c5a1ea90c5770534a976c";
logging-data="4230"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18AVixyPNcz3UWoPn4ygOx+"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.2
Cancel-Lock: sha1:SqR3W3gVYmOdusFlKSvfCzqKzBc=
In-Reply-To: <q0hlfi-9pj.ln1@wilbur.25thandClement.com>
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 220308-2, 3/8/2022), Outbound message
 by: Ed Morton - Wed, 9 Mar 2022 12:39 UTC

On 3/8/2022 8:46 PM, William Ahern wrote:
> Ed Morton <mortonspam@gmail.com> wrote:
>> On 3/4/2022 4:44 PM, Bit Twister wrote:
>>> while read -r line ; do problem
>>>
>>> $ bash --version
>>> GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
>>>
>>> I have a bash script which reads a script file and updates variables.
>>> contents of some lines are modified without my script intervention.
>>>
>>> Code snippet
>>> 1 while read -r line; do
>>> 2 _t=$line
>>> 3 set -- $(IFS='=' ; echo $_t)
>>> 4 _wd=$1
>>> 5 case "$_wd" in
>>> 6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
>>> 7 <big if/case snip none of which modify _medicare line>
>>> 8 echo $line >> $_tmp_fn
>>> 9 done < $_taxes_paid_fn
> <snip>
>> 3) Don't use a shell loop just to manipulate text as you seem to be
>> doing, see
>> https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
>>
>
> IMO, this is not great advice.
>
> 1) If raw throughput matters, you shouldn't be using shell text processing
> in the first place; use sed or awk, at the very least, instead.

That's the same advice.

I'm not sure what you're saying below. It sounds like you're discussing
some bad software you came across that you improved by replacing a
couple of for loops with a while loop but obviously that doesn't mean it
couldn't have been further improved by using, say, awk instead. Can you
provide a concise sample shell script that clearly and simply just
demonstrates what you're describing below and some way of generating
sample input to help us understand what you're describing and so we can
test it?

Ed.

>
> 2) If the input is coming from a pipe, what the read loops buys you is
> concurrency and parallelism. If the process generating the input has high
> latency, the concurrency can help tremendously. If either side uses alot of
> CPU, the parallelism might help performance, overcoming the byte-by-byte
> issue.
>
> Case example: last year I downloaded a company engineer-managed script that
> updated routing tables, created as a workaround for a poorly managed IPSec
> VPN configuration deployed on company laptos. When I ran the script it
> seemed to hang, so I'd kill it and run it again. After a few minutes I
> decided to dive into the script to figure out what was happening. The
> fundamental problem was that they had one routine generating a list of
> addresses and another routine consuming the list in a loop. Crucially, the
> latter, second routine was using the Bash'ism to slurp the input into an
> array for processing using a for-loop rather than a while-read-loop. The
> address-generating routine was doing network I/O to download and preprocess
> the lists, which was taking considerable time. Meanwhile, the second loop
> was completely idle waiting for the first to finish. The second loop also
> incurred some surprisingly high latency per address (IIRC, might have been
> every invocation of route(1) doing reverse DNS or some such). Long story
> short, the entire script took much longer to complete than if they had used
> a simple while-read-loop, permitting both loops to run concurrently. Plus,
> the script would have provided immediate feedback that things were actually
> progressing.
>
> You see something similar with the widespread adoption of map-filter-reduce
> functional patterns in languages like JavaScript. The current popularity
> seems to have been kicked off by admiration for Haskell-style algorithms,
> which once upon a time were popular blog fodder. But Haskell uses lazy list
> evaluation, unlike languages like JavaScript. The result is that the new
> preferred pattern results in a tremendous amount of memory usage and churn,
> as every transformation step requires constructing and populating a whole
> new array. It makes for some horribly inefficient programs; inefficient in a
> quite opaque way, whereas with traditional patterns the unnecessary array
> duplications would be immediately obvious, particularly if reading the code
> with an eye toward improving performance. (Also aren't creating a bunch of
> closures, which can create barriers to JIT optimization.)
>
> Some of the old patterns--e.g. shell pipes--are far more sophisticated than
> people give them credit for today. See, e.g., this 2014 paper by Doug
> McIlroy, inventor of the Unix pipe, describing the equivalency between
> coroutines, pipes, and lazy lists:
>
> https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf

Re: while read -r line ; do problem

<t0b6p4$j0c$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=5005&group=comp.unix.shell#5005

 copy link   Newsgroups: comp.unix.shell
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_pa...@hotmail.com (Janis Papanagnou)
Newsgroups: comp.unix.shell
Subject: Re: while read -r line ; do problem
Date: Wed, 9 Mar 2022 22:38:43 +0100
Organization: A noiseless patient Spider
Lines: 129
Message-ID: <t0b6p4$j0c$1@dont-email.me>
References: <slrnt255id.1avaq.BitTwister@wb.home.test>
<svuc2t$egk$1@dont-email.me> <q0hlfi-9pj.ln1@wilbur.25thandClement.com>
<t0a756$446$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 9 Mar 2022 21:38:44 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9c8850897d30268560bbae1a99b92388";
logging-data="19468"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+d0g7WHl367Sfgp6B2V4j1"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:fcK8V7CqqhoCFVNXh7+tIlW3zjI=
In-Reply-To: <t0a756$446$1@dont-email.me>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Wed, 9 Mar 2022 21:38 UTC

On 09.03.2022 13:39, Ed Morton wrote:
> On 3/8/2022 8:46 PM, William Ahern wrote:
>
> I'm not sure what you're saying below. It sounds like you're discussing
> some bad software you came across that you improved by replacing a
> couple of for loops with a while loop but obviously that doesn't mean it
> couldn't have been further improved by using, say, awk instead. Can you
> provide a concise sample shell script that clearly and simply just
> demonstrates what you're describing below and some way of generating
> sample input to help us understand what you're describing and so we can
> test it?

What I had associated with the described text was...

1) for loop - that reads completely constructed data - gets replaced
by sequential and parallelisable processing pipe, e.g.

for f in `ls` # often seen [bad] pattern
for f in * # implicitly also sorting

(note: the "lazy evaluation" concept that the poster mentioned, if
implemented in shell, could probably handle both cases as well)

vs.

ls | while read

That example is certainly not accurate describing the intention since
'ls' is implicitly also sorting and requires to store all elements.
So replace 'ls' by 'some_arbitrary_non_buffering_data_generator'.

2)
Then the hypothesis that the slow (character-wise read) 'while read'
could be negligible [in certain cases] if the left-hand-side process
requires more execution time than the character-wise read.
So replace 'ls' by 'some_arbitrary_non_buffering_slow_data_generator'.

3)
Many processes can be parallelised on modern systems (scheduling on
multi-core or multi-CPU systems), so

x | y | z # may run in parallel

(note: the individual processes may also slow down the pipe when
storing huge amounts of data; e.g. in cases like using 'sort')

vs.

xyz # monolithic tool (hypothesis: non-parallelisable)

(note: the latter presumes that the processing steps have to be
or are implemented in a linear, sequential way in 'xyz')

4)
Finally co-processes are mentioned (not directly related) but directly
supported by the modern powerful shells (or use external Unix tools).

This is my interpretation of the text. (The author will correct any
misunderstandings, I hope.) Note: I am just interpreting, not valuing
what has been said (or what I think had been said).

Janis

>
> Ed.
>
>>
>> 2) If the input is coming from a pipe, what the read loops buys you is
>> concurrency and parallelism. If the process generating the input has high
>> latency, the concurrency can help tremendously. If either side uses
>> alot of
>> CPU, the parallelism might help performance, overcoming the byte-by-byte
>> issue.
>>
>> Case example: last year I downloaded a company engineer-managed script
>> that
>> updated routing tables, created as a workaround for a poorly managed
>> IPSec
>> VPN configuration deployed on company laptos. When I ran the script it
>> seemed to hang, so I'd kill it and run it again. After a few minutes I
>> decided to dive into the script to figure out what was happening. The
>> fundamental problem was that they had one routine generating a list of
>> addresses and another routine consuming the list in a loop. Crucially,
>> the
>> latter, second routine was using the Bash'ism to slurp the input into an
>> array for processing using a for-loop rather than a while-read-loop. The
>> address-generating routine was doing network I/O to download and
>> preprocess
>> the lists, which was taking considerable time. Meanwhile, the second loop
>> was completely idle waiting for the first to finish. The second loop also
>> incurred some surprisingly high latency per address (IIRC, might have
>> been
>> every invocation of route(1) doing reverse DNS or some such). Long story
>> short, the entire script took much longer to complete than if they had
>> used
>> a simple while-read-loop, permitting both loops to run concurrently.
>> Plus,
>> the script would have provided immediate feedback that things were
>> actually
>> progressing.
>>
>> You see something similar with the widespread adoption of
>> map-filter-reduce
>> functional patterns in languages like JavaScript. The current popularity
>> seems to have been kicked off by admiration for Haskell-style algorithms,
>> which once upon a time were popular blog fodder. But Haskell uses lazy
>> list
>> evaluation, unlike languages like JavaScript. The result is that the new
>> preferred pattern results in a tremendous amount of memory usage and
>> churn,
>> as every transformation step requires constructing and populating a whole
>> new array. It makes for some horribly inefficient programs;
>> inefficient in a
>> quite opaque way, whereas with traditional patterns the unnecessary array
>> duplications would be immediately obvious, particularly if reading the
>> code
>> with an eye toward improving performance. (Also aren't creating a
>> bunch of
>> closures, which can create barriers to JIT optimization.)
>>
>> Some of the old patterns--e.g. shell pipes--are far more sophisticated
>> than
>> people give them credit for today. See, e.g., this 2014 paper by Doug
>> McIlroy, inventor of the Unix pipe, describing the equivalency between
>> coroutines, pipes, and lazy lists:
>>
>> https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf
>

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor