Message-ID:

"One lawyer can steal more than a hundred men with guns." -- The Godfather

devel / comp.lang.python / Re: Extracting dataframe column with multiple conditions on row values

Re: Extracting dataframe column with multiple conditions on row values

<mailman.125.1641604882.3079.python-list@python.org>

https://www.novabbs.com/devel/article-flat.php?id=16614&group=comp.lang.python#16614

copy link Newsgroups: comp.lang.python

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: PythonL...@DancesWithMice.info (dn)
Newsgroups: comp.lang.python
Subject: Re: Extracting dataframe column with multiple conditions on row values
Date: Sat, 8 Jan 2022 14:21:05 +1300
Organization: DWM
Lines: 244
Message-ID: <mailman.125.1641604882.3079.python-list@python.org>
References: <92704ABE-B6C4-4F7C-BF40-1883C61A3964.ref@hxcore.ol>
<92704ABE-B6C4-4F7C-BF40-1883C61A3964@hxcore.ol>
<7530730e-b793-3075-2ec5-ee2e3160e0bb@DancesWithMice.info>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de sQZdKAC08nad4FV5XrK6OAn+pMDD84VYsCtOcHJZtq+w==
Return-Path: <PythonList@DancesWithMice.info>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=danceswithmice.info header.i=@danceswithmice.info
header.b=ZIcPGSSs; dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02; 'csv': 0.03;
'(which': 0.04; 'argument': 0.04; '(for': 0.05; 'containing':
0.05; 'math': 0.05; "python's": 0.05; 'row': 0.05; '(to': 0.07;
'arrays': 0.07; 'loop': 0.07; '"python"': 0.09; '=dn': 0.09;
'code?': 0.09; 'commands.': 0.09; 'example.': 0.09;
'from:addr:danceswithmice.info': 0.09; 'from:addr:pythonlist':
0.09; 'problem?': 0.09; 'solution,': 0.09; 'then.': 0.09; '(more':
0.16; '(when': 0.16; 'above)': 0.16; 'column': 0.16; 'confused':
0.16; 'eg:': 0.16; 'equips': 0.16; 'iterable': 0.16; 'iterate':
0.16; 'learning)': 0.16; 'mahmood': 0.16; 'message-
id:@DancesWithMice.info': 0.16; 'object,': 0.16; 'pointers': 0.16;
'print(': 0.16; 'pythonic': 0.16; 'received:51.254': 0.16;
'received:51.254.211': 0.16; 'received:51.254.211.219': 0.16;
'received:cloud': 0.16; 'received:rangi.cloud': 0.16; 'salaam':
0.16; 'says:': 0.16; 'somewhat': 0.16; 'subject:dataframe': 0.16;
'subject:values': 0.16; 'tuple': 0.16; 'tuples,': 0.16; 'url:faq':
0.16; 'url:programming': 0.16; 'which,': 0.16; 'wrote:': 0.16;
'problem': 0.16; 'python': 0.16; 'larger': 0.17; 'says': 0.17;
'values': 0.17; 'applications': 0.17; 'instead': 0.17; 'uses':
0.19; 'to:addr:python-list': 0.20; 'returns': 0.22; 'version':
0.23; 'code': 0.23; 'idea': 0.24; '(and': 0.25; 'discussion':
0.25; 'examples': 0.25; 'programming': 0.25; 'perform': 0.26;
'zip': 0.26; "isn't": 0.27; 'function': 0.27; '>>>': 0.28;
'background': 0.28; 'thinking': 0.28; 'example,': 0.28; 'suggest':
0.28; 'computer': 0.29; 'asked': 0.29; 'header:User-Agent:1':
0.30; 'seem': 0.31; 'approach': 0.31; 'putting': 0.31;
'header:Organization:1': 0.31; "doesn't": 0.32; 'question': 0.32;
'(as': 0.32; 'extract': 0.32; 'language.': 0.32; 'lists,': 0.32;
'python-list': 0.32; 'required,': 0.32; 'specified': 0.32; 'but':
0.32; "i'm": 0.33; "i'll": 0.33; 'there': 0.33; 'able': 0.34;
'header:In-Reply-To:1': 0.34; 'book': 0.35; 'item': 0.35; 'words':
0.35; 'yes,': 0.35; 'applicable': 0.65; 'similar': 0.65; 'well':
0.65; 'required': 0.65; 'let': 0.66; 'received:userid': 0.66;
'now,': 0.67; 'skip:e 20': 0.67; 'time,': 0.67; 'url:%0': 0.67;
'respond': 0.67; 'per': 0.68; 'right': 0.68; 'items': 0.68;
'further': 0.69; 'manually': 0.69; 'repeatedly': 0.69; 'manual':
0.70; 'url:%20': 0.71; 'skip:* 10': 0.71; 'suite': 0.71; 'deal':
0.73; "you'll": 0.73; 'easy': 0.74; '....': 0.76; 'languages,':
0.76; 'highly': 0.78; 'major': 0.78; 'quickly': 0.80;
'construction': 0.81; 'seek': 0.81; 'bears': 0.84; 'eventually':
0.84; 'expert,': 0.84; 'match.': 0.84; 'quotation': 0.84; 'ref:':
0.84; 'subject: \n ': 0.84; 'that!': 0.84; 'this!': 0.84; 'thus,':
0.84; 'thus:': 0.84; 'greek': 0.91; 'influenced': 0.91
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on vps517507.ovh.net
X-Spam-Level:
X-Spam-Status: No, score=-5.7 required=5.0 tests=ALL_TRUSTED,BAYES_00,
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,NICE_REPLY_A autolearn=ham
autolearn_force=no version=3.4.0
DKIM-Filter: OpenDKIM Filter v2.11.0 mail.rangi.cloud C8B3E2E36
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=danceswithmice.info;
s=staff; t=1641604877;
bh=n8DgnuBnfVE9n0gxGGV3cNlHU3Da1NPBwedazvM/GNQ=;
h=Date:Subject:To:References:From:In-Reply-To:From;
b=ZIcPGSSsdwQ5qFmlHHuGm0r+K2ukXVWhOxqryFRoNXy2h0BMrCIksO4gHduThzkGS
zLYu+lbFtApuoLVOd8+Yu/+L47AFcsYyI3l9Z0shiymwfSCdPvl0MeR6MSLsHnXiBN
n6a6nD/UTCXq7tj3ELciPKo3jbHIYCmMlLww+7EpjINRstqAat2gE36I2L5GJZbLZy
i/pUacgQw3Ntnv2I47VJMlWtZ4RE51e1SkPTpMiOCeP4huiYU01Vo+zVanhgGvGX12
oYDNWArQ5Nn8ccUarDk/y3kxHEcfozqYGFFMRLJgGYHbK96fvzsjLPN04D522maEjP
YqSiJnoYsNj0Q==
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Content-Language: en-GB
In-Reply-To: <92704ABE-B6C4-4F7C-BF40-1883C61A3964@hxcore.ol>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <7530730e-b793-3075-2ec5-ee2e3160e0bb@DancesWithMice.info>
X-Mailman-Original-References: <92704ABE-B6C4-4F7C-BF40-1883C61A3964.ref@hxcore.ol>
<92704ABE-B6C4-4F7C-BF40-1883C61A3964@hxcore.ol>

by: dn - Sat, 8 Jan 2022 01:21 UTC

Salaam Mahmood,

On 08/01/2022 12.07, Mahmood Naderan via Python-list wrote:
> I have a csv file like this
> V0,V1,V2,V3
> 4,1,1,1
> 6,4,5,2
> 2,3,6,7
>
> And I want to search two rows for a match and find the column. For
> example, I want to search row[0] for 1 and row[1] for 5. The corresponding
> column is V2 (which is the third column). Then I want to return the value
> at row[2] and the found column. The result should be 6 then.

Not quite: isn't the "found column" also required?

> I can manually extract the specified rows (with index 0 and 1 which are
> fixed) and manually iterate over them like arrays to find a match. Then I

Perhaps this idea has been influenced by a similar solution in another
programming language. May I suggest that the better-answer you seek lies
in using Python idioms (as well as Python's tools)...

> key1 = 1
> key2 = 5

Fine, so far - excepting that this 'problem' is likely to be a small
part of some larger system. Accordingly, consider writing it as a
function. In which case, these two "keys" will become
function-parameters (and the two 'results' become return-values).

> row1 = df.iloc[0] # row=[4,1,1,1]
> row2 = df.iloc[1] # row=[6,4,5,2]

This is likely not native-Python. Let's create lists for 'everything',
just-because:

>>> headings = [ "V0","V1","V2","V3" ]
>>> row1 = [4,1,1,1]
>>> row2 = [6,4,5,2]
>>> results = [ 2,3,6,7 ]

Note how I'm using the Python REPL (in a "terminal", type "python" (as
appropriate to your OpSys) at the command-line). IMHO the REPL is a
grossly under-rated tool, and is a very good means towards
trial-and-error, and learning by example. Highly recommended!

> for i in range(len(row1)):

This construction is very much a "code smell" for thinking that it is
not "pythonic". (and perhaps the motivation for this post)

In Python (compared with many other languages) the "for" loop should
actually be pronounced "for-each". In other words when we pair the
code-construct with a list (for example):

for each item in the list the computer should perform some suite of
commands.

(the "suite" is everything 'inside' the for-each-loop - NB my
'Python-betters' will quickly point-out that this feature is not limited
to Python-lists, but will work with any :iterable" - ref:
https://docs.python.org/3/tutorial/controlflow.html#for-statements)

Thus:

> for item in headings: print( item )
....
V0
V1
V2
V3

The problem is that when working with matrices/matrixes, a math
background equips one with the idea of indices/indexes, eg the
ubiquitous subscript-i. Accordingly, when reading 'math' where a formula
uses the upper-case Greek "sigma" character, remember that it means "for
all" or "for each"!

So, if Python doesn't use indexing or "pointers", how do we deal with
the problem?

Unfortunately, at first glance, the pythonic approach may seem
more-complicated or even somewhat convoluted, but once the concepts
(and/or the Python idioms) are learned, it is quite manageable (and
applicable to many more applications than matrices/matrixes!)...

> if row1[i] == key1:
> for j in range(len(row2)):
> if row2[j] == key2:
> res = df.iloc[:,j]
> print(res) # 6
>
> Is there any way to use built-in function for a more efficient code?

This is where your idea bears fruit!

There is a Python "built-in function": zip(), which will 'join' lists.
NB do not become confused between zip() and zip archive/compressed files!

Most of the time reference book and web-page examples show zip() being
used to zip-together two lists into a single data-construct (which is an
iterable(!)). However, zip() will actually zip-together multiple (more
than two) "iterables". As the manual says:

«zip() returns an iterator of tuples, where the i-th tuple contains the
i-th element from each of the argument iterables.»

Ah, so that's where the math-idea of subscript-i went! It has become
'hidden' in Python's workings - or putting that another way: Python
looks after the subscripting for us (and given that 'out by one' errors
in pointers is a major source of coding-error in other languages,
thank-you very much Python!)

First re-state the source-data as Python lists, (per above) - except
that I recommend the names be better-chosen to be more meaningful (to
your application)!

Now, (in the REPL) try using zip():

>>> zip( headings, row1, row2, results )
<zip object at 0x7f655cca6bc0>

Does that seem a very good illustration? Not really, but re-read the
quotation from the manual (above) where it says that zip returns an
iterator. If we want to see the values an iterator will produce, then
turn it into an iterable data-structure, eg:

>>> list( zip( headings, row1, row2, results ) )
[('V0', 4, 6, 2), ('V1', 1, 4, 3), ('V2', 1, 5, 6), ('V3', 1, 2, 7)]

or, to see things more clearly, let me re-type it as:

[
('V0', 4, 6, 2),
('V1', 1, 4, 3),
('V2', 1, 5, 6),
('V3', 1, 2, 7)
]

What we now see is actually a "transpose" of the original 'matrix'
presented in the post/question!

(NB Python will perform this layout for us - read about the pprint library)

Another method which can also be employed (and which will illustrate the
loop required to code the eventual-solution(!)) is that Python's next()
will extract the first row of the transpose:

>>> row = next( zip( headings, row1, row2, results ) )
>>> row
('V0', 4, 6, 2)

This is all-well-and-good, but that result is a tuple of four items
(corresponding to one column in the way the source-data was explained).

If we need to consider the four individual data-items, that can be
improved using a Python feature called "tuple unpacking". Instead of the
above delivering a tuple which is then assigned to "row", the tuple can
be assigned to four "identifiers", eg

>>> heading, row1_item, row2_item, result= next( zip( headings, row1,
row2, results ) )

(apologies about email word-wrapping - this is a single line of Python-code)

Which, to prove the case, could be printed:

>>> heading, row1_item, row2_item, result
('V0', 4, 6, 2)

(ref:
https://docs.python.org/3/tutorial/datastructures.html?highlight=tuple%20unpacking#tuples-and-sequences)

Thus, if we repeatedly ask for the next() row from the zip-ped
transpose, eventually it will respond with the row starting 'V2' - which
is the desired-result, ie the row containing the 1, the 5, and the 6 -
and if you follow-through using the REPL, will be clearly visible.

Finally, 'all' that is required, is a for-each-loop which will iterate
across/down the zip object, one tuple (row of the transpose) at a time,
AND perform the "tuple-unpacking" all in one command, with an
if-statement to detect the correct row/column:

>>> for *tuple-unpacking* in *zip() etc*:
.... if row1_item == *what?* and row2_item == *what?*
.... print( *which* and *which identifier* )
....
V2 6

Yes, three lines. It's as easy as that!
(when you know how)

Worse: when you become more expert, you'll be able to compress all of
that down into a single-line solution - but it won't be as "readable" as
is this!

NB this question has a 'question-smell' of 'homework', so I'll not
complete the code for you - this is something *you* asked to learn and
the best way to learn is by 'doing' (not by 'reading').

However, please respond with your solution, or any further question
(with the next version of the code so-far, per this first-post - which
we appreciate!)

Regardless, you asked 'the right question' (curiosity is the key to
learning) and in the right way/manner. Well done!

NBB the above code-outline does not consider the situation where the
search fails/the keys are not found!

For further information, please review:
https://docs.python.org/3/library/functions.html?highlight=zip#zip

Also, further to the above discussion of combining lists and loops:
https://docs.python.org/3/tutorial/datastructures.html?highlight=zip#looping-techniques

--
Regards,
=dn

Il giorno sabato 8 gennaio 2022 alle 02:21:40 UTC+1 dn ha scritto:
> Salaam Mahmood,
> On 08/01/2022 12.07, Mahmood Naderan via Python-list wrote:
> > I have a csv file like this
> > V0,V1,V2,V3
> > 4,1,1,1
> > 6,4,5,2
> > 2,3,6,7
> >
> > And I want to search two rows for a match and find the column. For
> > example, I want to search row[0] for 1 and row[1] for 5. The corresponding
> > column is V2 (which is the third column). Then I want to return the value
> > at row[2] and the found column. The result should be 6 then.
> Not quite: isn't the "found column" also required?
> > I can manually extract the specified rows (with index 0 and 1 which are
> > fixed) and manually iterate over them like arrays to find a match. Then I
> Perhaps this idea has been influenced by a similar solution in another
> programming language. May I suggest that the better-answer you seek lies
> in using Python idioms (as well as Python's tools)...
> > key1 = 1
> > key2 = 5
> Fine, so far - excepting that this 'problem' is likely to be a small
> part of some larger system. Accordingly, consider writing it as a
> function. In which case, these two "keys" will become
> function-parameters (and the two 'results' become return-values).
> > row1 = df.iloc[0] # row=[4,1,1,1]
> > row2 = df.iloc[1] # row=[6,4,5,2]
> This is likely not native-Python. Let's create lists for 'everything',
> just-because:
>
> >>> headings = [ "V0","V1","V2","V3" ]
> >>> row1 = [4,1,1,1]
> >>> row2 = [6,4,5,2]
> >>> results = [ 2,3,6,7 ]
>
>
> Note how I'm using the Python REPL (in a "terminal", type "python" (as
> appropriate to your OpSys) at the command-line). IMHO the REPL is a
> grossly under-rated tool, and is a very good means towards
> trial-and-error, and learning by example. Highly recommended!
>
>
> > for i in range(len(row1)):
>
> This construction is very much a "code smell" for thinking that it is
> not "pythonic". (and perhaps the motivation for this post)
>
> In Python (compared with many other languages) the "for" loop should
> actually be pronounced "for-each". In other words when we pair the
> code-construct with a list (for example):
>
> for each item in the list the computer should perform some suite of
> commands.
>
> (the "suite" is everything 'inside' the for-each-loop - NB my
> 'Python-betters' will quickly point-out that this feature is not limited
> to Python-lists, but will work with any :iterable" - ref:
> https://docs.python.org/3/tutorial/controlflow.html#for-statements)
>
>
> Thus:
>
> > for item in headings: print( item )
> ...
> V0
> V1
> V2
> V3
>
>
> The problem is that when working with matrices/matrixes, a math
> background equips one with the idea of indices/indexes, eg the
> ubiquitous subscript-i. Accordingly, when reading 'math' where a formula
> uses the upper-case Greek "sigma" character, remember that it means "for
> all" or "for each"!
>
> So, if Python doesn't use indexing or "pointers", how do we deal with
> the problem?
>
> Unfortunately, at first glance, the pythonic approach may seem
> more-complicated or even somewhat convoluted, but once the concepts
> (and/or the Python idioms) are learned, it is quite manageable (and
> applicable to many more applications than matrices/matrixes!)...
> > if row1[i] == key1:
> > for j in range(len(row2)):
> > if row2[j] == key2:
> > res = df.iloc[:,j]
> > print(res) # 6
> >
> > Is there any way to use built-in function for a more efficient code?
> This is where your idea bears fruit!
>
> There is a Python "built-in function": zip(), which will 'join' lists.
> NB do not become confused between zip() and zip archive/compressed files!
>
> Most of the time reference book and web-page examples show zip() being
> used to zip-together two lists into a single data-construct (which is an
> iterable(!)). However, zip() will actually zip-together multiple (more
> than two) "iterables". As the manual says:
>
> «zip() returns an iterator of tuples, where the i-th tuple contains the
> i-th element from each of the argument iterables.»
>
> Ah, so that's where the math-idea of subscript-i went! It has become
> 'hidden' in Python's workings - or putting that another way: Python
> looks after the subscripting for us (and given that 'out by one' errors
> in pointers is a major source of coding-error in other languages,
> thank-you very much Python!)
>
> First re-state the source-data as Python lists, (per above) - except
> that I recommend the names be better-chosen to be more meaningful (to
> your application)!
>
>
> Now, (in the REPL) try using zip():
>
> >>> zip( headings, row1, row2, results )
> <zip object at 0x7f655cca6bc0>
>
> Does that seem a very good illustration? Not really, but re-read the
> quotation from the manual (above) where it says that zip returns an
> iterator. If we want to see the values an iterator will produce, then
> turn it into an iterable data-structure, eg:
>
> >>> list( zip( headings, row1, row2, results ) )
> [('V0', 4, 6, 2), ('V1', 1, 4, 3), ('V2', 1, 5, 6), ('V3', 1, 2, 7)]
>
> or, to see things more clearly, let me re-type it as:
>
> [
> ('V0', 4, 6, 2),
> ('V1', 1, 4, 3),
> ('V2', 1, 5, 6),
> ('V3', 1, 2, 7)
> ]
>
>
> What we now see is actually a "transpose" of the original 'matrix'
> presented in the post/question!
>
> (NB Python will perform this layout for us - read about the pprint library)
>
>
> Another method which can also be employed (and which will illustrate the
> loop required to code the eventual-solution(!)) is that Python's next()
> will extract the first row of the transpose:
>
> >>> row = next( zip( headings, row1, row2, results ) )
> >>> row
> ('V0', 4, 6, 2)
>
>
> This is all-well-and-good, but that result is a tuple of four items
> (corresponding to one column in the way the source-data was explained).
>
> If we need to consider the four individual data-items, that can be
> improved using a Python feature called "tuple unpacking". Instead of the
> above delivering a tuple which is then assigned to "row", the tuple can
> be assigned to four "identifiers", eg
>
> >>> heading, row1_item, row2_item, result= next( zip( headings, row1,
> row2, results ) )
>
> (apologies about email word-wrapping - this is a single line of Python-code)
>
>
> Which, to prove the case, could be printed:
>
> >>> heading, row1_item, row2_item, result
> ('V0', 4, 6, 2)
>
>
> (ref:
> https://docs.python.org/3/tutorial/datastructures.html?highlight=tuple%20unpacking#tuples-and-sequences)
>
>
> Thus, if we repeatedly ask for the next() row from the zip-ped
> transpose, eventually it will respond with the row starting 'V2' - which
> is the desired-result, ie the row containing the 1, the 5, and the 6 -
> and if you follow-through using the REPL, will be clearly visible.
>
>
> Finally, 'all' that is required, is a for-each-loop which will iterate
> across/down the zip object, one tuple (row of the transpose) at a time,
> AND perform the "tuple-unpacking" all in one command, with an
> if-statement to detect the correct row/column:
>
> >>> for *tuple-unpacking* in *zip() etc*:
> ... if row1_item == *what?* and row2_item == *what?*
> ... print( *which* and *which identifier* )
> ...
> V2 6
>
> Yes, three lines. It's as easy as that!
> (when you know how)
>
> Worse: when you become more expert, you'll be able to compress all of
> that down into a single-line solution - but it won't be as "readable" as
> is this!
>
>
> NB this question has a 'question-smell' of 'homework', so I'll not
> complete the code for you - this is something *you* asked to learn and
> the best way to learn is by 'doing' (not by 'reading').
>
> However, please respond with your solution, or any further question
> (with the next version of the code so-far, per this first-post - which
> we appreciate!)
>
> Regardless, you asked 'the right question' (curiosity is the key to
> learning) and in the right way/manner. Well done!
>
>
> NBB the above code-outline does not consider the situation where the
> search fails/the keys are not found!
>
>
> For further information, please review:
> https://docs.python.org/3/library/functions.html?highlight=zip#zip
>
> Also, further to the above discussion of combining lists and loops:
> https://docs.python.org/3/tutorial/datastructures.html?highlight=zip#looping-techniques
>
> and with a similar application (to this post):
> https://docs.python.org/3/faq/programming.html?highlight=zip#how-can-i-sort-one-list-by-values-from-another-list
>
> --
> Regards,

Click here to read the complete article

Re: Extracting dataframe column with multiple conditions on row values

<mailman.131.1641679255.3079.python-list@python.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=16622&group=comp.lang.python#16622

copy link Newsgroups: comp.lang.python

Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: avigr...@verizon.net (Avi Gross)
Newsgroups: comp.lang.python
Subject: Re: Extracting dataframe column with multiple conditions on row values
Date: Sat, 8 Jan 2022 22:00:51 +0000 (UTC)
Lines: 348
Message-ID: <mailman.131.1641679255.3079.python-list@python.org>
References: <92704ABE-B6C4-4F7C-BF40-1883C61A3964.ref@hxcore.ol>
<7530730e-b793-3075-2ec5-ee2e3160e0bb@DancesWithMice.info>
<92704ABE-B6C4-4F7C-BF40-1883C61A3964@hxcore.ol>
<mailman.125.1641604882.3079.python-list@python.org>
<6bc2ce03-75e3-48f7-8d7c-402345c6cb1cn@googlegroups.com>
<310031172.1839437.1641679251128@mail.yahoo.com>
Reply-To: Avi Gross <avigross@verizon.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de yOmvN/hNKdG2FkMU4ucqNQicqR/dF0D00ZpoTuQ8UPBA==
Return-Path: <avigross@verizon.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=verizon.net header.i=@verizon.net header.b=a/pHzKdT;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02; 'csv': 0.03;
'this:': 0.03; '(which': 0.04; 'argument': 0.04; '(for': 0.05;
'2022': 0.05; 'containing': 0.05; 'fairly': 0.05; 'issue.': 0.05;
'math': 0.05; "python's": 0.05; 'row': 0.05; '>>>': 0.07;
'(to': 0.07; 'arrays': 0.07; 'class.': 0.07; 'explicitly': 0.07;
'loop': 0.07; '"""': 0.09; '"python"': 0.09; 'check,': 0.09;
'code?': 0.09; 'commands.': 0.09; 'dataframe': 0.09; 'example.':
0.09; 'expression': 0.09; 'pandas': 0.09; 'problem?': 0.09;
'solution,': 0.09; 'solving': 0.09; 'then.': 0.09; '>': 0.14;
'import': 0.15; 'url:mailman': 0.15; '"what': 0.16; '(more': 0.16;
'(when': 0.16; '8:00': 0.16; 'above)': 0.16; 'assuming': 0.16;
'axis.': 0.16; 'both.': 0.16; 'column': 0.16; 'columns': 0.16;
'confused': 0.16; 'dataset.': 0.16; 'easier.': 0.16;
'easier.\xc2\xa0': 0.16; 'eg:': 0.16; 'equips': 0.16; 'excluded':
0.16; 'explicit': 0.16; 'extracting': 0.16; 'fund,': 0.16;
'index:': 0.16; 'iterable': 0.16; 'iterate': 0.16; 'learning)':
0.16; 'loops': 0.16; 'mahmood': 0.16; 'me.\xc2\xa0': 0.16; 'not)':
0.16; 'object,': 0.16; 'pointers': 0.16; 'print(': 0.16;
'pythonic': 0.16; 'relatively': 0.16; 'salaam': 0.16; 'says:':
0.16; 'solved': 0.16; 'somewhat': 0.16; 'specify': 0.16;
'subject:dataframe': 0.16; 'subject:values': 0.16; 'tuple': 0.16;
'tuples,': 0.16; 'url:faq': 0.16; 'url:programming': 0.16;
'which,': 0.16; 'wrote:': 0.16; 'problem': 0.16; 'python': 0.16;
'larger': 0.17; 'says': 0.17; 'values': 0.17; 'applications':
0.17; 'instead': 0.17; 'solve': 0.19; 'uses': 0.19; 'to:addr
:python-list': 0.20; 'written': 0.22; 'languages': 0.22;
'returns': 0.22; 'sat,': 0.22; 'version': 0.23; 'code': 0.23;
'goal': 0.23; 'idea': 0.24; 'to:name:python-list@python.org':
0.24; '(and': 0.25; 'skip:- 10': 0.25; 'url-ip:188.166.95.178/32':
0.25; 'url-ip:188.166.95/24': 0.25; 'discussion': 0.25;
'examples': 0.25; 'url:listinfo': 0.25; 'programming': 0.25; 'url-
ip:188.166/16': 0.25; 'seems': 0.26; 'tried': 0.26; '<pre': 0.26;
'object': 0.26; 'perform': 0.26; 'zip': 0.26; "isn't": 0.27;
'else': 0.27; 'function': 0.27; 'done': 0.28; 'thinking': 0.28;
'deal': 0.73; "you'll": 0.73; 'easy': 0.74; 'languages,': 0.76;
'see,': 0.76; 'supposed': 0.76; 'sent:': 0.78; 'highly': 0.78;
'major': 0.78; 'header:Reply-To:1': 0.79; 'quickly': 0.80;
'construction': 0.81; 'seek': 0.81; '"2"': 0.84; 'bears': 0.84;
'eventually': 0.84; 'expert,': 0.84; 'match.': 0.84; 'quotation':
0.84; 'ref:': 0.84; 'subject: \n ': 0.84; 'that!': 0.84; 'this!':
0.84; 'thus,': 0.84; 'thus:': 0.84; 'whole.': 0.84; '13px;': 0.91;
'greek': 0.91; 'influenced': 0.91; 'of:': 0.91
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=verizon.net; s=a2048;
t=1641679252; bh=zTFsPCFw5k1t/3EUxQUGef4oTYCRuQAG8psLQEKxnG0=;
h=Date:From:Reply-To:To:In-Reply-To:References:Subject:From:Subject:Reply-To;
b=a/pHzKdTs8rBdgC4po6MQt4zYeAwiTY8hJyW8XqIMBCEjLvIqMfyj0IvzjillAMiffdMgsK6QUddWsgQNTodSYsfL+sclpPKS91Lk4hcTIjsZhDlRqAAcLSsV6Ktie/GRHFN2NnskoxIUTPnMLoz2KZHl0J8DYZadqe2JDwkC91sbCjth/B14tXPOHgZwTF8IOrDfiCefTiNlksA8XPu5l+J83JtEgvySjoz7kPYLfCTeAHyvYJVWclKBOS2HW6VweCfwk/sQY6OjnhMGws7Ttlh2/p8js1o0N3Imv9zJypJDTvHH899Vs/pOl2Ij21n+VV61D6giF8v5BPfoeNzfA==
X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048;
t=1641679252; bh=O9nqKriCsRkG0FPjEX8H62C6kcGodop+P34n86HsW3S=;
h=X-Sonic-MF:Date:From:To:Subject:From:Subject;
b=RPaK538rmUqcTMUatzNHack43jusaD1HVYRwPvaUI6lTp1OjQDlHz8WHEzjgHrfu5PzUpEQ91bfMMOCxUl9UH1qTJBIkVGk1Voxr4Js6fT3ULMu7MbyTBWoVCTBJrQn7mYPCBkaH2UUlBwAu24MMSZcz0KnO9WGuWaxWfewtVu1nlzA6fq1V8B13svBwSrAS0ievC9926rMnShb+Z42isAU0UI0glQSy3n2wNEvQTn1doTYuTzma8Yz+G/1zSYsmZxkg52zgcYF5IOpI3vm3HtHK28An7FIvgMvq+TKWeSzbUNc+EMU7NUzXlmoTj6NkfoBmP1jd5+c353220gMnAw==
X-YMail-OSG: yFmNFs4VM1mBqDyujOkbLlnTHoaWCh81eJOQp4E.zQnrZ9lS9ec.yZYO6He.umr
VOqt40gdu3HdX6AbpLogwe5UEUnNb7vo2aVafwWir65Phu1FEP4wMxffcPG5uVwnCMFqVCxeoVK_
TMuhzB78pp7IWRV_WZhZV7ZT4ApsFWvpOHx570UBvrhjR7OAvE0eR_nK5SCnHO80helN3s3YbjrU
ARcAi4mmRPzkESUAaRSvBOJTALelxverRv8eHPtRhIJuauumVdsWnysMjBfs5vOimS8vuT3qReiD
fQHnS4bFcMONC4vBRNdPuE.HRo2n4uIoamGF1QYbmuKz5BJVGKFadFarSyWfyHOEfsoHroAW643R
TfK3QPIiiJYNJZ9wvT02ByWc8zjAhIVDRv1UUDdFmzd9lW6c2EGPtwNyXDtYqj6uWSE5d4h0ilDe
ObgvqbrX8pijb3A2cFwFmb6zq1oQjCiRcNUEzDAZv1Gsh.F_0mluqfZYmSsPqdAjOmO4KZyuPmCv
kcYy5yoZDb6qyHT0ZB7YTv.W8x2Vj1iTGamg1GShWbh1S2fG9.rqVPHeQgNJ7ZgxL.Ycx6TkUj4U
yGfQ4oYDnjRipooEbJcB1Pz1_56E1zW7KR7qUNOg7fw_cN7Rdf.dVdsjU_5_v5_X9gpeJNekwj0U
n9EWcmEyyTVRd1PqXA3rfTdudYQP2WgcGSmqrkDy_06RN_NiPeQACmXeFMmA8h2uyYVaML3hlRax
m23BicWTHia_jW9nRMpg65gixPh5ZEe0.8fgjA9Ho1GFwL6oz8gggNzLjmHSUkfKH.ZO4BbDxTCH
3aQoc4fx2Ne1gL2O8tp7q2jZPTgcLpEX1j6enQUkoEO10aGutpGiuPWznOaYX7DqjWwdFXbt5YSd
xS_HHafLTbGmHFs1UBE2sWd1SYhqlcRXYkN8a_WvDvFwiLDZzkVfytzzF4Tbqzuj832x0zh6JZag
sXgGkwJvc7sZc_.3dkzhzBPwHjsEcGMhzKIYcK1iQeNnJc84kI4ZFQXhe9XeWBPd8V1k812EwrnV
k7T1fVITdL.29fhBN7k8U5ehfk9q9ApXHFCAiB23KJ3kCX6GUWjE8em2SX.OFz9bEUfcoH5Wsnza
IbClS5sj6J9zrp3sQTVzYZW0MPKGi9QsmBb6nkLJzLWSxglx_lIZeFPWKvC_GX_GeQEKX8uNKvur
GoPVFg8eTwt_q8p0Twfxx.7QXtFbGMkuJLx6jom2YBAuoHna2MUUx22FS9dF5n2PNDlYDEnU1fLC
agmpBE1TeHXZ89teVY.vVuPLUb0fL4W2Ji7fKtfLiaYOhM2abrW.7AAsw4iIzJJ7ivq7CmfiksfM
Nifk5xHMpjjr_zzXTS_6qEqjMUfSdYynn9feSyBluLwCUrDOLUzbtdGWuA_DBYlf2ssVqNTY9hH6
NX0Rtg9fRvGJjux7xlHbTgaUrXg.6dae1XjyJVv1Hb0VIFfDh2SvqcKv3yjjEnDQ5_uyO0fAcE8g
d89cLCD6_jo2ZR_sDh6BhBptf94OBQPZXX1Il4p4TR4uPb3i2j1bx8IeOP8t1E9uqWD51otOdRsV
GL.uLYE.9Xekya453SBiLuFwp58KgQ9CpIV4FHP2e8Tu5jXsPzah_X6gJreIuKi1tVB8SpCul9_o
FjoIMSwPhPslWYRAvYxe3CddXMXXLSqUXXSTbYaEQT4.owHuZakkNUEY2JqYk5EehYyC.UM17Mjf
RY.K_w.qfBaQWnv4LH4KVrJYKOQEpzdKs9nAN7vMOf8vLAIuFFKuNysyXUx1q9ofXAPXKFQKYRvu
7tthTyQ3NST.Km3_xZiLkVUf8VnZFS5UTQ7._nvnVTAlNHdtwREZ9iuZoSua2xpVCaZAr43NnxM_
xmtOVXOJhUOmwxAr4iPi2dDgBpB21YeLob3Lj0DvQbTAaDadfgvGZjYDEOTD5Fdtc84usFRcDps7
tCbcdxur31gGArdGELWmM1Zlw8B9hfjUqkbqucFRct7UKC6fk1ABO0tGrkwyCYXh3DXG4l_qr7nU
tVC0C_D2.3H1OFb4AX8cLulCN.zDqwLNTpBcw0ifsGIMOjc2E7i3AcY_UEszX7.UL6DFWwk.fbi.
f.GzE8ThmybKSEGmlqSFbS9Ixcma_8sGdyY3OSU9HpBRk7Ed2g8kO_iB1o9ufqNVEPT.sJghkh_P
lpCBKoKw23UTqSiYSi8ZAkpeYixnaNuTP_vsLWPun9O20peqkgSBANusjt3jqR28M3dLdvtcejGR
bRFW2HGVjAtq1QbxxYviuyBx3yurquBRMpp5FGUXlPX58uh9aRlsmm0WVhhWzTe.4HsrbpZqJAbi
DFtX6qS5jtYIZlie8kERgfpy16DND1X4CZxljb8AfVcANbcGXGPs-
X-Sonic-MF: <avigross@verizon.net>
In-Reply-To: <6bc2ce03-75e3-48f7-8d7c-402345c6cb1cn@googlegroups.com>
X-Mailer: WebService/1.1.19551 aolwebmail
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <310031172.1839437.1641679251128@mail.yahoo.com>
X-Mailman-Original-References: <92704ABE-B6C4-4F7C-BF40-1883C61A3964.ref@hxcore.ol>
<7530730e-b793-3075-2ec5-ee2e3160e0bb@DancesWithMice.info>
<92704ABE-B6C4-4F7C-BF40-1883C61A3964@hxcore.ol>
<mailman.125.1641604882.3079.python-list@python.org>
<6bc2ce03-75e3-48f7-8d7c-402345c6cb1cn@googlegroups.com>

by: Avi Gross - Sat, 8 Jan 2022 22:00 UTC

I have to wonder if when something looks like HOMEWORK, if it should be answered in detail, let alone using methods beyond what is expected in class.
The goal of this particular project seems to be to find one (or perhaps more) columns in some data structure like a dataframe that match two conditions (containing a copy of two numbers in one or more places) and then KNOW what column it was in. The reason I say that is because the next fairly nonsensical request is to then explicitly return what that column has in the row called 2, meaning the third row.
Perhaps stated another way: "what it the item in row/address 2 of the column that somewhere contains two additional specified contents called key1 and key2"
My guess is that if the instructor wanted this to be solved using methods being taught, then loops may well be a way to go. Python and numpy/pandas make it often easier to do things with columns rather than in rows across them, albeit many things allow you to specify an axis. So, yes, transposing is a way to go that transforms the problem in a way easier to solve without thinking deeply. Some other languages allow relatively easy access in both directions of horizontally versus vertically. And this may be an example where solving it as a list of lists may also be easier.
Is the solution at the bottom a solution? Before I check, I want to see if I understand the required functionality and ask if it is completely and unambiguously specified.
For completeness, the question being asked may need to deal with a uniqueness issue. Is it possible multiple columns match the request and thus more than one answer is required to be returned? Is the row called 2 allowed to participate in the match or must it be excluded and the question becomes to find one (or more) columns that contain key1 somewhere else than row 2 and key2 (which may have to be different than key1 or not) somewhere else and THEN provide the corresponding entry from row 2 and that (or those) column(s)?
So in looking at the solution offered, what exactly was this supposed to do when dft is the transpose?
idt = (dft[0] == 1) & (dft[1] == 5)
Was the code (way below in this message) tried out or just written for us to ponder? I tried it. I got an answer of: 0 1 2
V2 1 5 6
That is not my understanding of what was requested. Row 2 (shown transposed as a column) is being shown as a whole. The request was for item "2" which would be just 6. Something more like this:
print(dft[idt][2])

But the code makes no sense to me. seems to explicitly test the first column (0) to see if it contains a 1 and then the second column (1) to see if it contains a 5. Not sure who cares about this hard-wired query as this is not my understanding of the question. You want any of the original three rows (now transposed) tested to see if it contains BOTH.
I may have read the requirements wrong or it may not be explained well. Until I am sure what is being asked and whether there is a good reason someone wants a different solution, I see no reason to provide yet another solution.But just for fund, assuming dft contains the transpose of the original data, will this work?
first = dft[dft.values == key1 ]second = first[first.values == key2 ]print(second[2])
I get a 6 as an answer and suppose it could be done in one more complex expression if needed! LOL!
-----Original Message-----
From: Edmondo Giovannozzi <edmondo.giovannozzi@gmail.com>
To: python-list@python.org
Sent: Sat, Jan 8, 2022 8:00 am
Subject: Re: Extracting dataframe column with multiple conditions on row values

Click here to read the complete article

Il giorno sabato 8 gennaio 2022 alle 23:01:13 UTC+1 Avi Gross ha scritto:
> I have to wonder if when something looks like HOMEWORK, if it should be answered in detail, let alone using methods beyond what is expected in class..
> The goal of this particular project seems to be to find one (or perhaps more) columns in some data structure like a dataframe that match two conditions (containing a copy of two numbers in one or more places) and then KNOW what column it was in. The reason I say that is because the next fairly nonsensical request is to then explicitly return what that column has in the row called 2, meaning the third row.
> Perhaps stated another way: "what it the item in row/address 2 of the column that somewhere contains two additional specified contents called key1 and key2"
> My guess is that if the instructor wanted this to be solved using methods being taught, then loops may well be a way to go. Python and numpy/pandas make it often easier to do things with columns rather than in rows across them, albeit many things allow you to specify an axis. So, yes, transposing is a way to go that transforms the problem in a way easier to solve without thinking deeply. Some other languages allow relatively easy access in both directions of horizontally versus vertically. And this may be an example where solving it as a list of lists may also be easier.
> Is the solution at the bottom a solution? Before I check, I want to see if I understand the required functionality and ask if it is completely and unambiguously specified.
> For completeness, the question being asked may need to deal with a uniqueness issue. Is it possible multiple columns match the request and thus more than one answer is required to be returned? Is the row called 2 allowed to participate in the match or must it be excluded and the question becomes to find one (or more) columns that contain key1 somewhere else than row 2 and key2 (which may have to be different than key1 or not) somewhere else and THEN provide the corresponding entry from row 2 and that (or those) column(s)?
> So in looking at the solution offered, what exactly was this supposed to do when dft is the transpose?
> idt = (dft[0] == 1) & (dft[1] == 5)
> Was the code (way below in this message) tried out or just written for us to ponder? I tried it. I got an answer of: 0 1 2
> V2 1 5 6
> That is not my understanding of what was requested. Row 2 (shown transposed as a column) is being shown as a whole. The request was for item "2" which would be just 6. Something more like this:
> print(dft[idt][2])
>
> But the code makes no sense to me. seems to explicitly test the first column (0) to see if it contains a 1 and then the second column (1) to see if it contains a 5. Not sure who cares about this hard-wired query as this is not my understanding of the question. You want any of the original three rows (now transposed) tested to see if it contains BOTH.
> I may have read the requirements wrong or it may not be explained well. Until I am sure what is being asked and whether there is a good reason someone wants a different solution, I see no reason to provide yet another solution.But just for fund, assuming dft contains the transpose of the original data, will this work?
> first = dft[dft.values == key1 ]second = first[first.values == key2 ]print(second[2])
> I get a 6 as an answer and suppose it could be done in one more complex expression if needed! LOL!
> -----Original Message-----
> From: Edmondo Giovannozzi <edmondo.g...@gmail.com>
> To: pytho...@python.org
> Sent: Sat, Jan 8, 2022 8:00 am
> Subject: Re: Extracting dataframe column with multiple conditions on row values
>
> Il giorno sabato 8 gennaio 2022 alle 02:21:40 UTC+1 dn ha scritto:
> > Salaam Mahmood,
> > On 08/01/2022 12.07, Mahmood Naderan via Python-list wrote:
> > > I have a csv file like this
> > > V0,V1,V2,V3
> > > 4,1,1,1
> > > 6,4,5,2
> > > 2,3,6,7
> > >
> > > And I want to search two rows for a match and find the column. For
> > > example, I want to search row[0] for 1 and row[1] for 5. The corresponding
> > > column is V2 (which is the third column). Then I want to return the value
> > > at row[2] and the found column. The result should be 6 then.
> > Not quite: isn't the "found column" also required?
> > > I can manually extract the specified rows (with index 0 and 1 which are
> > > fixed) and manually iterate over them like arrays to find a match. Then I
> > Perhaps this idea has been influenced by a similar solution in another
> > programming language. May I suggest that the better-answer you seek lies
> > in using Python idioms (as well as Python's tools)...
> > > key1 = 1
> > > key2 = 5
> > Fine, so far - excepting that this 'problem' is likely to be a small
> > part of some larger system. Accordingly, consider writing it as a
> > function. In which case, these two "keys" will become
> > function-parameters (and the two 'results' become return-values).
> > > row1 = df.iloc[0] # row=[4,1,1,1]
> > > row2 = df.iloc[1] # row=[6,4,5,2]
> > This is likely not native-Python. Let's create lists for 'everything',
> > just-because:
> >
> > >>> headings = [ "V0","V1","V2","V3" ]
> > >>> row1 = [4,1,1,1]
> > >>> row2 = [6,4,5,2]
> > >>> results = [ 2,3,6,7 ]
> >
> >
> > Note how I'm using the Python REPL (in a "terminal", type "python" (as
> > appropriate to your OpSys) at the command-line). IMHO the REPL is a
> > grossly under-rated tool, and is a very good means towards
> > trial-and-error, and learning by example. Highly recommended!
> >
> >
> > > for i in range(len(row1)):
> >
> > This construction is very much a "code smell" for thinking that it is
> > not "pythonic". (and perhaps the motivation for this post)
> >
> > In Python (compared with many other languages) the "for" loop should
> > actually be pronounced "for-each". In other words when we pair the
> > code-construct with a list (for example):
> >
> > for each item in the list the computer should perform some suite of
> > commands.
> >
> > (the "suite" is everything 'inside' the for-each-loop - NB my
> > 'Python-betters' will quickly point-out that this feature is not limited
> > to Python-lists, but will work with any :iterable" - ref:
> > https://docs.python.org/3/tutorial/controlflow.html#for-statements)
> >
> >
> > Thus:
> >
> > > for item in headings: print( item )
> > ...
> > V0
> > V1
> > V2
> > V3
> >
> >
> > The problem is that when working with matrices/matrixes, a math
> > background equips one with the idea of indices/indexes, eg the
> > ubiquitous subscript-i. Accordingly, when reading 'math' where a formula
> > uses the upper-case Greek "sigma" character, remember that it means "for
> > all" or "for each"!
> >
> > So, if Python doesn't use indexing or "pointers", how do we deal with
> > the problem?
> >
> > Unfortunately, at first glance, the pythonic approach may seem
> > more-complicated or even somewhat convoluted, but once the concepts
> > (and/or the Python idioms) are learned, it is quite manageable (and
> > applicable to many more applications than matrices/matrixes!)...
> > > if row1[i] == key1:
> > > for j in range(len(row2)):
> > > if row2[j] == key2:
> > > res = df.iloc[:,j]
> > > print(res) # 6
> > >
> > > Is there any way to use built-in function for a more efficient code?
> > This is where your idea bears fruit!
> >
> > There is a Python "built-in function": zip(), which will 'join' lists.
> > NB do not become confused between zip() and zip archive/compressed files!
> >
> > Most of the time reference book and web-page examples show zip() being
> > used to zip-together two lists into a single data-construct (which is an
> > iterable(!)). However, zip() will actually zip-together multiple (more
> > than two) "iterables". As the manual says:
> >
> > «zip() returns an iterator of tuples, where the i-th tuple contains the
> > i-th element from each of the argument iterables.»
> >
> > Ah, so that's where the math-idea of subscript-i went! It has become
> > 'hidden' in Python's workings - or putting that another way: Python
> > looks after the subscripting for us (and given that 'out by one' errors
> > in pointers is a major source of coding-error in other languages,
> > thank-you very much Python!)
> >
> > First re-state the source-data as Python lists, (per above) - except
> > that I recommend the names be better-chosen to be more meaningful (to
> > your application)!
> >
> >
> > Now, (in the REPL) try using zip():
> >
> > >>> zip( headings, row1, row2, results )
> > <zip object at 0x7f655cca6bc0>
> >
> > Does that seem a very good illustration? Not really, but re-read the
> > quotation from the manual (above) where it says that zip returns an
> > iterator. If we want to see the values an iterator will produce, then
> > turn it into an iterable data-structure, eg:
> >
> > >>> list( zip( headings, row1, row2, results ) )
> > [('V0', 4, 6, 2), ('V1', 1, 4, 3), ('V2', 1, 5, 6), ('V3', 1, 2, 7)]
> >
> > or, to see things more clearly, let me re-type it as:
> >
> > [
> > ('V0', 4, 6, 2),
> > ('V1', 1, 4, 3),
> > ('V2', 1, 5, 6),
> > ('V3', 1, 2, 7)
> > ]
> >
> >
> > What we now see is actually a "transpose" of the original 'matrix'
> > presented in the post/question!
> >
> > (NB Python will perform this layout for us - read about the pprint library)
> >
> >
> > Another method which can also be employed (and which will illustrate the
> > loop required to code the eventual-solution(!)) is that Python's next()
> > will extract the first row of the transpose:
> >
> > >>> row = next( zip( headings, row1, row2, results ) )
> > >>> row
> > ('V0', 4, 6, 2)
> >
> >
> > This is all-well-and-good, but that result is a tuple of four items
> > (corresponding to one column in the way the source-data was explained).
> >
> > If we need to consider the four individual data-items, that can be
> > improved using a Python feature called "tuple unpacking". Instead of the
> > above delivering a tuple which is then assigned to "row", the tuple can
> > be assigned to four "identifiers", eg
> >
> > >>> heading, row1_item, row2_item, result= next( zip( headings, row1,
> > row2, results ) )
> >
> > (apologies about email word-wrapping - this is a single line of Python-code)
> >
> >
> > Which, to prove the case, could be printed:
> >
> > >>> heading, row1_item, row2_item, result
> > ('V0', 4, 6, 2)
> >
> >
> > (ref:
> > https://docs.python.org/3/tutorial/datastructures.html?highlight=tuple%20unpacking#tuples-and-sequences)
> >
> >
> > Thus, if we repeatedly ask for the next() row from the zip-ped
> > transpose, eventually it will respond with the row starting 'V2' - which
> > is the desired-result, ie the row containing the 1, the 5, and the 6 -
> > and if you follow-through using the REPL, will be clearly visible.
> >
> >
> > Finally, 'all' that is required, is a for-each-loop which will iterate
> > across/down the zip object, one tuple (row of the transpose) at a time,
> > AND perform the "tuple-unpacking" all in one command, with an
> > if-statement to detect the correct row/column:
> >
> > >>> for *tuple-unpacking* in *zip() etc*:
> > ... if row1_item == *what?* and row2_item == *what?*
> > ... print( *which* and *which identifier* )
> > ...
> > V2 6
> >
> > Yes, three lines. It's as easy as that!
> > (when you know how)
> >
> > Worse: when you become more expert, you'll be able to compress all of
> > that down into a single-line solution - but it won't be as "readable" as
> > is this!
> >
> >
> > NB this question has a 'question-smell' of 'homework', so I'll not
> > complete the code for you - this is something *you* asked to learn and
> > the best way to learn is by 'doing' (not by 'reading').
> >
> > However, please respond with your solution, or any further question
> > (with the next version of the code so-far, per this first-post - which
> > we appreciate!)
> >
> > Regardless, you asked 'the right question' (curiosity is the key to
> > learning) and in the right way/manner. Well done!
> >
> >
> > NBB the above code-outline does not consider the situation where the
> > search fails/the keys are not found!
> >
> >
> > For further information, please review:
> > https://docs.python.org/3/library/functions.html?highlight=zip#zip
> >
> > Also, further to the above discussion of combining lists and loops:
> > https://docs.python.org/3/tutorial/datastructures.html?highlight=zip#looping-techniques
> >
> > and with a similar application (to this post):
> > https://docs.python.org/3/faq/programming.html?highlight=zip#how-can-i-sort-one-list-by-values-from-another-list
> >
> > --
> > Regards,
>
> You may also transpose your dataset. Then the index will become your column name and the column name become your index:
> To read your dataset:
>
> import pandas as pd
> import io
>
> DN = """
> V0,V1,V2,V3
> 4,1,1,1
> 6,4,5,2
> 2,3,6,7
> """
> df = pd.read_csv(io.StringIO(DN))
>
> Transpose it:
>
> dft = df.T
>
> Find all the index with your condition:
>
> idt = (dft[0] == 1) & (dft[1] == 5)
>
> Print the columns that satisfy your condition:
>
> print(dft[idt])
>
> As you see, without explicit loop.
> --
> https://mail.python.org/mailman/listinfo/python-list

Click here to read the complete article

Subject	Author
Re: Extracting dataframe column with multiple conditions on row values	dn
Re: Extracting dataframe column with multiple conditions on row values	Edmondo Giovannozzi
Re: Extracting dataframe column with multiple conditions on row values	Avi Gross
Re: Extracting dataframe column with multiple conditions on row values	Edmondo Giovannozzi