Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

All Finagle Laws may be bypassed by learning the simple art of doing without thinking.


devel / comp.lang.python / Re: How to replace a cell value with each of its contour cells and yield the corresponding datasets seperately in a list according to a Pandas-way?

SubjectAuthor
o Re: How to replace a cell value with each of its contour cells and yield the cormarc nicole

1
Re: How to replace a cell value with each of its contour cells and yield the corresponding datasets seperately in a list according to a Pandas-way?

<mailman.98.1705861368.15798.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24921&group=comp.lang.python#24921

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: mk1853...@gmail.com (marc nicole)
Newsgroups: comp.lang.python
Subject: Re: How to replace a cell value with each of its contour cells and
yield the corresponding datasets seperately in a list according to a
Pandas-way?
Date: Sun, 21 Jan 2024 19:25:17 +0100
Lines: 223
Message-ID: <mailman.98.1705861368.15798.python-list@python.org>
References: <CAGJtH9RvLNeHFQSVk4b-TuYKKLgVbvAnoiQvC37vw5HGm2YjHg@mail.gmail.com>
<a9aa6809-2842-4e7d-9024-fb2549c2eb65@tompassin.net>
<CAGJtH9Svh5Q5B68aB+WK8JaYzV=EY-pfobi=cSoNjrTxBxv5_Q@mail.gmail.com>
<86081560-08ff-42bc-ae6d-dfb03d8b2a7d@tompassin.net>
<CAGJtH9Qcqd_fBB-h7ED9QxnA0aen6HiavM7+pY00T349MzkT8A@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de HYUVIiz15367e93B6fOqewwb8Qkdu+D9CfuUMiNCUfHA==
Cancel-Lock: sha1:k1C1D9Lg0N/ivWkuCSM/4qZmvg0= sha256:XEn6CwSKUzMTGtgzrvjuYDs+Z65Oku6LWto5KCC4IK8=
Return-Path: <mk1853387@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=duEbHKVS;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'comments': 0.03; 'def':
0.04; 'containing': 0.05; 'issue.': 0.05; 'math': 0.05; 'random':
0.05; 'arrays': 0.07; 'queue': 0.07; 'skip:\xc2 30': 0.07;
'subject:value': 0.07; 'approach?': 0.09; 'cell.': 0.09; 'code?':
0.09; 'dataframe': 0.09; 'email addr:python.org>': 0.09;
'filtering': 0.09; 'list.\xc2\xa0': 0.09; 'numpy': 0.09; 'pandas':
0.09; 'skip:z 20': 0.09; 'values.': 0.09; 'subject:list': 0.11;
'&gt;': 0.14; 'import': 0.15; 'url:mailman': 0.15; '(1,': 0.16;
'1),': 0.16; '2024': 0.16; '[snip]': 0.16; 'arguments': 0.16;
'be,': 0.16; 'code.\xc2\xa0': 0.16; 'dataframes': 0.16;
'dataset,': 0.16; 'datasets': 0.16; 'datasets,': 0.16;
'enhancing': 0.16; 'itertools': 0.16; 'lambda': 0.16; 'main()':
0.16; 'picking': 0.16; 'received:mail-pj1-x102c.google.com': 0.16;
'reformat': 0.16; 'separately': 0.16; 'subject:cell': 0.16;
'subject:each': 0.16; 'subject:way': 0.16; 'tuples': 0.16;
'yield': 0.16; '\xc2\xa0--': 0.16; '\xc2\xa01.': 0.16;
'\xc2\xa0:': 0.16; '\xc2\xa0have': 0.16; '\xc2\xa0keep': 0.16;
'\xc2\xa0on': 0.16; 'wrote:': 0.16; 'python': 0.16; 'larger':
0.17; 'code.': 0.17; 'probably': 0.17; 'libraries': 0.19; 'to:addr
:python-list': 0.20; 'machine': 0.22; 'written': 0.22; 'skip:&
40': 0.22; 'code': 0.23; 'laptop': 0.23; 'lines': 0.23; 'skip:p
30': 0.23; 'subject:How': 0.23; 'run': 0.23; 'url-
ip:188.166.95.178/32': 0.25; 'url-ip:188.166.95/24': 0.25;
'url:listinfo': 0.25; 'url-ip:188.166/16': 0.25; 'email
addr:python.org&gt;': 0.28; 'seem': 0.31; 'takes': 0.31; 'am,':
0.31; 'approach': 0.31; 'url-ip:188/8': 0.31; 'think': 0.32;
"doesn't": 0.32; 'aiming': 0.32; 'execution': 0.32; 'python-list':
0.32; 'specified': 0.32; 'message-id:@mail.gmail.com': 0.32;
'100': 0.33; 'header:In-Reply-To:1': 0.34; 'received:google.com':
0.34; 'complex': 0.35; 'from:addr:gmail.com': 0.35; 'cell': 0.36;
'count': 0.36; 'target': 0.36; 'people': 0.36; 'those': 0.36;
'couple': 0.37; 'lists': 0.37; 'special': 0.37; 'really': 0.37;
'using': 0.37; "it's": 0.37; 'could': 0.38; 'read': 0.38;
'thanks': 0.38; 'changes': 0.39; 'single': 0.39; 'list': 0.39;
'break': 0.39; 'define': 0.40; 'program.': 0.40; 'seconds': 0.40;
'hello,': 0.40; 'want': 0.40; 'should': 0.40; 'four': 0.60;
'initial': 0.61; 'identified': 0.62; 'here': 0.62; 'skip:r 20':
0.64; 'about.': 0.64; 'skip:t 40': 0.64; 'your': 0.64;
'independent': 0.65; 'look': 0.65; 'well': 0.65; 'let': 0.66;
'skip:t 20': 0.66; 'nearly': 0.67; 'skip:t 30': 0.67; 'skip:e 20':
0.67; 'choose': 0.67; 'skip:i 40': 0.68; 'surrounding': 0.69;
'below': 0.69; '8bit%:6': 0.71; 'product': 0.71; '10%': 0.76;
'formatting': 0.76; 'email name:&lt;python-list': 0.84; 'factor.':
0.84; 'janv.': 0.84; 'subject: \n ': 0.84;
'\xc3\xa9crit\xc2\xa0:': 0.84; 'skip:d 30': 0.86; 'thoughts,':
0.91
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20230601; t=1705861365; x=1706466165; darn=python.org;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=mX8VfxwQu0MNgJV+ph+DvZ5kjWxj6sFTQwubAoo0Ud8=;
b=duEbHKVSMyH7bOadpEdslidRqFeWpnTrhFq0L0HroMfW6gxAysBTczdcMmirFvyy5m
vGgvS5+oN2SIg4JzylcaWrP+F0UGutfI31PxrSab+SznDmh6ATnmqINrJJWZ1Y3HE5N9
cNmom1k1i8gO3S4UXZ90I2jtZAD/ah617zOayzLKdqG++cET5iLEpw/a4wX7zrNEzqnq
EFLhbJGTKmKddMYc0qE+Ml0EdAnXOO4FTastrwm+l45aedqCVMPgCM0DkcXnPB9LolZ0
B96rmaN3yGCK4l5Mm44DOOYvKivXfyuoVO8CcJQtnNsRd9/tZe+D8Sa8TzTdYjweCr55
FDqQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1705861365; x=1706466165;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=mX8VfxwQu0MNgJV+ph+DvZ5kjWxj6sFTQwubAoo0Ud8=;
b=rRAmlfpow66RSQTLFcK68pKl9lLvZfp+fdmivKCosjan9QvxDhTYPtFzMF0ppZF4M3
Kdb7rgS/4x2TGcqIi67Dtw8UAxDYjcvdmIGkqk03Je1G+Az8C0BAZCRMbqexUv2/mdgG
HUJJymbuCL2sfqfEbrswUgsUvil9x5LyGIMJDEvS9tpJhCiXjFP7PEOitRa9CP0wlp2F
QC3/H1v58NrShSA/oSqN8y0YcBAGasKZo6WYKJZJpe3RT9Wz5xUUVCBl6eIwdrGmScx/
PshAqVRXhb0e0kgTL56V1doNmZIRnqwWNL90644hSQm2uI4IhdzO0N6pMSE2c55/vViO
DrGg==
X-Gm-Message-State: AOJu0Yzg0S45ce/W/HtA3LCpWGP8I/LsvhUdLF735l8Tgvd6pqyjZn8J
5WD9StvAcJOP1A+7t9mSJK83nSMcERz8rb0gXcN0DkPwPd1ARcr2LsVrvgNwNmnuwp9roHJMPuF
phxlwFBzp9NKYXWFj5tn+qd8QObBkB4N2
X-Google-Smtp-Source: AGHT+IFeCAM8B/sYKWYk9rxH2nwuRATCbrFuSxxQUpuZnrpkS1npwV3X0dtS28wiVlClsn0zKFU0pia4v281BtFJUG0=
X-Received: by 2002:a17:90a:12ce:b0:28e:77e9:a2c6 with SMTP id
b14-20020a17090a12ce00b0028e77e9a2c6mr5302280pjg.8.1705861364987; Sun, 21 Jan
2024 10:22:44 -0800 (PST)
In-Reply-To: <86081560-08ff-42bc-ae6d-dfb03d8b2a7d@tompassin.net>
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAGJtH9Qcqd_fBB-h7ED9QxnA0aen6HiavM7+pY00T349MzkT8A@mail.gmail.com>
X-Mailman-Original-References: <CAGJtH9RvLNeHFQSVk4b-TuYKKLgVbvAnoiQvC37vw5HGm2YjHg@mail.gmail.com>
<a9aa6809-2842-4e7d-9024-fb2549c2eb65@tompassin.net>
<CAGJtH9Svh5Q5B68aB+WK8JaYzV=EY-pfobi=cSoNjrTxBxv5_Q@mail.gmail.com>
<86081560-08ff-42bc-ae6d-dfb03d8b2a7d@tompassin.net>
 by: marc nicole - Sun, 21 Jan 2024 18:25 UTC

It is part of a larger project aiming at processing data according to a
given algorithm
Do you have any comments or any enhancing recommendations on the code?

Thanks.

Le dim. 21 janv. 2024 à 18:28, Thomas Passin via Python-list <
python-list@python.org> a écrit :

> On 1/21/2024 11:54 AM, marc nicole wrote:
> > Thanks for the reply,
> >
> > I think using a Pandas (or a Numpy) approach would optimize the
> > execution of the program.
> >
> > Target cells could be up to 10% the size of the dataset, a good example
> > to start with would have from 10 to 100 values.
>
> Thanks for the reformatted code. It's much easier to read and think about.
>
> For say 100 points, it doesn't seem that "optimization" would be much of
> an issue. On my laptop machine and Python 3.12, your example takes
> around 5 seconds to run and print(). OTOH if you think you will go to
> much larger datasets, certainly execution time could become a factor.
>
> I would think that NumPy arrays and/or matrices would have good potential..
>
> Is this some kind of a cellular automaton, or an image filtering process?
>
> > Let me know your thoughts, here's a reproducible example which I
> formatted:
> >
> >
> >
> > from numpy import random
> > import pandas as pd
> > import numpy as np
> > import operator
> > import math
> > from collections import deque
> > from queue import *
> > from queue import Queue
> > from itertools import product
> >
> >
> > def select_target_values(dataframe, number_of_target_values):
> > target_cells = []
> > for _ in range(number_of_target_values):
> > row_x = random.randint(0, len(dataframe.columns) - 1)
> > col_y = random.randint(0, len(dataframe) - 1)
> > target_cells.append((row_x, col_y))
> > return target_cells
> >
> >
> > def select_contours(target_cells):
> > contour_coordinates = [(0, 1), (1, 0), (0, -1), (-1, 0)]
> > contour_cells = []
> > for target_cell in target_cells:
> > # random contour count for each cell
> > contour_cells_count = random.randint(1, 4)
> > try:
> > contour_cells.append(
> > [
> > tuple(
> > map(
> > lambda i, j: i + j,
> > (target_cell[0], target_cell[1]),
> > contour_coordinates[iteration_],
> > )
> > )
> > for iteration_ in range(contour_cells_count)
> > ]
> > )
> > except IndexError:
> > continue
> > return contour_cells
> >
> >
> > def create_zipf_distribution():
> > zipf_dist = random.zipf(2, size=(50, 5)).reshape((50, 5))
> >
> > zipf_distribution_dataset = pd.DataFrame(zipf_dist).round(3)
> >
> > return zipf_distribution_dataset
> >
> >
> > def apply_contours(target_cells, contour_cells):
> > target_cells_with_contour = []
> > # create one single list of cells
> > for idx, target_cell in enumerate(target_cells):
> > target_cell_with_contour = [target_cell]
> > target_cell_with_contour.extend(contour_cells[idx])
> > target_cells_with_contour.append(target_cell_with_contour)
> > return target_cells_with_contour
> >
> >
> > def create_possible_datasets(dataframe, target_cells_with_contour):
> > all_datasets_final = []
> > dataframe_original = dataframe.copy()
> >
> > list_tuples_idx_cells_all_datasets = list(
> > filter(
> > lambda x: x,
> > [list(tuples) for tuples in
> > list(product(*target_cells_with_contour))],
> > )
> > )
> > target_original_cells_coordinates = list(
> > map(
> > lambda x: x[0],
> > [
> > target_and_contour_cell
> > for target_and_contour_cell in target_cells_with_contour
> > ],
> > )
> > )
> > for dataset_index_values in list_tuples_idx_cells_all_datasets:
> > all_datasets = []
> > for idx_cell in range(len(dataset_index_values)):
> > dataframe_cpy = dataframe.copy()
> > dataframe_cpy.iat[
> > target_original_cells_coordinates[idx_cell][1],
> > target_original_cells_coordinates[idx_cell][0],
> > ] = dataframe_original.iloc[
> > dataset_index_values[idx_cell][1],
> > dataset_index_values[idx_cell][0]
> > ]
> > all_datasets.append(dataframe_cpy)
> > all_datasets_final.append(all_datasets)
> > return all_datasets_final
> >
> >
> > def main():
> > zipf_dataset = create_zipf_distribution()
> >
> > target_cells = select_target_values(zipf_dataset, 5)
> > print(target_cells)
> > contour_cells = select_contours(target_cells)
> > print(contour_cells)
> > target_cells_with_contour = apply_contours(target_cells,
> contour_cells)
> > datasets = create_possible_datasets(zipf_dataset,
> > target_cells_with_contour)
> > print(datasets)
> >
> >
> > main()
> >
> > Le dim. 21 janv. 2024 à 16:33, Thomas Passin via Python-list
> > <python-list@python.org <mailto:python-list@python.org>> a écrit :
> >
> > On 1/21/2024 7:37 AM, marc nicole via Python-list wrote:
> > > Hello,
> > >
> > > I have an initial dataframe with a random list of target cells
> > (each cell
> > > being identified with a couple (x,y)).
> > > I want to yield four different dataframes each containing the
> > value of one
> > > of the contour (surrounding) cells of each specified target cell..
> > >
> > > the surrounding cells to consider for a specific target cell are
> > : (x-1,y),
> > > (x,y-1),(x+1,y);(x,y+1), specifically I randomly choose 1 to 4
> > cells from
> > > these and consider for replacement to the target cell.
> > >
> > > I want to do that through a pandas-specific approach without
> > having to
> > > define the contour cells separately and then apply the changes on
> the
> > > dataframe
> >
> > 1. Why do you want a Pandas-specific approach? Many people would
> > rather
> > keep code independent of special libraries if possible;
> >
> > 2. How big can these collections of target cells be, roughly
> speaking?
> > The size could make a big difference in picking a design;
> >
> > 3. You really should work on formatting code for this list. Your
> code
> > below is very complex and would take a lot of work to reformat to the
> > point where it is readable, especially with the nearly impenetrable
> > arguments in some places. Probably all that is needed is to replace
> > all
> > tabs by (say) three spaces, and to make sure you intentionally break
> > lines well before they might get word-wrapped. Here is one example I
> > have reformatted (I hope I got this right):
> >
> > list_tuples_idx_cells_all_datasets = list(filter(
> > lambda x: utils_tuple_list_not_contain_nan(x),
> > [list(tuples) for tuples in list(
> > itertools.product(*target_cells_with_contour))
> > ]))
> >
> > 4. As an aside, it doesn't look like you need to convert all those
> > sequences and iterators to lists all over the place;
> >
> >
> > > (but rather using an all in one approach):
> > > for now I have written this example which I think is not Pandas
> > specific:
> > [snip]
> >
> > --
> > https://mail.python.org/mailman/listinfo/python-list
> > <https://mail.python.org/mailman/listinfo/python-list>
> >
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>


Click here to read the complete article
1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor