Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"Being against torture ought to be sort of a bipartisan thing." -- Karl Lehenbauer


devel / comp.lang.python / Selenium script - stuck - could someone take a look?

SubjectAuthor
* Selenium script - stuck - could someone take a look?Veek M
+- Re: Selenium script - stuck - could someone take a look?Veek M
`* Re: Selenium script - stuck - could someone take a look?Dennis Lee Bieber
 `* Re: Selenium script - stuck - could someone take a look?Veek M
  `- Re: Selenium script - stuck - could someone take a look?Dennis Lee Bieber

1
Selenium script - stuck - could someone take a look?

<s8t26j$gh5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13416&group=comp.lang.python#13416

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: vee...@foo.com (Veek M)
Newsgroups: comp.lang.python
Subject: Selenium script - stuck - could someone take a look?
Date: Sat, 29 May 2021 09:40:35 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 120
Message-ID: <s8t26j$gh5$1@dont-email.me>
Reply-To: Veek M <vek.m12@foo.com>
Injection-Date: Sat, 29 May 2021 09:40:35 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="72582cacbf16440c92d3828739209c82";
logging-data="16933"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/iKvH08GqrYSmC9DBap681"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:gIGrmV/oU1LjsRgOPLSl+sEa/Ew=
 by: Veek M - Sat, 29 May 2021 09:40 UTC

Script: http://paste.debian.net/1199271/

It mostly works but line 78 is supposed to extract
<span class="price-unit">100 pieces / lot</span> No matter what I
try it's failed and I DON'T KNOW WHY? It's a simple div.classname
match..

Could someone take a look and figure it out - I'm stuck.

--------------------------------------------------------
import re, sys, time

from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import StaleElementReferenceException

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

url = 'https://www.aliexpress.com'

caps = DesiredCapabilities().FIREFOX; caps["pageLoadStrategy"] = 'eager'
ignored_exceptions=(NoSuchElementException,StaleElementReferenceException,)

fh = open('/tmp/log.html', 'w')
fh.write('<!doctype html> <title>parts</title><body>\n<table>\n')

def convert(m):
money = m.group()
return str(round(float(money) * 72.4, 3))

import re
def process_fields(txt):
if '$' in txt:
txt = txt.replace('+', '')
txt = txt.replace('$', '')
txt = txt.replace('US', '')
txt = txt.replace('Shipping:', '')
r = re.sub(r'(\s*[0-9]+\.[0-9]+)', convert, txt)
return str(r)

def ali_search(url, txt):
driver.get(url)
assert 'AliExpress' in driver.title

try:
srch_elem = WebDriverWait(driver, 3600,
ignored_exceptions=ignored_exceptions).until(
EC.presence_of_element_located((By.XPATH, '//div[@class="search-key-box"]')))
print('search')
x = driver.find_element_by_id('search-key')
if 'input' in x.tag_name:
print 'success'
finally:
for c in list(txt):
time.sleep(1)
x.send_keys(c)
x.send_keys(Keys.RETURN)

try:
element = WebDriverWait(driver, 3600,
ignored_exceptions=ignored_exceptions).until(
EC.presence_of_element_located((By.XPATH, '//div[@class="product-container"]')))
finally:
print('product-container')
x = driver.find_element_by_xpath('//body')
x.send_keys(Keys.HOME)

for i in range(1,10):
print('send END')
time.sleep(1)
x.send_keys(Keys.PAGE_DOWN)
time.sleep(1)
#driver.execute_script("window.scrollTo(0,
document.body.scrollHeight);")

# EC.presence_of_element_located((By.XPATH, '//div[contains(@class, "
product-list")]')))
divs = element.find_elements_by_xpath('//li[@class="list-item
packaging_sale"]')
for c, div in enumerate(divs):
fh.write('<tr>')
for param in ['price-current', 'item-price-row packaging-sale',
'shipping-value', 'store-name']:
try:
if 'store' in param:
fh.write('<td>' +
div.find_elements_by_class_name(param)[0].text + '</td>')
elif 'sale' in param:
print param
lot = div.find_elements_by_class_name(param)
fh.write('<td>' + str(lot) + '</td>')
else:
fh.write('<td>' +
process_fields(div.find_elements_by_class_name(param).text) + '</td>')
except Exception as e:
fh.write('<td>' + str(e) + '</td>')
fh.write('</tr>\n')
fh.write('\n</table></body>')
fh.close()

def part_lookup():
global driver
with webdriver.Firefox(executable_path=r'/mnt/sdb1/root/geckodriver',
firefox_binary='/mnt/sdb1/firefox/firefox-bin', capabilities=caps) as driver:
if len(sys.argv) == 2:
ali_search(url, sys.argv[1])
time.sleep(3600)

part_lookup()

Re: Selenium script - stuck - could someone take a look?

<s8tggo$c3j$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13417&group=comp.lang.python#13417

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: vee...@foo.com (Veek M)
Newsgroups: comp.lang.python
Subject: Re: Selenium script - stuck - could someone take a look?
Date: Sat, 29 May 2021 13:44:56 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 3
Message-ID: <s8tggo$c3j$1@dont-email.me>
References: <s8t26j$gh5$1@dont-email.me>
Reply-To: Veek M <vek.m12@foo.com>
Injection-Date: Sat, 29 May 2021 13:44:56 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8de3e46c976a513959683046a6d72128";
logging-data="12403"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18ZcjifbHf1eWfUrTdqOVuD"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:himuRjhmw+Nm2vnHOVroIoqR7iU=
 by: Veek M - Sat, 29 May 2021 13:44 UTC

On 2021-05-29, Veek M <veekm@foo.com> wrote:

fixed div './/' vs '//'

Re: Selenium script - stuck - could someone take a look?

<mailman.410.1622314000.3087.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13421&group=comp.lang.python#13421

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!lilly.ping.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: wlfr...@ix.netcom.com (Dennis Lee Bieber)
Newsgroups: comp.lang.python
Subject: Re: Selenium script - stuck - could someone take a look?
Date: Sat, 29 May 2021 14:24:53 -0400
Organization: IISS Elusive Unicorn
Lines: 23
Message-ID: <mailman.410.1622314000.3087.python-list@python.org>
References: <s8t26j$gh5$1@dont-email.me>
<3315bgp4j6dkrcn10esa55n4j7iflnq5d5@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de HBjxb2wxRVhT5E6gfLFg+gsbnudxu+DUwtBk4IvAX03g==
Return-Path: <python-python-list@m.gmane-mx.org>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.044
X-Spam-Evidence: '*H*': 0.91; '*S*': 0.00; 'received:ciao.gmane.io':
0.09; 'received:gmane.io': 0.09; 'received:list': 0.09;
'subject:could': 0.09; 'subject:script': 0.09; 'examine': 0.16;
'jumps': 0.16; 'message-id:@4ax.com': 0.16; 'received:116.202':
0.16; 'received:116.202.254': 0.16; 'received:116.202.254.214':
0.16; 'url': 0.20; 'purposes': 0.23; 'sat,': 0.23; 'to:addr
:python-list': 0.23; 'extract': 0.27; 'mostly': 0.28; 'local':
0.29; 'header:User-Agent:1': 0.31; 'header:Organization:1': 0.31;
'but': 0.31; 'failed': 0.35; "it's": 0.38; 'list': 0.39; 'simple':
0.40; "skip:' 10": 0.40; 'lot': 0.62; 'plan': 0.65; 'order': 0.68;
'matter': 0.69; 'pieces': 0.71; 'received:116': 0.71; 'supposed':
0.77; '2021': 0.84; 'subject:Selenium': 0.84; 'url-ip:76/8': 0.84
X-Injected-Via-Gmane: http://gmane.org/
User-Agent: ForteAgent/8.00.32.1272
X-No-Archive: YES
X-Mailman-Approved-At: Sat, 29 May 2021 14:46:38 -0400
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <3315bgp4j6dkrcn10esa55n4j7iflnq5d5@4ax.com>
X-Mailman-Original-References: <s8t26j$gh5$1@dont-email.me>
 by: Dennis Lee Bieber - Sat, 29 May 2021 18:24 UTC

On Sat, 29 May 2021 09:40:35 -0000 (UTC), Veek M <veekm@foo.com> declaimed
the following:

>
>It mostly works but line 78 is supposed to extract
><span class="price-unit">100 pieces / lot</span> No matter what I
>try it's failed and I DON'T KNOW WHY? It's a simple div.classname
>match..

The only thing that jumps out is that "price-unit" is NOT in the list
on line 72.
for param in ['price-current', 'item-price-row packaging-sale',
'shipping-value', 'store-name']:

But then... I don't do selenium; I also don't plan to try to connect to
that URL for purposes of saving the page source, in order to examine it as
a local file...

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

Re: Selenium script - stuck - could someone take a look?

<s8v085$a89$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13431&group=comp.lang.python#13431

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: vee...@foo.com (Veek M)
Newsgroups: comp.lang.python
Subject: Re: Selenium script - stuck - could someone take a look?
Date: Sun, 30 May 2021 03:19:33 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <s8v085$a89$1@dont-email.me>
References: <s8t26j$gh5$1@dont-email.me>
<3315bgp4j6dkrcn10esa55n4j7iflnq5d5@4ax.com>
<mailman.410.1622314000.3087.python-list@python.org>
Reply-To: Veek M <vek.m12@foo.com>
Injection-Date: Sun, 30 May 2021 03:19:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d31b0930f8fbcc393b454a8d81427fa0";
logging-data="10505"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/A19bUFG52vyBILsw1qw6z"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:YcwMS5vclmSU2IgIk3oAVT58qQ0=
 by: Veek M - Sun, 30 May 2021 03:19 UTC

On 2021-05-29, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
> On Sat, 29 May 2021 09:40:35 -0000 (UTC), Veek M <veekm@foo.com> declaimed
> the following:
>

ah, yeah - man that took me a while to do (save to local file and use
file:///). It's working now, basically xpath mistake because I've
forgotten stuff.. but the script is almost complete and cleaned up..
Basically forgot that div.xpath('// doesn't use div as the root window -
thing starts the lookup from the very beginning. You got to .//

My "Next Page" is creating problems - asked here as well:
https://www.reddit.com/r/selenium/comments/nnrxuu/
what_is_overlay_backdrop_how_does_it_block_a/

def page_next():
tmp = driver.find_element_by_xpath('//button[contains(@class,
" next-next")]')
tmp = driver.find_element_by_xpath('.//div[@class=
"next-overlay-backdrop")]')
# e = WebDriverWait(driver, 200).until(EC.element_to_be_clickable(
# (By.XPATH, "//button[contains(@class, ' next-next']"))).click()
# e.click()

if tmp: tmp.click()
else: raise SystemExit('no next')

This is the error I get
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.ElementClickInterceptedException: Message:
Element <button class="next-btn next-medium next-btn-normal
next-pagination-item next-next" type="button"> is not clickable at
point (699,649) because another element <div class="next-overlay-backdrop">
obscures it

Re: Selenium script - stuck - could someone take a look?

<mailman.431.1622401033.3087.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13448&group=comp.lang.python#13448

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: wlfr...@ix.netcom.com (Dennis Lee Bieber)
Newsgroups: comp.lang.python
Subject: Re: Selenium script - stuck - could someone take a look?
Date: Sun, 30 May 2021 13:03:13 -0400
Organization: IISS Elusive Unicorn
Lines: 33
Message-ID: <mailman.431.1622401033.3087.python-list@python.org>
References: <s8t26j$gh5$1@dont-email.me>
<3315bgp4j6dkrcn10esa55n4j7iflnq5d5@4ax.com>
<mailman.410.1622314000.3087.python-list@python.org>
<s8v085$a89$1@dont-email.me>
<crg7bgpdtv9m61rl8ja90jho9n7aoh3jat@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de cQH9etQnZwcLfl5vW7QfCgsDTZZD+ADbhXEhwBbIKdWg==
Return-Path: <python-python-list@m.gmane-mx.org>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.007
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'message:': 0.03; 'sun,':
0.07; 'received:ciao.gmane.io': 0.09; 'received:gmane.io': 0.09;
'received:list': 0.09; 'subject:could': 0.09; 'subject:script':
0.09; 'help,': 0.14; 'but...': 0.16; 'firefox': 0.16; 'message-
id:@4ax.com': 0.16; 'received:116.202': 0.16;
'received:116.202.254': 0.16; 'received:116.202.254.214': 0.16;
'url-ip:151.101.1.69/32': 0.16; 'url-ip:151.101.129.69/32': 0.16;
'url-ip:151.101.193.69/32': 0.16; 'url-ip:151.101.65.69/32': 0.16;
'url:howto': 0.16; 'url:questions': 0.16; 'url:stackoverflow':
0.16; 'wrote:': 0.16; 'all,': 0.19; "i've": 0.22; 'basically':
0.23; 'sat,': 0.23; 'to:addr:python-list': 0.23; 'error': 0.28;
'local': 0.29; 'header:User-Agent:1': 0.31;
'header:Organization:1': 0.31; 'raise': 0.31; 'url-
ip:151.101.129/24': 0.32; 'url-ip:151.101.193/24': 0.32; 'select':
0.37; 'file': 0.38; "it's": 0.38; 'use': 0.38; 'source': 0.38;
'url-ip:192/8': 0.64; 'now,': 0.67; 'took': 0.70; 'skip:e 20':
0.70; 'received:116': 0.71; 'man': 0.77; '2021': 0.84;
'subject:Selenium': 0.84; 'url-ip:192.229.173.207/32': 0.84; 'url-
ip:192.229.173/24': 0.84; 'url-ip:192.229/16': 0.84; 'url-
ip:76/8': 0.84; 'url:w3schools': 0.84; 'screen,': 0.91
X-Injected-Via-Gmane: http://gmane.org/
User-Agent: ForteAgent/8.00.32.1272
X-No-Archive: YES
X-Mailman-Approved-At: Sun, 30 May 2021 14:57:11 -0400
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <crg7bgpdtv9m61rl8ja90jho9n7aoh3jat@4ax.com>
X-Mailman-Original-References: <s8t26j$gh5$1@dont-email.me>
<3315bgp4j6dkrcn10esa55n4j7iflnq5d5@4ax.com>
<mailman.410.1622314000.3087.python-list@python.org>
<s8v085$a89$1@dont-email.me>
 by: Dennis Lee Bieber - Sun, 30 May 2021 17:03 UTC

On Sun, 30 May 2021 03:19:33 -0000 (UTC), Veek M <veekm@foo.com> declaimed
the following:

>On 2021-05-29, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
>> On Sat, 29 May 2021 09:40:35 -0000 (UTC), Veek M <veekm@foo.com> declaimed
>> the following:
>>
>
>ah, yeah - man that took me a while to do (save to local file and use
>file:///). It's working now, basically xpath mistake because I've

Basic page in Firefox is Tools/Browser Tools/Page Source -- select all,
cut, paste in editor <G>

> This is the error I get
> raise exception_class(message, screen, stacktrace)
>selenium.common.exceptions.ElementClickInterceptedException: Message:
>Element <button class="next-btn next-medium next-btn-normal
>next-pagination-item next-next" type="button"> is not clickable at
>point (699,649) because another element <div class="next-overlay-backdrop">
>obscures it

Not much help, but...

https://www.w3schools.com/howto/howto_css_overlay.asp
https://stackoverflow.com/questions/23302279/how-to-overlay-a-new-div-over-entire-body-of-html-using-jquery/23302409

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor