Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  nodelist  faq  login

Never keep up with the Joneses. Drag them down to your level. -- Quentin Crisp


rocksolid / Offtopic / Re: Re I cant believe it the baidu spider is scanning def3 on tor

SubjectAuthor
* I can't believe it, the baidu spider is scanning def3 on tortrw
`* Re: I can't believe it, the baidu spider is scanning def3 on torAnonUser
 `* Re: I can't believe it, the baidu spider is scanning def3 on torAnonUser
  `* Re: I can't believe it, the baidu spider is scanning def3 on torAnonUser
   `* Re: Re I cant believe it the baidu spider is scanning def3 on tortrw
    `* Re: Re I cant believe it the baidu spider is scanning def3 on torAnonUser
     +* Re: Re I cant believe it the baidu spider is scanning def3 on torAnonUser
     |`- Re: Re I cant believe it the baidu spider is scanning def3 on torGuest
     `- Re: Re I cant believe it the baidu spider is scanning def3 on tortrw

1
Subject: I can't believe it, the baidu spider is scanning def3 on tor
From: trw
Newsgroups: rocksolid.shared.offtopic
Organization: Dancing elephants
Date: Fri, 18 Oct 2019 23:18 UTC
Path: i2pn2.org!.POSTED!not-for-mail
From: trw...@i2pmail.org (trw)
Newsgroups: rocksolid.shared.offtopic
Subject: I can't believe it, the baidu spider is scanning def3 on tor
Date: Fri, 18 Oct 2019 19:18:35 -0400
Organization: Dancing elephants
Lines: 3
Message-ID: <qodh8v$dkc$1@i2pn2.org>
Reply-To: trw <trw@i2pmail.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 18 Oct 2019 23:18:56 -0000 (UTC)
Injection-Info: i2pn2.org; posting-account="def.i2p";
logging-data="13964"; mail-complaints-to="usenet@i2pn2.org"
User-Agent: FUDforum 3.0.7
X-FUDforum: 6666cd76f96956469e7be39d750cc7d9 <77569>
View all headers
Seems like a strange thing, and maybe it is a fake: baidu shows no hit when searching for the name or the onion address.
Or maybe they just harvest the info, and censor it right after...

trw
Posted on def3


Subject: Re: I can't believe it, the baidu spider is scanning def3 on tor
From: AnonUser
Newsgroups: rocksolid.shared.offtopic
Organization: Rocksolid Light
Date: Fri, 18 Oct 2019 23:34 UTC
References: 1
Path: i2pn2.org!rocksolid2!.POSTED.localhost!not-for-mail
From: AnonU...@rslight.i2p (AnonUser)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: I can't believe it, the baidu spider is scanning def3 on tor
Date: Fri, 18 Oct 2019 23:34:00 -0000 (UTC)
Organization: Rocksolid Light
Message-ID: <ea7ac6c84bc58f535bcd51f5f9d9e505$1@news.novabbs.com>
References: <qodh8v$dkc$1@i2pn2.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 18 Oct 2019 23:34:00 -0000 (UTC)
Injection-Info: novabbs.com; posting-account="retrobbs1"; posting-host="localhost:127.0.0.1";
logging-data="30846"; mail-complaints-to="usenet@novabbs.com"
User-Agent: rslight (http://news.novabbs.com)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on novabbs.com
X-Rslight-Site: $2y$10$U6rZtKAl./VW9KRa4KAA9enwAxXYynr0qhs/HWmabaWm49X/ktnIu
View all headers
trw wrote:

Seems like a strange thing, and maybe it is a fake: baidu shows no hit when
searching for the name or the onion address.
Or maybe they just harvest the info, and censor it right after...

I see baidu search in my logs also. I assume it's someone running the spider themself (or spoofing the user-agent). May not be the site. Often they disable delays between page requests and ignore robots.txt, which most real sites won't do.


--
Posted on Rocksolid Light



Subject: Re: I can't believe it, the baidu spider is scanning def3 on tor
From: AnonUser
Newsgroups: rocksolid.shared.offtopic
Organization: RetroBBS
Date: Sat, 19 Oct 2019 19:15 UTC
References: 1
Path: i2pn2.org!i2pn.org!rocksolid2!.POSTED.rocksolid3!not-for-mail
From: anonu...@retrobbs.rocksolidbbs.com.remove-kn3-this (AnonUser)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: I can't believe it, the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 19:15:30 +0000
Organization: RetroBBS
Message-ID: <9746930af4a5b77ceceed35784021788$1@retrobbs.i2p>
References: <ea7ac6c84bc58f535bcd51f5f9d9e505$1@news.novabbs.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: novabbs.com; posting-account="rslight.i2p"; posting-host="rocksolid3:192.241.178.238";
logging-data="27264"; mail-complaints-to="usenet@novabbs.com"
User-Agent: rslight (http://news.novabbs.com)
To: AnonUser
X-Comment-To: AnonUser
In-Reply-To: <ea7ac6c84bc58f535bcd51f5f9d9e505$1@news.novabbs.com>
X-FTN-PID: Synchronet 3.17a-Linux Dec 29 2018 GCC 6.3.0
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on rocksolidbbs.com
X-Rslight-Site: $2y$10$1.k567n35I6b.QWkmjn/M.x1NHHBTBUFrz8nYSiqZlNH5598nm5KO
X-Gateway: retrobbs.rocksolidbbs.com [Synchronet 3.17a-Linux NewsLink 1.110]
View all headers
  To: AnonUser
this seems most likely.

people think nothing of it if googlebot or any of the major search engine bots is crawling your site without a delay, or even ignoring robots.txt or simply don't want any attention.

source: i've written crawlers in the past and spoofed the user agent because of those reasons.
--
Posted on RetroBBS



Subject: Re: I can't believe it, the baidu spider is scanning def3 on tor
From: AnonUser
Newsgroups: rocksolid.shared.offtopic
Organization: RetroBBS
Date: Sat, 19 Oct 2019 19:17 UTC
References: 1
Path: i2pn2.org!i2pn.org!rocksolid2!.POSTED.rocksolid3!not-for-mail
From: anonu...@retrobbs.rocksolidbbs.com.remove-9vm-this (AnonUser)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: I can't believe it, the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 19:17:59 +0000
Organization: RetroBBS
Message-ID: <11a1968b180a7a511b542f97c8c7a9d8$1@retrobbs.i2p>
References: <9746930af4a5b77ceceed35784021788$1@retrobbs.i2p>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: novabbs.com; posting-account="rslight.i2p"; posting-host="rocksolid3:192.241.178.238";
logging-data="28862"; mail-complaints-to="usenet@novabbs.com"
User-Agent: rslight (http://news.novabbs.com)
To: AnonUser
X-Comment-To: AnonUser
In-Reply-To: <9746930af4a5b77ceceed35784021788$1@retrobbs.i2p>
X-FTN-PID: Synchronet 3.17a-Linux Dec 29 2018 GCC 6.3.0
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on rocksolidbbs.com
X-Rslight-Site: $2y$10$rjXsVUSDH9PC3WhZ5WWvaeYkj6oqmlsKGZ7yBrengxsjHkA0nCrea
X-Gateway: retrobbs.rocksolidbbs.com [Synchronet 3.17a-Linux NewsLink 1.110]
View all headers
  To: AnonUser
googlebot and the other well known bots usually also got preferential treatment.
--
Posted on RetroBBS



Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
From: trw
Newsgroups: rocksolid.shared.offtopic
Organization: Dancing elephants
Date: Sat, 19 Oct 2019 20:08 UTC
References: 1
Path: i2pn2.org!.POSTED!not-for-mail
From: trw...@i2pmail.org (trw)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 16:08:38 -0400
Organization: Dancing elephants
Lines: 6
Message-ID: <qofqgp$fe2$1@i2pn2.org>
References: <11a1968b180a7a511b542f97c8c7a9d8$1@retrobbs.i2p>
Reply-To: trw <trw@i2pmail.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 19 Oct 2019 20:08:58 -0000 (UTC)
Injection-Info: i2pn2.org; posting-account="def.i2p";
logging-data="15810"; mail-complaints-to="usenet@i2pn2.org"
User-Agent: FUDforum 3.0.7
X-FUDforum: 6666cd76f96956469e7be39d750cc7d9 <79297>
View all headers
source: i've written crawlers in the past and spoofed the user agent because of those reasons.

were those for clearnet or for darknets (or both) ? i don't mind bots as long as they do not consume too many resources...

cheers

trw
Posted on def3


Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
From: AnonUser
Newsgroups: rocksolid.shared.offtopic
Organization: RetroBBS
Date: Sat, 19 Oct 2019 22:10 UTC
References: 1
Path: i2pn2.org!i2pn.org!rocksolid2!.POSTED.rocksolid3!not-for-mail
From: anonu...@retrobbs.rocksolidbbs.com.remove-3rb-this (AnonUser)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 22:10:06 +0000
Organization: RetroBBS
Message-ID: <5cd2c23bb04cf218ce44d57e40117e31$1@retrobbs.i2p>
References: <qofqgp$fe2$1@i2pn2.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: novabbs.com; posting-account="rslight.i2p"; posting-host="rocksolid3:192.241.178.238";
logging-data="25486"; mail-complaints-to="usenet@novabbs.com"
User-Agent: rslight (http://news.novabbs.com)
To: trw
X-Comment-To: trw
In-Reply-To: <qofqgp$fe2$1@i2pn2.org>
X-FTN-PID: Synchronet 3.17a-Linux Dec 29 2018 GCC 6.3.0
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on rocksolidbbs.com
X-Rslight-Site: $2y$10$8zE76nzMHX0GAW3whHdH2.WZ63NHaLzdODzX9z2Km5PR1K6VNz5Ee
X-Gateway: retrobbs.rocksolidbbs.com [Synchronet 3.17a-Linux NewsLink 1.110]
View all headers
  To: trw
mainly clearnet. i did also crawl the darknet without a delay back then, simply because the network was really slow.

years ago the darknets were even slower than they are now. one could even say they are "fast" nowadays.

you can always rate limit your eepsite if bots become an issue, it generally isn't worth it for any crawler to re-create tunnels very often, as that is computationally expensive and also takes a bit to warm up. i also do not think most people would bother with that.
--
Posted on RetroBBS



Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
From: AnonUser
Newsgroups: rocksolid.shared.offtopic
Organization: RetroBBS
Date: Sat, 19 Oct 2019 22:13 UTC
References: 1
Path: i2pn2.org!i2pn.org!rocksolid2!.POSTED.rocksolid3!not-for-mail
From: anonu...@retrobbs.rocksolidbbs.com.remove-ob-this (AnonUser)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 22:13:53 +0000
Organization: RetroBBS
Message-ID: <eb067b108b84f148a81f0537ad44cca3$1@retrobbs.i2p>
References: <5cd2c23bb04cf218ce44d57e40117e31$1@retrobbs.i2p>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: novabbs.com; posting-account="rslight.i2p"; posting-host="rocksolid3:192.241.178.238";
logging-data="27123"; mail-complaints-to="usenet@novabbs.com"
User-Agent: rslight (http://news.novabbs.com)
To: AnonUser
X-Comment-To: AnonUser
In-Reply-To: <5cd2c23bb04cf218ce44d57e40117e31$1@retrobbs.i2p>
X-FTN-PID: Synchronet 3.17a-Linux Dec 29 2018 GCC 6.3.0
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on rocksolidbbs.com
X-Rslight-Site: $2y$10$jRI0orRrTzd39e1yCZFK7ezOdnbfVFmh4B.nxgXuBw.TkBXccN.ji
X-Gateway: retrobbs.rocksolidbbs.com [Synchronet 3.17a-Linux NewsLink 1.110]
View all headers
  To: AnonUser
it just came to my mind that back then there were clearnet websites which provided access to darknet websites, so clearnet search engines did index the darknets for a while.

i do not know if any such website is still active, though i would seriously doubt it because of the possible legal issues and constant dmca/takedown requests.
--
Posted on RetroBBS



Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
From: Guest
Newsgroups: rocksolid.shared.offtopic
Organization: Dancing elephants
Date: Sat, 19 Oct 2019 22:24 UTC
References: 1
Path: i2pn2.org!.POSTED!not-for-mail
From: gue...@retrobbs.rocksolidbbs.com (Guest)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 18:24:50 -0400
Organization: Dancing elephants
Lines: 2
Message-ID: <qog2g4$qo2$1@i2pn2.org>
References: <eb067b108b84f148a81f0537ad44cca3$1@retrobbs.i2p>
Reply-To: Guest <guest@retrobbs.rocksolidbbs.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 19 Oct 2019 22:25:09 -0000 (UTC)
Injection-Info: i2pn2.org; posting-account="def.i2p";
logging-data="27394"; mail-complaints-to="usenet@i2pn2.org"
User-Agent: FUDforum 3.0.7
X-FUDforum: 6666cd76f96956469e7be39d750cc7d9 <79615>
View all headers
i do not know if any such website is still active,

oh yes, there are many of them that are active. best to be avoided, imo, both as a client and as a server.
Posted on def3


Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
From: trw
Newsgroups: rocksolid.shared.offtopic
Organization: Dancing elephants
Date: Sat, 19 Oct 2019 22:31 UTC
References: 1
Path: i2pn2.org!.POSTED!not-for-mail
From: trw...@i2pmail.org (trw)
Newsgroups: rocksolid.shared.offtopic
Subject: Re: Re I cant believe it the baidu spider is scanning def3 on tor
Date: Sat, 19 Oct 2019 18:31:47 -0400
Organization: Dancing elephants
Lines: 11
Message-ID: <qog2t5$rft$1@i2pn2.org>
References: <5cd2c23bb04cf218ce44d57e40117e31$1@retrobbs.i2p>
Reply-To: trw <trw@i2pmail.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 19 Oct 2019 22:32:06 -0000 (UTC)
Injection-Info: i2pn2.org; posting-account="def.i2p";
logging-data="28157"; mail-complaints-to="usenet@i2pn2.org"
User-Agent: FUDforum 3.0.7
X-FUDforum: 6666cd76f96956469e7be39d750cc7d9 <79639>
View all headers
you can always rate limit your eepsite if bots become an issue

on i2p everything is dandy, and the java package allows finetuning of these service settings to a very high extent.
it is the tor side that is usually making trouble, and that one has no such measures (unless you add your own code to the service or you play around with iptables).
anyway, bots and spiders are just something that any service has to deal with somehow, it is kind a stress test sometimes, so it helps to find weak spots in the server.

i did some crawling of both tor and i2p some years ago, only by using wget in a script. this was semi successful...


cheers

trw
Posted on def3


1

rocksolid light 0.8.3
clearneti2ptor