Website Hosting for Just 20 ForumCoin ~ Advertise on ForumCoin
52 Life Tips Banner
Webmaster forum. Website development, design & management. Graphic design. Blog / Forum Set-up, Management, Admin & News

How we can block AI software crawl our websites?

Postby mrki444 » 05 Feb 2023, 19:31

We can see how ChatGPT become very popular. It power and creation come from millions data crawled from Internet. Many authors now complain how OpenAI company used their data to make ChatGPT without their permission. Some think AI developers should first request permission from author before crawling data. Exceptions are royalty free content.

Now some programmers think how to block AI crawlers. We can block Google or other search engines with few lines of code.
This is example of code in robot.txt file for Bing.
Code: Select all
User-agent: *
Disallow: *.axd
Disallow: /cgi-bin/
Disallow: /member

User-agent: bingbot
User-agent: ia_archiver
Disallow: /


Do you think this solutions for AI software will become often and will it affect on AI future?
  • 1

mrki444
 
Posts: 29,585
Location: Croatia
Referrals: 1
ForumCoin: 1,202

Re: How we can block AI software crawl our websites?

Postby friendociate » 08 Feb 2023, 21:35

I think the tag for spiders not to 'count a link' (if it's a link you want to link to but don't want search-engines to associate with your site) is
Code: Select all
target=_blank (within the <a href="... >-tag
  • 0

Last edited by friendociate on 15 Feb 2023, 19:41, edited 1 time in total.
User avatar
friendociate
 
Posts: 1,692
Location: Oklahoma City, Oklahoma, US
ForumCoin: 39

Re: How we can block AI software crawl our websites?

Postby Mika » 09 Feb 2023, 16:30

If you want organic traffic, you need search engine bots to crawl your page, as long as search engines have access to your site, AI software will also have access to your site.
  • 0

User avatar
Mika
 
Posts: 8,084
ForumCoin: 916

Re: How we can block AI software crawl our websites?

Postby mrki444 » 09 Feb 2023, 21:06

I know how to block search engines and I know that is bad decision so I don't do that. My question is how to block A. I software like ChatGPT crawl my site. If we can block specific search engine, why can't we block AI?
  • 0

mrki444
 
Posts: 29,585
Location: Croatia
Referrals: 1
ForumCoin: 1,202

Re: How we can block AI software crawl our websites?

Postby friendociate » 09 Feb 2023, 22:05

mrki444 wrote:I know how to block search engines and I know that is bad decision so I don't do that. My question is how to block A. I software like ChatGPT crawl my site. If we can block specific search engine, why can't we block AI?

If you mean 'A.I. posting on your site,' I think that's why they have email-verification & CAPTCHA (prove you're not a robot) when you sign-on.
  • 0

User avatar
friendociate
 
Posts: 1,692
Location: Oklahoma City, Oklahoma, US
ForumCoin: 39

Re: How we can block AI software crawl our websites?

Postby mrki444 » 13 Feb 2023, 18:22

friendociate wrote:If you mean 'A.I. posting on your site,' I think that's why they have email-verification & CAPTCHA (prove you're not a robot) when you sign-on.


Not posting, than crawl. Crawl is techning where some tool access your site and copy paste entire content or just part. A.I does that and from that content generate new.
  • 0

mrki444
 
Posts: 29,585
Location: Croatia
Referrals: 1
ForumCoin: 1,202

Re: How we can block AI software crawl our websites?

Postby ptrikha21 » 05 Mar 2023, 13:37

This could become like a Hide-and-seek between programming solutions to block AI crawlers and counter measures to bypass such scripts!
By the way, the Robot.txt script is in Java Script, VB Script or something else?
  • 0

User avatar
ptrikha21
 
Posts: 7,375
Location: India
Referrals: 5
ForumCoin: 1,107

Re: How we can block AI software crawl our websites?

Postby mrki444 » 05 Mar 2023, 15:41

ptrikha21 wrote:By the way, the Robot.txt script is in Java Script, VB Script or something else?


I am not sure is it any programming language. It is more meta language or meta data. Meta data are data about data. In it we define some standards and rules and in robot.txt we define rules about crawlers, SEO and bot programs.
  • 0

mrki444
 
Posts: 29,585
Location: Croatia
Referrals: 1
ForumCoin: 1,202

Re: How we can block AI software crawl our websites?

Postby Netherrealmer » 05 Mar 2023, 23:08

I may not mind them if the person using it is not using it for spamming and flooding nonsense. ChatGPT can be use to rewrite your content in active voice and fixing grammars. So it is not as bad as people paint it. Its just some people with no sense use it wrongly
  • 1

User avatar
Netherrealmer
Community Moderator
 
Posts: 37,876
Location: Babylon
ForumCoin: 200

Re: How we can block AI software crawl our websites?

Postby mrki444 » 12 Mar 2023, 20:11

Looks currently there is no way to block A.I.
You can block Common Crawler (crawler used by most ad companys, so includes and search engines) but than you can have problems with advertisers since they use it to post relative ads to your site.

Here is how you can block it if you really want it. Add it into robot.txt
Code: Select all
User-agent: CCBot
Disallow: /


Source: Searchenginejournal
  • 0

mrki444
 
Posts: 29,585
Location: Croatia
Referrals: 1
ForumCoin: 1,202



Your Ad Here.

Return to Webmaster Questions, Discussion & News



Who is online

Users browsing this forum: Claude [Bot] and 4 guests

Reputation System ©'