robots.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

User-agent: *			# all bots			 
Disallow:           		# allow this directory (the entire website)

User-agent: googlebot        	# all Google services
Disallow: /          		# disallow this directory (the entire website)

User-agent: GPTBot		# ChatGPT
Disallow: /          		# disallow this directory (the entire website)

User-agent: Bytespider
Disallow: /

# on its website, Google stipulate that instructions listed in robots.txt
# file are not enough to blocking Google from indexing a webiste:  
# 'it is not a mechanism for keeping a web page out of Google.'
#
# if you want to block all search indexing 
# you need to add the following header to all of your pages:
# <meta name="robots" content="noindex" />
# although Google stipulates that '[f]or the noindex rule to be effective, 
# the page or resource must not be blocked by a robots.txt file'
# sound like a catch-22. idk.