#www.abacho.com - Busca do Portal Abaco da Alemanha, tradutor e busca em 8 Países da União Européia User-agent: AbachoBOT Disallow: #Alexa Dispensa apresentação User-agent: ia_archiver Disallow: #ATN Worldwide - O robô de ATN é usado construir a base de dados para o serviço de busca de AllThatNet http://www.allthatnet.com/ operado por toda a rede. O robô funciona semanalmente, e visita sites por ordem aleatória. User-agent: ATN_Worldwire Disallow: #www.altavista.co.uk - Reino Unido Usa banco de dados do Yahoo. User-agent: AltaVista-Intranet Disallow: #www.alltheweb.com - Banco Dados Yahoo User-agent: FAST-WebCrawler Disallow: #ArchitextSpider - ArchitextSpider coleta as informações dos motores de busca Excite e WebCrawler User-agent: ArchitextSpider Disallow: #www.acoon.de Alemão Europa User-agent: Acoon Robot Disallow: #www.antisearch.net - França User-agent: antibot Disallow: #www.atomz.com - USA User-agent: Atomz Disallow: #www.axmo.com - USA User-agent: AxmoRobot Disallow: #www.abcdatos.com - Programas e tutoriais em casteliano User-agent: abcdatos_botlink Disallow: #www.aesop.com - Spider Search Engine User-agent: AESOP_com_SpiderMan Disallow: #www.ah-ha.com redireciona para www.enhance.com - Paga pelo desempenho motor de busca para marketing e serviços. User-agent: ah-ha.com crawler Disallow: #www.altavista.com User-agent: Scooter Disallow: #www.asc.com - pesquisa, imagns, notícias, blogs, vídeo, mapas e direção, pesquisa local e shopping User-agent: Teoma Disallow: # AdSense User-agent: Mediapartners-Google* Disallow: #www.amfibi.com - BUsca controlado pela BCN Telecom from its offices within Spain add site br User-agent: Amfibibot Disallow: # Amidalla User-agent: amibot Disallow: # ASPseek.com User-agent: ASPseek Disallow: # BDNcentral User-agent: BDNcentral Disallow: # Become.com User-agent: BecomeBot Disallow: # BigClique User-agent: BigCliqueBOT Disallow: # Boitho User-agent: boitho.com-dc Disallow: #www.buscaplus.com User-agent: Buscaplus Robi Disallow: # Convera User-agent: ConveraCrawler Disallow: #www.canseek.ca User-agent: CanSeek/ Disallow: #www.christcrawler.com/search.cfm User-agent: ChristCRAWLER Disallow: #Idioma Chinês User-agent: robot-response@openfind.com.tw Disallow: #www.clush.com User-agent: Clushbot Disallow: # Daypop User-agent: Daypopbot Disallow: #Dir.com User-agent: Pompos Disallow: #www.ditto.com User-agent: DittoSpyder Disallow: #www.domanova.co.uk User-agent: Jack Disallow: #deleuze.infobee.ne.jp User-agent: gazz/1.0 Disallow: #www.daadle.com User-agent: DaAdLe.com ROBOT/ Disallow: #www.daum.net User-agent: RaBot Disallow: #ext-gw.trd.fast.no User-agent: Wget Disallow: #www.en.deepindex.com User-agent: DeepIndex Disallow: #www.earthcom.info User-agent: EARTHCOM.info Disallow: #www.entireweb.com User-agent: Speedy Spider Disallow: #www.excite.com User-agent: Architext Spider Disallow: #www.excite.com User-agent: ArchitectSpider Disallow: #www.eurip.com User-agent: EuripBot Disallow: #www.ezresults.com User-agent: EZResult Disallow: #www.euroseek.net User-agent: Arachnoidea Disallow: # FyberSearch User-agent: FyberSpider Disallow: #www.fastsearch.net User-agent: Fast PartnerSite Crawler Disallow: # FAST Data User-agent: FAST Data Search Crawler Disallow: # FAST Data User-agent: FAST Data Search Document Retriever Disallow: #www.fireball.de User-agent: KIT-Fireball Disallow: #http://france.misesajour.com/ User-agent: france.misesajour.com Disallow: #www.fybersearch.com User-agent: FyberSearch Disallow: #www.galaxy.com User-agent: GalaxyBot Disallow: #www.geckobot.com User-agent: Geckobot Disallow: #www.gendoor.com #Busca#Genealógico User-agent: GenCrawler Disallow: #www.geona.com User-agent: GeonaBot Disallow: #www.getrax.com User-agent: getRAX Disallow: #www.google.com User-agent: ooglebot Disallow: #www.goo.ne.jp User-agent: moget/2.0 Disallow: #www.girafa.com User-agent: Aranha Disallow: # Google User-agent: googlebot Disallow: # Gullive User-agent: gulliver Disallow: # Girafa.com User-agent: Girafabot Disallow: # GoForIt User-agent: Goforitbot Disallow: #http://hoppa.com/ User-agent: Toutatis 2.5-2 Disallow: #www.hubat.com User-agent: Hubater Disallow: # IBM User-agent: Crawler Disallow: # Inelegant User-agent: nelaBot Disallow: # IRL User-agent: IRLbot Disallow: #Inktomi Slurp User-agent: Slurp/2.0 Disallow: # InfoSeek User-agent: InfoSeek Robot 1.0 Disallow: #www.iltrovatore.it User-agent: IlTrovatore-Setaccio Disallow: #www.incywincy.com User-agent: IncyWincy Disallow: #www.infoseek.com User-agent: UltraSeek Disallow: #www.ip3000.com User-agent: C-PBWF-ip3000.com-crawler Disallow: #www.ipselon.com User-agent: Ipselonbot Disallow: #www.intags.de User-agent: InfoSeek Sidewinder Disallow: #www.joocer.com User-agent: JoocerBot Disallow: #www.kuloko.com User-agent: kuloko-bot/0.2 Disallow: # Local.com User-agent: LocalcomBot Disallow: # Loopimprovements User-agent: NetResearchServer Disallow: # Lycos User-agent: Lycos/x.x Disallow: #www.lexis-nexis.com User-agent: LNSpiderguy Disallow: #www.lapozz.com User-agent: LapozzBot/ Disallow: #www.linknz.co.nz User-agent: Linknzbot Disallow: #www.look.com User-agent: lookbot Disallow: #www.looksmart.com User-agent: MantraAgent Disallow: #www.lycos.com User-agent: Lycos_Spider_(T-Rex) Disallow: # Majestic-12 User-agent: MJ12bot Disallow: #http://mp3bot.de/ User-agent: MP3Bot Disallow: #Mercator.pa-x.dec.com User-agent: Mercator Disallow: #Mercator.pa-x.dec.com User-agent: Scooter2_Mercator_3-1.0 Disallow: #http://search.msn.com/ User-agent: MSNBOT/0.1 Disallow: # MSN User-agent: msnbot Disallow: #www.mirago.co.uk User-agent: HenryTheMiragoRobot Disallow: #http://mapper.teradex.com User-agent: Teradex_Mapper Disallow: #www.mojeek.com User-agent: MojeekBot Disallow: #www.maxbot.com User-agent: Spider/maxbot.com Disallow: #www.mousefish.com User-agent: MouseBOT/ Disallow: #www.mozdex.com User-agent: mozDex/ Disallow: #www.navadoo.com User-agent: Navadoo Crawler Disallow: #http://navi.ocn.ne.jp/ User-agent:nttdirectory_robot Disallow: #www.northernlight.com User-agent: Gulliver Disallow: #www.nationaldirectory.com User-agent: NationalDirectory-SuperSpider Disallow: #www.naver.com User-agent: dloader(NaverRobot)/ Disallow: #www.noxtrum.com User-agent: noxtrumbot/ Disallow: # Nameprotect User-agent: NPBot Disallow: # ObjectsSearch User-agent: ObjectsSearch Disallow: #www.objectssearch.com User-agent: ObjectsSearch/0.01 Disallow: #www.openfind.com User-agent: Openfind piranha,Shark Disallow: # Omni-Explorer User-agent: OmniExplorer_Bot Disallow: #www.picsearch.org User-agent: psbot Disallow: #www.pinpoint.com User-agent: CrawlerBoy Pinpoint.com Disallow: #www.petersnews.com User-agent: user.ip3000.com Disallow: # Pipeline User-agent: pipeLiner Disallow: #www.picosearch.com User-agent: PicoSearch/ Disallow: #www.portaljuice.com User-agent: PJspider Disallow: #www.powerinter.net User-agent: DIIbot Disallow: #www.qweery.nl User-agent: QweeryBot Disallow: #www.rambler.ru User-agent: StackRambler/ Disallow: #inktomi User-agent: Slurp/2.0j Disallow: #inktomi User-agent: Slurp.so/1.0 Disallow: #inktomi User-agent: Slurp/2.0-KiteHourly Disallow: #inktomi User-agent: Slurp/2.0-OwlWeekly Disallow: #inktomi User-agent: Slurp/3.0-AU Disallow: # SEVENtwentyfour User-agent: LinkWalker Disallow: # SharewarePlaza.com User-agent: Agent-SharewarePlazaFileCheckBot Disallow: # SitiDi.net User-agent: SitiDiBot Disallow: #http://szukaj.onet.pl/ User-agent: OnetSzukaj/ Disallow: #http://search.privacybird.com/ User-agent: PrivacyFinder Disallow: # SpiderMonkey User-agent: SpiderMonkey Disallow: #http://search.Market-UK.com User-agent: ScollSpider Disallow: #www.seznam.cz User-agent: SeznamBot Disallow: #www.search-10.com User-agent: Search-10 Disallow: #www.searchhippo.com User-agent: Fluffy the spider Disallow: #www.scrubtheweb.com User-agent: Scrubby/ Disallow: #www.singingfish.com User-agent: asterias Disallow: #www.speedfind.de User-agent: speedfind ramBot xtreme Disallow: #www.s.u-tokyo.ac.jp User-agent: Kototoi/0.1 Disallow: #www.searchbyusa.com User-agent: SearchByUsa Disallow: #www.searchspider.com User-agent: Searchspider/ Disallow: #www.sightquest.com User-agent: SightQuestBot/ Disallow: #www.spidermonkey.ca User-agent: Spider_Monkey/ Disallow: #www.surfnomore.com User-agent: Surfnomore Spider v1.1 Disallow: #www.supersnooper.com User-agent: Robot@SuperSnooper.Com Disallow: # Turnitin.com User-agent: TurnitinBot Disallow: # TutorGig.com User-agent: TutorGigBot Disallow: #tv.sv.av.com User-agent: Tv_Merc_resh_26_1_D-1.0 Disallow: #www.travel-finder.com User-agent: ESISmartSpider Disallow: #www.traficdublu.ro User-agent: Spider TraficDublu Disallow: #www.tutorgig.com User-agent: Tutorial Crawler Disallow: #www.teoma.com User-agent: teoma_agent1 Disallow: #Ultraseek User-agent: Ultraseek Disallow: #www.updated.com User-agent: updated/0.1beta Disallow: #www.uksearcher.co.uk User-agent: UK Searcher Spider Disallow: # Voila.fr User-agent: VoilaBot Disallow: #www.vestris.com/alkaline User-agent: AlkalineBOT Disallow: # Walhello User-agent: Appie Disallow: # WebArchive User-agent: BruinBot Disallow: # Yandex User-agent: Yandex Disallow: #www.walhello.com User-agent: appie Disallow: #www.websmostlinked.com User-agent: Nazilla Disallow: #www.webseek.de User-agent: marvin/infoseek Disallow: #www.webtop.com User-agent: MuscatFerret Disallow: #Walhello Appie User-agent: appie/1.1 Disallow: #www.whizbanglabs.com User-agent: WhizBang! Lab Disallow: #W3T User-agent: W3T_SE Disallow: #www.wisenut.com User-agent: ZyBorg Disallow: #www.wire.co.uk User-agent: WIRE WebRefiner: Disallow: #w3.org User-agent: W3C-gsa Disallow: #www.worldsearchcenter.com User-agent: WSCbot Disallow: # Yahoo! User-agent: Slurp Disallow: #www.yellowpet.com #pet#based#search#engine User-agent: Yellopet-Spider Disallow: #www.yelo.no User-agent: Findexa Crawler Disallow: #www.yourbettersearch.com User-agent: YBSbot search engine indexer Disallow: #http://verno.ueda.info.waseda.ac.jp/ User-agent: Iron33 Disallow: # Todos # Onde os arquivos a ser excluido são informado NÃO PODE HAVER ESPAÇOS ENTRE AS LINHAS # Exemplo de exclusão por pasta de diretório User-agent: * Disallow: /cgi-bin/ Disallow: /teste/ Disallow: /rascunhos/ Disallow: /particular/ # Exemplo de exclusão por extensão de arquivo # O sinal de Cifrão "$" é exigido nessa ordem, serve para declarar o fim do arquivo e de um nome. # Exemplo de exclusão por nome e extensão #Disallow: /foto-projeto.jpeg$ #Disallow: /image/logo-velho.png$ # Exemplo de exclusão por nome de página #Disallow: /pagina-01.html