Lista roboti – cautare internet – web crawlers

Search engine robots and others

The following table lists the search engines that spider the web, the IP addresses that they use, and the robot names they send out to visit your site. Version numbers are usually included in the robot names, but are omitted here except where it implies a visit from a different IP address or (as in inktomi) a different search engine.

Often multiple IP addresses are used, in which case we just give a flavour of the names or numbers. Inktomi is a company that offers search engine technology and is used by a number of sites (e.g. www.snap.com and www.hotbot.com)

Wherever appears this indicates a number of different digits may be used.



Home page/search engine Robot identifier
www.abacho.com AbachoBOT
www.abcdatos.com abcdatos_botlink http://www.abcdatos.com/botlink/
www.aesop.com AESOP_com_SpiderMan
www.ah-ha.com ah-ha.com crawler (crawler@ah-ha.com)
www.alexa.com ia_archiver
www.altavista.com Scooter Mercator Scooter2_Mercator_3-1.0 roach.smo.av.com-1.0 Tv_Merc_resh_26_1_D-1.0
www.altavista.co.uk AltaVista-Intranet jan.gelin@av.com
www.alltheweb.com FAST-WebCrawler crawler@fast.no
www.fast.no/faq/faqfastwebsearch/faqfastwebcrawler.html
Wget
www.acoon.de Acoon Robot
www.antisearch.net antibot
www.atomz.com Atomz
www.axmo.com AxmoRobot
www.buscaplus.com Buscaplus Robi http://www.buscaplus.com/robi/
www.canseek.ca CanSeek/ support@canseek.ca
www.christcrawler.com/search.cfm ChristCRAWLER http://www.christcrawler.com/
www.clush.com Clushbot http://www.clush.com/bot.html
www.crawler.de Crawler admin@crawler.de
www.daadle.com DaAdLe.com ROBOT/
www.daum.net RaBot Agent-admin/ phortse@hanmail.net contact/jylee@kies.co.kr
RaBot Agent-admin/ webmaster@kisco.go.kr
www.en.deepindex.com DeepIndex
www.ditto.com DittoSpyder
domanova.co.uk Jack
www.earthcom.info EARTHCOM.info
www.entireweb.com Speedy Spider
www.excite.com ArchitextSpider
(excite) ArchitectSpider
www.eurip.com EuripBot
www.euroseek.net Arachnoidea arachnoidea@euroseek.net
www.ezresults.com EZResult
www.fastsearch.net Fast PartnerSite Crawler FAST Data Search Crawler FAST Data Search Document Retriever
www.fireball.de KIT-Fireball
http://france.misesajour.com/ france.misesajour.com
www.fybersearch.com FyberSearch
www.galaxy.com GalaxyBot http://www.galaxy.com/galaxybot.html
www.geckobot.com geckobot
www.gendoor.com (Genealogical Search Engine) GenCrawler
www.geona.com GeonaBot
www.getrax.com getRAX
www.google.com Googlebot googlebot@googlebot.com http://googlebot.com/
www.goo.ne.jp moget/2.0 moget@goo.ne.jp
www.girafa.com Aranha
(inktomi) Slurp.so/1.0 slurp@inktomi.com
(inktomi) Slurp/2.0j slurp@inktomi.com www.inktomisearch.com
(inktomi) Slurp/2.0-KiteHourly slurp@inktomi.com; www.inktomi.com/slurp.html
(inktomi) Slurp/2.0-OwlWeekly spider@aeneid.com www.inktomi.com/slurp.html
(inktomi) Slurp/3.0-AU slurp@inktomi.com
http://hoppa.com/ (need V5 browsers to view) Toutatis 2.5-2
www.hubat.com Hubater
www.almaden.ibm.com (research centre) http://www.almaden.ibm.com/cs/crawler
www.iltrovatore.it IlTrovatore-Setaccio
www.incywincy.com IncyWincy
www.infoseek.com UltraSeek InfoSeek Sidewinder
www.intags.de Mole2/1.0 webmaster@intags.de
http://mp3bot.de/ MP3Bot
www.ip3000.com C-PBWF-ip3000.com-crawler ip3000.com-crawler
www.istarthere.com http://www.istarthere.com spider@istarthere.com
www.knowledge.com Knowledge.com/
www.kuloko.com kuloko-bot/0.2
www.lexis-nexis.com LNSpiderguy
www.linknz.co.nz Linknzbot
www.look.com lookbot
www.looksmart.com MantraAgent
www.loopimprovements.com (see also www.incywincy.com) NetResearchServer www.loopimprovements.com/robot.html
www.lycos.com Lycos_Spider_(T-Rex)
www.joocer.com JoocerBot
www.mirago.co.uk HenryTheMiragoRobot
www.mojeek.com MojeekBot
www.mozdex.com mozDex/
http://search.msn.com/ MSNBOT/0.1 http://search.msn.com/msnbot.htm)
www.navadoo.com Navadoo Crawler
www.northernlight.com Gulliver
www.objectssearch.com ObjectsSearch/0.01
http://szukaj.onet.pl/ OnetSzukaj/
www.picosearch.com PicoSearch/
www.portaljuice.com PJspider
www.powerinter.net but it won’t let us in :-( DIIbot
http://navi.ocn.ne.jp/ nttdirectory_robot super-robot@super.navi.ocn.ne.jp griffon griffon@super.navi.ocn.ne.jp
www.maxbot.com Spider/maxbot.com admin@maxbot.com
??? gazz/1.0 gazz@nttrd.com
www.nationaldirectory.com NationalDirectory-SuperSpider
www.naver.com dloader(NaverRobot)/ dumrobo(NaverRobot)/
www.noxtrum.com noxtrumbot/
www.openfind.com (Chinese language) Openfind piranha,Shark robot-response@openfind.com.tw Openbot/
www.picsearch.org psbot www.picsearch.org/bot.html
www.pinpoint.com CrawlerBoy Pinpoint.com
www.petersnews.com user.ip3000.com
www.qweery.nl QweeryBot http://qweerybot.qweery.com)
www.vestris.com/alkaline AlkalineBOT
www.rambler.ru StackRambler/
www.seznam.cz SeznamBot
www.search-10.com Search-10
www.searchhippo.com Fluffy the spider info@searchhippo.com)
www.scrubtheweb.com Scrubby/
www.singingfish.com asterias
www.speedfind.de speedfind ramBot xtreme
www.s.u-tokyo.ac.jp Kototoi/0.1
www.searchbyusa.com SearchByUsa
www.searchspider.com Searchspider/
www.sightquest.com SightQuestBot/ http://www.sightquest.com/bot.htm
www.spidermonkey.ca Spider_Monkey/
www.surfnomore.com Surfnomore Spider v1.1
www.supersnooper.com Robot@SuperSnooper.Com
www.teoma.com teoma_agent1 teoma_admin@hawkholdings.com
http://mapper.teradex.com Teradex_Mapper mapper@teradex.com
www.travel-finder.com ESISmartSpider
www.traficdublu.ro Spider TraficDublu
www.tutorgig.com Tutorial Crawler http://www.tutorgig.com/crawler
www.updated.com updated/0.1beta crawler@updated.com
www.uksearcher.co.uk UK Searcher Spider
www.vivante.com (coming soon) Vivante Link Checker
www.walhello.com appie
www.websmostlinked.com Nazilla
www.webwombat.com.au www.WebWombat.com.au
www.webseek.de marvin/infoseek marvin-team@webseek.de
www.webtop.com MuscatFerret
www.whizbanglabs.com WhizBang! Lab
www.wisenut.com ZyBorg (info@WISEnut.com)
www.wire.co.uk WIRE WebRefiner: webrefiner@wire.co.uk
www.worldsearchcenter.com WSCbot
www.yandex.com Yandex
www.yellowpet.com pet-based search engine Yellopet-Spider
www.yelo.no Findexa Crawler
www.yourbettersearch.com YBSbot search engine indexer
libwww-perl
http://verno.ueda.info.waseda.ac.jp/

Browsers

Most browsers identify themselves with a string that begins “Mozilla…”. I’ve chosen not to document those (as yet). Here are a few of the rarer browser identifiers that I’ve seen.

Browser identifier Information
AmigaVoyager http://v3.vapor.com/ Voyager browser for the Amiga
xChaos_Arachne http://browser.arachne.cz/ (DOS-compatible browser. Linux version under development)
IBrowse www.hisoft.co.uk (search for IBrowse) Amiga-based browser
ICab www.icab.de/index.html (Macintosh-only)
JustView http://www3.justsystem.co.jp/download/justview/3.01win1a.html (I think this is a browser. Site is in Japanese)
KMeleon http://kmeleon.sourceforge.net/ (Light browser based on the Mozilla code base)
Konqueror www.konqueror.org/konq-browser.html (Linux KDE browser)
Lynx http://lynx.browser.org/ (Cross-platform text based browser)
OmniWeb www.omnigroup.com/products/omniweb/ (Macintosh-only)
Opera www.opera.com (Cross-platform, small, efficient and standards lead browser)
Plucker www.plkr.org/index.pl/faq#1.1 (Palm handhelds. Written in Python)
pwWebSpeak www.prodworks.com/issound/catalog/catalog_pwwebspeak.html Audio Browser
QWeb http://sunsite.auc.dk/qweb/ (Linux browser) (see also http://browswerwatch.internet.com/news/story/qweb8.html)
retawq http://retawq.sourceforge.net/ Text-based browser for text terminals. Runs under Linux
SlimBrowser www.flashpeak.com/sbrowser/sbrowser.htm Freeware tabbed browser
Sleipnir http://sleipnir.pos.to/software/sleipnir/index.html (Japanese) Japanese browser with apparantly an English version available.
VMS_Mosaic http://vaxa.wvnet.edu/vmswww/vms_mosaic.html (OpenVMS only version of Mosaic, a pre-Netscape browser)
WannaBe http://mindstory.com/wb2/ (Macintosh text-only browser)
w3m http://w3m.sourceforge.net/ (text-based browser)
Tags: ,

About magazinweb

filme subtitrate - filme romanesti - desene animate - diverse - internet.

0 comentarii

Leave a Reply

you may leave here a comment .
thank you !