3 2a"@sXddlZddlZddlZdgZejddZGdddZGdddZGdd d Z dS) NRobotFileParser RequestRatezrequests secondsc@sfeZdZdddZddZddZdd Zd d Zd d ZddZ ddZ ddZ ddZ ddZ dS)rcCs,g|_d|_d|_d|_|j|d|_dS)NFr)entries default_entry disallow_all allow_allset_url last_checked)selfurlr 0/opt/alt/python36/lib64/python3.6/robotparser.py__init__s  zRobotFileParser.__init__cCs|jS)N)r )r r r rmtime$szRobotFileParser.mtimecCsddl}|j|_dS)Nr)timer )r rr r rmodified-szRobotFileParser.modifiedcCs&||_tjj|dd\|_|_dS)N)r urllibparseurlparsehostpath)r r r r rr 5szRobotFileParser.set_urlcCsytjj|j}WnRtjjk rd}z2|jdkr:d|_n|jdkrT|jdkrTd|_WYdd}~XnX|j }|j |j dj dS)NTiizutf-8)rr) rZrequestZurlopenr errorZ HTTPErrorcoderrreadrdecode splitlines)r ferrrawr r rr:s zRobotFileParser.readcCs,d|jkr|jdkr(||_n |jj|dS)N*) useragentsrrappend)r entryr r r _add_entryGs  zRobotFileParser._add_entrycCs6d}t}|jx|D]}|sT|dkr8t}d}n|dkrT|j|t}d}|jd}|dkrr|d|}|j}|sq|jdd}t|dkr|djj|d<tj j |dj|d<|ddkr|dkr|j|t}|j j |dd}q|ddkr4|dkr|j j t|ddd}q|dd krh|dkr|j j t|dd d}q|dd kr|dkr|djjrt|d|_d}q|dd kr|dkr|djd }t|dkr|djjr|djjrtt|dt|d|_d}qW|dkr2|j|dS)Nrr#:z user-agentZdisallowFZallowTz crawl-delayz request-rate/)Entryrr(findstripsplitlenlowerrrunquoter%r& rulelinesRuleLineisdigitintdelayrreq_rate)r linesstater'lineiZnumbersr r rrPsd             zRobotFileParser.parsecCs|jr dS|jrdS|jsdStjjtjj|}tjjdd|j|j |j |j f}tjj |}|sfd}x"|j D]}|j|rn|j|SqnW|jr|jj|SdS)NFTrr,)rrr rrrr3 urlunparserparamsZqueryZfragmentquoter applies_to allowancer)r useragentr Z parsed_urlr'r r r can_fetchs$    zRobotFileParser.can_fetchcCs4|js dSx|jD]}|j|r|jSqW|jjS)N)rrrAr8r)r rCr'r r r crawl_delays    zRobotFileParser.crawl_delaycCs4|js dSx|jD]}|j|r|jSqW|jjS)N)rrrAr9r)r rCr'r r r request_rates    zRobotFileParser.request_ratecCs0|j}|jdk r||jg}djtt|dS)N )rrjoinmapstr)r rr r r__str__s  zRobotFileParser.__str__N)r)__name__ __module__ __qualname__rrrr rr(rrDrErFrKr r r rrs    Cc@s$eZdZddZddZddZdS)r5cCs>|dkr| rd}tjjtjj|}tjj||_||_dS)NrT)rrr>rr@rrB)r rrBr r rrs zRuleLine.__init__cCs|jdkp|j|jS)Nr$)r startswith)r filenamer r rrAszRuleLine.applies_tocCs|jr dndd|jS)NZAllowZDisallowz: )rBr)r r r rrKszRuleLine.__str__N)rLrMrNrrArKr r r rr5sr5c@s,eZdZddZddZddZddZd S) r-cCsg|_g|_d|_d|_dS)N)r%r4r8r9)r r r rrszEntry.__init__cCsg}x|jD]}|jd|q W|jdk r@|jd|j|jdk rj|j}|jd|jd|j|jtt|j |jddj |S)Nz User-agent: z Crawl-delay: zRequest-rate: r,rrG) r%r&r8r9ZrequestsZsecondsextendrIrJr4rH)r retagentZrater r rrKs    z Entry.__str__cCsF|jddj}x.|jD]$}|dkr*dS|j}||krdSqWdS)Nr,rr$TF)r0r2r%)r rCrSr r rrAs zEntry.applies_tocCs$x|jD]}|j|r|jSqWdS)NT)r4rArB)r rPr<r r rrBs   zEntry.allowanceN)rLrMrNrrKrArBr r r rr-s  r-) collectionsZ urllib.parserZurllib.request__all__ namedtuplerrr5r-r r r r s 2