3 Pfn@sXddlmZddlmZddlmZmZmZddlm Z m Z m Z m Z GdddeZ dS)) CharSetProber)CodingStateMachine)LanguageFilter ProbingState MachineState) HZ_SM_MODELISO2022CN_SM_MODELISO2022JP_SM_MODELISO2022KR_SM_MODELcsVeZdZdZdfdd ZfddZeddZed d Zd d Z d dZ Z S)EscCharSetProberz This CharSetProber uses a "code scheme" approach for detecting encodings, whereby easily recognizable escape or shift sequences are relied on to identify these encodings. Ncstt|j|dg|_|jtj@rD|jjtt |jjtt |jtj @r`|jjtt |jtj @r||jjttd|_d|_d|_d|_|jdS)N) lang_filter)superr __init__ coding_smr rZCHINESE_SIMPLIFIEDappendrrrZJAPANESEr ZKOREANr active_sm_count_detected_charset_detected_language_statereset)selfr ) __class__/usr/lib/python3.6/escprober.pyr*s   zEscCharSetProber.__init__csNtt|jx"|jD]}|s qd|_|jqWt|j|_d|_d|_dS)NT) r r rractivelenrrr)rr)rrrr:s   zEscCharSetProber.resetcCs|jS)N)r)rrrr charset_nameEszEscCharSetProber.charset_namecCs|jS)N)r)rrrrlanguageIszEscCharSetProber.languagecCs|jr dSdSdS)NgGz?g)r)rrrrget_confidenceMszEscCharSetProber.get_confidencecCsx|D]}x|jD]}| s|j r&q|j|}|tjkrhd|_|jd8_|jdkrtj|_|j Sq|tj krtj |_|j |_ |j|_|j SqWqW|j S)NFr)rrZ next_staterZERRORrrZNOT_MErstateZITS_MEZFOUND_ITZget_coding_state_machinerrr)rZbyte_strcrZ coding_staterrrfeedSs"       zEscCharSetProber.feed)N) __name__ __module__ __qualname____doc__rrpropertyrrrr" __classcell__rr)rrr #s  r N)Z charsetproberrZcodingstatemachinerZenumsrrrZescsmrrr r r rrrrs