bg0'dddlmZmZddlmZmZmZddlmZm Z m Z ddl m Z m Z mZddlmZmZmZddlmZmZmZddlmZGd d ZGd d eZGd deZGddeZGddeZGddeZGddeZGddeZ dS))TupleUnion)BIG5_CHAR_TO_FREQ_ORDERBIG5_TABLE_SIZEBIG5_TYPICAL_DISTRIBUTION_RATIO)EUCKR_CHAR_TO_FREQ_ORDEREUCKR_TABLE_SIZE EUCKR_TYPICAL_DISTRIBUTION_RATIO)EUCTW_CHAR_TO_FREQ_ORDEREUCTW_TABLE_SIZE EUCTW_TYPICAL_DISTRIBUTION_RATIO)GB2312_CHAR_TO_FREQ_ORDERGB2312_TABLE_SIZE!GB2312_TYPICAL_DISTRIBUTION_RATIO)JIS_CHAR_TO_FREQ_ORDERJIS_TABLE_SIZEJIS_TYPICAL_DISTRIBUTION_RATIO)JOHAB_TO_EUCKR_ORDER_TABLEceZdZdZdZdZdZddZddZd e e e fd e ddfd Z defd Zdefd Zde e e fde fdZdS)CharDistributionAnalysisigGz?g{Gz?returnNct|_d|_d|_d|_d|_d|_|dS)NrgF)tuple_char_to_freq_order _table_sizetypical_distribution_ratio_done _total_chars _freq_charsresetselfs O/opt/cloudlinux/venv/lib64/python3.11/site-packages/chardet/chardistribution.py__init__z!CharDistributionAnalysis.__init__@sJ5:GG  +.'  c0d|_d|_d|_dS)zreset analyser, clear any stateFrN)rr r!r#s r%r"zCharDistributionAnalysis.resetOs! r'charchar_lenc|dkr||}nd}|dkr>|xjdz c_||jkr%d|j|kr|xjdz c_dSdSdSdS)z"feed a character with known lengthrriN) get_orderr rrr!)r$r)r*orders r%feedzCharDistributionAnalysis.feedXs q==NN4((EEE A::    "  t'''1%888$$)$$$$ :('88r'c|jdks|j|jkr|jS|j|jkr,|j|j|jz |jzz }||jkr|S|jS)z(return confidence based on existing datar)r r!MINIMUM_DATA_THRESHOLDSURE_NOrSURE_YES)r$rs r%get_confidencez'CharDistributionAnalysis.get_confidencefs|   ! !T%59T%T%T<    0 0 0 "T%559XXA4=  }r'c"|j|jkSN)r ENOUGH_DATA_THRESHOLDr#s r%got_enough_dataz(CharDistributionAnalysis.got_enough_dataws 4#===r'_cdS)Nr-)r$r;s r%r.z"CharDistributionAnalysis.get_order|s rr'rN)__name__ __module__ __qualname__r9r4r3r2r&r"rbytes bytearrayintr0floatr6boolr:r.r=r'r%rr:s HG     *ui/0 *C *D * * * *">>>>> 5 !12sr'rc@eZdZdfd ZdeeefdefdZxZ S)EUCTWDistributionAnalysisrNctt|_t|_t |_dSr8)superr&r rr rrrr$ __class__s r%r&z"EUCTWDistributionAnalysis.__init__7 #; +*J'''r'byte_strcJ|d}|dkrd|dz z|dzdz SdS)Nr^rr-r=r$rN first_chars r%r.z#EUCTWDistributionAnalysis.get_order; a[   d*+hqk9D@ @rr'r> r?r@rAr&rrBrCrDr. __classcell__rLs@r%rHrHoKKKKKK %y(8"9cr'rHc@eZdZdfd ZdeeefdefdZxZ S)EUCKRDistributionAnalysisrNctt|_t|_t |_dSr8rJr&r rr rr rrKs r%r&z"EUCKRDistributionAnalysis.__init__rMr'rNcJ|d}|dkrd|dz z|dzdz SdS)NrrQrrRr-r=rSs r%r.z#EUCKRDistributionAnalysis.get_orderrUr'r>rVrXs@r%r[r[rYr'r[c@eZdZdfd ZdeeefdefdZxZ S)JOHABDistributionAnalysisrNctt|_t|_t |_dSr8r]rKs r%r&z"JOHABDistributionAnalysis.__init__rMr'rNc||d}d|cxkrdkr&nn#|dz|dz}tj|dSdS)Nrrr-)rget)r$rNrTcodes r%r.z#JOHABDistributionAnalysis.get_ordersXa[ : $ $ $ $ $ $ $ $ $#hqk1D-1$;; ;rr'r>rVrXs@r%rarasoKKKKKK %y(8"9cr'rac@eZdZdfd ZdeeefdefdZxZ S)GB2312DistributionAnalysisrNctt|_t|_t |_dSr8)rJr&rrrrrrrKs r%r&z#GB2312DistributionAnalysis.__init__s7 #< ,*K'''r'rNcZ|d|d}}|dkr|dkrd|dz z|zdz SdS)Nrrr_rRrQr-r=r$rNrT second_chars r%r.z$GB2312DistributionAnalysis.get_ordersI #+1+x{K $  [D%8%8d*+k9D@ @rr'r>rVrXs@r%rjrjsoLLLLLL %y(8"9cr'rjc@eZdZdfd ZdeeefdefdZxZ S)Big5DistributionAnalysisrNctt|_t|_t |_dSr8)rJr&rrrrrrrKs r%r&z!Big5DistributionAnalysis.__init__s7 #: **I'''r'rNc||d|d}}|dkr%|dkrd|dz z|zdz dzSd|dz z|zdz SdS) NrrrR?@r-r=rms r%r.z"Big5DistributionAnalysis.get_ordersi #+1+x{K   d""j4/0;>EJJ*t+,{:TA Arr'r>rVrXs@r%rprpsoJJJJJJ %y(8"9 c        r'rpc@eZdZdfd ZdeeefdefdZxZ S)SJISDistributionAnalysisrNctt|_t|_t |_dSr8rJr&rrrrrrrKs r%r&z!SJISDistributionAnalysis.__init__7 #9 )*H'''r'rNc|d|d}}d|cxkrdkr nn d|dz z}nd|cxkrdkrnn d|dz dzz}nd S||zd z }|d krd }|S) Nrrr-rvr=)r$rNrTrnr/s r%r.z"SJISDistributionAnalysis.get_orders #+1+x{K : % % % % % % % % %:,-EE Z ' ' ' '4 ' ' ' ' ':,r12EE2 #d*   E r'r>rVrXs@r%rxrxsoIIIIII %y(8"9cr'rxc@eZdZdfd ZdeeefdefdZxZ S)EUCJPDistributionAnalysisrNctt|_t|_t |_dSr8rzrKs r%r&z"EUCJPDistributionAnalysis.__init__r{r'rNcJ|d}|dkrd|dz z|dzdz SdS)NrrQrRrr-r=)r$rNr)s r%r.z#EUCJPDistributionAnalysis.get_orders8 { 4<<% 3d: :rr'r>rVrXs@r%rrsoIIIIII %y(8"9cr'rN)!typingrrbig5freqrrr euckrfreqr r r euctwfreqr r r gb2312freqrrrjisfreqrrr johabfreqrrrHr[rarjrprxrr=r'r%rs8      211111GGGGGGGGT 8$ 8$      8   !9$7(72 8r'