3 Jsz*SchedulerUtils.__init__..css|]}tt||fVqdS)N)r(r))r*r,rrrr-Ls)dict_dict_schedcfg2schedconstitems_dict_schedcfg2numvalues_dict_num2schedconst)r rrrrHszSchedulerUtils.__init__cCs |jj|S)N)r1get)r str_schedulerrrrsched_cfg_to_numNszSchedulerUtils.sched_cfg_to_numcCs |jj|S)N)r3r4)r r rrrsched_num_to_constRsz!SchedulerUtils.sched_num_to_constcCs tj|S)N)r)sched_getscheduler)r pidrrr get_schedulerUszSchedulerUtils.get_schedulercCstj||tj|dS)N)r)sched_setscheduler sched_param)r r9schedpriorrr set_schedulerXszSchedulerUtils.set_schedulercCs tj|S)N)r)sched_getaffinity)r r9rrr get_affinity[szSchedulerUtils.get_affinitycCstj||dS)N)r)sched_setaffinity)r r9r rrr set_affinity^szSchedulerUtils.set_affinitycCs tj|jS)N)r)sched_getparamsched_priority)r r9rrr get_priorityaszSchedulerUtils.get_prioritycCs tj|S)N)r)sched_get_priority_min)r r=rrrget_priority_mindszSchedulerUtils.get_priority_mincCs tj|S)N)r)sched_get_priority_max)r r=rrrget_priority_maxgszSchedulerUtils.get_priority_maxN)rrr__doc__r/rr6r7r:r?rArCrFrHrJrrrrr;s rc@sPeZdZdZddZddZddZdd Zd d Zd d Z ddZ ddZ dS)SchedulerUtilsSchedutilszE Class encapsulating scheduler implementation in schedutils module cCs8tdd|jjD|_tdd|jjD|_dS)Ncss |]\}}|tt|fVqdS)N)r( schedutils)r*r+r,rrrr-psz4SchedulerUtilsSchedutils.__init__..css|]}tt||fVqdS)N)r(rM)r*r,rrrr-rs)r.r/r0r1r2r3)r rrrrnsz!SchedulerUtilsSchedutils.__init__cCs tj|S)N)rMr:)r r9rrrr:tsz&SchedulerUtilsSchedutils.get_schedulercCstj|||dS)N)rMr?)r r9r=r>rrrr?wsz&SchedulerUtilsSchedutils.set_schedulercCs tj|S)N)rMrA)r r9rrrrAzsz%SchedulerUtilsSchedutils.get_affinitycCstj||dS)N)rMrC)r r9r rrrrC}sz%SchedulerUtilsSchedutils.set_affinitycCs tj|S)N)rMrF)r r9rrrrFsz%SchedulerUtilsSchedutils.get_prioritycCs tj|S)N)rMrH)r r=rrrrHsz)SchedulerUtilsSchedutils.get_priority_mincCs tj|S)N)rMrJ)r r=rrrrJsz)SchedulerUtilsSchedutils.get_priority_maxN) rrrrKrr:r?rArCrFrHrJrrrrrLjsrLcseZdZdZfddZddZddZdd Zed d Z d d Z ddZ ddZ ddZ ddZddZddZddZddZddZdd!d"Zd#d$Zd%d&Zd'd(Zdd)d*Zd+d,Zd-d.Zd/d0Zd1d2Zd3d4Zd5d6Zd7d8Zd9d:Z d;d<Z!d=d>Z"dd?d@Z#dAdBZ$dCdDZ%fdEdFZ&dGdHZ'dIdJZ(dKdLZ)e*j+ffdMdN Z,dOdPZ-dQdRZ.fdSdTZ/dUdVZ0dWdXZ1dYdZZ2e3d[d d\d]d^Z4e3d_d d\d`daZ5e3dbd d\dcddZ6e3ded d\dfdgZ7e3dhd d\didjZ8e3dkd d\dldmZ9dndoZ:dpdqZ;drdsZdxdyZ?dzd{Z@d|d}ZAd~dZBddZCddZDddZEddZFe3dd ddddZGddZHddZIdddZJeKdddZLeMdddZNeKdddZOeMdddZPeKdddZQeMdddZReKdddZSeMdddZTeKdddZUeMdddZVeKdddZWeMdddZXeKdddZYeMdddZZeKdddZ[eMdddZ\eKdddZ]eMdddZ^eKddd„Z_eMdddĄZ`ZaS)SchedulerPlugina]- `scheduler`:: Allows tuning of scheduling priorities, process/thread/IRQ affinities, and CPU isolation. + To prevent processes/threads/IRQs from using certain CPUs, use the [option]`isolated_cores` option. It changes process/thread affinities, IRQs affinities and it sets `default_smp_affinity` for IRQs. The CPU affinity mask is adjusted for all processes and threads matching [option]`ps_whitelist` option subject to success of the `sched_setaffinity()` system call. The default setting of the [option]`ps_whitelist` regular expression is `.*` to match all processes and thread names. To exclude certain processes and threads use [option]`ps_blacklist` option. The value of this option is also interpreted as a regular expression and process/thread names (`ps -eo cmd`) are matched against that expression. Profile rollback allows all matching processes and threads to run on all CPUs and restores the IRQ settings prior to the profile application. + Multiple regular expressions for [option]`ps_whitelist` and [option]`ps_blacklist` options are allowed and separated by `;`. Quoted semicolon `\;` is taken literally. + .Isolate CPUs 2-4 ==== ---- [scheduler] isolated_cores=2-4 ps_blacklist=.*pmd.*;.*PMD.*;^DPDK;.*qemu-kvm.* ---- Isolate CPUs 2-4 while ignoring processes and threads matching `ps_blacklist` regular expressions. ==== The [option]`irq_process` option controls whether the scheduler plugin applies the `isolated_cores` parameter to IRQ affinities. The default value is `true`, which means that the scheduler plugin will move all possible IRQs away from the isolated cores. When `irq_process` is set to `false`, the plugin will not change any IRQ affinities. ==== The [option]`default_irq_smp_affinity` option controls the values *TuneD* writes to `/proc/irq/default_smp_affinity`. The file specifies default affinity mask that applies to all non-active IRQs. Once an IRQ is allocated/activated its affinity bitmask will be set to the default mask. + The following values are supported: + -- `calc`:: Content of `/proc/irq/default_smp_affinity` will be calculated from the `isolated_cores` parameter. Non-isolated cores are calculated as an inversion of the `isolated_cores`. Then the intersection of the non-isolated cores and the previous content of `/proc/irq/default_smp_affinity` is written to `/proc/irq/default_smp_affinity`. If the intersection is an empty set, then just the non-isolated cores are written to `/proc/irq/default_smp_affinity`. This behavior is the default if the parameter `default_irq_smp_affinity` is omitted. `ignore`:: *TuneD* will not touch `/proc/irq/default_smp_affinity`. explicit cpulist:: The cpulist (such as 1,3-4) is unpacked and written directly to `/proc/irq/default_smp_affinity`. -- + .An explicit CPU list to set the default IRQ smp affinity to CPUs 0 and 2 ==== ---- [scheduler] isolated_cores=1,3 default_irq_smp_affinity=0,2 ---- ==== To adjust scheduling policy, priority and affinity for a group of processes/threads, use the following syntax. + [subs="+quotes,+macros"] ---- group.__groupname__=__rule_prio__:__sched__:__prio__:__affinity__:__regex__ ---- + where `__rule_prio__` defines internal *TuneD* priority of the rule. Rules are sorted based on priority. This is needed for inheritence to be able to reorder previously defined rules. Equal `__rule_prio__` rules should be processed in the order they were defined. However, this is Python interpreter dependant. To disable an inherited rule for `__groupname__` use: + [subs="+quotes,+macros"] ---- group.__groupname__= ---- + `__sched__` must be one of: *`f`* for FIFO, *`b`* for batch, *`r`* for round robin, *`o`* for other, *`*`* do not change. + `__affinity__` is CPU affinity in hexadecimal. Use `*` for no change. + `__prio__` scheduling priority (see `chrt -m`). + `__regex__` is Python regular expression. It is matched against the output of + [subs="+quotes,+macros"] ---- ps -eo cmd ---- + Any given process name may match more than one group. In such a case, the priority and scheduling policy are taken from the last matching `__regex__`. + .Setting scheduling policy and priorities to kernel threads and watchdog ==== ---- [scheduler] group.kthreads=0:*:1:*:\[.*\]$ group.watchdog=0:f:99:*:\[watchdog.*\] ---- ==== + The scheduler plug-in uses perf event loop to catch newly created processes. By default it listens to `perf.RECORD_COMM` and `perf.RECORD_EXIT` events. By setting [option]`perf_process_fork` option to `true`, `perf.RECORD_FORK` events will be also listened to. In other words, child processes created by the `fork()` system call will be processed. Since child processes inherit CPU affinity from their parents, the scheduler plug-in usually does not need to explicitly process these events. As processing perf events can pose a significant CPU overhead, the [option]`perf_process_fork` option parameter is set to `false` by default. Due to this, child processes are not processed by the scheduler plug-in. + The CPU overhead of the scheduler plugin can be mitigated by using the scheduler [option]`runtime` option and setting it to `0`. This will completely disable the dynamic scheduler functionality and the perf events will not be monitored and acted upon. The disadvantage ot this approach is the procees/thread tuning will be done only at profile application. + .Disabling the scheduler dynamic functionality ==== ---- [scheduler] runtime=0 isolated_cores=1,3 ---- ==== + NOTE: For perf events, memory mapped buffer is used. Under heavy load the buffer may overflow. In such cases the `scheduler` plug-in may start missing events and failing to process some newly created processes. Increasing the buffer size may help. The buffer size can be set with the [option]`perf_mmap_pages` option. The value of this parameter has to expressed in powers of 2. If it is not the power of 2, the nearest higher power of 2 value is calculated from it and this calculated value used. If the [option]`perf_mmap_pages` option is omitted, the default kernel value is used. + The scheduler plug-in supports process/thread confinement using cgroups v1. + [option]`cgroup_mount_point` option specifies the path to mount the cgroup filesystem or where *TuneD* expects it to be mounted. If unset, `/sys/fs/cgroup/cpuset` is expected. + If [option]`cgroup_groups_init` option is set to `1` *TuneD* will create (and remove) all cgroups defined with the `cgroup*` options. This is the default behavior. If it is set to `0` the cgroups need to be preset by other means. + If [option]`cgroup_mount_point_init` option is set to `1`, *TuneD* will create (and remove) the cgroup mountpoint. It implies `cgroup_groups_init = 1`. If set to `0` the cgroups mount point needs to be preset by other means. This is the default behavior. + The [option]`cgroup_for_isolated_cores` option is the cgroup name used for the [option]`isolated_cores` option functionality. For example, if a system has 4 CPUs, `isolated_cores=1` means that all processes/threads will be moved to CPUs 0,2-3. The scheduler plug-in will isolate the specified core by writing the calculated CPU affinity to the `cpuset.cpus` control file of the specified cgroup and move all the matching processes/threads to this group. If this option is unset, classic cpuset affinity using `sched_setaffinity()` will be used. + [option]`cgroup.__cgroup_name__` option defines affinities for arbitrary cgroups. Even hierarchic cgroups can be used, but the hieararchy needs to be specified in the correct order. Also *TuneD* does not do any sanity checks here, with the exception that it forces the cgroup to be under [option]`cgroup_mount_point`. + The syntax of the scheduler option starting with `group.` has been augmented to use `cgroup.__cgroup_name__` instead of the hexadecimal `__affinity__`. The matching processes will be moved to the cgroup `__cgroup_name__`. It is also possible to use cgroups which have not been defined by the [option]`cgroup.` option as described above, i.e. cgroups not managed by *TuneD*. + All cgroup names are sanitized by replacing all all dots (`.`) with slashes (`/`). This is to prevent the plug-in from writing outside [option]`cgroup_mount_point`. + .Using cgroups v1 with the scheduler plug-in ==== ---- [scheduler] cgroup_mount_point=/sys/fs/cgroup/cpuset cgroup_mount_point_init=1 cgroup_groups_init=1 cgroup_for_isolated_cores=group cgroup.group1=2 cgroup.group2=0,2 group.ksoftirqd=0:f:2:cgroup.group1:ksoftirqd.* ps_blacklist=ksoftirqd.*;rcuc.*;rcub.*;ktimersoftd.* isolated_cores=1 ---- Cgroup `group1` has the affinity set to CPU 2 and the cgroup `group2` to CPUs 0,2. Given a 4 CPU setup, the [option]`isolated_cores=1` option causes all processes/threads to be moved to CPU cores 0,2-3. Processes/threads that are blacklisted by the [option]`ps_blacklist` regular expression will not be moved. The scheduler plug-in will isolate the specified core by writing the CPU affinity 0,2-3 to the `cpuset.cpus` control file of the `group` and move all the matching processes/threads to this cgroup. ==== Option [option]`cgroup_ps_blacklist` allows excluding processes which belong to the blacklisted cgroups. The regular expression specified by this option is matched against cgroup hierarchies from `/proc/PID/cgroups`. Cgroups v1 hierarchies from `/proc/PID/cgroups` are separated by commas ',' prior to regular expression matching. The following is an example of content against which the regular expression is matched against: `10:hugetlb:/,9:perf_event:/,8:blkio:/` + Multiple regular expressions can be separated by semicolon ';'. The semicolon represents a logical 'or' operator. + .Cgroup-based exclusion of processes from the scheduler ==== ---- [scheduler] isolated_cores=1 cgroup_ps_blacklist=:/daemons\b ---- The scheduler plug-in will move all processes away from core 1 except processes which belong to cgroup '/daemons'. The '\b' is a regular expression metacharacter that matches a word boundary. ---- [scheduler] isolated_cores=1 cgroup_ps_blacklist=\b8:blkio: ---- The scheduler plug-in will exclude all processes which belong to a cgroup with hierarchy-ID 8 and controller-list blkio. ==== Recent kernels moved some `sched_` and `numa_balancing_` kernel run-time parameters from `/proc/sys/kernel`, managed by the `sysctl` utility, to `debugfs`, typically mounted under `/sys/kernel/debug`. TuneD provides an abstraction mechanism for the following parameters via the scheduler plug-in: [option]`sched_min_granularity_ns`, [option]`sched_latency_ns`, [option]`sched_wakeup_granularity_ns`, [option]`sched_tunable_scaling`, [option]`sched_migration_cost_ns`, [option]`sched_nr_migrate`, [option]`numa_balancing_scan_delay_ms`, [option]`numa_balancing_scan_period_min_ms`, [option]`numa_balancing_scan_period_max_ms` and [option]`numa_balancing_scan_size_mb`. Based on the kernel used, TuneD will write the specified value to the correct location. + .Set tasks' "cache hot" value for migration decisions. ==== ---- [scheduler] sched_migration_cost_ns=500000 ---- On the old kernels, this is equivalent to: ---- [sysctl] kernel.sched_migration_cost_ns=500000 ---- that is, value `500000` will be written to `/proc/sys/kernel/sched_migration_cost_ns`. However, on more recent kernels, the value `500000` will be written to `/sys/kernel/debug/sched/migration_cost_ns`. ==== c stt|j||||||||d|_tj|_ttj|_ |dk rh|j tj tj|_t|j tj tj|_ t|_d|_i|_d|_d|_d|_tj|_|jdd|_d|_|jdd|_d|_y t|_Wntk rt |_YnXdS)NTz.*r )Z command_nameirq)!superrNrZ_has_dynamic_optionsconstsZCFG_DEF_DAEMON_daemonintZCFG_DEF_SLEEP_INTERVAL_sleep_intervalget_boolZ CFG_DAEMONr4ZCFG_SLEEP_INTERVALrr_secure_boot_hint_sched_knob_paths_cache _ps_whitelist _ps_blacklist_cgroup_ps_blacklist_reperfZcpu_map_cpusZ _storage_key_scheduler_storage_key _irq_process_irq_storage_key_evlistr_scheduler_utilsAttributeErrorrL) r Zmonitor_repositoryZstorage_factoryZhardware_inventoryZdevice_matcherZdevice_matcher_udevZplugin_instance_factoryZ global_cfg variables) __class__rrrs0     zSchedulerPlugin.__init__cCsT|dkr dSy t|}Wntk r,dSX|dkr:dStdtjtj|dS)Nr)rT ValueErrormathZceillog)r Z mmap_pagesZmprrr_calc_mmap_pagess z SchedulerPlugin._calc_mmap_pagescsd|_d|_d|_d|_jjji_tjdkr^t j dj i_jj jt _d_d_d_tjfdd|jjD_|j|_jj|jd}j|}|dkrt jd|d}|dk rt||krt j d ||fx(|jD]}jj|j||j|<qWjj|jjd d d krHd|_tj |_!j"r|jryt#j$|_%t#j&t#j't#j(d d ddd d t#j)t#j*Bd }|j+j,|j%dt#j-j,|j%|_|jj.||dkr|jj/n|jj/|dWnd|_YnXdS)NFTrz0recovering scheduling settings from previous runcsJg|]B\}}|dddkrt|dkrj|ddjj|fqS)Nzcgroup.)len_sanitize_cgroup_path _variablesexpand)r*optionr )r rr sz2SchedulerPlugin._instance_init..perf_mmap_pageszKInvalid 'perf_mmap_pages' value specified: '%s', using default kernel valuezL'perf_mmap_pages' value has to be power of two, specified: '%s', using: '%d'Zruntimer0) typeconfigZtaskcommmmapZfreqZ wakeup_eventsZ watermarkZ sample_type)Zcpusthreads)Zpages)0raZ_has_dynamic_tuningZ_has_static_tuning_runtime_tuning_storager4r^_scheduler_originalrlriinfo_restore_ps_affinityunsetr._cgroups_original_affinityr_cgroup_affinity_initialized_cgroup collections OrderedDictoptionsr0_cgroups _schedulerrnrorjerrorstrrrV threadingZEvent _terminaterSr\Z thread_mapZ_threadsevselZ TYPE_SOFTWAREZCOUNT_SW_DUMMYZ SAMPLE_TIDZ SAMPLE_CPUopenr]Zevlistaddrw)r instanceZperf_mmap_pages_rawrrr+rr)r r_instance_inits^          zSchedulerPlugin._instance_initcCs*|jr&x|jjD]}tj|jqWdS)N)ra get_pollfdr)closer,)r rfdrrr_instance_cleanupsz!SchedulerPlugin._instance_cleanupcCs4dtjdddddddddddddddddddddS)NFTcalcZfalse)isolated_corescgroup_mount_pointcgroup_mount_point_initcgroup_groups_initcgroup_for_isolated_corescgroup_ps_blacklist ps_whitelist ps_blacklist irq_processdefault_irq_smp_affinityrrperf_process_forksched_min_granularity_nssched_latency_nssched_wakeup_granularity_nssched_tunable_scalingsched_migration_cost_nssched_nr_migratenuma_balancing_scan_delay_ms!numa_balancing_scan_period_min_ms!numa_balancing_scan_period_max_msnuma_balancing_scan_size_mb)rRZDEF_CGROUP_MOUNT_POINT)clsrrr_get_config_optionss,z#SchedulerPlugin._get_config_optionscCs|dk rt|jddSdS)N./)rreplace)r rrrrrm9sz%SchedulerPlugin._sanitize_cgroup_pathcCs>t|tjs|}tj|}tj|}|j|r:d|d}|S)N[]) isinstanceprocfsprocessZprocess_cmdline _is_kthread)r rr9rrrr _get_cmdline=s     zSchedulerPlugin._get_cmdlinecCstj}|ji}x|jD]}yN|j|}|d}|||<d|krnx&|djD]}|j|}|||<qTWWqttfk r}z$|jtj ks|jtj krwnWYdd}~XqXqW|S)Nr9rx) rpidstatsreload_threadsr2rkeysOSErrorIOErrorerrnoENOENTESRCH)r ps processesprocrr9errr get_processesGs$    zSchedulerPlugin.get_processescCs@|jj|}|jj|}|jj|}tjd|||f||fS)Nz8Read scheduler policy '%s' and priority '%d' of PID '%d')rbr:r7rFridebug)r r9r sched_strr rrr_get_rt`s    zSchedulerPlugin._get_rtcCs|jj|}tjd|||fyB|jj|}|jj|}||ksJ||kr`tjd||||fWn4ttfk r}ztjd|WYdd}~XnXy|jj |||Wn`ttfk r }z>t |dr|j t j krtjd|ntjd||fWYdd}~XnXdS)NzBSetting scheduler policy to '%s' and priority to '%d' of PID '%d'.z9Priority for %s must be in range %d - %d. '%d' was given.z(Failed to get allowed priority range: %srzAFailed to set scheduling parameters of PID %d, the task vanished.z1Failed to set scheduling parameters of PID %d: %s) rbr7rirrHrJr SystemErrorrr?hasattrrr)r r9r=r>rZprio_minZprio_maxrrrr_set_rths*    zSchedulerPlugin._set_rtcCs|ddtjj@dkS)Nstatflagsr)rZpidstatZ PF_KTHREAD)r rrrrrszSchedulerPlugin._is_kthreadcCs yjtj|}|djrd|dddkr8tjd|n(|j|rRtjd|ntjd|dSdSWnttfk r}zF|j t j ks|j t j krtjd |d Stj d ||fd SWYdd}~Xn8t tfk r}ztj d ||fdSd}~XnXdS)NrstateZzYAffinity of zombie task with PID %d cannot be changed, the task's affinity mask is fixed.z[Affinity of kernel thread with PID %d cannot be changed, the task's affinity mask is fixed.zRAffinity of task with PID %d cannot be changed, the task's affinity mask is fixed.rrz6Failed to get task info for PID %d, the task vanished.z&Failed to get task info for PID %d: %srfr)rrZis_bound_to_cpurirrwarnrrrrrrrcKeyError)r r9rrrrr_affinity_changeables2       z$SchedulerPlugin._affinity_changeablec Cs\y|j|}Wn(tk r6t|j}||j|<YnX|jdkrX|jdkrX||_||_dS)N)r{rrrr r )r r9r r paramsrrr_store_orig_process_rts z&SchedulerPlugin._store_orig_process_rtcCsd}|dkr|dkr|Sy:|j|\}}|dkr4|}|j||||j|||Wntttfk r}zTt|dr|jtjkrtj d|||j kr|j |=d}ntj d||fWYdd}~XnX|S)NTrz=Failed to read scheduler policy of PID %d, the task vanished.FzcRefusing to set scheduler and priority of PID %d, reading original scheduling parameters failed: %s) rrrrrrrrrirr{r)r r9r=r>contZ prev_schedZ prev_priorrrr_tune_process_rts& z SchedulerPlugin._tune_process_rtcCst|dddkS)Nrkzcgroup.)r)r r rrr_is_cgroup_affinitysz#SchedulerPlugin._is_cgroup_affinityFc Csby|j|}Wn(tk r6t|j}||j|<YnX|jdkr^|jdkr^|rX||_n||_dS)N)r{rrrr r )r r9r is_cgrouprrrr_store_orig_process_affinitys z,SchedulerPlugin._store_orig_process_affinityc Cspxj|jjdtjt|dfddjdD]@}y&|jdddd}|dkrP|Sd Stk rfYq(Xq(Wd S) Nz%s/%s/%sr T)no_error z:cpuset:rrOr)r read_filerRZPROCFS_MOUNT_POINTrsplit IndexError)r r9lr rrr_get_cgroup_affinitys, z$SchedulerPlugin._get_cgroup_affinitycCsB|j|}|j}|dkr$d||f}|jjd|t|dddS)Nrz%s/%sz%s/tasksT)r)rm_cgroup_mount_pointr write_to_filer)r r9r pathrrr _set_cgroups   zSchedulerPlugin._set_cgroupcCs,|dd}t|t o"t|dk}||fS)Nrkr)rlistrl)r r rrrr_parse_cgroup_affinitys z&SchedulerPlugin._parse_cgroup_affinityc Csd}|dkr|Syd|j|\}}|r<|j|}|j||n(|j|}|rX|j|||}|j|||j|||Wntttfk r}zTt |dr|j t j krt j d|||jkr|j|=d}nt jd||fWYdd}~XnX|S)NTrz5Failed to read affinity of PID %d, the task vanished.FzLRefusing to set CPU affinity of PID %d, reading original affinity failed: %s)rrr _get_affinity_get_intersect_affinity _set_affinityrrrrrrrirr{r) r r9r intersectrrr prev_affinityrrrr_tune_process_affinitys4     z&SchedulerPlugin._tune_process_affinitycCsF|j|||}|sdS|j||}| s2||jkr6dS||j|_dS)N)rrr{r)r r9rr=r>r rrrr _tune_processs zSchedulerPlugin._tune_processc Csf|jj|}|dkr.|dkr.tjd|dSy t|}Wn"tk r\tjd|dSX||fS)Nrz>Invalid scheduler: %s. Scheduler and priority will be ignored.z=Invalid priority: %s. Scheduler and priority will be ignored.)NN)NN)rbr6rirrTrg)r r5Z str_priorityr r rrr_convert_sched_paramss  z%SchedulerPlugin._convert_sched_paramscCsD|dkrd}n2|j|r|}n"|jj|}|s@tjd|d}|S)Nrz)Invalid affinity: %s. It will be ignored.)rr hex2cpulistrir)r Z str_affinityr rrr_convert_affinity+s  z!SchedulerPlugin._convert_affinitycCs6|\}}}}}|j||\}}|j|}|||||fS)N)rr)r vals rule_prior r r regexrrr_convert_sched_cfg8s   z"SchedulerPlugin._convert_sched_cfgcCsd|j|f}ytj|tjWn4tk rT}ztjd||fWYdd}~XnX|jj d|df|jj d|jdfddddstjd|dS)Nz%s/%sz Unable to create cgroup '%s': %sz cpuset.memsT)rz3Unable to initialize 'cpuset.mems ' for cgroup '%s') rr)mkdirrRDEF_CGROUP_MODErrirrrr)r r rrrrr_cgroup_create_group?s$z$SchedulerPlugin._cgroup_create_groupcCs@|jdk r"|j|jkr"|j|jx|jD]}|j|q*WdS)N)rrr)r cgrrr_cgroup_initialize_groupsJs  z)SchedulerPlugin._cgroup_initialize_groupscCstjdytj|jtjWn0tk rN}ztjd|WYdd}~XnX|j j dddddd|jg\}}|dkrtjd |jdS) NzInitializing cgroups settingsz'Unable to create cgroup mount point: %sZmountz-tr z-oZcpusetrzUnable to mount '%s') rirr)makedirsrrRrrrrexecute)r rretoutrrr_cgroup_initializePs   z"SchedulerPlugin._cgroup_initializecCsHytj|Wn4tk rB}ztjd||fWYdd}~XnXdS)Nz#Unable to remove directory '%s': %s)r)rmdirrrir)r r rrrr _remove_dirZszSchedulerPlugin._remove_dircCsXx&t|jD]}|jd|j|fq W|jdk rT|j|jkrT|jd|j|jfdS)Nz%s/%s)reversedrrrr)r rrrr_cgroup_finalize_groups`sz'SchedulerPlugin._cgroup_finalize_groupscCsltjd|jjd|jg\}}|dkr.cs6g|].\}}tjd|rt|dkr|j|fqS)zgroup\.)rematchrlr)r*rpr)r rrrqs cSs |ddS)Nrrr)Z option_valsrrrsz8SchedulerPlugin._instance_apply_static..)keyz(error compiling regular expression: '%s'cs(g|] \}}tj|dk r||fqS)N)rsearch)r*r9r)r%rrrqsc s$g|]\}}||ffqSrr)r*r9r)r rpr rr rrrqsz(?}ztjd|dSd}~XnXx|jjD]x\}}||ksL|||jkrlqL|jdk r|j dk r|j ||j|j |j dk r|j ||j qL|j dk rL|j||j qLWi|_|jj|jdS)NzKerror unapplying tuning, cannot get information about running processes: %s)rrrrirr{r0rr r rr rr rrzr~r^)r rrr9Z orig_paramsrrrr}s&      z$SchedulerPlugin._restore_ps_affinitycCsttj}d}xr|dkr|dkr|jjd|j|dfddd}|d krvx.|jdD] }|jjd |jdf|dd qRW|d 8}qW|dkrtj d |dS)N rOrz%s/%s/%sZtasksT)rrrz%s/%s)rrz(Unable to cleanup tasks from cgroup '%s')rOr#) rTrRZCGROUP_CLEANUP_TASKS_RETRYrrrrrrir)r r Zcntdatarrrr_cgroup_cleanup_tasks_ones    z)SchedulerPlugin._cgroup_cleanup_tasks_onecCs@|jdk r"|j|jkr"|j|jx|jD]}|j|q*WdS)N)rrr%)r rrrr_cgroup_cleanup_taskss  z%SchedulerPlugin._cgroup_cleanup_taskscsptt|j|||jr2|jr2|jj|jj|j |j |j |j sV|j r^|j|j rl|jdS)N)rQrN_instance_unapply_staticrSryrrr!joinr}r r&rrrr)r rZrollback)rerrr's    z(SchedulerPlugin._instance_unapply_staticcCstjd|d|j|df}|jj|ddd}|dkr|jtjks<|jtjkrLtjd|ntjd||fdSd}~XnX|j j |j ||}|dk r||j krtjd||t |f|\}}} |j||||| |jj|j|j dS)Nz3Failed to get cmdline of PID %d, the task vanished.z#Failed to get cmdline of PID %d: %sz-tuning new process '%s' with PID '%d' by '%s')rrrrrrrirrrZ re_lookuprr{rrrzrr^) r rr9r%rrvr=r>r rrr_add_pid$s$       zSchedulerPlugin._add_pidcCs6||jkr2|j|=tjd||jj|j|jdS)Nz)removed PID %d from the rollback database)r{rirrzrr^)r rr9rrr _remove_pid9s   zSchedulerPlugin._remove_pidc Cs|jj|j}tj}|jj}x|D]}|j|q&Wx|jj st |j|j ddkr:|jj r:d}x|rd}xt|j D]j}|jj |}|r~d}|jtjks|jr|jtjkr|j|t|j|q~|jtjkr~|j|t|jq~WqnWq:WdS)NirTF)rZre_lookup_compilerselectpollrarregisterrZis_setrlrUr]Z read_on_cpurtr\Z RECORD_COMM_perf_process_fork_valueZ RECORD_FORKr2rTtidZ RECORD_EXITr3) r rr%r5ZfdsrZ read_eventsZcpuZeventrrrr @s&   $    zSchedulerPlugin._thread_coder) per_devicecCs:|rdS|r6|dk r6djddtjdt|D|_dS)N|cSsg|] }d|qS)z(%s)r)r*r1rrrrq_sz8SchedulerPlugin._cgroup_ps_blacklist..z(?.z(?.z(?)_default_irq_smp_affinity_valuercpulist_unpack)r r;rr<r0rrr_default_irq_smp_affinityys  z)SchedulerPlugin._default_irq_smp_affinityrcCs*|rdS|r&|dk r&|jj|dk|_dS)Nr )rrVr7)r r;rr<r0rrr_perf_process_forks z"SchedulerPlugin._perf_process_forkcCs"|jj|}tjd||f|S)NzRead affinity '%s' of PID %d)rbrArir)r r9resrrrrs zSchedulerPlugin._get_affinitycCstjd||fy|jj||dSttfk r}zXt|dr`|jtjkr`tjd|n.|j |}|dksz|d krtj d|||fdSd}~XnXdS) Nz'Setting CPU affinity of PID %d to '%s'.Trz4Failed to set affinity of PID %d, the task vanished.rrfz,Failed to set affinity of PID %d to '%s': %sFr) rirrbrCrrrrrrr)r r9r rrCrrrrs  zSchedulerPlugin._set_affinitycCs"t|jt|}|rt|S|S)N)r intersectionr)r Z affinity1Z affinity2Z affinity3Zaffrrrrsz'SchedulerPlugin._get_intersect_affinityc s>fdd|D}jdkr.fdd|D}jdkrJfdd|D}tdd|D}x|D]}yj||}Wnbttfk r}zB|jtjks|jtjkrt j d|nt j d||fwbWYdd}~XnXj ||d d } | sqb|j kr |j |_| rbd ||krbj||d j|d qbWdS) Ncs(g|] }tjjj|dk r|qS)N)rrrY_get_stat_comm)r*r1)r rrrqs z9SchedulerPlugin._set_all_obj_affinity..rOcs(g|] }tjjj|dkr|qS)N)rrrZrE)r*r1)r rrrqs cs(g|] }tjjj|dkr|qS)N)rrr[_get_stat_cgroup)r*r1)r rrrqs cSsg|]}|j|fqSr)r9)r*r1rrrrqsz3Failed to get cmdline of PID %d, the task vanished.zARefusing to set affinity of PID %d, failed to get its cmdline: %sT)rrx)rZr[r.rrrrrrrirrrr{r_set_all_obj_affinityr2) r Zobjsr rxZpslZpsdr9rrrr)r rrGs6         z%SchedulerPlugin._set_all_obj_affinityc Cs(y|dStttfk r"dSXdS)NZcgroupsrO)rrr)r r&rrrrFsz SchedulerPlugin._get_stat_cgroupc Cs,y |ddStttfk r&dSXdS)NrrvrO)rrr)r r&rrrrEs zSchedulerPlugin._get_stat_commcCs`y&tj}|j|j|j|dWn4ttfk rZ}ztjd|WYdd}~XnXdS)NFzIerror applying tuning, cannot get information about running processes: %s) rrrrGr2rrrir)r r rrrrr_set_ps_affinitysz SchedulerPlugin._set_ps_affinitycCsyJ|jj|}tjd||fd|}t|d}|j|WdQRXdSttfk r}zLt|dr|j t j kr| rtjd|d Stj d|||fd SWYdd}~XnXdS) Nz&Setting SMP affinity of IRQ %s to '%s'z/proc/irq/%s/smp_affinitywrrz/Setting SMP affinity of IRQ %s is not supportedrfz0Failed to set SMP affinity of IRQ %s to '%s': %srrr) r cpulist2hexrirrwriterrrrZEIOr)r rPr Z restoring affinity_hexfilenamer#rrrr_set_irq_affinitys"   z!SchedulerPlugin._set_irq_affinitycCs|y>|jj|}tjd|tdd}|j|WdQRXWn8ttfk rv}ztjd||fWYdd}~XnXdS)Nz(Setting default SMP IRQ affinity to '%s'z/proc/irq/default_smp_affinityrIz2Failed to set default SMP IRQ affinity to '%s': %s) rrJrirrrKrrr)r r rLr#rrrr_set_default_irq_affinitys  z)SchedulerPlugin._set_default_irq_affinityc Cs"t}tj}x|jD]}y"||d}tjd||fWntk rTwYnX|j|||}t|t|krvq|j ||d}|dkr||j |<q|d kr|j j |qW|j jd}|j j|}|jdkr|j|||}n|jdkr|j}|jdkr|j|||_|jj|j|dS) Nr zRead affinity of IRQ '%s': '%s'Frrfz/proc/irq/default_smp_affinityrr>r)rr interruptsrrirrrrrNrrappendrrrr?rOrrzr`) r r irq_originalrrPrrrCZprev_affinity_hexrrr_set_all_irq_affinity s6        z%SchedulerPlugin._set_all_irq_affinitycCsn|jj|jd}|dkrdSx$|jjD]\}}|j||dq(W|jdkr\|j}|j||jj |jdS)NTr>) rzr4r`rr0rNr?rrOr~)r rRrPr rrr_restore_all_irq_affinity)s  z)SchedulerPlugin._restore_all_irq_affinitycCsFt|jt|}|r,tjtj||fntjtj|||f|S)N)rissubsetrir|rRr*rr+)r irq_descriptioncorrect_affinityr,rCrrr_verify_irq_affinity4s z$SchedulerPlugin._verify_irq_affinityc Cs|jj|jd}tj}d}x|jD]}||jkrR|rRd|}tjt j |q&y<||d}tj d||fd|} |j | ||sd}Wq&t k rw&Yq&Xq&W|jjd} |jj| }|jdkr|j d ||jd kr|n|j rd}|S) NTz-IRQ %s does not support changing SMP affinityr z#Read SMP affinity of IRQ '%s': '%s'zSMP affinity of IRQ %sFz/proc/irq/default_smp_affinityr>zdefault IRQ SMP affinityr)rzr4r`rrPrrrir|rRZ STR_VERIFY_PROFILE_VALUE_MISSINGrrXrrrrr?) r rWr0rRrrCrP descriptionr,rVZcurrent_affinity_hexrrr_verify_all_irq_affinity@s8     z(SchedulerPlugin._verify_all_irq_affinityr )r9r c Csd}d|_|dk rrt|jj|}t|j}|j|rRt||}|jj||_n |jj|j}tj d||f|sz|r|dkrdS|r|j r|j ||SdS|r|j r|j d|j } n|} |j| |j r|j|n|j r|jdS)NzJInvalid isolated_cores specified, '%s' does not match available cores '%s'Tz cgroup.%s)rrrr@r]rUrr)rirr_rZrr rHrSrT) r r;rr<r0r isolatedZpresentZstr_cpusZ ps_affinityrrr_isolated_cores_s6        zSchedulerPlugin._isolated_corescCsd|||f}|jj|}|r"|Sd||f}tjj|sv|dkrPd||f}nd|||f}d|}|jdkrvd|_||j|<|S)Nz%s_%s_%sz/proc/sys/kernel/%s_%srOz%s/%sz%s/%s/%sz/sys/kernel/debug/%sT)rXr4r)rexistsrW)r prefix namespaceknobrrrrr_get_sched_knob_paths     z$SchedulerPlugin._get_sched_knob_pathcCsJ|jj|j|||dd}|dkrFtjd||jrFtjdd|_|S)N)rzError reading '%s'zUThis may not work with Secure Boot or kernel_lockdown (this hint is logged only once)F)rrrbrirrW)r r_r`rar$rrr_get_sched_knobs zSchedulerPlugin._get_sched_knobcCsN|dkr dS|sJ|jj|j|||||r0tjgnddsJtjd||f|S)NF)rz Error writing value '%s' to '%s')rrrbrrrir)r r_r`rarsimremoverrr_set_sched_knobszSchedulerPlugin._set_sched_knobrcCs|jdddS)NrOr=min_granularity_ns)rc)r rrr_get_sched_min_granularity_nssz-SchedulerPlugin._get_sched_min_granularity_nscCs|jddd|||S)NrOr=rg)rf)r rrdrerrr_set_sched_min_granularity_nssz-SchedulerPlugin._set_sched_min_granularity_nsrcCs|jdddS)NrOr= latency_ns)rc)r rrr_get_sched_latency_nssz%SchedulerPlugin._get_sched_latency_nscCs|jddd|||S)NrOr=rj)rf)r rrdrerrr_set_sched_latency_nssz%SchedulerPlugin._set_sched_latency_nsrcCs|jdddS)NrOr=wakeup_granularity_ns)rc)r rrr _get_sched_wakeup_granularity_nssz0SchedulerPlugin._get_sched_wakeup_granularity_nscCs|jddd|||S)NrOr=rm)rf)r rrdrerrr _set_sched_wakeup_granularity_nssz0SchedulerPlugin._set_sched_wakeup_granularity_nsrcCs|jdddS)NrOr=tunable_scaling)rc)r rrr_get_sched_tunable_scalingsz*SchedulerPlugin._get_sched_tunable_scalingcCs|jddd|||S)NrOr=rp)rf)r rrdrerrr_set_sched_tunable_scalingsz*SchedulerPlugin._set_sched_tunable_scalingrcCs|jdddS)NrOr=migration_cost_ns)rc)r rrr_get_sched_migration_cost_nssz,SchedulerPlugin._get_sched_migration_cost_nscCs|jddd|||S)NrOr=rs)rf)r rrdrerrr_set_sched_migration_cost_nssz,SchedulerPlugin._set_sched_migration_cost_nsrcCs|jdddS)NrOr= nr_migrate)rc)r rrr_get_sched_nr_migratesz%SchedulerPlugin._get_sched_nr_migratecCs|jddd|||S)NrOr=rv)rf)r rrdrerrr_set_sched_nr_migratesz%SchedulerPlugin._set_sched_nr_migratercCs|jdddS)Nr=numa_balancing scan_delay_ms)rc)r rrr!_get_numa_balancing_scan_delay_mssz1SchedulerPlugin._get_numa_balancing_scan_delay_mscCs|jddd|||S)Nr=ryrz)rf)r rrdrerrr!_set_numa_balancing_scan_delay_mssz1SchedulerPlugin._set_numa_balancing_scan_delay_msrcCs|jdddS)Nr=ryscan_period_min_ms)rc)r rrr&_get_numa_balancing_scan_period_min_mssz6SchedulerPlugin._get_numa_balancing_scan_period_min_mscCs|jddd|||S)Nr=ryr})rf)r rrdrerrr&_set_numa_balancing_scan_period_min_mssz6SchedulerPlugin._set_numa_balancing_scan_period_min_msrcCs|jdddS)Nr=ryscan_period_max_ms)rc)r rrr&_get_numa_balancing_scan_period_max_mssz6SchedulerPlugin._get_numa_balancing_scan_period_max_mscCs|jddd|||S)Nr=ryr)rf)r rrdrerrr&_set_numa_balancing_scan_period_max_mssz6SchedulerPlugin._set_numa_balancing_scan_period_max_msrcCs|jdddS)Nr=ry scan_size_mb)rc)r rrr _get_numa_balancing_scan_size_mbsz0SchedulerPlugin._get_numa_balancing_scan_size_mbcCs|jddd|||S)Nr=ryr)rf)r rrdrerrr _set_numa_balancing_scan_size_mbsz0SchedulerPlugin._set_numa_balancing_scan_size_mb)F)F)F)F)F)brrrrKrrjrr classmethodrrmrrrrrrrrrrrrrrrrrrrrrrrrr r r rr}r%r&rRZ ROLLBACK_SOFTr'r-r.r/r2r3r Zcommand_customr=rYrZr_rArBrrrrGrFrErHrNrOrSrTrXrZr]rbrcrfZ command_getrhZ command_setrirkrlrnrorqrrrtrurwrxr{r|r~rrrrr __classcell__rr)rerrNs(  ?             <      "    $ rN) rOrZ decoratorsZ tuned.logsZtunedr subprocessrr\r4Z tuned.constsrRrZtuned.utils.commandsrrr)rrhrrcrMZlogsr4riobjectrrrrLZPluginrNrrrrs0     /