home > linux > amdgpu-linux-driver-parameters

AMDGPU Linux Driver Parameters

10 | 30 Oct 2016

AMDGPU Linux Module Parameters

I'm using an AMD A10-8700P (Carrizo) with the AMDGPU kernel module. Failing to find a complete list of what each parameter does or any documentation, I've decided to put a list together.

If anyone has any good links for AMDGPU documentation, please comment / email.

The parameter references in the source are all prepended with "amdgpu_". The source is maintained under the directory "linux/drivers/gpu/drm/amd/amdgpu".

Current values of parameters can be found in the following directory:

$ ls -1 /sys/module/amdgpu/parameters/

How to Change Parameter Values

Any of these parameters can be enabled / disabled via boot time module options in modprode.d. (note unlike some modules, the amdgpu's paramerters can't be edited via sysfs).

mike@mike-laptop4:~$ cat /etc/modprobe.d/amdgpu.conf
options amdgpu powerplay=1 bapm=1
blacklist powernow_k8
...

For reference, you can find my latest module options here:

https://github.com/mikejonesey/hp-envy-15-ah150sa/tree/master/etc/modprobe.d

as an alternate to modprobe.d, some options can be changed via the kernel commandline. (for example amdgpu.powerplay=1 can be added to "GRUB_CMDLINE_LINUX_DEFAULT").

aspm

ASPM support (1 = enable, 0 = disable, -1 = auto)

Active State Power Management, enables the power management of pcie devices. As the code only looks for a "0" and breaks if it finds it, the default would look to be switched on.

        if (amdgpu_aspm == 0)
                return;

referenced in amdgpu_drv.c, vi.c, cik.c

audio

Audio enable (-1 = auto, 0 = disable, 1 = enable)

Support for audio pins.

referenced in: amdgpu_drv.c, amdgpu_connectors.c, amdgpu_display.c, atombios_encoders.c, dce_v8_0.c, dce_v10_0.c, dce_v11_0.c

bapm

BAPM support (1 = enable, 0 = disable, -1 = auto)

Bidirectional Application Power Management (BAPM) is an algoirthm to enable fine grained power transfers between the core and GPU.

It will also edit the device TDP (enabling "Max Turbo Core Speed"), provided the device supports cTDP.

"Configurable TDP (cTDP) provides flexibility to AMD APUs, traditionally defined to be at fixed nominal TDPs, to fit well in platforms that have thermal solutions designed for lower than nominal TDP. For example, a 35W OPN with the cTDP feature will be able to function and perform well in platforms designed for 30W."

On my cpu the default TDP is 15 W. A higher TDP enables a higher clock speed. The default clock speed on my A10 is 1.8 GHz, with a max of 3.2 GHz. Note the higher the TDP the higher the Temperature.

BAPM functionality will balance the power with the current temperature. Laptops tend to heat faster so you can expect a reduced maximum value (Use a laptop cooler to increase).

Note: I've not yet managed to notably increase the TDP, and cpufreq-aperf does not seem to be compatible.

(Now working, cpu can be scaled 1.3Ghz to 3.2Ghz) 

Requirements for bapm:

Kernel Config:

CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_X86_ACPI_CPUFREQ=m

Kernel modules for cpu frequency scaling (I had to perform some dsdt cleaning for this):

cpufreq_conservative    16384  0
cpufreq_powersave      16384  0
cpufreq_userspace      16384  0
acpi_cpufreq           20480  0
processor              36864  5 acpi_cpufreq

(also enable the module paramiter via modprobe.d)

Sample from cpupower:

mike@mike-laptop4:~$ sudo cpupower monitor
    |Mperf               || Idle_Stats         
CPU | C0   | Cx   | Freq || POLL | C1   | C2   
   0|  2.55| 97.45|  2260||  0.00|  4.08| 93.64
   1|  1.62| 98.38|  2258||  0.00|  1.24| 97.25
   2|  2.09| 97.91|  2493||  0.00|  6.23| 91.62
   3| 97.80|  2.20|  3021||  0.00|  0.00|  0.00

Sample from cpufreq:

mike@mike-laptop4:~$ sudo  cpufreq-aperf -o
CPU    Average freq(KHz)    Time in C0    Time in Cx    C0 percentage
000    3006000            00 sec 962 ms    00 sec 037 ms    96
001    2754000            00 sec 010 ms    00 sec 989 ms    01
002    1296000            00 sec 072 ms    00 sec 927 ms    07
003    1314000            00 sec 037 ms    00 sec 962 ms    03

referenced in: amdgpu_drv.c, amdgpu_pm.c, ci_dpm.c, cz_dpm.c, kv_dpm.c, kv_smc.c

benchmark

Run benchmark

 

cg_mask

Clockgating flags mask (0 = disable clock gating)

 

deep_color

Deep Color support (1 = enable, 0 = disable (default)

 

disable_cu

Disable CUs (se.sh.cu,...)

 

disp_priority

Display Priority (0 = auto, 1 = normal, 2 = high)

 

dpm

DPM support (1 = enable, 0 = disable, -1 = auto)

Dynamic Power Management, this is enabled by default, dpm dynamically change clock speeds and voltage based on GPU load. It also enables clock and power gating.

referenced in: amdgpu_drv.c, amdgpu_dpm.c, amdgpu_pm.c, amdgpu_powerplay.c, amdgpu_uvd.c, amdgpu_vce.c, ci_dpm.c, cz_dpm.c, dce_v10_0.c, dce_v11_0.c, dce_v8_0.c, fiji_dpm.c, iceland_dpm.c, kv_dpm.c, tonga_dpm.c

exp_hw_support

experimental hw support (1 = enable, 0 = disable (default)

This is required to use "GCN 1.2+" (Graphics Core Next)

gartsize

Size of PCIE/IGP gart to setup in megabytes (32, 64, etc., -1 = auto)

 

hw_i2c

hw i2c engine enable (0 = disable)

 

ip_block_mask

IP Block Mask (all blocks enabled (default)

 

lockup_timeout

GPU lockup timeout in ms (default 0 = disable)

 

msi

MSI support (1 = enable, 0 = disable, -1 = auto)

 

pcie_gen2

PCIE Gen2 mode (-1 = auto, 0 = disable, 1 = enable)

 

pcie_gen_cap

PCIE Gen Caps (0: autodetect (default)

 

pcie_lane_cap

PCIE Lane Caps (0: autodetect (default)

 

pg_mask

Powergating flags mask (0 = disable power gating)

 

powercontainment

Power Containment (1 = enable (default), 0 = disable)

 

powerplay

Powerplay component (1 = enable, 0 = disable, -1 = auto (default)

Required Kernel Params:

CONFIG_DRM_AMD_POWERPLAY=y

runpm

PX runtime pm (1 = force enable, 0 = disable, -1 = PX only default)

 

sched_hw_submission

the max number of HW submissions (default 2)

 

sched_jobs

the max number of jobs supported in the sw queue (default 32)

 

smc_load_fw

SMC firmware loading(1 = enable, 0 = disable)

 

test

Run tests

 

vm_block_size

VM page table size in bits (default depending on vm_size)

 

vm_debug

Debug VM handling (0 = disabled (default), 1 = enabled)

 

vm_fault_stop

Stop on VM fault (0 = never (default), 1 = print first, 2 = always)

 

vm_size

VM address space size in gigabytes (default 64GB)

 

vramlimit

Restrict VRAM for testing, in megabytes

 

CPU Features

=== 3dnowprefetch ===
#define X86_FEATURE_3DNOWPREFETCH ( 6*32+ 8) /* 3DNow prefetch instructions */
=== abm ===
#define X86_FEATURE_ABM        ( 6*32+ 5) /* Advanced bit manipulation */
=== acc_power ===
#define X86_FEATURE_ACC_POWER    ( 3*32+19) /* AMD Accumulated Power Mechanism */
=== aes ===
#define X86_FEATURE_AES        ( 4*32+25) /* AES instructions */
=== aperfmperf ===
#define X86_FEATURE_APERFMPERF    ( 3*32+28) /* APERFMPERF */
=== apic ===
#define X86_FEATURE_APIC    ( 0*32+ 9) /* Onboard APIC */
=== arat ===
#define X86_FEATURE_ARAT    (14*32+ 2) /* Always Running APIC Timer */
=== avic ===
#define X86_FEATURE_AVIC    (15*32+13) /* Virtual Interrupt Controller */
=== avx ===
#define X86_FEATURE_AVX        ( 4*32+28) /* Advanced Vector Extensions */
=== avx2 ===
#define X86_FEATURE_AVX2    ( 9*32+ 5) /* AVX2 instructions */
=== bmi1 ===
#define X86_FEATURE_BMI1    ( 9*32+ 3) /* 1st group bit manipulation extensions */
=== bmi2 ===
#define X86_FEATURE_BMI2    ( 9*32+ 8) /* 2nd group bit manipulation extensions */
=== bpext ===
#define X86_FEATURE_BPEXT    (6*32+26) /* data breakpoint extension */
=== clflush ===
#define X86_FEATURE_CLFLUSH    ( 0*32+19) /* CLFLUSH instruction */
=== cmov ===
#define X86_FEATURE_CMOV    ( 0*32+15) /* CMOV instructions */
=== cmp_legacy ===
#define X86_FEATURE_CMP_LEGACY    ( 6*32+ 1) /* If yes HyperThreading not valid */
=== constant_tsc ===
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
=== cpb ===
#define X86_FEATURE_CPB        ( 7*32+ 2) /* AMD Core Performance Boost */
=== cr8_legacy ===
#define X86_FEATURE_CR8_LEGACY    ( 6*32+ 4) /* CR8 in 32-bit mode */
=== cx16 ===
#define X86_FEATURE_CX16    ( 4*32+13) /* CMPXCHG16B */
=== cx8 ===
#define X86_FEATURE_CX8        ( 0*32+ 8) /* CMPXCHG8 instruction */
=== de ===
#define X86_FEATURE_DE        ( 0*32+ 2) /* Debugging Extensions */
=== decodeassists ===
#define X86_FEATURE_DECODEASSISTS (15*32+ 7) /* Decode Assists support */
=== eagerfpu ===
#define X86_FEATURE_EAGER_FPU    ( 3*32+29) /* "eagerfpu" Non lazy FPU restore */
=== extapic ===
#define X86_FEATURE_EXTAPIC    ( 6*32+ 3) /* Extended APIC space */
=== extd_apicid ===
#define X86_FEATURE_EXTD_APICID    ( 3*32+26) /* has extended APICID (8 bits) */
=== f16c ===
#define X86_FEATURE_F16C    ( 4*32+29) /* 16-bit fp conversions */
=== flushbyasid ===
#define X86_FEATURE_FLUSHBYASID (15*32+ 6) /* flush-by-ASID support */
=== fma ===
#define X86_FEATURE_FMA        ( 4*32+12) /* Fused multiply-add */
=== fma4 ===
#define X86_FEATURE_FMA4    ( 6*32+16) /* 4 operands MAC instructions */
=== fpu ===
#define X86_FEATURE_FPU        ( 0*32+ 0) /* Onboard FPU */
=== fsgsbase ===
#define X86_FEATURE_FSGSBASE    ( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/
=== fxsr ===
#define X86_FEATURE_FXSR    ( 0*32+24) /* FXSAVE/FXRSTOR, CR4.OSFXSR */
=== fxsr_opt ===
#define X86_FEATURE_FXSR_OPT    ( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
=== ht ===
#define X86_FEATURE_HT        ( 0*32+28) /* Hyper-Threading */
=== hw_pstate ===
#define X86_FEATURE_HW_PSTATE    ( 7*32+ 8) /* AMD HW-PState */
=== ibs ===
#define X86_FEATURE_IBS        ( 6*32+10) /* Instruction Based Sampling */
=== lahf_lm ===
#define X86_FEATURE_LAHF_LM    ( 6*32+ 0) /* LAHF/SAHF in long mode */
=== lbrv ===
#define X86_FEATURE_LBRV    (15*32+ 1) /* LBR Virtualization support */
=== lm ===
#define X86_FEATURE_LM        ( 1*32+29) /* Long Mode (x86-64) */
=== lwp ===
#define X86_FEATURE_LWP        ( 6*32+15) /* Light Weight Profiling */
=== mca ===
#define X86_FEATURE_MCA        ( 0*32+14) /* Machine Check Architecture */
=== mce ===
#define X86_FEATURE_MCE        ( 0*32+ 7) /* Machine Check Exception */
=== misalignsse ===
#define X86_FEATURE_MISALIGNSSE ( 6*32+ 7) /* Misaligned SSE mode */
=== mmx ===
#define X86_FEATURE_MMX        ( 0*32+23) /* Multimedia Extensions */
=== mmxext ===
#define X86_FEATURE_MMXEXT    ( 1*32+22) /* AMD MMX extensions */
=== monitor ===
#define X86_BUG_MONITOR         X86_BUG(12) /* IPI required to wake up remote CPU */
=== movbe ===
#define X86_FEATURE_MOVBE    ( 4*32+22) /* MOVBE instruction */
=== msr ===
#define X86_FEATURE_MSR        ( 0*32+ 5) /* Model-Specific Registers */
=== mtrr ===
#define X86_FEATURE_MTRR    ( 0*32+12) /* Memory Type Range Registers */
=== mwaitx ===
#define X86_FEATURE_MWAITX    ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */
=== nodeid_msr ===
#define X86_FEATURE_NODEID_MSR    ( 6*32+19) /* NodeId MSR */
=== nonstop_tsc ===
#define X86_FEATURE_NONSTOP_TSC    ( 3*32+24) /* TSC does not stop in C states */
=== nopl ===
#define X86_FEATURE_NOPL    ( 3*32+20) /* The NOPL (0F 1F) instructions */
=== npt ===
#define X86_FEATURE_NPT        (15*32+ 0) /* Nested Page Table support */
=== nrip_save ===
=== nx ===
#define X86_FEATURE_NX        ( 1*32+20) /* Execute Disable */
=== osvw ===
#define X86_FEATURE_OSVW    ( 6*32+ 9) /* OS Visible Workaround */
=== overflow_recov ===
#define X86_FEATURE_OVERFLOW_RECOV (17*32+0) /* MCA overflow recovery support */
=== pae ===
#define X86_FEATURE_PAE        ( 0*32+ 6) /* Physical Address Extensions */
=== pat ===
#define X86_FEATURE_PAT        ( 0*32+16) /* Page Attribute Table */
=== pausefilter ===
#define X86_FEATURE_PAUSEFILTER (15*32+10) /* filtered pause intercept */
=== pclmulqdq ===
#define X86_FEATURE_PCLMULQDQ    ( 4*32+ 1) /* PCLMULQDQ instruction */
=== pdpe1gb ===
#define X86_FEATURE_GBPAGES    ( 1*32+26) /* "pdpe1gb" GB pages */
=== perfctr_core ===
#define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */
=== perfctr_nb ===
#define X86_FEATURE_PERFCTR_NB  ( 6*32+24) /* NB performance counter extensions */
=== pfthreshold ===
#define X86_FEATURE_PFTHRESHOLD (15*32+12) /* pause filter threshold */
=== pge ===
#define X86_FEATURE_PGE        ( 0*32+13) /* Page Global Enable */
=== pni ===
#define X86_FEATURE_XMM3    ( 4*32+ 0) /* "pni" SSE-3 */
=== popcnt ===
#define X86_FEATURE_POPCNT      ( 4*32+23) /* POPCNT instruction */
=== pse ===
#define X86_FEATURE_PSE        ( 0*32+ 3) /* Page Size Extensions */
=== pse36 ===
#define X86_FEATURE_PSE36    ( 0*32+17) /* 36-bit PSEs */
=== ptsc ===
#define X86_FEATURE_PTSC    ( 6*32+27) /* performance time-stamp counter */
=== rdrand ===
#define X86_FEATURE_RDRAND    ( 4*32+30) /* The RDRAND instruction */
=== rdtscp ===
#define X86_FEATURE_RDTSCP    ( 1*32+27) /* RDTSCP */
=== rep_good ===
#define X86_FEATURE_REP_GOOD    ( 3*32+16) /* rep microcode works well */
=== sep ===
#define X86_FEATURE_SEP        ( 0*32+11) /* SYSENTER/SYSEXIT */
=== skinit ===
#define X86_FEATURE_SKINIT    ( 6*32+12) /* SKINIT/STGI instructions */
=== smep ===
#define X86_FEATURE_SMEP    ( 9*32+ 7) /* Supervisor Mode Execution Protection */
=== sse ===
#define X86_FEATURE_SSE4A    ( 6*32+ 6) /* SSE-4A */
=== sse2 ===
#define X86_FEATURE_XMM2    ( 0*32+26) /* "sse2" */
=== sse4_1 ===
#define X86_FEATURE_XMM4_1    ( 4*32+19) /* "sse4_1" SSE-4.1 */
=== sse4_2 ===
#define X86_FEATURE_XMM4_2    ( 4*32+20) /* "sse4_2" SSE-4.2 */
=== sse4a ===
#define X86_FEATURE_SSE4A    ( 6*32+ 6) /* SSE-4A */
=== ssse3 ===
#define X86_FEATURE_SSSE3    ( 4*32+ 9) /* Supplemental SSE-3 */
=== svm ===
#define X86_FEATURE_SVM        ( 6*32+ 2) /* Secure virtual machine */
=== svm_lock ===
#define X86_FEATURE_SVML    (15*32+ 2) /* "svm_lock" SVM locking MSR */
=== syscall ===
#define X86_FEATURE_SYSCALL    ( 1*32+11) /* SYSCALL/SYSRET */
=== tbm ===
#define X86_FEATURE_TBM        ( 6*32+21) /* trailing bit manipulations */
=== tce ===
#define X86_FEATURE_TCE        ( 6*32+17) /* translation cache extension */
=== topoext ===
#define X86_FEATURE_TOPOEXT    ( 6*32+22) /* topology extensions CPUID leafs */
=== tsc ===
#define X86_FEATURE_TSC        ( 0*32+ 4) /* Time Stamp Counter */
=== tsc_scale ===
#define X86_FEATURE_TSCRATEMSR  (15*32+ 4) /* "tsc_scale" TSC scaling support */
=== vmcb_clean ===
#define X86_FEATURE_VMCBCLEAN   (15*32+ 5) /* "vmcb_clean" VMCB clean bits support */
=== vme ===
#define X86_FEATURE_VME        ( 0*32+ 1) /* Virtual Mode Extensions */
=== vmmcall ===
#define X86_FEATURE_VMMCALL     ( 8*32+15) /* Prefer vmmcall to vmcall */
=== wdt ===
#define X86_FEATURE_WDT        ( 6*32+13) /* Watchdog timer */
=== xop ===
#define X86_FEATURE_XOP        ( 6*32+11) /* extended AVX instructions */
=== xsave ===
#define X86_FEATURE_XSAVE    ( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
=== xsaveopt ===
#define X86_FEATURE_XSAVEOPT    (10*32+ 0) /* XSAVEOPT */

Notes

acpi_cpufreq module checks for "X86_FEATURE_CPB".

References:

http://products.amd.com/en-gb/search/APU/AMD-A-Series-Processors/AMD-A10...

https://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf

https://www.x.org/wiki/RadeonFeature/#index3h2

Post a Comment