Old QEMU/KVM VFIO VM Setup Notes 2022-11-20 ================================ I found my old VFIO QEMU/KVM setup notes recently when organizing my files. This same script worked for Linux and FreeBSD guests too, and is not limited to just Windows guests. At the time, Looking Glass for Linux guests was not finished yet so I do not have notes regarding that. The markdown formatting would remain as is. # VFIO Install Notes Last modified: Oct. 17, 2020 Ignore poorly written notes as this is meant for me Other people reading this, go look at [the Arch Wiki on it](https://wiki.archlinux.org/index.php/PCI%20passthrough%20via%20OVMF) or [Yuri Alek's guide on Single GPU passthrough - warning: need jabbascript for captcha](https://gitlab.com/YuriAlek/vfio) or [4chan's /g/ wiki on it - warning: cuckflare](https://wiki.installgentoo.com/index.php/PCI_passthrough). # Prerequisites ## UEFI Options Enable VT-d and VT-x (or AMD equivalent) ## Kernel Config Enable KVM and VFIO > you can set VFIO as builtin but as a module is more flexible Also add `"iommu=pt intel_iommu=on"` to your kernel command line (or in CONFIG\_CMDLINE) ### Current Options ``` ... CONFIG_IOMMU_IOVA=y CONFIG_IOMMU_API=y CONFIG_IOMMU_SUPPORT=y CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y # use the respective AMD options if using an AMD CPU CONFIG_INTEL_IOMMU=y CONFIG_INTEL_IOMMU_SVM=y CONFIG_INTEL_IOMMU_DEFAULT_ON=y CONFIG_INTEL_IOMMU_FLOPPY_WA=y CONFIG_KVM_VFIO=y CONFIG_VFIO_IOMMU_TYPE1=m CONFIG_VFIO_VIRQFD=m CONFIG_VFIO=m CONFIG_VFIO_PCI=m CONFIG_VFIO_PCI_VGA=y CONFIG_VFIO_PCI_MMAP=y CONFIG_VFIO_PCI_INTX=y CONFIG_VFIO_PCI_IGD=y CONFIG_VFIO_MDEV=m CONFIG_VFIO_MDEV_DEVICE=m ... ``` ## Packages Required ``` app-emulation/qemu (actual program) sys-firmware/edk2-ovmf (UEFI firmware for Nvidia GPU) media-sound/scream (audio) looking-glass-client (compile from source if no package, or make your own) ``` `app-emulation/libvirt` can be used as well for easier configuration and autostart but I have had problems with it: - Service not starting properly, workaround is restarting service after it starts (Gentoo) - Networks and domains not autostarting, workaround is starting them manually (CRUX) ### Gentoo USE Flags ``` app-emulation/qemu gtk opengl sdl sdl-image usb # (spice, ssh, vhost-user-fs, virgl, and virtfs are optional I think) media-libs/libsdl2 X gles opengl # for Looking Glass ``` note to self (2020-10-17): check how minimal you can make qemu to run vfio # IOMMU Run `dmesg | grep -E 'DMAR'` and see if `DMAR: IOMMU enabled` or something similar is in output # QEMU Script All code blocks in this section go in the qemu script file unless specified otherwise ## Environment Variables ```sh IMG=/path/to/windows-image-file VIRTIO=/path/to/virtio-iso WINDOWS=/path/to/windows-install-iso OVMF=/usr/share/edk2-ovmf/OVMF_CODE.fd RAM=16G ULIMIT=$(ulimit -l) ULIMIT_TARGET=$(( $(echo $RAM | tr -d 'G')*1048576+100000 )) GPU_VIDEO=01:00.0 GPU_AUDIO=01:00.1 VIDEOID="10de 13c0" AUDIOID="10de 0fbb" VIDEOBUSID="0000:${GPU_VIDEO}" AUDIOBUSID="0000:${GPU_AUDIO}" ``` ## VFIO Detaching and Attaching ```sh vfio_on() { # for nvidia card with proprietary drivers rmmod nvidia_drm rmmod nvidia_modeset rmmod nvidia # disable bumblebee service or use bbswitch to detach card if using bumblebee modprobe vfio-pci echo $VIDEOID > /sys/bus/pci/drivers/vfio-pci/new_id echo $VIDEOBUSID > /sys/bus/pci/devices/$VIDEOBUSID/driver/unbind echo $VIDEOBUSID > /sys/bus/pci/drivers/vfio-pci/bind echo $VIDEOID > /sys/bus/pci/drivers/vfio-pci/remove_id echo $AUDIOID > /sys/bus/pci/drivers/vfio-pci/new_id echo $AUDIOBUSID > /sys/bus/pci/devices/$AUDIOBUSID/driver/unbind echo $AUDIOBUSID > /sys/bus/pci/drivers/vfio-pci/bind echo $AUDIOID > /sys/bus/pci/drivers/vfio-pci/remove_id # add rest of gpu devices if they are in the same group (I think 4 devices in 1000 or 2000 series nvidia) } vfio_off() { rmmod vfio_iommu_type1 rmmod vfio_pci rmmod vfio_virqfd rmmod vfio modprobe nvidia } ``` ## Networking ```sh net_on() { ip tuntap add dev tap0 mode tap group kvm ip link set dev tap0 up promisc on ip addr add 0.0.0.0 dev tap0 ip link add br0 type bridge ip link set br0 up ip link set tap0 master br0 echo 0 > /sys/class/net/br0/bridge/stp_state ip addr add 192.168.123.1/24 dev br0 sysctl net.ipv4.conf.tap0.proxy_arp=1 > /dev/null sysctl net.ipv4.conf.enp0s31f6.proxy_arp=1 > /dev/null sysctl net.ipv4.ip_forward=1 > /dev/null iptables -t nat -A POSTROUTING -o enp0s31f6 -j MASQUERADE > /dev/null iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT > /dev/null iptables -A FORWARD -i br0 -o enp0s31f6 -j ACCEPT > /dev/null } net_off() { sysctl net.ipv4.conf.tap0.proxy_arp=0 > /dev/null sysctl net.ipv4.conf.enp0s31f6.proxy_arp=0 > /dev/null sysctl net.ipv4.ip_forward=0 > /dev/null ip link set dev br0 down ip link del br0 ip link set dev tap0 down ip tuntap del mode tap name tap0 } ``` Also add this to /etc/conf.d/net if using Gentoo ([source](https://wiki.gentoo.org/wiki/QEMU/Options#Network_bridge)) > replace `enp0s31f6` with the host/master interface ```sh ... tuntap_tap0="tap" config_tap0="null" bridge_br0="enp0s31f6 tap0" config_br0="192.168.123.2 netmask 255.255.255.0" routes_br0="default via 192.168.123.1" bridge_forward_delay_br0=0 bridge_hello_time_br0=10 depend_br0() { need net.enp0s31f6 need net.tap0 } ... ``` ## Hugepages ```sh hugepages_on() { PAGES=$(( $(echo $RAM | tr -d 'G') * 1048576 / 2048)) mkdir -p /dev/hugepages mount -t hugetlbfs hugetlbfs /dev/hugepages echo $PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages } hugepages_off() { echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages umount /dev/hugepages } ``` ## QEMU Command ### Before installing guest OS (Windows 10 used as example) ```sh ulimit -l $ULIMIT_TARGET qemu-system-x86_64 \ -name 'vfio-vm' \ -vga qxl \ -nodefaults -enable-kvm -machine q35 \ -m $RAM -mem-path /dev/hugepages \ -cpu host,kvm=off,svm=off,topoext,hv_relaxed,hv_spinlocks=0x1fff,hv_time,hv_vapic,hv_vendor_id=novideobad43,hv_vpindex,hv_synic,hv_stimer,hv_frequencies \ -smp 8,sockets=1,cores=4,threads=2 \ -rtc clock=host,base=localtime \ -boot menu=on -boot d \ -nic tap,ifname=tap0,script=no,downscript=0,model=virtio-net-pci \ -drive if=pflash,format=raw,readonly,file=$OVMF \ -drive file="$VIRTIO",id=cd1,media=cdrom \ -drive file="$WINDOWS",id=cd2,media=cdrom \ -device virtio-scsi-pci,id=scsi0 \ -device scsi-hd,bus=scsi0.0,drive=rootfs \ -drive file="$IMG",id=rootfs,index=0,format=qcow2,media=disk,if=none ulimit -l $ULIMIT ``` ### After installing guest OS ```sh ulimit -l $ULIMIT_TARGET qemu-system-x86_64 \ -name 'vfio-vm' \ -vga none -nographic \ -nodefaults -enable-kvm -machine q35 \ -m $RAM -mem-path /dev/hugepages \ -cpu host,kvm=off,svm=off,topoext,hv_relaxed,hv_spinlocks=0x1fff,hv_time,hv_vapic,hv_vendor_id=novideobad43,hv_vpindex,hv_synic,hv_stimer,hv_frequencies \ -smp 8,sockets=1,cores=4,threads=2 \ -rtc clock=host,base=localtime \ -boot menu=on -boot c \ -nic tap,ifname=tap0,script=no,downscript=0,model=virtio-net-pci \ -device vfio-pci,host=$GPU_VIDEO,multifunction=on,x-vga=on \ -device vfio-pci,host=$GPU_AUDIO \ -device ivshmem-plain,memdev=ivshmem,bus=pcie.0 \ -object memory-backend-file,id=ivshmem,share=on,mem-path=/dev/shm/looking-glass,size=32M \ -device virtio-keyboard-pci \ -device virtio-mouse-pci \ -object input-linux,id=kbd0,evdev=/dev/input/by-id/usb-Corsair_Corsair_K70R_Gaming_Keyboard-if02-event-kbd,grab_all=on,repeat=on \ -object input-linux,id=mouse0,evdev=/dev/input/by-id/usb-Logitech_Gaming_Mouse_G502_0E5F335C3236-event-mouse \ -object input-linux,id=mouse1,evdev=/dev/input/by-id/usb-Logitech_Gaming_Mouse_G502_0E5F335C3236-if01-event-kbd,grab_all=on,repeat=on \ -drive if=pflash,format=raw,readonly,file=$OVMF \ -drive file="$VIRTIO",id=cd1,media=cdrom \ -device virtio-scsi-pci,id=scsi0 \ -device scsi-hd,bus=scsi0.0,drive=rootfs \ -drive file="$IMG",id=rootfs,index=0,format=qcow2,media=disk,if=none ulimit -l $ULIMIT ``` # Extra ## Adding USB Devices Get vendor and product id from `lsusb` and add them to your QEMU command arguments: ```sh -device qemu-xhci,id=xhci0 -device usb-host,bus=xhci0.0,vendorid=0x,productid=0x ``` Example for my USB bluetooth receiver: ``` $ lsusb ... Bus 001 Device 004: ID 0b05:17cb ASUSTek Computer, Inc. Broadcom BCM20702A0 Bluetooth ... ``` My vendorid is `0x0b05` and productid is `0x17cb`, so in QEMU it would be: ```sh -device qemu-xhci,id= -device usb-host,bus=.0,vendorid=0x0b05,productid=0x17cb ``` ## Set CPU Affinity While libvirt makes this more simple, it appears we need a script/function to do it in bare QEMU Borrowed from [here](https://null-src.com/posts/qemu-optimization/post.php#taskset) > note: uses bash-isms so that's why I put it in a separate file ```bash #!/bin/bash THREAD_LIST="0,4,1,5,2,6,3,7" NAME="vfio-vm" sleep 20 && HOST_THREAD=0 # for each vCPU thread PID for PID in $(pstree -pa $(pstree -pa $(pidof qemu-system-x86_64) | grep $NAME | awk -F',' '{print $2}' | awk '{print $1}') | grep CPU | pstree -pa $(pstree -pa $(pidof qemu-system-x86_64) | grep $NAME | cut -d',' -f2 | cut -d' ' -f1) | grep CPU | sort | awk -F',' '{print $2}') do let HOST_THREAD+=1 # set each vCPU thread PID to next host CPU thread in THREAD_LIST echo "taskset -pc $(echo $THREAD_LIST | cut -d',' -f$HOST_THREAD) $PID" | sh done ``` ## Additional Disk You can add another disk by simply copying the arguments for adding the rootfs and slightly modifying Example for a qcow2 image: ``` -device virtio-scsi-pci,id= \ -device scsi-hd,bus=.0,drive= \ -drive file=,id=,index=0,format=qcow2,media=disk,if=none ``` ## No Drives During Installation Make sure virtio driver is loaded: - Click Load Driver - Choose virtio-cd disc > amd64 > w10 and press enter - Load Red Hat Virtio SCSI driver ## Looking Glass Not Starting Make sure no virtual display like QXL is loaded too (`-nographic -vga none` in QEMU) ## JACK Support To use JACK instead of Scream, you can use these QEMU arguments ```sh -audiodev jack,id=snd0,in.client-name=default,out.client-name=default,in.start-server=off,out.start-server=off,in.exact-name=on,out.exact-name=on,in.connect-ports=system,out.connect-ports=system,in.frequency=48000,out.frequency=48000,timer-period=2048,out.buffer-length=5120 \ -device ich9-intel-hda \ -device hda-output,audiodev=snd0 \ ``` You might need to change the timer-period and buffer-length if experiencing crackling. Also you might have to change the controller (ich9-intel-hda) and codec (hda-output) to something else. To list controller and codecs, run: ```sh qemu-system-x86_64 -device help | grep hda ```