Professional Documents
Culture Documents
ABSTRACT
Virtualization in computing is the creation of a virtual version for hardware platform, operating system, a storage device or network resources. Scheduling refers to the way processes are assigned to run on the available CPUs, since there are typically many more processes running than there are available CPUs. This assignment is carried out by software known as a scheduler. The scheduler is concerned mainly with: Throughput - number of processes that complete their execution per time unit. Latency is specifically a delay in processing. Turnaround - total time between submission of a process and its completion. Response time- amount of time it takes from when a request was submitted until the first response is produced. Fairness / Waiting Time - Equal CPU time to each process.
Scheduling policy in host operating system will have effect on the performance of Kernel-based Virtual Machine (KVM). KVM driver is added to the Linux kernel and Linux kernel is made to act as virtual machine monitor. By adding virtualization capabilities to a standard Linux kernel, fine-tuning work that is being loaded into the kernel can be enjoyed, and brings that benefit into a virtualized environment. in this model, every virtual machine is a regular Linux process scheduled by the standard Linux scheduler. Its memory is allocated by the Linux memory allocator. Virtualization has led to the creation of hypervisors. A hypervisor, also called virtual machine manager (VMM), is one of the hardware virtualization techniques that allow multiple operating systems, termed guests, to run concurrently on a host computer. The hypervisor presents the guest operating systems a virtual operating platform and manages the execution of the guest operating systems. In this report an optimized scheduling policy is proposed to improve the performance of KVM. In the first stage, a special process queue for virtual machine is added in the host operating system, which is scheduled before the normal process. Same time slices is given to each virtual machine to make their load balanced. Next, the virtual machine process queue is periodically sorted based on the remaining time slices of virtual machine and a special state is added to identify I/O-intensive virtual machine, which makes sure that I/Ointensive virtual machine is processed specially to receive a prior scheduling opportunity. At the end, an experiment in KVM environment is designed and executed to show the effectiveness of this optimized scheduling policy. In KVM environment, the scheduler within the host OS plays a key role in determining the overall fairness and performance characteristics of the whole virtualized system. KEYWORDS: Kernel based virtual machine, Scheduling policy, Host operating system Performance, response latency, native virtualization, hypervisor, QEMU process, KVM driver, and virtualization.
Page 1
1. INTRODUCTION
Virtualization has gained widespread uses in cloud computing, server consolidation and information security for its multitudinous benefits such as flexibility, isolation, high resource utilizing rate, easy IT infrastructure management, power saving and so on. In virtualization systems, resource virtualization of underlying hardware and concurrent execution of virtual machines are in the charge of a software called virtual machine monitor (VMM) or hypervisor. By creating the same view of underlying hardware and platform APIs from different vendors, virtual machine monitor enables virtual machines to run on any available computer. This not only eases the numerous applications of desktop computers, but also reduces the hardware cost of distributed environments. However, these benefits are not always for free. The existing of virtual machine monitor level debases the performance of some specific operations. As one of the core components, virtual machine monitor will affect the performance of virtualization systems to a great extent, so its important to measure and analyze the performance of virtual machine monitors. Kernel-based Virtual Machine (KVM) is a virtualization infrastructure for the Linux kernel. KVM supports native virtualization on processors with hardware virtualization extensions. Native virtualization is a platform virtualization approach that enables efficient full virtualization using help from hardware capabilities, primarily from the host processors. Full virtualization is used to simulate a complete hardware environment, or virtual machine, in which an unmodified guest operating system (using the same instruction set as the host machine) executes in complete isolation. Full virtualization is a virtualization technique used to provide a certain kind of virtual machine environment, namely, one that is a complete simulation of the underlying hardware. Full virtualization requires that every salient feature of the hardware be reflected into one of several virtual machines including the full instruction set, input/output operations, interrupts, memory access, and whatever other elements are used by the software that runs on the bare machine, and that is intended to run in a virtual machine. In such an environment, any software capable of execution on the raw hardware can be run in the virtual machine and, in particular, any operating systems. The obvious test of virtualization is whether an operating system intended for stand-alone use can successfully run inside a virtual machine Kernel based virtual machine is a new Virtualization solution based on Linux kernel, which need x86 hardware virtualization support. KVM has two components namely: KVM driver and QEMU process. KVM Driver consists of loadable volume as a part of Linux kernel and provides core virtualization infrastructure including virtual CPU (VCPU) and virtual memory for virtual machine (VM). The other component is a lightly modified QEMU process, which is used to simulate PC hardware components of user space and provides an I/O device model for Virtual machine. In conjunction with CPU emulation, it also provides a set of device models, allowing it to run a variety of unmodified guest operating systems, it can thus be viewed as a hosted virtual machine monitor. It also provides an accelerated mode for supporting a mixture of binary translation (for kernel code) and native execution (for user code), in the same fashion as VMware Workstation and Virtual Box. QEMU can also be used purely for CPU emulation for user level processes, allowing applications compiled for one architecture to run on another. Kernel modules see the virtualization of hardware resources through /dev/kvm and kill command. With /dev/kvm, guest operating system may have its own
Page 2
1.1 MOTIVATION
Linux treats each virtual machine as normal process. Hence, they have the same state with normal process and take the benefits of Linux feature. When virtual machine is running a high priority application, if a normal process which has higher priority than virtual machine but lower priority than the application on virtual machine, Linux scheduler enforces the virtual machine to give up the right to use processor resources and schedules normal process that is being arrived to run as it is not aware of the situation. Due to this, the high priority application in virtual machine is not executed and it also results in an unexpected process switch, thereby increasing the virtual machine switching overhead. As Linux kernel will not treat each virtual machine fairly, when kernel based virtual machine is running network monitoring application, it cannot guarantee that each virtual machine receive same network packets or workload balance. Therefore it cannot achieve basic requirement of network monitoring. If kernel based virtual machine is a system virtual machine in host operating system, Linux scheduler is not good enough to meet the requirement. In order to resolve this problem, an optimized scheduling policy is proposed.
1.1.1 VISION
For the drawbacks of Linux scheduling policy towards KVM, an optimized scheduling policy to improve the performance of KVM is proposed.
1.1.2 MISSION
Improving the efficiency of scheduling in Kernel based virtual machine. Improving the response latency of I/O intensive virtual machine.
1.1.3 OBJECTIVES
1. A process queue for scheduling KVM process is added into the queue and the VM in this queue has two states: HAVE and OVER, which depends on whether its time slices is remaining or not. VM in HAVE state always run first than in OVER state by using first-in, first-out (FIFO) technique and the KVM process queue has higher priority than normal process queue, which avoid VM preempted by normal process when running.
Page 3
Page 4
1.3 TAXONAMY
1. Kernel based virtual machine: Kernel-based Virtual Machine (KVM) is a virtualization infrastructure for the Linux kernel which supports native virtualization on processors with hardware virtualization extensions. 2. Hypervisor: A hypervisor, also called virtual machine manager (VMM), is one of the hardware virtualization techniques that allow multiple operating systems, termed guests, to run concurrently on a host computer. 3. Scheduling policy: A scheduling policy is the set of decisions you make regarding scheduling priorities, goals, and objectives. 4. Native virtualization: Native virtualization is a platform virtualization approach that enables efficient full virtualization using help from hardware capabilities, primarily from the host processors. 5. KVM driver: KVM driver as a loadable volume is a part of Linux kernel and provides core virtualization infrastructure including virtual CPU (VCPU) and virtual memory for virtual machine (VM).
Page 5
Page 6
2. PROPOSED TECHNOLOGY
Linux treats each virtual machine as normal process. So they have same state as that of normal process like: TASK_RUNNING, TASK_INTERUPTABLE, TASK_STOPPED, etc. When virtual machine is created, Linux set its state to TASK_RUNNING, and is put into CPUs process queue, where it waits to get scheduled. Each CPU has a process queue made up of 140 priority lists that are serviced in FIFO order. Processes that are scheduled to execute are added to the end of their respective process queue's priority list. Each process has a time slice that determines how much time it is permitted to execute. The first 100 priority lists of the process queue are reserved for real-time processes, and the last 40 are used for normal processes. Figure below depicts CPU process queue for schedule.
Queue [0]
NULL
Queue [1]
NULL
Que[139]
Task 0
Task 1
Page 7
Usually, a find-first-bit-set instruction is used to find the highest priority bit set in one of five 32-bit words (for the 140 priorities). The time it takes to find a process to execute depends not on the number of active processes but on the number of priorities. Linux processes are preemptive. If a process enters the TASK_RUNNING state, the kernel checks whether its dynamic priority is greater than the priority of the currently running process. If the priority is greater, the execution of current is interrupted and the scheduler is invoked to select another process to run (usually the process that just became run able). Of course, a process may also be preempted when its time quantum expires. When this occurs, the need_reached field of the current process is set, so the scheduler is invoked when the timer interrupt handler terminates. In kernel based virtual machine environment, host operating system is the scheduler of the virtual machine and virtual machine is treated as a normal process by Linux, which takes the advantage of Linux feature. The KVM code, which is rather small (about 10,000 lines), turns a Linux kernel into a hypervisor by loading a kernel module. Instead of writing a hypervisor and the necessary components, such as a scheduler, memory manager, I/O stack, and device drivers, KVM leverages the ongoing development of the Linux kernel. The kernel module exports a device called /dev/kvm, which enables a guest mode of the kernel (in addition to the traditional kernel and user modes). With /dev/kvm, a virtual machine has a unique address space. Devices in the device tree (/dev) are common to all user-space processes. But /dev/kvm is different because each process sees a different device map in order to support isolation of the virtual machines. KVM takes advantage of hardware-based virtualization extensions to run an unmodified OS.
Page 8
KVM THREADS PRIORITY Two most important kinds of KVM threads are QEMU threads and the VCPU threads. QEMU threads do the actual I/O and emulate the devices. The VCPU threads execute codes by instruction emulation or direct code execution. When QEMU thread has a higher priority, the I/O request will be met in a short time. But QEMU finishes I/O through emulation which has additional overhead. If I/O is emulated too frequently, the performance will also be affected. For example, when the guest does the network I/O, the network speed of guest will be slowed down if QEMU emulates I/O for every package. KVM threads priority should be well configured according to workload types to improve the virtualization performance.
KVM THREAD ALLOCATION MECHANISM Threads allocation mechanism decides which physical CPU to place one thread in initialization or migration In the worst situation, multiple VCPUs of one guest virtual machine will be placed onto the same physical core, which may cause a serious contention. It could happen in the real world because the load may be still balanced in this situation. The hypervisor model consists of a software layer which multiplexes the hardware among several guest operating systems. The hypervisor performs basic scheduling and memory management, and typically delegates management and I/O functions to a special, privileged, guest.
Page 9
I/O PROXY
HYPERVISOR
Today's hardware however is becoming increasingly complex. The basic scheduling operations have to take into account multiple hardware threads on a core, multiple cores on a socket, and multiple sockets on a system. Similarly, on-chip memory controllers require that memory management take into effect the NonUniform Memory Access (NUMA) characteristics of a system. While great effort is invested into adding these capabilities to hypervisors, a mature scheduler and memory management system that handles these issues very well the Linux kernel is present. When virtualization capabilities are added to a standard Linux kernel, all the fine-tuning work that has gone (and is going) into the kernel is enjoyed, and brings that benefit into a virtualized environment. Under this model, every virtual machine is a regular Linux process scheduled by the standard Linux scheduler. Its memory is allocated by the Linux memory allocator, with its knowledge of NUMA and integrated into the scheduler.
Page 10
LINUX SCHEDULER
Linux has a well designed scheduling framework, which includes three scheduler classes: Real-Time class, Completely Fair Scheduler (CFS) class and idle class. CFS models an ideal, precise multi-tasking CPU on real hardware, which can run each task at equal speed in parallel. CFS uses a red-black tree to sort the tasks according to their virtual running time. The prioritized tasks virtual running time increases slowly. Each task has a priority from 0 to 139 in kernel. The range from 0 to 99 is reserved for real-time processes. In user space, the value of a task is [-20, 19], which are mapped to the range from 100 to 139. If the task has smaller priority number, it means that the task is more important. Load balance is responsible for balancing tasks among available CPUs. CFS has both the passive balancing and active balancing. Passive balancing tries to balance CPUs in the system with the same loads, but it may fail at times if all the tasks on the busiest CPU have a higher priority. Active balancing moves exactly one task from the busiest CPU run queue to the initiator. Active balancing is more likely to succeed because it does not perform the priority comparison. CFS balances tasks even if the local CPU becomes busier than the busiest CPU. CFS will do the balance as long as the abstract value of imbalance between these two CPUs does not become larger. It is not necessary to load balance KVM threads too frequently, because it may result in large cache misses and low performance.
Page 11
GUEST MODE
GUEST MODE
QEMU I/O
QEMU I/O
LINUX KERNEL
KVM DRIVER
Virtual process, as a system virtual machine, is special to the normal system process, so it needs higher priority to run some of its own applications. Else when virtual machine tries to access system privileged resource, it will result VM exit and VM context switch, and then Linux kernel will complete the real work for virtual machine. As virtual machine context switch will bring system overhead, when VM is preempted by normal process, it will increase the whole system overhead, which is unexpected for. When virtual machine is running a high priority application, if a normal process which has high priority than virtual machine but low priority than the application in virtual machine arrives, Linux scheduler enforces the virtual machine to give up the right to use process resource and put new process to run as it is not aware of this situation. Due to this, not only the high priority application in virtual machine is not executed but also results in an unexpected process switch, thereby increasing the virtual machine switching overhead. As Linux kernel will not treat each kernel fairly, when kernel based virtual machine running network monitoring application is used, it cannot guarantee that each virtual machine receive same network packets or workload balance.
Page 12
Page 13
LINUX
KVM
VM PROCESS QUEUE
VM PROCESS SCHEDULER
FIFO SCHEDULER
LINUX task_struct
Page 14
Prioritized VM process
Schedule VM to run
VM process, tqi
FCFS scheduler
Assign priority to VM
VM process, tqi
Process state: HAVE/OVER/ URGENT HAVE/OVER
VM process tq=<0/>>0
VM process, tqi
process Linux task_struct Sleep_avg_time Checks process_ avg_sleep _ time Checks time quantum
Page 15
3. DEVELOPING TECHNIQUES
Each virtual machine is given a certain time slices when it starts, then it is put into virtual machine process queue. The overall objective of the optimized scheduler is to allocate the processor resources fairly; weighted by the time slices, each virtual machine is allocated. Therefore, each virtual machine is given the same time slices such that each of them gets an equal fraction of processor resources. Virtual machine in the process queue can be in one of two states: HAVE or OVER. If it is in the HAVE state, it means it has time slices remaining. If it is in the OVER state, it means it has finished time slices allocation. Time slices are based on periodic scheduler interrupts which occur at every 10ms. At each scheduler interrupt, the current running virtual machine consumes 100 slices. When the time slices for all of the virtual machine in the system goes negative, all VMs are given new time slices. Scheduling decisions are made such that virtual machines in the HAVE state run before the virtual machines in the OVER state. Virtual machines whose time slices allocation is OVER is executed only if there are no virtual machine in the HAVE state that are ready to run. When making scheduling decisions, the kernel based virtual machine scheduler only considers whether virtual machines is in the OVER or HAVE state. The remaining time slice a virtual machine has is irrelevant to it. Rather, it considers virtual machines in the same state that are scheduled in a FIFO manner. Virtual machines are inserted into the process queues after all other VMs in the process queue that are in the same state, and the scheduler selects the virtual machine which is at the front of the process queue to execute. When a VM reaches the front of the process queue, it is allowed to run for three scheduling intervals (for a total of 30ms) as long as it has sufficient time slices to do so. When a virtual machines time slice is over, it will enter to OVER state. As shown in the below figure, the kernel based virtual machine is loaded into Linux kernel. Kernel based virtual machine provides n number of virtual machine each of which can accommodate guest operating system which is shown in top layer rectangle in the diagram. Also the kernel based virtual machine is isolated from normal process queue and kernel based virtual machine process queue is scheduled separately. The schedule controller takes the input from system clock which makes it to accommodate the scheduling depending on the time quantum. The scheduling is done in FCFS manner depending on the state of virtual machine which in turn depends on time quantum which it constitutes. The scheduler schedules the virtual machine which is at the front of the queue for execution, if the state of virtual machine is HAVE. If the virtual machine is in OVER state, then the virtual machine is sent back to tail of the queue and state is changed to HAVE. The I/O intensive virtual machine will be given the other state and prioritized.
Page 16
NORMAL RUN QUEUE LINUX KERNEL KVM RUN QUEUE SCHEDULER CONTROL
P1 VM
Pi VMi
HARDWARE
CLOCK
CPU
MM
In virtualized data centers, I/O performance problems are caused by running numerous virtual machines on one server. In early server virtualization implementations, the number of virtual machines per server was typically limited to six or less. But it was found that it could safely run seven or more applications per server, often using 80 percent of total server capacity, an improvement over the average 5 to 15 percent utilized with non-virtualized servers.
Page 17
away from the front of the queue, it has to wait until Computation of intensive virtual machine is over. This results in high response latency. In order to resolve the problem of high response latency, an additional state: URGENT is added. A virtual machine in this state has higher priority than the virtual machine in HAVE and OVER states. Linux task_struct structure of sleep_avg domain records the past behavior of the process, especially the average sleep time of the process. The Linux kernel uses a circular doubly-linked list of structure task_struct to store the process descriptors. This structure is declared in Linux/sched.h. So when virtual machine process reaches the maximum sleep_avg value, it is considered as an I/O intensive virtual machine and is set to URGENT state. Once a virtual machine enters the URGENT state, it will prevent from entering the process queue at the tail and also to wait for all other active virtual machines before it is executed. It will preempt the current virtual machine and starts running. By increasing virtual machines priority in this fashion response latency of I/O intensive virtual machine can be reduced. In the optimized scheduler, when a virtual machine becomes run able, its remaining time slices have only a limited effect on its place in the process queue. Specifically, the remaining time slices only determines the virtual machines state. The virtual machine is always sent to the queue after the last virtual machine of the same state. In fact, I/O-intensive virtual machine will not be given any time slices if it happens to block before the periodic scheduler interrupt. I/O-intensive VMs will often consume their time slices more slowly than CPU-intensive VMs. So by sorting the process queue periodically according to each virtual machines remaining time slices, the latency for an I/O-intensive virtual machine which is to be executed can be reduced. The overall process of the new KVM scheduler is depicted in the following flow chart.
Page 18
Y Y N
Y
Y
N
Page 19
FIGURE 3.3 CPU UTILIZATION WHEN RUN NET SERVER IS IN ONE VM Figure above shows the total CPU utilization in 120 seconds when run netserver is in one virtualization before and after optimization. Also from figure it is seen that, after optimization, the CPU utilization of virtual machine is reduced almost 50%.
Page 20
FIGURE 3.4 CPU UTILIZATION WHEN RUN NET SERVER IS IN TWO VMs Figure above shows the total CPU utilization in 120 seconds when run net server is in two virtual machines before and after optimization. From the figure, it is observed that after optimization, each virtual machines CPU utilization reduces nearly by 50% and become more stable, and each virtual machine tend to balance the workload. On comparing above figures, it is also seen that the sum of two virtual machines CPU utilization is lower than one CPU utilization.
Page 21
4. CONCLUSION
In this report, problem of scheduling kernel based virtual machine is analyzed and to improve the performance, optimized approach for scheduling policy for scheduling both normal virtual machine and I/O intensive virtual machine is proposed.
Page 22
BIBLIOGRAPHY
1. http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine
2. en.wikipedia.org/wiki/Native_virtualization
3. http://www.qumranet.com/art_images/files/8/KVM_Whitepaper.pdf
4. http://en.wikipedia.org/wiki/QEMU
5. http://www.inf.fu-berlin.de/lehre/SS01/OS/Lectures/Lecture08.pdf
6. http://en.wikipedia.org/wiki/Virtualization
7. http://en.wikipedia.org/wiki/Network_monitoring
8. The netperf benchmark: http://www.netperf.org/netperf/NetperfPage.html 9. http://linuxgazette.net/133/saha.html 10. http://kerneltrap.org/node/525 11. CFS Optimizations to KVM Threads on Multi-Core Environment (IEEE, 2010) 12. A Synthetic Performance Evaluation of OpenVZ, Xen and KVM (IEEE, 2009) 13. http://www.anandtech.com/show/2480/10 14. http://www.linuxfoundation.org/collaborate/workgroups/networking/napi 15. http://en.wikipedia.org/wiki/Completely_Fair_Scheduler 16. http://en.wikipedia.org/wiki/Full_virtualization 17. http://en.wikipedia.org/wiki/Network_interface_controller 18. A Survey on I/O Virtualization and Optimization (The Fifth Annual China Grid Conference)
Page 23