A LOW-COMPLEXITY, NEAR-OPTIMAL SCHEDULING POLICY FOR SOLVING A RESTLESS MULTI-ARMED BANDIT PROBLEM OCCURRING IN A SINGLE-HOP WIRELESS NETWORK
By: ÖMER MELIH GÜL
Power resources and battery lifetime are important issues for wireless networks such as wireless sensor networks (WSNs). To extend the battery lifetime, the recent advances in energy harvesting (EH) techniques propose an effective solution. EH nodes can harvest energy from environmental sources (e.g. solar, wind, vibrational, thermal) to power their sensing, computing and communication functions. In this thesis, we develop a solution to a scheduling problem under three scheduling scenarios. Firstly, we consider a single-hop wireless network where the fusion center (FC) collects data from a set of m EH nodes (e.g. nodes of a WSN). In each time slot, k of m nodes can be scheduled by the FC for transmission over k orthogonal channels. FC has no direct knowledge of battery states of nodes, or EH processes; it only has causal information of the outcomes of transmission attempts. The objective is to find a low complexity scheduling policy whereby the fusion center can collect the maximum amount of throughput in this data backlogged system, where transmission is limited by harvested energy. Energy is assumed to be stored losslessly in the batteries of nodes, up to a storage capacity (infinite capacity case is also considered). The problem is treated in finite and infinite problem horizons. Secondly, we consider the case where the infinite data backlog assumption is lifted. Thirdly, we consider a dual problem of the first scheduling problem. A low-complexity policy, UROP (Uniformizing Random Ordered Policy) is proposed, whose near optimality is shown under general energy harvesting and data arrival processes (uniform, non-uniform, independent, Markovian). Numerical examples indicate that under a reasonable-sized battery and buffer capacity, UROP uses the arriving energy and data with almost perfect efficiency. As the problem is a restless multi-armed bandit (RMAB) problem with an average reward criterion, UROP may have a wider application area than communication network.
communication networks, decision theory, energy harvesting, scheduling algorithms, wireless sensor networks, wireless networks, restless multi-armed bandit