In this paper, an energy efficient scheduling problem for multiple unmanned aerial vehicle (UAV) assisted mobile edge computing (MEC) is studied. In the considered model, UAVs act as mobile edge servers to provide computing services to endusers with task offloading requests. Unlike existing works, we allow UAVs to determine not only their trajectories but also the decisions of whether returning to the depot for replenishing energies and updating application placements (due to their limited batteries and storage capacities). With the aim of maximizing the long-term energy efficiency of all UAVs, i.e., the total amount of offloaded tasks computed by all UAVs over their total energy consumption, a joint optimization of UAVs’ trajectory planning, energy renewal and application placement is formulated. Taking into account the underlying cooperation and competition among intelligent UAVs, we reformulate such optimization problem as three coupled multi-agent stochastic games. Since the prior environment information is unavailable to UAVs, we propose a novel triple learner based reinforcement learning (TLRL) approach, integrating a trajectory learner, an energy learner and an application learner, for reaching equilibriums. Moreover, we analyze the convergence and the complexity of the proposed solution. Simulations are conducted to evaluate the performance of the proposed TLRL approach, and demonstrate its superiority over counterparts.