In enterprise networks, companies interact on a temporal basis through client–server relationships between order agents (clients) and resource agents (servers) acting as autonomic managers. In this work, the autonomic MES (@MES) proposed by Rolón and Martinez (2012) has been extended to allow selfish behavior and adaptive decision‐making in distributed execution control and emergent scheduling. Agent learning in the @MES is addressed by rewarding order agents in order to continuously optimize their processing routes based on cost and reliability of alternative resource agents (servers). Service providers are rewarded so as to learn the quality level corresponding to each task which is used to define the processing time and cost for each client request. Two reinforcement learning algorithms have been implemented to simulate learning curves of client–server relationships in the @MES. Emerging behaviors obtained through generative simulation in a case study show that despite selfish behavior and policy adaptation in order and resource agents, the autonomic MES is able to reject significant disturbances and handle unplanned events successfully.