Abstract:The recommender system aims to build a model from the user-item interaction and recommend the content of interest to users, so as to improve the user experience. However, most user-item sequences are not always sequentially related but have more flexible sequences and even noise. In order to solve this problem, a deep reinforce? ment learning sequence recommender algorithm based on strategy memory is proposed. The algorithm stores the user’s historical interaction in the memory network, and then uses a strategy network to divide the user′s current behavior pattern into short-term preference, long-term preference, and global preference, and introduces the attention mecha? nism to generate the corresponding user memory vector. The deep reinforcement learning algorithm is used to identify the projects with great benefits in the future. The strategy of the reinforcement learning network is continuously up? dated in the interaction between users and items to improve the accuracy of the recommender. Experiments on two public data sets show that the proposed algorithm improves the recall index by 8.87% and 11.20%, respectively, com? pared with the most advanced baseline algorithm.