Language and gesture are highly interdependent systems that reciprocally influence each other. For example, performing a gesture when learning a word or a phrase enhances its retrieval compared to pure verbal learning. Although the enhancing effects of co-speech gestures on memory are known to be robust, the underlying neural mechanisms are still unclear. Here, we summarize the results of behavioral and neuroscientific studies. They indicate that the neural representation of words consists of complex multimodal networks connecting perception and motor acts that occur during learning. In this context, gestures can reinforce the sensorimotor representation of a word or a phrase, making it resistant to decay. Also, gestures can favor embodiment of abstract words by creating it from scratch. Thus, we propose the use of gesture as a facilitating educational tool that integrates body and mind.