We have used a long short-term memory (LSTM) as a basic architecture. The text, its' author and phonetics obtained with special heuristics were used as the inputs. The neural network was building vector representations of letters, words, sounds, authors and their poems in order to predict every next word.
The learning data-set consisted of Russian classic poetry and Russian song lyrics which all together comprised a 130 Mb data-set of texts. The neural network was reading the poems in a random order, reading every poem from 10 to 15 times on average. It has learned approximately 400 000 words that were used in the studied poems and also obtained certain compatibility of the words that could be regarded as some kind of a morphology model. The network also tried to learn specific features of every author. The more poems of a given author there were in the data-set the better it was "understood" by the network.