Computing Rhyme

One of the basic pieces of information that the phonetics we derived allow us to obtain is rhyme—taken here as patterns of repetition at the end of lines (that is, no internal rhymes as of yet).

Phonetic patterns are easier to predict in Russian than in English, but the attempt to automatically detect rhyme was nonetheless rather challenging. In this case, a particularly interesting problem presented by Russian rhyme is the range of possibilities as to which parts of the ends of final words are meaningful.

The tonic vowel is always central, though insufficient in and of itself to determine rhyme; the minimum requirement for rhyme is the repetition of the tonic vowel plus an adjacent consonant, whether preceding or following. Many rhymes do not stop at these minimum requirements, incorporating more elements of the word before or after.

The initial strategy of the XSLT stylesheet designed to derive rhyme from the phonetics tier was to divide these elements at the end of every line—the tonic vowel, preceding consonants, immediately following elements, and elements that continued further—and compare those vowels to other lines. (A more finely tuned stylesheet would break these elements down further, into component characters) As the search iterated through, it strung together reference numbers based on the number of lines preceding the matching line. Hence, a line with a rhyme that occurred in the first and third lines was given a reference attribute of “1,3”.

Unfortunately, the stylesheet is not mature enough to be left on its own to handle prepped poems. As with many of the components of this project, from adding stresses from the dictionary to calculating meter, we still have to manually check and correct the results. Additionally, the current, O2 method of comparing every line to every other line is inefficient, and might be addressed by creating a reference database on the fly. However, it is on its way; anyone would be welcome to give feedback or develop from where we have gotten so far.

The first thing we did with this information was create an SVG-driven visualization—these are available at the bottom of each of the poems available in the menu bar to the left. This particular visualization is very indebted to the work of Wendell Piez at The Sonneteer.

What remains to be done is note interesting deviations from patterns. Of the poems available so far, mere eyeballing (with the help of the visual, especially) would tell one that there is something interesting in the poem “Пушкин,” where the rhyme is relatively impoverished, with only two sounds over six lines, but shifts from couplets to alternating to a new couplet, or “hooks” two sets of single-sound poems together. Similarly, the unexpected continuity of “oj” across stanzas in “Лилии” might be worth analyzing. Fortunately, as we set up the means of analysis, our ability to count, pinpoint and analyze such moments will only become more nuanced.