I want to enter a text, then compare other subsequent texts to it to see whether they are similar enough to warrant the same response. Count words in each text and report the similarity as number of similar words? Suppose an array of reference texts linked to potential output texts. For each word in the new input: If that word is in the reference text, add 1 to the similarity total. If a word in the input text's equivalence relations is in the equivalence relations of a word in the reference text, increment the similarity score. Store the final similarity total for that reference text, go on to the next reference text, and repeat the above three lines. Choose the reference text with the highest similarity total to generate an output. --- Reference texts are trained, or copied and pasted, from the parents of my reddit posts. My responses are linked to the reference texts that provoked my response. Run the similarity algorithm on all the initial reference texts to see if some can be consolidated. (How to "consolidate"?) --- March 3, 2017: From https://github.com/sergey-alekseev/lita-answers/blob/master/lib/nlp.rb class Nlp class << self # TODO: use real NLP, not this silly chunk of code def closest_sentence(string, sentences) string_tokens = split_into_tokens(string) string_tokens_size = string_tokens.size closest_sentence = nil closest_sentence_size = 0 sentences.each do |sentence| sentence_tokens = split_into_tokens(sentence) common_tokens = string_tokens & sentence_tokens bigger_sentence_size = [sentence_tokens.size, string_tokens_size].max if bigger_sentence_size - common_tokens.size == 1 && bigger_sentence_size > closest_sentence_size closest_sentence = sentence closest_sentence_size = bigger_sentence_size end end closest_sentence end def split_into_tokens(string) string.gsub('?', '').downcase.split end end end