The «t» sounds in the words «
tea,» «
tree,» and «but,» for instance, might be classified as separate phones, but a speech recognition system has to transcribe all of them using the letter «t.» And indeed, Belinkov and Glass found that lower levels of the network were better at
recognizing phones than higher levels, where, presumably, the distinction is less important.