Click For Photo: https://cdn.theatlantic.com/assets/media/img/mt/2018/04/GettyImages_527587750/facebook.jpg?1525099843
With this refinement in place, the OCR was finally ready to read some texts on its own. The team decided to feed it some documents from the Vatican Registers, a more than 18,000-page subset of the Secret Archives consisting of letters to European kings, rulings on legal matters, and other correspondence.
The initial results were mixed. In texts transcribed so far, a full one-third of the words contained one or more typos, places where the OCR guessed the wrong letter. If yov were tryinj to read those lnies in a bock, that would gct very aiiiioying. (The most common typos involved m/n/i confusion and another commonly confused pair: the letter f and an archaic, elongated form of s.) Still, the software got 96 percent of all handwritten letters correct. And even “imperfect transcriptions can provide enough information and context about the manuscript at hand” to be useful, says Merialdo.
Vatican - Survive - Age - Digital - Media
Can the Vatican Survive the Age of Digital Media?
Like all artificial intelligence, the software will improve over time, as it digests more text. Even more exciting, the general strategy of In Codice Ratio—jigsaw segmentation, plus crowdsourced training of the software—could easily be adapted to read texts in other languages. This could potentially do for handwritten documents what Google Books did...
Wake Up To Breaking News!
Hell sometimes looks an awful lot like an office cubicle.