An Exact A* Method for Deciphering Letter-Substitution Ciphers
Corlett, Eric and Penn, Gerald

Article Structure

Abstract

Letter-substitution ciphers encode a document from a known or hypothesized language into an unknown writing system or an unknown encoding of a known writing system.

Introduction

Letter-substitution ciphers encode a document from a known language into an unknown writing system or an unknown encoding of a known writing system.

Terminology

Substitution ciphers are ciphers that are defined by some permutation of a plaintext alphabet.

The Algorithm

In order to efficiently search for the most likely solution for a ciphertext C, we conduct a search of the partial solutions using their trigram probabilities as a heuristic, where the trigram probability of a partial solution 0 of length A: is the maximum trigram probability over all strings consistent with it, meaning, in particular, that ciphertext letters not in its range can be mapped to any plaintext letter, and do not even need to be consistently mapped to the same plaintext letter in every instance.

Experiment

The above algorithm is designed for application to the transliteration of electronic documents, specifically, the transliteration of websites, and it has been tested with this in mind.

Results

In our first set of tests, we measured the time consumption and accuracy of our algorithm over 10 ciphers taken from random texts that were 6000 characters long.

Conclusions

In the above paper, we have presented an algorithm for solving letter-substitution ciphers, with an eye towards discovering unknown encoding standards in electronic documents on the fly.

Topics

language model

Appears in 17 sentences as: language model (16) language models (2) language model’s (1)
In An Exact A* Method for Deciphering Letter-Substitution Ciphers
  1. If the text from which a language model is trained is of a different genre than the plaintext of a cipher, the unigraph letter frequencies may differ substantially from those of the language model , and so frequency counting will be misleading.
    Page 1, “Introduction”
  2. Such inefficiency indicates that integer programming may simply be the wrong tool for the job, possibly because language model probabilities computed from empirical data are not smoothly distributed enough over the space in which a cutting-plane method would attempt to compute a linear relaxation of this problem.
    Page 2, “Introduction”
  3. This difference in difficulty, while real, is not inherent, but rather an artefact of the character-level n-gram language models that they (and we) use, in which preponderant evidence of differences in short character sequences is necessary for the model to clearly favour one letter-substitution mapping over another.
    Page 2, “Introduction”
  4. In the present method, the role of the language model can be acutely perceived; both the time complexity of the algorithm and the accuracy of the results depend crucially on this characteristic of the language model .
    Page 2, “Introduction”
  5. Every possible full solution to a cipher C will produce a plaintext string with some associated language model probability, and we will consider the best possible solution to be the one that gives the highest probability.
    Page 3, “Terminology”
  6. For the sake of concreteness, we will assume here that the language model is a character-level trigram model.
    Page 3, “Terminology”
  7. Backpointers are necessary to reference one of the two language model probabilities.
    Page 4, “The Algorithm”
  8. Cells that would produce inconsistencies are left at zero, and these as well as cells that the language model assigns zero to can only produce zero entries in later columns.
    Page 4, “The Algorithm”
  9. The n p x n p cells of every column 2' do not depend on each other —only on the cells of the previous two columns 2' — 1 and i— 2, as well as the language model .
    Page 4, “The Algorithm”
  10. In the event that a trigram or bigram would be found in the plaintext that was not counted in the language model , add one smoothing was used.
    Page 5, “Experiment”
  11. Our character-level language model used was developed from the first 1.5 million characters of the Wall Street Journal section of the Penn Tree-bank corpus.
    Page 5, “Experiment”

See all papers in Proc. ACL 2010 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

Viterbi

Appears in 8 sentences as: Viterbi (9)
In An Exact A* Method for Deciphering Letter-Substitution Ciphers
  1. In this paper, we introduce an exact method for deciphering messages using a generalization of the Viterbi algorithm.
    Page 1, “Abstract”
  2. Our algorithm operates by running a search over the n- gram probabilities of possible solutions to the cipher, using a generalization of the Viterbi algorithm that is wrapped in an A* search, which determines at each step which partial solutions to expand.
    Page 1, “Introduction”
  3. Given a partial solution 0 of length n, we can extend 0 by choosing a ciphertext letter c not in the range of 0, and then use our generalization of the Viterbi algorithm to find, for each p not in the domain of 0, a score to rank the choice of p for 0, namely the trigram probability of the extension Up of 0.
    Page 3, “The Algorithm”
  4. Our generalization of the Viterbi algorithm, depicted in Figure 1, uses dynamic programming to score every immediate extension of a given partial solution in tandem, by finding, in a manner consistent with the real Viterbi algorithm, the most probable input string given a set of output symbols, which in this case is the cipher 0.
    Page 3, “The Algorithm”
  5. Unlike the real Viterbi algorithm, we must also observe the constraints of the input partial solution’s mapping.
    Page 3, “The Algorithm”
  6. The real Viterbi algorithm lacks these final two constraints, and would only store a single cell at G (i, 1).
    Page 4, “The Algorithm”
  7. Algorithm 2 Generalized Viterbi Algorithm GVit(0, c, 7“) Input: partial solution 0, ciphertext character 0, and index 7“ into 0.
    Page 6, “Experiment”
  8. Asymptotically, the time required for each sweep of the Viterbi algorithm increases, but this is more than offset by the decrease in the number of required sweeps.
    Page 7, “Results”

See all papers in Proc. ACL 2010 that mention Viterbi.

See all papers in Proc. ACL that mention Viterbi.

Back to top.

error rate

Appears in 3 sentences as: error rate (3)
In An Exact A* Method for Deciphering Letter-Substitution Ciphers
  1. We judged the quality of the decoding by measuring the percentage of characters in the cipher alphabet that were correctly guessed, and also the word error rate of the plaintext generated by our solution.
    Page 5, “Experiment”
  2. We did not count the accuracy or word error rate for unfinished ciphers.
    Page 5, “Experiment”
  3. We can also observe that even when there are errors (e.g., in the size 1000 cipher), the word error rate is very small.
    Page 6, “Results”

See all papers in Proc. ACL 2010 that mention error rate.

See all papers in Proc. ACL that mention error rate.

Back to top.