Just how does naive XOR encryption show its weaknesses?
Basically, it does very little to thwart
frequency analysis. Suppose we use 8-byte blocks of
plain text and a corresponding 8-byte long encryption key. (It
doesn't make much difference if blocks are longer; the same
argument applies, although this requires more known cipher text.)
Find some cipher text and simply temporarily ignore
everything except bytes 1, 9, 17, etc., of the cipher text.
This plain text, corresponding to this first-of-each-block
cipher text, will still have the same frequency regularities
as the whole plain text. And each identical plain text byte
will be transformed into the same cipher text byte. So by
knowing that the letter "E" makes up about 13% of plain text
(assuming it is English prose), all we need do is look for
a cipher text byte value occurring at this same frequency (we
simplify here by ignoring case and punctuation, but this is
not important for the concept). Once we find these
corresponding plain text and cipher text bytes, the key byte
is given instantly by KB = PB XOR CB
, or in
the example: KB = 'E' XOR 'q'
. Once we know
the key bytes, we can decipher all the cipher text values
whose plain text is not an "E" without further work. Repeat
the procedure for cipher text bytes [2,10,18,...] and
[3,11,19,...] and so on.