[ un ] theoretical

Notes Research About

ASCII and XOR Ciphers Don't Mix Well

Summary

I was given a CTF challenge to find the key of a given ciphertext. The key length was given to be 6 - not crazy. Since the key size was known (although guessing the key size isn't too difficult), I partitioned the ciphertext into rows of 6 bytes - the last byte (00) was padded just to make the dimensions consistent. The hexdump is below.

	
2e	19	1f	43	16	5b		. . . C . [ 
1c	17	1f	11	17	5c		. . . . . \ 
19	14	5a	01	17	46		. . Z . . F 
0d	14	1f	0d	52	46		. . . . R F 
12	14	15	11	0b	12		. . . . . . 
1b	1f	1e	43	02	40		. . . C . @ 
1b	12	0e	0a	11	57		. . . . . W 
5a	18	09	43	06	5a		Z . . C . Z 
1b	05	5a	0a	1c	12		. . Z . . . 
0e	19	1f	0c	00	4b		. . . . . K 
56	51	0e	0b	17	40		V Q . . . @ 
1f	51	13	10	52	5c		. Q . . R \ 
15	51	1e	0a	14	54		. Q . . . T 
1f	03	1f	0d	11	57		. . . . . W 
5a	13	1f	17	05	57		Z . . . . W 
1f	1f	5a	17	1a	57		. . Z . . W 
15	03	03	43	13	5c		. . . C . \ 
1e	51	0a	11	13	51		. Q . . . Q 
0e	18	19	06	5c	00		. . . . \ . 
	
	

Then I stared at the binary for a while until some things jumped out.


00101110	00011001	00011111	01000011	00010110	01011011	
00011100	00010111	00011111	00010001	00010111	01011100	
00011001	00010100	01011010	00000001	00010111	01000110	
00001101	00010100	00011111	00001101	01010010	01000110	
00010010	00010100	00010101	00010001	00001011	00010010	
00011011	00011111	00011110	01000011	00000010	01000000	
00011011	00010010	00001110	00001010	00010001	01010111	
01011010	00011000	00001001	01000011	00000110	01011010	
00011011	00000101	01011010	00001010	00011100	00010010	
00001110	00011001	00011111	00001100	00000000	01001011	
01010110	01010001	00001110	00001011	00010111	01000000	
00011111	01010001	00010011	00010000	01010010	01011100	
00010101	01010001	00011110	00001010	00010100	01010100	
00011111	00000011	00011111	00001101	00010001	01010111	
01011010	00010011	00011111	00010111	00000101	01010111	
00011111	00011111	01011010	00010111	00011010	01010111	
00010101	00000011	00000011	01000011	00010011	01011100	
00011110	01010001	00001010	00010001	00010011	01010001	
00001110	00011000	00011001	00000110	01011100	00000000	

Among the first things that I noticed was the fact that none of the bytes had it's 8th bit set. This implied that the character encoding was ascii. Second, in the last column, all of the 7th bits were flipped to 1. After looking at an ascii code table, it would seem likely that the last key byte is probably not a letter.

I made a guess that the first byte of text would be an uppercase letter. If the first byte in the key were a lowercase character, the 6th bit would be flipped in the first byte of ciphertext. This seemed like a reasonable assumption given that no other bytes in the first column had their 6th bit set. In fact, this seemed reasonable for all the remaining columns up to the last one.

Under these assumptions, the fact that the 4th byte in the ciphertext had its 7th bit set drew my attention. Again, after looking back at my ascii chart, it seemed reasonable to think that the 4th byte could be whitespace. If that is the case, then for ciphertext $c_i$ and key $k_i$, then $c_i \oplus k_i = p_i$ where $p_i$ is the $ith$ byte of plaintext.

Doing so, give me 01100011 which is actually just the letter "c". As it happens, if the key is a lowercase character, XORing a space will simply flip the case of the key. Looking back at the canonical representation of the hexdump, its very likely that the first 5 bytes of the key are "zqzcr". Trying that out gives us


54	68	65	02	64	7a		T h e . d z 
66	66	65	72	65	7d		f f e r e } 
63	65	02	62	65	67		c e . b e g 
77	65	65	6e	02	67		w e e n . g 
68	65	6f	72	79	33		h e o r y 3 
61	6e	64	02	07	61		a n d . p a 
61	63	74	69	63	76		a c t i c v 
02	69	73	02	74	7b		. i s . t { 
61	74	02	69	6e	33		a t . i n 3 
74	68	65	6f	72	6a		t h e o r j 
2c	02	74	68	65	61		, . t h e a 
65	02	69	73	02	7d		e . i s . } 
6f	02	64	69	66	75		o . d i f u 
65	72	65	6e	63	76		e r e n c v 
02	62	65	74	77	76		. b e t w v 
65	6e	02	74	68	76		e n . t h v 
6f	72	79	02	61	7d		o r y . a } 
64	02	07	72	61	07		d . p r a p 
74	69	63	65	2e	21		t i c e . !

Its clear that we need to map the 6th byte to an "i". Using $c_i \oplus k_i = p_i$ is trivial to find the last key byte - producing the remaining byte "2". Thus, the key is "zqzcr2".


54	68	65	02	64	69		T h e . d i 
66	66	65	72	65	6e		f f e r e n 
63	65	02	62	65	74		c e . b e t 
77	65	65	6e	02	74		w e e n . t 
68	65	6f	72	79	02		h e o r y . 
61	6e	64	02	07	72		a n d . p r 
61	63	74	69	63	65		a c t i c e 
02	69	73	02	74	68		. i s . t h 
61	74	02	69	6e	02		a t . i n . 
74	68	65	6f	72	79		t h e o r y 
2c	02	74	68	65	72		, . t h e r 
65	02	69	73	02	6e		e . i s . n 
6f	02	64	69	66	66		o . d i f f 
65	72	65	6e	63	65		e r e n c e 
02	62	65	74	77	65		. b e t w e 
65	6e	02	74	68	65		e n . t h e 
6f	72	79	02	61	6e		o r y . a n 
64	02	07	72	61	63		d . p r a c 
74	69	63	65	2e	32		t i c e . 2