Data Communications and Computer Networks Purdue University HAVING FUN WITH CODES

HAVING FUN WITH CODES1. Read about Cesar Ciphers.
Visit the Cryptography Tutorial and decode the three strings shown.
Cut and paste the decoded answer and record the value of the shift key for each.
1a ___________________
_____
1b ___________________
_____
1c ___________________
_____
2. Take the encoded text below and perform character frequency analysis on it
using the provided tool.
ZWLS XVWSO NRA XOTOR UONSX NBW WLS ZNEIOSX HSWLBIE ZWSEI WR EIPX
VWREPRORE N ROF RNEPWR, VWRVOPTOA PR JPHOSEU, NRA AOAPVNEOA EW EIO
DSWDWXPEPWR EINE NJJ QOR NSO VSONEOA OKLNJ. RWF FO NSO ORBNBOA PR N
BSONE VPTPJ FNS, EOXEPRB FIOEIOS EINE RNEPWR, WS NRU RNEPWR XW
VWRVOPTOA NRA XW AOAPVNEOA, VNR JWRB ORALSO. FO NSO QOE WR N BSONE
HNEEJOZPOJA WZ EINE FNS. FO INTO VWQO EW AOAPVNEO N DWSEPWR WZ EINE
ZPOJA, NX N ZPRNJ SOXEPRB DJNVO ZWS EIWXO FIW IOSO BNTO EIOPS JPTOX EINE
EINE RNEPWR QPBIE JPTO. PE PX NJEWBOEIOS ZPEEPRB NRA DSWDOS EINE FO
XIWLJA AW EIPX. HLE, PR N JNSBOS XORXO, FO VNRRWE AOAPVNEO, FO VNRRWE
VWRXOVSNEO, FO VNRRWE INJJWF EIPX BSWLRA. EIO HSNTO QOR, JPTPRB NRA
AONA, FIW XESLBBJOA IOSO, INTO VWRXOVSNEOA PE, ZNS NHWTO WLS DWWS
DWFOS EW NAA WS AOESNVE. EIO FWSJA FPJJ JPEEJO RWEO, RWS JWRB
SOQOQHOS FINE FO XNU IOSO, HLE PE VNR ROTOS ZWSBOE FINE EIOU APA IOSO. PE
PX ZWS LX EIO JPTPRB, SNEIOS, EW HO AOAPVNEOA IOSO EW EIO LRZPRPXIOA
FWSC FIPVI EIOU FIW ZWLBIE IOSO INTO EILX ZNS XW RWHJU NATNRVOA. PE PX
SNEIOS ZWS LX EW HO IOSO AOAPVNEOA EW EIO BSONE ENXC SOQNPRPRB
HOZWSO LX—EINE ZSWQ EIOXO IWRWSOA AONA FO ENCO PRVSONXOA AOTWEPWR
EW EINE VNLXO ZWS FIPVI EIOU BNTO EIO JNXE ZLJJ QONXLSO WZ AOTWEPWR—
EINE FO IOSO IPBIJU SOXWJTO EINE EIOXO AONA XINJJ RWE INTO APOA PR TNPR—
EINE EIPX RNEPWR, LRAOS BWA, XINJJ INTO N ROF HPSEI WZ ZSOOAWQ—NRA EINE
BWTOSRQORE WZ EIO DOWDJO, HU EIO DOWDJO, ZWS EIO DOWDJO, XINJJ RWE
DOSPXI ZSWQ EIO ONSEI.
2a.Take a screen snapshot and cut and paste the result to your report.
3. Next, take the encrypted text and change the replacement letters using this tool and try to
decode the text. Some of the characters will be wrong. You just have to experiment a little to
find the right match.
Here are some hints:
Only the first two characters from the letter frequency analysis are correct. The others
are close but not exact.
Experiment with specific letter substitutions to figure out the words. For instance:



One letter words are likely: a or I
Two letter words might be: if it of an or on in at as to
Three letter words might include: the (the most common) for and can but
not
If you get stuck, here are some more hints:
This is a partial decryption set showing 9 of the 6 characters of the alphabet.
Figure 1: A partial decryption set showing 9 of the 6 characters of the alphabet.
Once you have deciphered the text complete the following:
3a. Cut and paste the decoded text to your report.
3b. Cut and paste the resultant substitution matrix into your report as well.
4. For an extra challenge, try this text. It is trickier:
GHRX AOLMMLT, RES GCB XMLGCZ GNYBX SLS TZOB RES TLVAMB LE GCB HRAB:
RMM VLVXZ HBOB GCB ANONTNYBX, RES GCB VNVB ORGCX NPGTORAB. ABHROB
GCB URAABOHNJD, VZ XNE! GCB URHX GCRG ALGB, GCB JMRHX GCRG JRGJC!
ABHROB GCB UPAUPA ALOS, RES XCPE GCB KOPVLNPX ARESBOXERGJC! CB GNND
CLX YNOWRM XHNOS LE CRES: MNET GLVB GCB VREQNVB KNB CB XNPTCG XN
OBXGBS CB AZ GCB GPVGPV GOBB, RES XGNNS RHCLMB LE GCNPTCG. RES, RX LE
PKKLXC GCNPTCG CB XGNNS, GCB URAABOHNJD, HLGC BZBX NK KMRVB, JRVB
HCLKKMLET GCONPTC GCB GPMTBZ HNNS RES APOAMBS RX LG JRVB! NEB, GHN!
NEB, GHN! RES GCONPTC RES GCONPTC GCB YNOWRM AMRSB HBEG XELJDBOXERJD CB MBKG LG SBRS, RES HLGC LGX CBRS CB HBEG TRMPVWCLET ARJD. RES,
CRX GCNP XMRLE GCB URAABOHNJD? JNVB GN VZ ROVX, VZ ABRVLXC ANZ N
KORAUNPX SRZ! JRMMNNC! JRMMRZ! CB JCNOGMBS LE CLX UNZ.
Use the same strategy as before to decode the words.
Hint…some words may not look right.
Here is a partial decode hint showing 10 characters.
Figure 2: A partial decode hint showing 10 characters.
4a. Cut and paste the decoded text to your report.
4b. Cut and paste the resultant substitution matrix into your report as well.
5. Write a one page summary of what you have done and what you think of these ciphers and
tools.
IT 543
CRYPTOGRAPHY CONCEPTS AND TECHNIQUES
UNIT 1 SEMINAR
Logistics
• My background
• Seminars
• What I expect to see in this course.
• What the course is and what it isn’t
• Do you have the book?
• Do you understand the
announcement about AI?
• Why are you in this course?
Course outcomes

IT543-1: Examine the historical development and basic principles
of cryptography.

IT543-2: Evaluate various cryptographic methods.

IT543-3: Develop secure communications using cryptographic
methods.

IT543-4: Design an implementation of cryptographic methods for
an organization.
Fundamental Cryptographic Concepts
Basic Terminology

Plaintext



Process of converting from plaintext to
ciphertext
Deciphering or decryption


The coded message

Enciphering or encryption


The original message
Ciphertext


Restoring the plaintext from the
ciphertext
Cryptography

Study of encryption

Cryptographic system or cipher
– Schemes used for encryption
Cryptanalysis
– Techniques used for
deciphering a message
without any knowledge of the
enciphering details
Cryptology
– Areas of cryptography and
cryptanalysis together
Fundamental Cryptographic Concepts
Substitution Technique
§ Is one in which the letters of plaintext are
replaced by other letters or by numbers or
symbols
§ If the plaintext is viewed as a sequence of bits,
then substitution involves replacing plaintext bit
patterns with ciphertext bit patterns
Fundamental Cryptographic Concepts
Caesar Cipher
§ Simplest and earliest known use of a substitution
cipher
§ Used by Julius Caesar
§ Involves replacing each letter of the alphabet with the
letter standing three places further down the alphabet
§ Alphabet is wrapped around so that the letter
following Z is A
plain: meet me after
the toga party
cipher: PHHW PH DIWHU WKH WRJD SDUWB
Walk Through
Fundamental Cryptographic Concepts
Caesar Cipher Algorithm
§ Can define transformation as:
abcdefghijklmnopqrstuvwxyz
DEFGHIJKLMNOPQRSTUVWXYZABC
§ Mathematically give each letter a number
abcdefghij k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
§ Algorithm can be expressed as:
c = E(3, p) = (p + 3) mod (26)
– A shift may be of any amount, so that the general Caesar algorithm is:
C = E(k , p ) = (p + k ) mod 26
§ Where k takes on a value in the range 1 to 25; the decryption algorithm is
simply:
p = D(k , C ) = (C – k ) mod 26
Frequency Analysis
1. Look at the Letters
2. Compare to Normal Letters
3. Match the Letters
4. Guess the Key
5. Try It Out
6. Read the Secret Messages
Fundamental Cryptographic Concepts
Symmetric Encryption
§ Also referred to as conventional encryption
or single-key encryption
§ Was the only type of encryption in use prior
to the development of public-key encryption
in the 1970s
§ Remains by far the most widely used of the
two types of encryption
Fundamental Cryptographic Concepts
Symmetric Encryption
§ Also referred to as conventional encryption
or single-key encryption
§ Was the only type of encryption in use prior
to the development of public-key encryption
in the 1970s
§ Remains by far the most widely used of the
two types of encryption
Fundamental Cryptographic Concepts
Simplified Model of Symmetric Encryption
Fundamental Cryptographic Concepts
Model of Symmetric Cryptosystem
Fundamental Cryptographic Concepts
Cryptographic Systems
Characterized along three independent dimensions:
The type of operations
used for transforming
plaintext to ciphertext
The number of keys
used
The way in which the
plaintext is processed
Substitution
Symmetric,
single-key, secretkey, conventional
encryption
Block cipher
Transposition
Asymmetric, twokey, or public-key
encryption
Stream cipher
Fundamental Cryptographic Concepts
Cryptanalysis and Brute-Force Attack
Cryptanalysis
Brute-force attack
• Attack relies on the nature of the
algorithm plus some knowledge
of the general characteristics of
the plaintext
• Attack exploits the characteristics
of the algorithm to attempt to
deduce a specific plaintext or to
deduce the key being used
• Attacker tries every possible
key on a piece of ciphertext
until an intelligible translation
into plaintext is obtained
• On average, half of all possible
keys must be tried to achieve
success
Fundamental Cryptographic Concepts
Types of Attacks on Encrypted Messages
Czesław Kościelny · Mirosław Kurkowski
Marian Srebrny
Modern
Cryptography
Primer
Theoretical Foundations
and Practical Applications
Modern Cryptography Primer
lwilliams4@kaplan.edu
Czesław Kościelny r Mirosław Kurkowski r
Marian Srebrny
Modern
Cryptography
Primer
Theoretical Foundations
and Practical Applications
lwilliams4@kaplan.edu
Czesław Kościelny
Faculty of Information Technology
Wrocław School of Information Technology
Wrocław, Poland
Mirosław Kurkowski
Inst. of Computer and Information Sciences
Czestochowa University of Technology
Czestochowa, Poland
and
European University of Information
Technology and Economics
Warsaw, Poland
Marian Srebrny
Institute of Computer Science
Polish Academy of Sciences
Warsaw, Poland
and
Section of Informatics
University of Commerce
Kielce, Poland
ISBN 978-3-642-41385-8
ISBN 978-3-642-41386-5 (eBook)
DOI 10.1007/978-3-642-41386-5
Springer Heidelberg New York Dordrecht London
Library of Congress Control Number: 2013955235
© Springer-Verlag Berlin Heidelberg 2013
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from Springer.
Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations
are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any
errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect
to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
lwilliams4@kaplan.edu
Preface
For centuries, the need to ensure confidentiality of some gathered or transmitted information got a lot of attention in various political or military contexts. Nowadays,
in the era of a general necessity for privacy, and the conscious awareness of one’s
rights to it, cryptography is found useful in a wide range of practical applications.
For the most part, it is used for securing confidentiality in interpersonal computerized communication. The turn of the 21st century is sometimes called the Internet
age, computer era; communication takes place instantly and without hindrance. Obviously, no one can imagine the functioning of various types of communication and
telecommunication networks without the appropriate security measures against undesirable listening in on our information.
Modern cryptography would not exist without solid mathematical foundations,
especially in number theory. The recent and most advanced security algorithms are
built on such arithmetic constructs as integer arithmetic divisibility, modulo operations, prime numbers, or the Euler function.
Today’s societies depend to a large extent on computers which process huge
amounts of information, often transferred via telecommunication networks, and
stored in databases. This information often needs adequate security protection
against being read by unauthorized users of computer systems and networks, particularly illegal users. Cryptography provides economical means that enable this
protection. The experts in cryptography work on more and more efficient methods
of ensuring the secrecy of information and electronic documents which require it.
Striking advances in the proliferation of electronic data storage, linkage, and transmission have created significant new challenges in maintaining confidentiality and
developing adequate methods of authentication. The ambition of cryptanalysis and
cryptanalysts is to break the security codes and forge encrypted messages in such a
way that they look authentic.
Until quite recently, cryptography was applied only in the area of military forces
and diplomacy. This is also why cryptographers usually worked in agencies dealing
with state security, and all research work concerning cryptography, as well as cryptanalysis, was classified. It was not until the late 1960s that a multinational group
of scholars, who were not controlled by security agencies, became interested in the
v
lwilliams4@kaplan.edu
vi
Preface
problems of cryptology and started to publish their research papers on this subject,
thanks to which cryptographic data protection was found useful also in various civilian fields. The new paradigm requires that the cryptographic algorithms be publicly
known, whereas only the private keys must be secret. Nowadays, the public access
to the algorithms is treated as a safeguard of their security, assurance that there are
no flaws due either to poor, unprofessional work by their designers or to deliberate
insertion of so-called hidden backdoors (e.g., collecting copies of private keys).
Cryptographic methods are the most efficient ways of secure protection of modern telecommunication network users against computer break-ins, which have by
now become a plague. That is why business promotes the use of cryptography since
a basic requirement for worldwide economic growth is the development of secure
worldwide computer networks underlying the information society economic infrastructure. In this context, possible administrative limitations on the use of cryptography are considered responsible for a substantial decline in a country’s attractiveness
in the eyes of foreign investors. Cryptographic security means are inevitable in order to improve trading and legal proceedings in the electronic economy, as well as
to ensure at least the minimum of civil privacy and freedom.
The aim of this book is to introduce the currently most interesting and most
important issues of modern applied cryptography in the technological practice of
telecommunication networks, along with the necessary basic mathematics. Cryptography is an area on the edge of mathematics and practical software engineering.
Like no other, it combines immense, challenging unsolved mathematical problems
with the issues of authentic use in practical security tools in currently deployed vital
data communication systems.
We present all the best known and most often used technologies, algorithms and
protocols, and methods of their design and analysis. The algorithms are presented
in readable pseudocode; i.e., written in English with some mathematical or programming symbols, or simple graphics and diagrams. We will not go into details
on finding implementation bugs or methods of program engineering depending on
the features of a particular programming environment, specification and implementation in any favorite programming language.
We bring particular attention in this book to performance analysis of the presented algorithms and protocols because since the late 1980s efficiency has essentially become the central concept to understanding modern cryptographic mechanisms, their usage, and many related problems, especially the problems of breaking
the codes.
There are many very good publications on the market devoted to cryptography
and/or its usage. However, only very few of them can serve as course textbooks.
The material they present seems to us either extensively broad or too narrow for a
graduate course, often too mathematical, and therefore very difficult for the majority
of student readers with no deep mathematical background.
This book is written at the level of a graduate lecture course textbook for students
of any technical university in the European Union or North America. As prerequisites it requires only some very basic elementary mathematical experience in algebra, number theory, probability, data structures, as well as the design and efficient
lwilliams4@kaplan.edu
Preface
vii
analysis of algorithms. The material presented in the book can constitute a one-year
graduate course, as well as providing material for shorter courses on selected topics
to be used without the need to search other parts of the book. Each chapter contains
all the necessary background information concerning the problems being discussed.
Selected chapters can constitute a reasonable basis for further studies of the subject,
e.g., in the form of seminar or term credit papers, etc. The references provided will
definitely be of help in completing such tasks. For the same reason, this book can be
treated as a useful source of information in the field of data and network transactions
security for practitioners and researchers just after their studies.
Today’s cryptography is a very broad and lively field. We are aware that many
areas need much broader treatment. For example, the elliptic curve algorithms,
quantum cryptography, secret sharing, and various cryptanalytic techniques. Cryptographic hash algorithms are limited in this book to signaling the basic approaches
and challenges, with no coverage of the most recent very interesting advances. These
areas have got a lot of attention in the last few years, with many different methods
and their own challenges. These topics will be covered in full detail in our follow-up
textbook to appear soon.
This book consists of nine chapters discussing today’s actual practice in applied
cryptography from the very basics to strong state-of-the-art security algorithms and
protocols.
The first chapter introduces the basic concepts of cryptography. The general diagram of encryption/decryption, as well as the notion of a cryptographic algorithm
and the definition of cryptographic keys are discussed. The rules for building strong
cryptographic codes are introduced. The chapter also presents the fundamental notions of theoretical and practical computational complexity, and discusses its meaning for determining the difficulty of breaking cryptosystems. Next, we introduce
codes known from history such as Caesar’s ancient code, and Playfair and Enigma,
which were applied during the World Wars.
Modern cryptography would not exist without solid mathematical foundations,
therefore in Chap. 2 we recollect and present mathematical concepts and properties
required for continuing the course. Elements of the theory of algebraic structures, as
well as elements of number theory, are presented. Also, we present simple arithmetic
algorithms applied in cryptography. The chapter ends with a discussion on currently
applied algorithms for testing integer primality, and computationally hard problems
in number theory.
In Chap. 3 the most important symmetric ciphers are presented, among them the
standards of symmetric encryption applied in widespread practice today. The DES
(Data Encryption Standard) algorithm, its modifications and modes of operation
are given and discussed in detail. A lot of attention is focused on the most recent
American standard for symmetric cipher, the AES (Advanced Encryption Standard)
algorithm. The IDEA algorithm, as well as the algorithms of the RC family are presented. As an interesting detail illustrating the resistance of encryption algorithms
against attempts to break them, we present the process of the global competition in
breaking RC algorithms, and the results.
In Chap. 4 the reader will find exact descriptions of asymmetric algorithms, beginning with the Diffie-Hellman scheme, through the ElGamal algorithm. Next, the
lwilliams4@kaplan.edu
viii
Preface
well known RSA algorithm, and various issues concerning unceasing attempts to
break it are discussed. An interesting detail is the discussion on the results of the
RSA factorization challenge, illustrating the cryptographic power of the RSA code.
Chapter 5 presents one of the most important modern applications of cryptography, namely the electronic signature. The general scheme, as well as several of the
currently most essential and most interesting applied algorithms for generation and
verification of the validity of e-signature are covered. We present various algorithms
of digital signature and hash functions. We discuss the current issues concerning the
usage of these functions, and their security.
In Chap. 6 the reader will find the exact description of the popular cryptosystem
PGP (Pretty Good Privacy). The overall scheme of the system and the algorithms
used in it are surveyed. The installation and the usage of PGP are described, encryption and signing documents (messages, e-mails, files) among others. In this chapter,
the authors introduce other, non-commercial solutions enabling the application of
strong cryptography by any computer user.
Chapter 7 is devoted to the public key infrastructure as a solution enabling application of the electronic signature for business and legal proceedings in the form
required by legislation in most countries. The role of the so-called trusted third party
in contemporary solutions, as well as the issues concerning certification of cryptographic keys, are presented.
Another important feature of cryptography in day-to-day reality is the cryptographic protocols applied often in mass scale in all kinds of communication via computer networks, especially for entity authentication and preventing identity theft.
The goals to be achieved by the cryptographic protocols, as well as their examples,
are presented. Issues and problems of their specification, design and application,
methods of complexity analysis as well as methods of verification of correctness,
and the security of cryptographic protocols are introduced and covered more broadly
than in any other textbook available so far.
In Chap. 9 the remaining aspects of the application of cryptography in data and
transaction security are taken up. The problems and solutions of preserving the secrecy and privacy of electronic mail, as well as secure exchange of documents in
electronic form are discussed. The commonly applied SSH (Secure SHell) and SSL
(Secure Socket Layer) protocols are also studied.
Like every book, ours is surely not flawless. In case of any errors, mistakes or inaccuracies in this publication, we would appreciate if the reader could kindly submit
them to us via e-mail at cryptobook@icis.pcz.pl. Any feedback will be appreciated.
In return, we promise an up-to-date list of corrections, a constantly revised corrigendum.
Our thanks for help and support in various stages of the process of writing and
editing this book are due to many of our friends and collaborators, as well as our
students, audience and participants in lectures and seminars given by each of us.
Our special thanks must be given to Professor Leonard Bolc (1934–2013) of the
Institute of Computer Science of the Polish Academy of Sciences, without whose
kind and gentle but tenaciously ongoing systematic encouragement this book would
most definitely never have come into existence. Very special acknowledgments go
lwilliams4@kaplan.edu
Preface
ix
to Professor Andrzej Borzyszkowski for his fruitful cooperation on the early versions of the materials for the chapter on the security protocols, their specification
and verification of correctness. Similarly to Maciej Orzechowski. The third author
gratefully acknowledges many useful conversations and discussions with Professors
Paweł Morawiecki, Stanisław Spież, and Jerzy Urbanowicz (1951–2012). The latter
was entirely responsible for dragging the third author in a friendly manner into the
world of cryptologic research and practice, and for educating him on the field’s special beauty and problems, splendors and shadows. The first author acknowledges
support from Wrocław School of Information Technology. The second author acknowledges support from Czestochowa University of Technology and the European
University of Information Technology and Economics, Warsaw. We would also like
to thank Kasia Grygiel, Gosia Berezowska, Ewelina Gajek and Janek Jay Halicki
for their help in the preparation of the English version of our book. Last but not
least, the authors thank the copyeditor for his excellent careful work, and Springer’s
Ronan Nugent for successfully and nicely driving us through the whole editorial and
production process.
Poland
August 2013
Czesław Kościelny
Mirosław Kurkowski
Marian Srebrny
lwilliams4@kaplan.edu
Contents
1
Basic Concepts and Historical Overview . . . . . . . . . . . . . . .
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Encryption . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Algorithms and Keys . . . . . . . . . . . . . . . . . . .
1.1.3 Strong Cryptosystems Design Principles . . . . . . . . .
1.1.4 Computational Complexity of Algorithms . . . . . . . .
1.2 Simple Stream Ciphers . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Caesar Cipher . . . . . . . . . . . . . . . . . . . . . . .
1.2.2 XOR Encryption (Vernam Cipher) . . . . . . . . . . . .
1.3 Simple Block Ciphers . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Transpositions . . . . . . . . . . . . . . . . . . . . . . .
1.3.3 Example of a Simple Transposition Cipher . . . . . . . .
1.3.4 Example of a Substitution Block Cipher . . . . . . . . .
1.3.5 Example of a Product Cipher . . . . . . . . . . . . . . .
1.3.6 Generalized Substitutions—Bigrams . . . . . . . . . . .
1.3.7 Polyalphabetic Substitutions . . . . . . . . . . . . . . . .
1.3.8 Vigenère Cipher . . . . . . . . . . . . . . . . . . . . . .
1.4 Wheel Cipher and Rotor Machines . . . . . . . . . . . . . . . .
1.4.1 Wheel Cipher . . . . . . . . . . . . . . . . . . . . . . .
1.4.2 Rotor Machines . . . . . . . . . . . . . . . . . . . . . .
1.5 Enigma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 History of the Enigma . . . . . . . . . . . . . . . . . . .
1.5.2 Construction of the Enigma . . . . . . . . . . . . . . . .
1.5.3 Enigma Operation . . . . . . . . . . . . . . . . . . . . .
1.5.4 Breaking the Enigma Cipher . . . . . . . . . . . . . . .
1
1
1
2
4
5
10
10
11
13
13
13
14
17
17
18
20
20
21
21
22
23
24
26
29
31
xi
lwilliams4@kaplan.edu
xii
Contents
2
Mathematical Foundations of Cryptography . . . . . . . . . . . . .
2.1 Basic Concepts in the Theory of Algebraic Structures . . . . . .
2.1.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Rings and Fields . . . . . . . . . . . . . . . . . . . . . .
2.1.3 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . .
2.1.4 Polynomial Ring . . . . . . . . . . . . . . . . . . . . . .
2.1.5 Applications of Galois Fields . . . . . . . . . . . . . . .
2.2 Elements of Number Theory . . . . . . . . . . . . . . . . . . . .
2.2.1 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Prime Numbers and Their Properties . . . . . . . . . . .
2.2.3 Euler’s Function . . . . . . . . . . . . . . . . . . . . . .
2.2.4 Modular Congruences . . . . . . . . . . . . . . . . . . .
2.2.5 Simple Modular Equations . . . . . . . . . . . . . . . .
2.2.6 Euler’s Theorem . . . . . . . . . . . . . . . . . . . . . .
2.3 Sieve of Eratosthenes, Euclidean Algorithms . . . . . . . . . . .
2.3.1 Sieve of Eratosthenes . . . . . . . . . . . . . . . . . . .
2.3.2 Euclidean Algorithm . . . . . . . . . . . . . . . . . . . .
2.3.3 Extended Euclidean Algorithm . . . . . . . . . . . . . .
2.4 Tests for Primality . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Fermat’s Test . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Fermat’s Primality Test . . . . . . . . . . . . . . . . . .
2.4.3 Miller-Rabin Test . . . . . . . . . . . . . . . . . . . . .
2.4.4 Algorithm AKS . . . . . . . . . . . . . . . . . . . . . .
2.5 Computationally Hard Problems in Number Theory . . . . . . .
2.5.1 Factorization . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Discrete Logarithm Problem . . . . . . . . . . . . . . . .
37
37
38
40
44
45
49
50
50
52
55
55
57
59
59
59
60
64
67
67
68
69
70
71
72
75
3
Foundations of Symmetric Cryptography . . . . . . . . . . . . . .
3.1 Idea of Symmetric Cryptography . . . . . . . . . . . . . . . . .
3.1.1 The Feistel Network . . . . . . . . . . . . . . . . . . . .
3.2 The DES Algorithm . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 S-Boxes . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Description of the DES Algorithm . . . . . . . . . . . .
3.2.3 Breaking DES . . . . . . . . . . . . . . . . . . . . . . .
3.3 Extensions of the DES Algorithm . . . . . . . . . . . . . . . . .
3.3.1 Triple DES . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.2 DESX . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Modes of Operation of the DES Algorithm . . . . . . . . . . . .
3.4.1 Electronic Codebook Mode of Operation . . . . . . . . .
3.4.2 Cipher Block-Chaining Mode of Operation . . . . . . . .
3.4.3 Cipher Feedback Mode of Operation . . . . . . . . . . .
3.5 The IDEA Algorithm . . . . . . . . . . . . . . . . . . . . . . .
3.6 RC Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1 RC4 Algorithm . . . . . . . . . . . . . . . . . . . . . .
3.6.2 RC5 Algorithm . . . . . . . . . . . . . . . . . . . . . .
3.6.3 RC5-Breaking Project . . . . . . . . . . . . . . . . . . .
3.6.4 RC6 Algorithm . . . . . . . . . . . . . . . . . . . . . .
77
77
78
79
79
80
85
86
86
87
87
87
87
89
90
92
92
94
96
99
lwilliams4@kaplan.edu
Contents
xiii
3.7 AES—The Successor to DES . . . . . . . . . . . . . . . . . . .
3.7.1 Mathematical Foundations of AES . . . . . . . . . . . .
3.7.2 Description of the Algorithm . . . . . . . . . . . . . . .
3.7.3 Key Expansion . . . . . . . . . . . . . . . . . . . . . . .
3.7.4 Encryption Algorithm . . . . . . . . . . . . . . . . . . .
3.7.5 Decryption Algorithm . . . . . . . . . . . . . . . . . . .
3.8 Generalizations and Refinements of DES, IDEA and AES . . . .
3.8.1 Algorithms DES-768, IDEA-832, AES-1408, AES-1664,
and AES-1920 . . . . . . . . . . . . . . . . . . . . . . .
3.8.2 Generalized DES and AES Ciphers . . . . . . . . . . . .
100
100
108
111
113
114
117
4
Foundations of Asymmetric Cryptography . . . . . . . . . . . . . .
4.1 Idea of Asymmetric Cryptography . . . . . . . . . . . . . . . .
4.2 The Diffie-Hellman Algorithm . . . . . . . . . . . . . . . . . .
4.3 The ElGamal Algorithm . . . . . . . . . . . . . . . . . . . . . .
4.4 The RSA Algorithm . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Key Generation . . . . . . . . . . . . . . . . . . . . . .
4.4.2 Encryption and Decryption . . . . . . . . . . . . . . . .
119
119
120
121
123
123
124
5
An Electronic Signature and Hash Functions . . . . . . . . . . . .
5.1 Digital Signature Algorithms . . . . . . . . . . . . . . . . . . .
5.1.1 A Digital Signature . . . . . . . . . . . . . . . . . . . .
5.1.2 The RSA Signature . . . . . . . . . . . . . . . . . . . .
5.1.3 The ElGamal Signature . . . . . . . . . . . . . . . . . .
5.1.4 DSA Signature . . . . . . . . . . . . . . . . . . . . . . .
5.2 Cryptographic Hash Functions . . . . . . . . . . . . . . . . . .
5.2.1 Classification of Hash Functions . . . . . . . . . . . . .
5.2.2 Birthday Paradox and Brute Force . . . . . . . . . . . . .
5.2.3 MD5 Algorithm . . . . . . . . . . . . . . . . . . . . . .
5.2.4 SHA-1 Algorithm . . . . . . . . . . . . . . . . . . . . .
5.2.5 Keccak/SHA-3 . . . . . . . . . . . . . . . . . . . . . . .
127
127
128
129
130
131
132
134
135
136
140
142
6
PGP Systems and TrueCrypt . . . . . . . . . . . . . . . . . . . . .
6.1 PGP System . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1.1 The Idea and the History of PGP . . . . . . . . . . . . .
6.1.2 PGP Algorithms . . . . . . . . . . . . . . . . . . . . . .
6.1.3 The Use of PGP . . . . . . . . . . . . . . . . . . . . . .
6.1.4 Web of Trust and Key Certification . . . . . . . . . . . .
6.2 FireGPG and Enigmail . . . . . . . . . . . . . . . . . . . . . . .
6.3 TrueCrypt . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Formating the TrueCrypt Volume . . . . . . . . . . . . .
6.3.2 Encrypting a Partition . . . . . . . . . . . . . . . . . . .
6.3.3 Forming a Hidden Volume . . . . . . . . . . . . . . . . .
6.3.4 Work with Hidden Volumes . . . . . . . . . . . . . . . .
6.3.5 The Usage of Keyfiles . . . . . . . . . . . . . . . . . . .
6.3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . .
147
147
147
149
152
161
162
164
165
169
170
171
171
172
lwilliams4@kaplan.edu
117
118
xiv
Contents
7
Public Key Infrastructure . . . . . . . . . . . . . . . . . . . . . . .
7.1 Public Key Infrastructure and Its Services . . . . . . . . . . . . .
7.2 Modern Web Threats . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Trusted Third Party, Certification Process . . . . . . . . . . . . .
7.4 PKI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5 Certificates, Keys and Management . . . . . . . . . . . . . . . .
7.5.1 Generating and Installing the Certificates . . . . . . . . .
7.5.2 Configuration of Certificate . . . . . . . . . . . . . . . .
7.5.3 Cancellation of Certificates . . . . . . . . . . . . . . . .
175
175
175
176
180
183
183
184
190
8
Cryptographic Protocols . . . . . . . . . . . . . . . . . . . . . . . .
8.1 Examples of Cryptographic Protocols . . . . . . . . . . . . . . .
8.2 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2.1 The Needham-Schroeder Protocol . . . . . . . . . . . . .
8.3 Needham-Schroeder Symmetric Key Protocol . . . . . . . . . .
8.4 Timestamps . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5 Key Exchange Public-Key Protocol . . . . . . . . . . . . . . . .
8.6 Kerberos System . . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.1 Description of Kerberos Components . . . . . . . . . . .
8.6.2 Example of Application of Kerberos . . . . . . . . . . .
8.7 Verification of Correctness of Cryptographic Protocols . . . . . .
8.7.1 Axiomatic (Deductive) Method . . . . . . . . . . . . . .
8.7.2 Model Checking . . . . . . . . . . . . . . . . . . . . . .
8.7.3 Inductive Method . . . . . . . . . . . . . . . . . . . . .
8.7.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . .
193
194
195
196
199
201
202
203
204
206
207
208
209
209
210
211
9
Cryptographic Applications for Network Security . . . . . . . . . .
9.1 Application of Cryptography to Internet Mail Systems Security .
9.1.1 PEM . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1.2 S/MIME . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1.3 MOSS . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 Security of Document Interchange . . . . . . . . . . . . . . . .
9.2.1 EDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2.2 OpenEDI . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2.3 OBI . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2.4 Swift, Edifact . . . . . . . . . . . . . . . . . . . . . . .
9.2.5 EDI in Practice . . . . . . . . . . . . . . . . . . . . . . .
9.3 Computer Network Security—SSH and SSL Protocols . . . . . .
9.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
9.3.2 Idea of the SSH Protocol . . . . . . . . . . . . . . . . .
9.3.3 Using the SSH Protocol . . . . . . . . . . . . . . . . . .
9.3.4 Construction of SSL Protocol . . . . . . . . . . . . . . .
9.3.5 The Use of SSL in Practice . . . . . . . . . . . . . . . .
9.4 Wireless Network Security . . . . . . . . . . . . . . . . . . . .
9.4.1 WEP Protocol . . . . . . . . . . . . . . . . . . . . . . .
9.4.2 WPA Protocol and Its Modifications . . . . . . . . . . .
213
213
213
214
216
216
217
217
218
218
219
220
220
221
224
225
227
229
229
230
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
233
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
237
lwilliams4@kaplan.edu
Chapter 1
Basic Concepts and Historical Overview
1.1 Introduction
Cryptography is the science of transforming, or encoding, information into a form
non-comprehensible for anyone who does not know an appropriate key. In such
forms information can be securely transferred via any communication channel or
stored in data archives with its access restricted or even forbidden (for one reason or another). Cryptography is a part of a broader discipline called cryptology,
which includes also so-called cryptanalysis—the art of breaking codes (ciphers),
i.e., regaining the content of encrypted messages without an authorized access to
the decryption keys.
1.1.1 Encryption
Cryptography is the art of providing confidentiality of information (messages, documents) through encryption, whenever required, together with means of information
security, data integrity, entity and data authentication.
Let us suppose someone (a sender) wishes to deliver some information to someone else (a receiver) via a public channel, e.g., the Internet. Moreover, the sender
would like to make sure that no one else, but the intended receiver, can get the
content being transmitted. The sender can do so by hiding the information content
according to the following scheme. The information content being transferred is
called a plaintext, or a cleartext. The procedure of hiding the content is called encryption, and the encrypted message is called its ciphertext or cryptogram. The reverse procedure of recapturing the content from its cryptogram is called decryption.
The encryption and decryption algorithms together form a cipher. These concepts
are illustrated in Fig. 1.1.
Depending on the encryption algorithm, a plaintext can be any information formulated in any way as a sequence of bits, text file, sequence of voice samples, digital video, et cetera. The examples listed come from the pervasive digital world, but
C. Kościelny et al., Modern Cryptography Primer,
DOI 10.1007/978-3-642-41386-5_1, © Springer-Verlag Berlin Heidelberg 2013
lwilliams4@kaplan.edu
1
2
1 Basic Concepts and Historical Overview
Fig. 1.1 Encryption and decryption
clearly in general one can encrypt information presented in any form whatsoever—it
requires only an encryption algorithm to be applied or designed for this purpose. In
this book a cipher’s input is considered as binary data.
We usually denote a plaintext message by the letter M and its ciphertext by C.
Computer output ciphertext is a binary data sequence as well, often of the same size
as M, sometimes longer. (In the case of combining encryption with compression,
C may turn out smaller than M; encryption itself does not give this effect, usually.)
One can view encryption as a function E associating with each given plaintext data
its ciphertext data. The encryption procedure can then be written as the mathematical
formula:
E(M) = C.
Similarly, the decryption procedure can be thought of as the function
D(C) = M,
which takes a cipher text C and outputs its plaintext M.
The goal of decrypting an encrypted message is to recapture the input plaintext;
hence the following is required:
D(E(M)) = M.
1.1.2 Algorithms and Keys
Historically, the security offered by a cipher was to a large extent based on keeping secret its encryption/decryption algorithm. Modern cryptography considers that
such ciphers do not provide an adequate level of security. For instance, they cannot be used by a larger group of users. A problem arises when someone would like
to leave the group, the others would have to change the algorithm. A similar procedure would apply when someone reveals the algorithm. Another serious concern and
source of doubt about secret ciphers is due to the impossibility of having the quality
of the algorithms, their standardization and implementations, checked by external
experts.
A secret cipher algorithm would have to be uniquely designed for each group of
users, which excludes the possibility of ready-to-use software or hardware implementations. Otherwise, an adversary would be able to purchase an identical product and run the same encryption/decryption algorithms. Each group of users would
have to design and implement their own cipher. If in such a group there was no good
cryptographer and cryptanalyst, the group would not know if its cipher was reliable
enough.
lwilliams4@kaplan.edu
1.1 Introduction
3
Fig. 1.2 Encryption and decryption with one key
Fig. 1.3 Encryption and decryption with two keys
Modern cryptography solves the above security concerns in such a way that usually the cipher used is publicly known but its encryption/decryption execution uses
an extra private piece of information, called a cryptographic key, which is another
input parameter. A key is usually denoted by the letter K. It can take one of a wide
range or keyspace of possible values, usually numbers.
The central idea is that both encryption and decryption functions use a key, and
their outputs depend on the keys used, with the following formulae:
E(K, M) = C and D(K, C) = M.
In the literature often the following notation appears:
EK (M) = C and DK (C) = M
where the subscripts indicate the key. Note the following property (see Fig. 1.2):
DK (EK (M)) = M.
Some ciphers use different encryption and decryption keys (Fig. 1.3). This means
that the encryption key K1 is different from the corresponding decryption key K2 .
In this case we have the following properties:
EK1 (M) = C,
DK2 (C) = M,
DK2 (EK1 (M)) = M.
As pointed out above, the security of good ciphers is based on the secrecy of the
keys. The algorithms are publicly known and can be analyzed by the best experts.
Software and hardware implementations or partial components of the ciphers can
be produced and distributed on an industrial scale. Any potential intruder can have
access to the algorithms. As long as she does not know our private key and the cipher
is good enough, she will not be able to read our cryptograms.
By a cipher or cryptosystem we shall mean the two algorithms of encryption and
decryption together with (the space of) all the possible plaintexts, cryptograms and
keys. There are two general types of ciphers which use keys: symmetric ciphers and
public-key ciphers.
Symmetric ciphers, often also called traditional ciphers, secret-key ciphers,
single-key algorithms or one-key algorithms, use the same key for encryption and
decryption. Here, the same means that each of the two keys can be practically determined (computed) from the other. The keys used in such ciphers have to be kept
lwilliams4@kaplan.edu
4
1 Basic Concepts and Historical Overview
secret as long as the communication is supposed to be kept secret. Prior to use these
keys have to be exchanged over a secure channel between the sender and the receiver. Compromising such a key would enable intruders to encipher and decipher
messages and documents.
The basic idea of public key cryptographic algorithms is that encryption and
decryption use two different keys, matched in such a way that it is not possible in
practice to reconstruct one of them from the other. In such a cryptosystem each user
has a unique pair of keys—public and private. The first of them is publicly available.
Everybody can use it to encrypt messages. But only the corresponding private key
allows decryption. Thus the only person able to run decryption is the one who has
the private key.
1.1.3 Strong Cryptosystems Design Principles
An encryption procedure transforms a given plaintext document into its enciphered
form (cryptogram). An encryption algorithm input consists of a plaintext and a key.
It outputs the cryptogram. The associated decryption algorithm input consists of a
cryptogram and a key, and it outputs the original plaintext.
The basic step in any cryptosystem design is a kind of evaluation of the level of
security offered by the resulting system. It can be measured in terms of the computational resources required to break—by any known or foreseeable method—the
ciphertexts generated by the system. In the course of many years of research the
following conditions have been developed as basic requirements on a strong cryptographic algorithm:
• it should be infeasible to find the plaintext from its cryptogram without knowing
the key used;
• reconstructing the secret key should be infeasible.
A good cipher should meet the above criteria also when the cryptanalyst trying
to break it has access to some relatively large number of sample plaintexts together
with their corresponding cryptograms and knows all the details of the cipher algorithm. It is generally assumed that the cryptanalyst has all the resources (space,
devices, technology) feasible currently and in the foreseeable future. Given these
assumptions, it should be emphasized strongly that a strong cryptographic system’s
robustness is based on the secrecy of the private keys.
We do not require that there does not exist a way to break such a cryptosystem. We only require that there is no currently known feasible method to do so.
The strong cryptographic algorithms that correspond to the above definition could
theoretically be broken, althought in practice it happens very rarely.
The most important rules to date for constructing difficult-to-crack cryptographic
code systems were formulated by Claude Elwood Shannon [94] in 1949 (Fig. 1.4).
He defined breaking a cipher as finding a method to determine the key and/or cleartext on the basis of its cryptogram. The cryptanalyst can obtain great help from
information about certain statistical characteristics of the possible plaintexts such as
lwilliams4@kaplan.edu
1.1 Introduction
5
Fig. 1.4 Claude E. Shannon
the frequency of occurrences of various characters. On this basis one can determine
whether the plaintext is a program written in C, a fragment of prose in Japanese, or
an audio file. In each of these cases in every plaintext there is an apparent redundancy of information which can greatly facilitate cryptanalysis. By now many statistical tests based on information theory have been developed, which effectively help
in the breaking of ciphers whenever the plaintext statistics parameters are known.
According to Shannon a cryptographic system that allows excellent protection
against unauthorized access must not provide any statistical information about the
encrypted plaintext at all. Shannon proved that this is the case when the number
of cryptographic keys is at least as large as the number of possible plaintexts. The
key should therefore be of roughly the same or more bits, characters or bytes as
the plaintext, with the assumption that no key can be used twice. Shannon’s perfect
encryption system is called the single-key system or one-time pad.
According to Shannon to define a mathematical model of a reliable system of
strong cryptography it is necessary to be able to reduce the redundancy of plaintext
information, so that the redundancy is not carried into the cryptograms. Shannon
proposed techniques of diffusion and confusion, which in practice have been reduced
by many crypto designers to some kind of alternation of combining block cipher
components with substitutions and permutations.
Claude Elwood Shannon (30 April 1916–24 February 2001) was an eminent
American mathematician, founder of information theory, one of the many scholars working during World War II with US military and government agencies as a
consultant in the field of cryptology.
1.1.4 Computational Complexity of Algorithms
In this section we introduce the basic concepts of theoretical and practical computational complexity, to the minimum extent that is necessary to understand modern
cryptography and the next chapters of this book.1
1 For more on this background topic the reader is referred to [27, 68].
lwilliams4@kaplan.edu
6
1 Basic Concepts and Historical Overview
Modern cryptography uses publicly known algorithms. Before their deployment
they are subject to objective analysis by independent experts. The private keys are
the closely guarded secrets, not the algorithms. Without knowing the appropriate
keys no one can encrypt/decrypt documents in any good cryptosystem. According
to Shannon’s principles the algorithms must be designed in such a way that the
complexity of the two tasks previously described as infeasible makes breaking the
ciphers practically impossible. Often, however, the high complexity is an estimated
upper bound on the performance of one algorithm, with the additional argument that
currently no efficient cipher-breaking algorithm is known.
An important part of the analysis of encryption and decryption algorithms, and
algorithms in general, is their computational complexity, the efficiency of calculations and of solving algorithmic and computational problems and problem instances.
In this context, by a computational problem we mean a function of the input data
into the output data, that we want to calculate using the analyzed algorithm.
As motivating examples, consider the problem of computing the determinant of
an integer-valued matrix and the problem of integer factorization. A given matrix or
an integer are called instances of these problems, respectively. The bigger the matrix
or integer, the more computational resources are needed to calculate the determinant
or the prime factors. In general, the bigger the size of the input data, the more resources (time, space, processors) are needed to compute such a problem instance.
The complexity of an algorithm is a function of the instance input data size.
One can define the complexity of an algorithm in various ways, however in general it expresses the amount of resources needed by a machine (computer) to perform the algorithm. The resources considered most often are time and space. For
obvious reasons, the amount of these resources can differ greatly from instance to
instance depending on several parameters. The time complexity of an algorithm is a
function that indicates how much time is needed for its execution. The time here is
not measured in seconds or minutes, but in the number of calculation steps, or of bit
operations. How many seconds it takes, depends largely on the equipment on which
the calculations are performed. Independently from equipment and from the rapid
development of computing devices, the complexity of an algorithm is defined as a
function expressing how many elementary operations on individual bits have to be
performed, in the course of each run of the algorithm, depending on the size of the
input data. The time complexity of a given computational problem is the function of
the input data size expressing the time complexity of the best algorithm solving the
problem.
The time complexity of a given algorithm is defined as the number of elementary operations on input data during one run of the algorithm. By the elementary
operations one usually understands the simplest nondecomposable instructions in a
certain programming language or in a certain abstract model of computation; e.g.,
a Turing machine or a (single-core) Random Access Machine, RAM. It does not
matter what language it is, because what matters is just the proportional order of
magnitude of the number of operations, up to a possible (finite) constant multiplicative factor. Without loss of generality, it can be the language Java. Alternatively, one
can treat the single instructions (lines) of the algorithm’s pseudocode as the elementary operations. In this book we do not refer to anything like that, nor to any abstract
lwilliams4@kaplan.edu
1.1 Introduction
7
machine model. Instead, as elementary we define the arithmetic bit-operations of
primary school addition and subtraction of the binary representations of nonnegative integers.
The addition of binary numbers is calculated in the same way as traditional column addition of decimal numbers. You do it by writing one number below the other
and adding one column at a time: add up the digits (binary, bits), then write down the
resulting 0 or 1, and write down the carry to the next column on the left. Subtraction
is calculated just as addition, but instead of adding the binary digits you deduct one
bit from the other, and instead of the carry operation you take a borrow from the
next column.
Multiplication can be done using only the above elementary bit operations. The
school division algorithm is much more complicated, but it also refers only to the
same elementary operations on single bits. Just add the divisor to some extra parameter (initialized to zero) until it gets bigger than the dividend. Then the number
of additions made so far (minus one) makes the resulting quotient of the division,
while the dividend minus the final value of the extra parameter (minus the divisor)
makes the resulting remainder left over.
In computer architecture the elementary bit operations are implemented in a special section of the central processing unit called the arithmetic logic unit, ALU.
In complexity analysis of a given algorithm it suffices to care about the time of
the dominating operation only. That is, the operation performed much longer than
all the other operations of the algorithm taken together. In this book, indeed in cryptography in general, the multiplication of two integers is dominating in almost all
cases. Sometimes it is integer addition, subtraction, or division. In cryptography,
these four basic arithmetic operations can be taken as elementary, the more so as
more and more often these operations get hardware implementations in the arithmetic and logical processors, and each of them can be executed in a single clock
tick.
Computational complexity is thus a kind of device-independent simplification,
an approximation that allows comparisons of the hardness of algorithms (programs)
regardless of the machines that can run them. It also neglects a lot of details such
as, for instance, the generally much shorter time required for a variety of auxiliary actions, for example, memory register access operations, (sub)procedure calls,
passing parameters, etc. It omits all the technical features of the computer on which
the calculations are performed. It is generally accepted that the algorithm execution
time is proportional to the number of elementary operations performed on the input
bits.
Adding two integers m and n written as |m| and |n| binary digits (bits zero or
one), respectively, can be done in max(|m|, |n|) bit-operations. Subtraction has the
same estimate. The primary-school multiplication of two |n|-bit integers requires at
most |n|2 elementary bit-operations. Dividing an |m|-bit integer by an |n|-bit integer
requires at most |m| · |n| time.
In general, the larger the input data, the more resources needed for their processing. However, an analyzed algorithm can do this in a nonuniform, not necessarily
monotone, way. The time and space it needs can vary considerably on different input
lwilliams4@kaplan.edu
8
1 Basic Concepts and Historical Overview
instances of the same size. We distinguish: pessimistic complexity, that is the case
of input data on which the analyzed algorithm requires the most resources over all
data of that size; expected, or average complexity; and asymptotic complexity, i.e.,
the limit of the complexity function values on arbitrarily large inputs.
Space complexity refers to how much space on the computer is required. Space
complexity of an algorithm (program) is a measure of the amount of memory
needed, for example, the number of cells visited on a Turing machine tape. In this
book, indeed in modern cryptography in general, the required space is expressed in
bits—as the maximum number of bits simultaneously written down (stored) in the
course of the analyzed algorithm run.
It is generally considered that a superpolynomial performance of an algorithm,
i.e., expressed by a function with asymptotic growth faster than all polynomials with
integer coefficients, is infeasible (or intractable) on sufficiently large input data.
For example, when we say that there is no known feasible algorithm for integer
factorization, we usually mean: no algorithm running in polynomial time.
Theoretical polynomial complexity is not a sufficient criterion for practicality.
For example, in 2002 [4] published a polynomial time algorithm checking whether
a given natural number is prime (that is, divisible only by 1 and itself). However, the
degree of this polynomial is too high for practical use in testing primality of numbers
of the size currently interesting in practical applications. These are approximately
1000-bit integers.
One more concept of computational complexity is often called practical complexity and measured in seconds on currently available computers and on input data
of the size of current interest. Algorithm AKS, mentioned above, has too high practical complexity. Similarly, the currently (August 2012) best attacks on the SHA-1
hash function standard are treated as merely theoretical, because the best of them
gives a chance of finding a collision (i.e., two different messages with the same
hash) in time corresponding to over 260 SHA-1 evaluations. Nobody can have that
much time on currently available computers (without very high extra financial and
organizational effort).
The concepts introduced above have been extensively studied in computational
complexity theory. The standard reference textbooks are [27, 77].
In modern cryptology the strength of a cipher (in general, a cryptosystem) is usually expressed in terms of the computational complexity of the problem of breaking
the analyzed cipher—how much time or space is required to find the secret key or
recover the plaintext from its encrypted version with no prior knowledge of the appropriate secret key, even when the cryptanalyst has possibly a large number of pairs
plaintext/ciphertext. How much time is required refers to the fastest currently known
algorithm performing this task. Cryptography can be called cryptocomplexity.
Similarly to time complexity, space complexity is defined as the amount of space
required for running an algorithm. It can be measured either by the maximum number of cells in an abstract machine model of the algorithm execution or by the size
of actual physical memory space expressed in bits or bytes.
lwilliams4@kaplan.edu
1.1 Introduction
9
A specific big-oh notation has been introduced for comparison of the rate of
growth of functions describing the computational complexity of algorithms. The
expression
f (n) ∈ O(g(n))
is read and defined as: function f is at most of order g if and only if there exist
positive real c and natural no such that for every n greater than or equal to no , the
value f (n) is at most equal to the product c · g(n). In symbols it can be written as
follows:
f (n) = O(g(n)) ⇔ ∃c∈R+ ∃n0 ∈N (n ≥ n0 ⇒ f (n) ≤ c · g(n)).
By way of a simple illustration, we give an estimate of the time cost (computational time complexity) of the algorithms for grade-school addition and multiplication of binary integers.
Consider two binary numbers x and y of bit-length k. Adding these binary numbers is usually realized as k additions of single bits. So, the time complexity is
O(k). Multiplying x by y is in the worst case (when y has all ones in the binary
notation) k − 1 additions of x to x shifted by one bit to the left each time. It requires
(k − 1) · k = k 2 − k additions of single bits. So, the time complexity of integer
multiplication is O(k 2 ), i.e., quadratic.2
Basic Complexity Classes
• O(1)—constant complexity—the algorithm performs a constant number of steps
no matter what size its input data is.
• O(n)—linear complexity—for each input data size |n|, the algorithm performs a
number of steps proportional to |n|. The growth rate of the running time is linear
w.r.t. the input data size.
• O(n2 )—quadratic complexity—the algorithm’s running time is proportional to
the square of the input data size |n|.
• O(n3 ), O(n4 ), . . .—polynomial time complexity.
• O(log n)—logarithmic complexity.
• O(2n )—exponential complexity—the algorithm performs a constant number of
operations for every subset of the size n input data.
• O(n!)—the algorithm performs a constant number of steps on each permutation
of the size n input data.
Algorithms of exponential or higher complexity are infeasible. Their running
time grows rapidly with increasing n. On large input data size, their running time
gets monstrously, inconceivably long. If we imagine that we have two computers,
one of which can perform a million operations per second, and the second is a million times faster (which gives the performance of 1012 operations per second), the
time required by an exponential algorithm (with time complexity O(2n )) is shown
in Table 1.1.
2 See Table 2.1 in [68].
lwilliams4@kaplan.edu
10
Table 1.1 Running time of
an algorithm of class O(2n )
1 Basic Concepts and Historical Overview
Input size n 20
50
100
106 op./s
roughly 1 s
about 35 years
about 4 · 1016 years
1012 op./s
about 10−6 s about 18.5 min about 4 · 1010 years
Table 1.2 Enumerating
letters of the Latin alphabet
A B C D … W X Y Z
↓ ↓ ↓ ↓ … ↓ ↓ ↓ ↓
1 2 3 4 . . . 23 24 25 26
The table clearly shows that even a big increase of hardware computational power
cannot beat the device-independent complexity of an algorithm. The time required
for its execution can become unimaginably large. For example, for n = 100, in both
cases probably we would never see the results.
Here we are talking about asymptotic complexity bounds. This does not exclude
the possibility of particular input data instances of even very large size that can be
computed very fast by an exponential time complexity algorithm. In practice an algorithm’s running time, measured in seconds, behaves irregularly, with many downs
and ups. Over the last decade or two a whole domain of research has arisen with
significant successful real-life applications of some asymptotic exponential time algorithms, often on some industrial-scale input sizes.
1.2 Simple Stream Ciphers
1.2.1 Caesar Cipher
One of the simplest encryption algorithms is the so-called Caesar cipher, already
used by the Roman army.3 Let us assume that the texts we want to encrypt are written in the 26-letter Latin alphabet excluding capitalization. We assign consecutive
positive integers to symbols of the alphabet (see Table 1.2).
The idea of the algorithm consists in replacing each symbol of a plaintext with
the symbol whose number is greater by three computing modulo 26 (the last three
letters of the alphabet X, Y, Z are replaced with, respectively, A, B, C). If we denote
the integer assigned to a letter x by Lx , then we can write the replacement operation
mathematically in the following way (addition is performed modulo 26, however
we do not replace 26 with 0):
C(Lx ) = Lx + 3.
3 One can find comprehensive information about this and many other ciphers used in the past in
[54]. The history of contemporary cryptography is well discussed in [29]. See also Sect. 7.3 in
[68].
lwilliams4@kaplan.edu
1.2 Simple Stream Ciphers
11
An important feature of the Caesar cipher is the distributivity of encryption over
the sequence of symbols that forms a plaintext. Formally, we can present it as follows:
C(xy) = C(x)C(y).
Obviously, it is easy to notice that the algorithm can be modified by changing the
number of positions by which symbols are shifted.
Although very simple, the Caesar cipher is a symmetric key algorithm. The key
in the original version of the Caesar cipher is equal to 3 (the shift parameter). In the
next section we will generalize Caesar’s method to all possible permutations of the
alphabet.
What is interesting is that the Caesar cipher was used even during World War I
by the Russian army. The applied shift parameter was equal to 13.
From the theoretical point of view the cost of encrypting a k-bit ciphertext is
linear—it equals O(k). Of course, nowadays it is very easy to break the Caesar
cipher even with an unknown shift parameter. Actually, it can be broken without
using a computer—a piece of paper and a pen are enough.
1.2.2 XOR Encryption (Vernam Cipher)
Now we are going to present an encryption algorithm known in the literature as the
Vernam cipher or simply XOR. This algorithm, which requires some mathematical
knowledge, uses the XOR function. Formally, the latter is a Boolean function (i.e.,
a function f : {0, 1} × {0, 1} → {0, 1}) that satisfies the following conditions:
f (0, 0) = f (1, 1) = 0
and f (0, 1) = f (1, 0) = 1.
One can easily notice some of its properties. For arbitrary x, y, z ∈ {0, 1} the
following equations hold:
1. f (x, y) = f (y, x) (commutativity),
2. f (x, x) = 0,
3. f (x, 0) = x (0 is a neutral element of f ).
It can be proved that the function is associative, i.e.,
4. f (x, f (y, z)) = f (f (x, y), z).
When we consider the function XOR as an operation defined on the set {0, 1},
then the above equations can be expressed as follows:
(a) 0 XOR 0 = 1 XOR 1 = 0,
(b) 0 XOR 1 = 1 XOR 0 = 1,
(c) x XOR y = y XOR x,
(d) x XOR x = 0,
(e) x XOR 0 = x,
(f) (x XOR (y XOR z)) = ((x XOR y) XOR z).
lwilliams4@kaplan.edu
12
1 Basic Concepts and Historical Overview
If we take two sequences of bits X = (x1 , x2 , . . . , xn ) and Y = (y1 , y2 , . . . , yn ),
then by X XOR Y we mean the sequence of values of the XOR operation on consecutive entries of sequences X and Y :
X XOR Y = (x1 , x2 , . . . , xn ) XOR (y1 , y2 , . . . , yn )
= (x1 XOR y1 , x2 XOR y2 , . . . , xn XOR yn ).
The above properties allow us to execute the following encryption algorithm. In
order to encrypt a k-bit sequence we divide it into blocks with n elements each (in
case there are not enough bits for the last block, we fill it with, e.g., zeroes). As a
key we take an arbitrary (random) bit sequence of length n: K = (k1 , k2 , . . . , kn ).
The ciphertext for a block A = (a1 , a2 , . . . , an ) is simply expressed by:
C = K XOR A = (k1 XOR a1 , k2 XOR a2 , . . . , kn XOR an )
= (c1 , c2 , . . . , cn ).
The above-mentioned features of the XOR operation enable us to specify the
method by which the ciphertext is decrypted. In order to reconstruct the block A it
is sufficient simply to execute:
K XOR C = K XOR (K XOR A) = A.
Indeed, for any entry of the ciphertext ci (i = 1, . . . , n) we have:
ki XOR ci = ki XOR (ki XOR ai ) = (ki XOR ki ) XOR ai
= 0 XOR ai = ai .
The second equation in the above notation follows from the associativity of XOR,
while the third and the fourth ones follow from properties 2 and 3, respectively.
The cost of this encryption is very low and, as in the case of the Caesar cipher,
is equal to O(k) for a k-bit ciphertext. Let us notice that if the key applied is appropriately long (for instance of several bits), then the cipher provides a high security
level. Due to its construction it is not vulnerable to any known attacks but the brute
force technique. The latter, however, is actually unfeasible in the case of long keys
because of its time complexity. Moreover, if the key length is as long as the length
of the ciphertext and the key is used only once, then the Vernam cipher turns out
to be an ideal cipher that cannot be broken. It can easily be seen that one can adjust a key for a ciphertext of a given length and any plaintext of the same length.
The complexity of the brute force method for breaking encryption with a k-bit key
equals O(2k ).
Despite its simplicity, the Vernam cipher is still applied: WordPerfect, a very
popular text editor, uses it in an only slightly modified version. This encryption
scheme is used in many other ciphers, e.g., DES and AES, as well. It is also applied
in secure communication with the use of the first quantum communication networks.
lwilliams4@kaplan.edu
1.3 Simple Block Ciphers
13
1.3 Simple Block Ciphers
1.3.1 Permutations
The Caesar cipher is one of the simplest substitution ciphers. Replacing each alphabet letter with another one is performed in a regular manner—it depends on the
alphabetical order. One can easily see that this assignment may be done arbitrarily.
Let us now recall some elementary mathematical facts. Given a finite set X, any
one-to-one function f : X → X is called a permutation.
If the cardinality of X is equal to n, then there are n! permutations (one-to-one
functions) on X.
The Caesar cipher can be generalized to all possible permutations of the alphabet.
In this situation a key is given by a 26-element non-repetitive sequence of integers
from 1 to 26. Such a sequence determines the substitution that has to be applied in
order to encrypt a message.
It can be seen that for an alphabet with 26 characters the number of all permutations equals 26!, which amounts to about 4 · 1026 , a number that is large even for
modern computers. Verification of all possible keys (sequences) would take a great
deal of time.
It turns out, however, that substitution ciphers can easily be broken using socalled frequency analysis. Certain letters and combinations of letters occur more
often than others. Therefore, it is easy to check which symbols appear in a given
ciphertext and with what frequency (for obvious reasons it is better to work with
suitably long ciphertext messages).
1.3.2 Transpositions
Another simple method of encryption consists of inserting a plaintext in adequately
defined (e.g., as geometrical shapes) blocks. Such a block may be represented as
an array with a given number of rows and columns. A plaintext is placed in the
matrix row-wise. Then the matrix is transposed (rows are replaced with columns).
The ciphertext should be read row-wise after such a transposition.
Example 1.1 Let us consider the following plaintext: ITISASIMPLEEXAMPLE. If
we decide to encrypt the message with a simple transposition cipher using, for instance, a 3 × 6-matrix, then we obtain the matrix with the plaintext (Table 1.3).
After performing the transposition, we get the matrix presented in Table 1.4:
Finally, we get the ciphertext: ISILXPTAMEALISPEME.
In this system, the key is given by the matrix dimensions. For example, in the
case of a 6 × 3-matrix the ciphertext is as follows: IIXTMAIPMSLPAELSEE.
lwilliams4@kaplan.edu
14
1 Basic Concepts and Historical Overview
Table 1.3 Matrix with the
plaintext
Table 1.4 Transposition
Table 1.5 Table for
constructing encryption
templates
I
T
I
S
A
S
I
M
P
L
E
E
X
A
M
P
L
E
I
S
I
L
X
P
T
A
M
E
A
L
I
S
P
E
M
E
0
1
1
2
3
3
0
3
1
3
2
2
3
0
0
3
3
0
3
2
0
1
0
3
2
1
0
2
1
2
3
2
1
2
1
1
0
3
1
2
2
1
0
1
0
0
2
3
1.3.3 Example of a Simple Transposition Cipher
We are going to explain the principle of constructing such a cipher by a simple,
however non-trivial, example. Now, a plaintext and a ciphertext are expressed over
a 32-symbol alphabet:
ABCDEFGHIJKLMNOPQRSTUVWXYZ . : , ; which consists of 26 capital letters of the Latin alphabet, a space, a period, a colon,
a comma, a semicolon, and a hyphen. We assume that a block of a plaintext contains
48 symbols from the above alphabet which are arranged in 6 rows. Given these
assumptions, we can apply the following encryption algorithm for a transposition
cipher:
1. Create a table with 6 rows and 8 columns and fill it with integers 0, 1, 2 and
3 in such a way that the number of zeroes, ones, twos and threes is the same
and equals 12. At the same time, try to minimize the number of adjacent cells
containing the same integers. These conditions are satisfied in Table 1.5. Such a
table can easily be found, for instance by means of the following program written
in Pascal:
var i, k, r: Byte;
l: array[0..3] of Byte;
begin
repeat
Randomize;
lwilliams4@kaplan.edu
1.3 Simple Block Ciphers
15
Table 1.6 Block of the
plaintext
A D I U N K T L E C T U R E R
A F O R Y Z M A P H O R I S M
A K T O R K A A C T R E S S .
for i:=0 to 3 do l[i]:=0;
for i:=1 to 6 do
begin
for k:=1 to 8 do
begin
r:=Random(4); Inc(l[r]);
Write(r:2);
end;
WriteLn
end;
WriteLn;
until (l[0]=l[1]) and (l[0]=l[2]) and(l[0]=l[3])
end.
2. Cut four rectangular pieces of paper that have the same size as the tables from the
first step of the algorithm. Then create four templates, numbered 0, 1, 2 and 3.
In each template cut 12 holes covering the cells which contain the number of the
template.
3. The cryptogram will be written on a blank piece of paper that is exactly the
size of the template. Put this piece of paper consecutively against the templates
prepared in the second step and insert 48 letters of the plaintext into empty cells
(4 times 12 letters).
The decryption algorithm is easier and consists of only one step:
1. Put the templates consecutively against the ciphertext and read the symbols of
the plaintext from the holes and write them down in the matrix of six rows and
eight columns.
Let us take the part of the Polish-English dictionary shown in Table 1.6 as a plaintext. After encrypting this text according to the algorithm we obtain the cryptogram
given in Table 1.7. It can easily be verified that by decrypting this ciphertext according to the algorithm presented above one obtains the proper plaintext.
The just-described substitution block cipher, whose keys are given by encryption
templates and the order of their use, can be specified in a more formal manner. If
we enumerate 48 symbols of a plaintext consecutively, then the encryption process
obtained by applying templates takes the form of the table presented in Table 1.8,
from which it follows that the first symbol of the ciphertext corresponds to the first
symbol of the plaintext, the second symbol corresponds to the 13th symbol of the
lwilliams4@kaplan.edu
16
1 Basic Concepts and Historical Overview
Table 1.7 Block of the
cryptogram of the
transposition cipher
A U R A R K D A
E – P H A I U C
T N R O K R T E
R A – I F S S M
O A R Y L S Z K
T M E – C T O .
Table 1.8 Encryption
permutation
1 13 14 25 37 38
15 40 26 27 41
43
5 44 28
29 17
2 39
3 4 42
6 16
7 45
8 30 18 31 46 32
19 33 20 21
9 47 22 34
35 23 10 24 11 12 36 48
Table 1.9 Decryption
permutation
1
7 14 15 18 21 23 27
37 43 45 46
2
3 9 22
26 29 33 35 36 39 42 44
4 11 12 20 25 28 30 32
34 40 41 47
5
6 8 10
13 16 17 19 24 31 38 48
plaintext, the third to the 14th, etc. Using this table, which represents a permutation
of the set {1, 2, . . . , 48}, one may thus determine the encryption procedure more
precisely than by means of encryption templates. What is more, considering this
permutation as an encryption key, it is possible to specify the exact number of all
possible keys for the above cipher, which is equal to
48! = 12413915592536072670862289047373375038521486354677760000000000,
which amounts to about 1.241391559 · 1062 . Not a small number, especially when
compared to the number of all atoms on our planet which is estimated to be 1051 .
In order to decrypt ciphertexts of the cipher in question one has to apply the
permutation that is inverse to the encryption permutation presented in Table 1.9. At
first sight it seems that the cipher may be broken by trying 48! permutations one
by one. Assuming that we would be able to test a million permutations per second
(even such an assumption is too optimistic for the current state of technology), it
would take around 1047 years to break a ciphertext. On the other hand, the age of
the universe is estimated to be 1011 years. However, if cryptanalysts apply statistical
tests, then breaking such a cipher takes them just a couple of seconds.
lwilliams4@kaplan.edu
1.3 Simple Block Ciphers
17
Table 1.10 Symbols of the
plaintext and corresponding
symbols of the cryptogram
A B C D E F G H I J K L M N O P
X H Y U Z R . V Q A : W T , B C
Q R S T U V W X Y Z
.
:
,
; –
S D ; E I G L F J K – M P N 0
Table 1.11 Table with
ciphertext symbols and
plaintext symbols
corresponding to them
A B C D E F G H I J K L M N O P
J O P R T X V B U Y Z W .
,
; :
Q R S T U V W X Y Z
,
; –
.
:
I F Q M D H L A C E – G K N S
Table 1.12 The block of the
substitution cipher
cryptogram obtained using
Table 1.10 which corresponds
to the block of the plaintext
presented in Table 1.6
X U Q I , : E
W Z Y E I D Z D
X R B D J K T
X C V B D Q ; T
X : E B D : X
X Y E D Z ; ; M
1.3.4 Example of a Substitution Block Cipher
In the easiest case the operation of substitution block ciphers consists in replacing
symbols of a plaintext with other symbols one by one. Hence, when performing
the encryption algorithm we have to apply a table which contains the rule of this
substitution, i.e., the table represents some permutation of the alphabet. Of course,
in order to decrypt messages one uses the inverse permutation.
Let us assume that in the considered case the same alphabet as in the example of a
transposition cipher is used, the 48-symbol block of a plaintext is the same as previously, and the substitution table presented in Table 1.10 is applied during encryption.
Then one obtains immediately a table for use during decryption (Table 1.11).
Now, we can illustrate the process of creating a cryptogram using the substitution cipher. If the plaintext is given by the block presented in Table 1.6, then, after
applying Table 1.10 and executing 48 symbol substitution operations, one gets the
cryptogram shown in Table 1.12. The obtained cryptogram may yet be easily decrypted since its symbols occur in the same order as the corresponding symbols in
the plaintext. For this reason cryptanalysts break substitution ciphers very quickly.
1.3.5 Example of a Product Cipher
A product cipher applies two encryption algorithms sequentially: it starts by encrypting a plaintext by means of the first algorithm, then the obtained cryptogram
lwilliams4@kaplan.edu
18
1 Basic Concepts and Historical Overview
Table 1.13 Block of the
product cipher cryptogram
obtained by encrypting a
block of the plaintext by
means of the substitution
cipher and then using the
transposition cipher
X I D X D : U X
Z
C V X Q I Y
E , D B : D E Z
D X
Q Q ; ; T
B X D J W ; K :
E T Z
Y E B M
is encrypted with the second one, which results in the ciphertext of the product cipher. This method can be explained by encrypting the plaintext given in Table 1.6
with the use of the substitution cipher described above and then by considering the
resulting ciphertext (Table 1.7) as an input for the transposition cipher encryption
algorithm. The cryptogram obtained in this way is presented in Table 1.13. One may
easily check that decryption of the ciphertext presented in Table 1.13 should be performed in the reverse order to the encryption process: the decryption algorithm for
the substitution cipher followed by the decryption algorithm for the transposition
cipher have to be executed.
Although the product encryption algorithm presented above consists of two weak
ciphers, breaking it is a non-trivial task—even for an advanced cryptanalyst.
As indicated by Claude Shannon, such an alternate use of transposition and substitution ciphers results in break-resistant ciphers. A slight modification of this principle is applied in many practically used cryptographic block encryption systems
operating on a two-element alphabet, such as DES, IDEA or AES.
1.3.6 Generalized Substitutions—Bigrams
Simple substitution ciphers, which apply single symbols of an alphabet, can easily
be generalized to ciphers that apply substitutions of blocks (sequences) of symbols.
Instead of permutations f : X → X (where X is a set of alphabet symbols), one
can consider, in the simplest case, permutations f : X × X → X × X (f : X 2 →
X 2 ) or, more generally, f : X n → X n . Of course, the number of possible keys
significantly increases in this approach. Indeed, in the simplest case when n = 2 the
number of permutations reaches 676!, while for an arbitrary n it increases to (26n )!.
A problem for the cryptographer is to represent a key (a permutation)—let us
notice that even in the simplest case when n = 2 we have 676 values of the given
permutation. For the sake of simplicity, geometric methods are used in order to
represent the key.
The Playfair cipher, used by British forces during World War I, is an example of
such systems. It uses a 25-symbol alphabet (the letter J is substituted by the letter I).
A key is given by an arbitrary expression contained in a square matrix of size 5 × 5.
We demonstrate execution of the algorithm by the following example.
Example 1.2 Let us consider as a key the following phrase: CRYPTOGRAPHYISOK. Now, let us construct a square matrix of size 5 × 5 filling it in with letters
lwilliams4@kaplan.edu
1.3 Simple Block Ciphers
19
of the key without repetitions. The empty entries of the matrix are filled with the
remaining letters of the alphabet (i.e., those that do not appear in the key).
We obtain the following matrix:
C R Y P T
O G A H I
S K B D E
F L M N Q
U V W X Z
The cryptogram is created from a plaintext by appropriate, i.e., with respect to the
matrix, substitutions of pairs of letters (if the text has an odd number of symbols,
then it is completed with any symbol).
Let us consider the following plaintext: ENCRYPTIONKEYS. At the first stage
of encryption, the sequence of letters is divided into pairs EN CR YP TI ON KE YS.
Each pair is transformed with respect to the rectangle contained in the matrix determined by the letters that form the pair (according to the row-wise order). For
instance, the pair EN is converted to the pair QD (as these two letters form the two
remaining corners of the rectangle defined by the digraph EN). If encrypted letters
are placed in the same row/column or they are equal, then we choose the symbols
to their right, e.g., AW is converted to HX, while FL is converted to LM.
D E
N Q
A H
B D
M N
W X
The whole ciphertext is as follows: QDRYPTCOFHBSBC.
Let us present another example of a bigram system.
Example 1.3 Keys are given by two independent permutations of a 25-letter alphabet. We place them in two square matrices (of dimension 5 × 5).
A K N Y E
R D U O I
Q S W B G
H C X T Z
V M L P F
E R T B O
W I U M K
N D A S F
Q X G Z V
H Y P L C
A plaintext, for instance TODAYISABEAUTIFULDAY, is divided into several rows
of a fixed length:
T O D A Y I S A B E
A U T I F U L D A Y
lwilliams4@kaplan.edu
20
1 Basic Concepts and Historical Overview
The encryption process consists in replacing columns of the plaintext with entries of
matrices (keys) that are determined by the corresponding rectangle, similarly to the
Playfair system. The upper symbol of the bigram is marked in the left matrix and
the lower symbol in the right one. Thus, in our example, the bigram TA is replaced
with BG, and so on.
The whole plaintext is transformed into the following cryptogram:
BG IM KU RR BO RM MS QR GS FR.
Both systems presented above are substitution block ciphers. Let us recall that
cryptanalysis of simple substitutions is hindered by the fact that now all operations
are performed on blocks of letters. The number of functions that map blocks into
blocks is very much greater than the number of functions that map single symbols.
Applying statistical methods is much more complex, as well.
1.3.7 Polyalphabetic Substitutions
Besides frequency analysis, a successful attack on simple or complex substitutions
can be carried out due to invertibility of applied transformations. It is easy to notice
that letters or blocks of letters of a plaintext correspond to fixed letters or blocks of
letters in the ciphertext, regardless of their positions. This causes specific patterns
to be preserved and, with the additional use of statistical methods that are easy to
apply nowadays, facilitates breaking ciphers.
The idea of polyalphabetic substitutions is based on applying substitution rules
that depend on the current position in the text. Pioneering ideas were developed
already in the second half of the fifteenth century. Let us now present probably the
best-known example of a polyalphabetic cipher: the Vigenère cipher.
1.3.8 Vigenère Cipher
In order to present one of the simplest methods using polyalphabetic substitutions,
let us recall the Caesar cipher. Each symbol (encoded as an integer in the range 1
to 26) is mapped into one fixed value. Let us assume that a plaintext is divided into
blocks of four letters. Polyalphabetic encryption may be defined as a transformation
of consecutive letters of each block by different (but fixed) values. For instance, the
first letter of a block is shifted by 2, the second by 4, the third by 3 and the fourth
by 10 (remember that each shift is executed modulo 26). In this system, the key is
given as the length of a block and values of all shifts.
And so, with the assumed data, the plaintext: OLDHABITSDIEHARD is transformed into the cryptogram: QPGRCFLDUHLOJEUN. The encryption process is
shown in Table 1.14. When using blocks of length n, the above algorithm is nothing
else but an n-fold convolution of the Caesar cipher. It is clear that the same letters
lwilliams4@kaplan.edu
1.4 Wheel Cipher and Rotor Machines
21
Table 1.14 Example of applying the Vigenère cipher
Plain text
O L D H
A B I T
S
D
I
E
H
A
R
Letter numbers
11
20
15
16
25
20
1
14
9
5
2
12
1
4
26
9
Key
2
4
3
10
2
4
3
10
2
4
3
10
2
4
3
10
Cipher numbers
13
24
18
26
1
24
4
24
11
9
5
22
3
8
3
19
Cipher
Q
P
G
R
C
F
L
D
U
H
L
O
J
E
U
N
D
Fig. 1.5 Idea of wheel cipher
of a plaintext are not necessarily mapped into the same symbols in a cryptogram. In
general, this hinders cryptanalysis that applies statistical methods.
However, even this cipher is not very hard to break. If only the length of blocks
is known, then given cryptograms that are sufficiently long one can focus on each
n-th symbol and analyze the obtained set with the use of statistical methods. In case
the parameter n is not known, the problem is a bit more complex, nonetheless, it is
still not very difficult to solve. It is enough to analyze the text for consecutive values
of the parameter n, starting with 2. Applying computers the cipher may quickly be
broken.
1.4 Wheel Cipher and Rotor Machines
1.4.1 Wheel Cipher
Upon the development of computers, the substitution and polyalphabetic ciphers
described in previous sections were no longer difficult for cryptanalysts to break. As
long as encryption using these methods was done manually, they were considered
an extremely hard problem of cryptanalysis.
The invention of the wheel cipher constituted an upgrade of polyalphabetic methods (Fig. 1.5).
The cipher was introduced by Thomas Jefferson in 1795, however at this time it
did not become well known. It was independently discovered by Étienne Bazeries
lwilliams4@kaplan.edu
22
1 Basic Concepts and Historical Overview
Fig. 1.6 Cipher wheel—M94
a century later. A wheel cipher consists of a set of several disks with permutations
of alphabet symbols arranged around their edges. The permutations are different for
all disks. With a dozen or even tens of such disks stacked on one axle, it is possible
to place them in such a way that the number of letters of a plaintext corresponds to
the number of disks set in one row. The encryption process is based on reading a
text placed one or more rows above or below. A person decrypting the text arranges
consecutive letters of the cryptogram in one row of the wheel and reads the plaintext
from the appropriate row (for instance, in the case of 30 disks it gives 30!, i.e., about
2.6 · 1032 possibilities).
According to current criteria, this method does not provide a sufficient security
level. An intruder may come into possession of the device used for encryption. They
could then quite easily find the appropriate arrangement of disks.
A wheel cipher known as the M-94 [54] was used by the United States Army
from 1923 until 1942 (Fig. 1.6).
1.4.2 Rotor Machines
Rotor machines were a step forward with respect to cipher cylinders—they were the
first encryption devices that used electricity. Below, we present how they work.
Let us consider a disk containing an alphabet permutation. A simplified version
of this situation (for a 4-letter alphabet) is depicted in Fig. 1.7.
Encryption by means of such a disk corresponds to a simple substitution cipher,
thus its encryption power is not very impressive. Let us notice, however, that this
disk may rotate around its axis in a specific way during the encryption process. The
cipher obtained in this way constitutes the Vigenère cipher described before.
Now, let us consider several disks containing different alphabet permutations
stacked on one axle (Fig. 1.8). Composing the appropriate number of such disks
(permutations) gives a unique permutation, however, it turns out that such a scheme
allows us to rotate all disks during encryption. This increases the encryption power
significantly and complicates analysis of ciphertexts.
lwilliams4@kaplan.edu
1.5 Enigma
23
Fig. 1.7 Permutation
Fig. 1.8 Idea of rotor machines
The algorithm for rotating the disks has to be specified precisely and both parties
exchanging messages need to know it, as well as how to arrange the disks to form
the whole device. The cipher key is given by the output positions and the order of
disks (the initial configuration of the machine).
Machines constructed in this way are called rotor machines. Patents for such
machines were filed around 1920 by several inventors concurrently. Disks of these
machines are called drums or rotors.
Physically, rotors were thick disks made of an insulator. Every disk had 26 contacts equidistant and symmetric from the center on both its sides. Each contact on
the left side was connected to exactly one contact on the right side—the pattern
of connection (some permutation of the alphabet) was of course to be kept secret.
There were 26 circuit terminals (corresponding to letters of the alphabet) adjacent
to each side of the disk. If voltage is applied to one of the terminals, then the current
flows through one of the contacts on the right side and lights one of 26 bulbs. In this
way, the machine performs the afore mentioned permutation of the alphabet.
The most famous machine of this type was Enigma, developed in Germany and
used before and during the World War II by the Axis powers.
1.5 Enigma
The name Enigma comes from Greek and literally means secret. The designation
Enigma was used to name a number of encryption machines with the same root
lwilliams4@kaplan.edu
24
1 Basic Concepts and Historical Overview
and use. Before Enigma reached its final form, various models of the machine were
produced and deployed. This section presents a description of the origin of these
machines and their principles of operation.
1.5.1 History of the Enigma
There is no single father of Enigma (Fig. 1.9). Several teams of cryptographers
from the Netherlands, Germany, Sweden and the United States worked to develop
its ancestors. Enigma’s history dates back to the beginning of World War I, when in
1915 two Dutch officers Theo van Hengel and R. P. C. Spengler used two connected
typewriters for the purpose of encryption. In the same year, another Dutchman Hugo
Koch received a patent for an identical machine. Also in 1915, in the United States,
Edward Hebern came upon the idea of connecting two typewriters with a bundle of
wires, where the connections between them could be freely changed. Striking a key
of the first typewriter resulted in the corresponding letter on the printing head of the
other machine [48].
A set of machines connected in this way, or later with radio or microwave connection, realized a simple monoalphabetic substitution cipher, thus implementing
one of the many permutations of the alphabet used. It was not a great invention, of
course, from the point of view of today’s cryptography. It must be emphasized, however, that the encryption was running in a completely automatic manner. The ability
to change the wiring in the unit was then expanded to include a rotor (a wheel with
electrical wires) between the keyboard and the printer, just as in the rotor machines
described in the previous section. From here, it was a short way to the application
of multiple rotors. Also added was the idea of simultaneous turning of the rotors by
different angles during the machine’s operation. Each particular position of the rotors provides an electrical path through all of the rotors. The current passes through
the interconnections, defined by a relative position arrangement of the rotors, to a
printer at the other end and outputs the ciphertext letter substituting an input plaintext letter. This realized a polyalphabetic substitution cipher, of possibly very high
complexity. Frequency analysis no longer works for cryptanalysis of such a cipher.
Soon after the Hebern inventions, the idea of a cipher machine was developed in
Europe by a Swede, Arvid Gerhard Damm, and a German, Arthur Scherbius. The
company they founded developed and promoted the first machine to be given the
working name Enigma. This ancestor of the whole subsequent family of encryption
machines was equipped with four rotors placed on a common spindle in a fixed
order. The fact that each of the rotors could turn through a different angle, when the
machine was in operation, was an additional obstacle to potential cryptanalysis of
the implemented cipher [48].
In 1926, the company released a version designated by the letter C, which had
two important innovations. The typewriter was abandoned, because it was cumbersome to use when operating the machine, though the keyboard remained part of the
machine. Instead, Enigma C was equipped with a board for a set of light bulbs of
lwilliams4@kaplan.edu
1.5 Enigma
25
Fig. 1.9 Enigma
number corresponding to the number of characters of the alphabet used. Pressing a
key would cause the corresponding light to come on.
An equally important innovation introduced in the C model was an idea from
one of the designers of the team, Willi Korn. He suggested additional equipment
for Enigma, a special drum, called the reflector (Fig. 1.11). It was designed in such
a way that it would send each signal back through the rotors. When an electrical
signal reached it, after negotiating three movable rotors, the reflector sent it back
through all three rotors in the same position. Thus, if, for example, letter G was
converted to letter P, then using the reflector and the same rotor arrangement, letter
P would be mapped back to G. Note that in this way the transition path through
the rotors’ contacts was doubled in length. The reason for the introduction of the
reflector was an issue in Enigma decryption. Until that time, separate keys had been
needed for encryption and decryption. Indeed, it had been necessary to reverse the
rotor mounting to decrypt the ciphertext. In this way, the decryption process could
be carried out without rotor reversal. It turned out that this seemingly wonderful
idea made it easier for analysts to break the Enigma code, since the introduction of
the reflector meant that no letter could be mapped to itself. This information became
very helpful later on in cracking the cipher.
In 1927, Enigma model D was developed. In this version, the reflector did not
rotate during operation of the machine and it could be assembled in any of the possible positions. In 1925, the German navy purchased several of these machines. Over
the subsequent years, Enigma kept evolving. A version was created for the German
army (Wehrmacht), and for the Navy (Kriegsmarine). The Enigma system was also
made available to Japan, which built its cipher machines based on it. In February
1942, the German navy introduced a version of Enigma with a fourth rotor for messages to and from Atlantic U-boats. In this way the number of possible settings was
multiplied by 26. This necessitated 26 times greater cryptanalytic effort. In mid1942 a much bigger 12-rotor version, called Lorenz, was introduced for messages
between the German High Command and field commanders. In the Lorenz cipher
each letter of the alphabet was represented by five electrical impulses. Messages
lwilliams4@kaplan.edu
26
1 Basic Concepts and Historical Overview
were enciphered by combining randomly generated letters with the original text
using a variant of the XOR function, character by character. At the receiving end
Lorenz applied exactly the same obscuring letters back to the ciphertext to recover
the plaintext.
In its time, Enigma undoubtedly presented a significant technological advance
in the field of encryption. The machine had a very large key space, and thus, at
that time, it was not possible to break the code by brute force. However, as safe
as Enigma would seem before the war, it was effectively broken, first by Polish
intelligence, and then in later versions by the Allied intelligence services. This fact
significantly influenced the result of World War II.
1.5.2 Construction of the Enigma
Below, we present the construction and operation of the most popular version of
the Enigma machine—the so-called Wehrmacht Enigma model, widely used during
WWII by the German land forces and the air force (Luftwaffe).
As indicated above, the starting point for the construction of the Enigmas was
an electric typewriter. Pressing a key would close the electric circuit, and then the
flow of the electric current would cause the corresponding letter printing and advancement of the print head by one position on the printing machine. In the first
version of the device, Hebern connected two typewriters with a bundle of 26 cables.
This machine, of course, generated a simple permutation cipher code. One could
complicate its function by introducing additional encryption rotors.
The introduction of the stationary rotor between the contacts leading to the keyboard and the print head does not increase the strength of the cipher, which remains
a simple permutation. However, the same mechanism which is used to move the
print head can also turn the rotor on its spindle by one position. In this case, each
additional letter of the cryptogram will be encrypted by a different substitution (permutation). After encrypting 26 letters, the machine returns to the initial substitution,
and the cycle of the used permutations is repeated. One can, of course, introduce
between the keyboar…

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN