AN AES Based Elliptic Curve Integrated Encryption Scheme for Hardware Design

Guard Kanda1,1, Alexander O. A. Antwi1 and Kwangki Ryoo1,

1 Department of Information and Communication Engineering, Hanbat National University, 125 Dongseodaero, Yuseong-gu, Daejeon, 34158, Republic of Korea

{guardkanda, alex.o.a.antwi}@gmail.com , [email protected]

Abstract. Communication channels, especially the ones in wireless environments need to be secured. But the use of cipher mechanisms in software is limited and cannot be carried out in hardware and mobile devices due to their resource constraints. This paper focuses on the implementation of Elliptic Curve Integrated Encryption Scheme (ECIES) cryptosystem over an elliptic curve of 163-bit key length with an AES cipher block based on the Diffie-Hellman (ECDH) key exchange protocol.

Keywords: ECDH, ECIES, Cryptosystem, ECC

1 Introduction

Data Integrity and Privacy maintenance achieved through techniques such as cryptography and password authentication are vital for everyday communication in our current world 1. Due to the huge advantage that comes with Elliptic Curve Cryptography in terms of their smaller key size and how easily they can be computed as compared to other public key encryption such as the RSA and DSA, it is quickly becoming the option for encryptions / decryption for many applications. For instance, ECDSA was implemented to avoid vehicular accidents by using secure broadcast Vehicle-to-Vehicle (V2V) communication in 2 which used the ECDSA algorithm with the IEEE 1609.2 vehicular Ad-hoc network standard. Not all, 3 proposed the implementation of an American National Standards Institute (ANSI) called X9.62 ECDSA over prime elliptic curve F192. Furthermore, a variant of the curve, the Hyperelliptic Curve Cryptosystem(HECC) 4 is suitable for all embedded processor architectures having very heavy resource constraint. This paper presents an implementation of elliptic curve integrated encryption scheme in hardware, adopting the ECDH protocol to generate a shared key for communication interchange between parties. The shared key, is in turn used as the key to any block cipher such as AES and DES to encrypt and decrypt any message.

1.1 Elliptic Curve Diffie-Hellman Algorithm

Parties involved in a particular communication based on a key agreement scheme are required to each provide some form of data or information to be used in creating a shared session key. This is the case for the ECDH algorithm. Two parties, Alice and Bob as popularly referred to, both agree on an elliptic curve E with a finite field P and base point G (x, y). The ECDH key exchange can be from table 1 in 4 main stages.

Table 1. Shared key generating sequence in ECDH.

No

Algorithm Sequence.

1:

Alice and Bob randomly generate integer numbers between 1 and n (order of the subgroup) dA and dB respectively for their private keys

2:

They both then generate their public key which is HA = dA.G, HB = dB.G

Where G is the base point on the elliptic curve

3:

Alice and Bob now exchange HA and HB public keys

4:

Alice and Bob can both now calculate the shared secrete key

dA.HB Alice’s shared key, dB.HA Bob’s shared key

S= dA.HB = dA (dB.G) = dB (dA.G) = dB.HA

1.2 Random Number Generator

Random numbers are needed to for everyday application and hence it’s mode of generation and testing is critical to its use in an application 5. The private keys for each communicating party are randomly generated. Two random number generator modules, the AKARI-X 6 and the Linear Feedback Shift Register (LFSR) were designed during this research. Their performances were compared and the best one chosen for the final implementation. The LFSR was implemented using a primitive polynomial of degree 32 from equation (1). The LFSR, an m-bit PRNG will always require at least m-clock cycles to generate. On the other hand, the AKARI-II requires a fixed 64-clock cycles. The LFSR operated at a frequency of 383 MHz with an LUT slice count of 480. The AKARI-X on the other hand operated at a maximum frequency of 215 MHz and an LUT slices count of 1314 making the PRNG more efficient.

x32 + x28 + x19 + x 18 + x 16 + x 14 + x 11 + x 10 + x 9 + x 6 + x 5 + x 1 + 1

(1)

1.3 Montgomery Ladder Point Multiplication

The main core of the ECIES is based on the ECDH shared key exchange protocol. The protocol is computationally intensive due to inverse operation and complexity of multiplication involving huge numbers. These issues are handled with the use of the Montgomery scalar multiplication algorithm. The inverse operation is also replaced with multiplication by transforming the coordinates from the affine domain to the projective domain by using the Lopez and Dahab transformational equation.

(X, Y, Z), Z ? 0, maps to (X/Z, Y/Z2)

(2)

As shown in the architecture in figure 1, this algorithm further implemented squaring 7, addition, multiplication 8 and division modules 9 all performed in Galois filed. This Montgomery multiplier was implemented on virtex 5 device with an LUT slices count of 3677 and operated at a maximum frequency of 500MHz

Fig. 1. Proposed Point Multiplier Architecture.

1.4 Secure Hash Algorithm 1

The Secure Hash Algorithm 1 (SHA-1) hash function designed in this paper is based on the FIPS 180-2 Secure Hash Standard 10. The SHA algorithm processes a 512-bit data in 32-bit chunks of block to generate a 160-bit message digest. The message digest obtained from SHA-1 is implemented in the Keyed-Hash Message Authentication Code (HMAC). The HMAC is used with a key and part of the sender’s message to create an authentication tag.

1.5 AES Hardware Architecture

The proposed AES Architecture was modeled with a round pipeline architecture. The mode of processing is done by iterating over the modules ten times in the pipeline fashion. This design approach significantly causes an increase in the operating frequency recorded. The Mix Columns unit operates on a column of the state matrix and multiplies that with a fixed matrix. That is either 0x02, 0x03 or 0x01. The design also implemented BRAM based S-box and a pipelined inner round to ensure maximum operating frequency.

Fig. 2. Simulation Waveform.

2 Proposed ECIES Hardware Architecture

The main focus of this research is to implement the ECIES standard in hardware while improving upon the ECDH key exchange scheme. As stated in section 1.3, the core module of this scheme is the Diffie-Hellman key exchange which is computationally expensive to design in hardware.

Figure 3 is the complete proposed architecture for the encryption phase of the communication. The controller generates the enable signals to trigger and schedules the execution of the individual module in the architecture. A done response signal is also received from each module upon completion of its operation. The controller data-path for the proposed architecture was modelled using an FSM.

From the waveform simulation in figure 2, the generated random number is the public key of the sender. This public key is then inputted to the ECC processor core. With an enable signal generated from the controller, the private key is generated based on the Montgomery ladder algorithm using the base point G (x, y) defined on the elliptic curve E agreed on by both parties. The Montgomery algorithm was implemented in projective coordinates. The shared session key can now be generated by using the generated private key of the sender and the public key of the recipient.

The Key Derivation Function (KDF) takes as input the shared key and based on a cryptographic hash, generates a hashed key pair (ENC_key and MAC_key). The ENC_key is used for the block cipher encryption and the MAC_key is used to create the message authentication take. This is a one-time message authentication code. The HMAC block computes the authentication tag with using the MAC_key and the plain text. The recipient of the encrypted text should also generate the same copy of the MAC_tag. The recipient then compares the generated MAC_tag to what he received in his cryptogram. If they both are equal, the recipient goes ahead to decrypt the text, otherwise he discards the whole text. The cryptogram that is sent after the whole process consists of the encrypted text, the sender’s public key and the MAC_tag.

3 Experiment and Discussions

To aid in performance and efficiency evaluation the system designed, the parameters defined by 7 is used to determine efficiency and throughput. Design was implemented on virtex 5 FPGA device to enable comparison with other designs 11 and also with design 7 which was implemented in vertex 4.

Fig. 3. Encryption Phase of the Proposed Hardware Architecture.

Throughput = Operating freq. x Number of Bits / Number of Clock Cycle

Efficiency = Throughput (Mbps) / Area (Slices)

(2)

Table 2. Proposed ECIES-processor FPGA Implementation Performance Result compared to 7 and 11

Proposed System Modules

Area

(Slice LUTs)

Max. Frequency(MHz)

# Cycles

Throughput

(Mbps)

Efficiency

ECC point Multiplier

3637

500

49580

1.64

0.0004519

ECC point Multiplier 7

14203

263

3404

12.5

0.0008800

ECC point Multiplier 11

4815

550

52012

1.72

0.0003513

HMAC(SHA-1)

1320

206

193

170.78

0.1293000

PRNG

495

382

164

379.6

0.7600000

AES

1455

350

45

995.5

0.6800000

ECIES

10190

206

159600

1.32

0.0001200

Table 2 shows the result from performance and Implementation. The proposed system shows a higher performance from the comparison with 7 and 11. It can be observed that even though the operating frequency for the proposed system is reduced to 500MHz compared to 11 it’s efficiency is a little higher due to the reduction in area. Performance of the other individual block modules in table 2 are determined using equation 2. The ECIES total output bit is a 1024 and hence performs with the efficiency 0.00012

4 Conclusion

In this paper we have proposed an ECIES hardware design. The design was implemented on virtex 5 and virtex 7 FPGA devices with high performance rates shown in table 2. The current design is presented without the Key Derivative Function (KDF) which will be implemented in the future design. Instead, the function for the KDF is omitted and the original shared key with its hash are used for the MAC_key and ENC_key respectively.

Acknowledgments. This research was supported by the MSI (Ministry of Science, ICT and Future Planning), Korea, under the Global IT Talent support program (IITP-2017-0-01681) and Human Resource Development Project for Brain scouting program (IITP-2016-0-00352) supervised by the IITP (Institute for Information and Communication Technology Promotion).

1 Please note that the LNCS Editorial assumes that all authors have used the western

naming convention, with given names preceding surnames. This determines the

structure of the names in the running heads and the author index.