Randomness and Security: A Primer on Entropy and DRBGs

August 5, 2016 | Views: 3314

Begin Learning Cyber Security for FREE Now!

FREE REGISTRATIONAlready a Member Login Here

Randomness and Security: A Primer on Entropy and DRBGs

When it comes to systems security engineering, randomness is everything. So many exploit mitigation technologies and cryptographic primitives rely on reliable, statistical randomness that getting it wrong is detrimental to the system as a whole.  That said, there are some common misunderstandings about both randomness and entropy that many people hold, and a whole lot about entropy and randomness that goes generally untaught. 

Hopefully this primer, largely written from a FIPS 140 and Common Criteria perspective, will address those and provide a clear view of the subject without getting too much in the weeds.

Where Do We Use Randomness?

Random numbers show up in various places in a properly implemented, modern, computer systems.  Some places where unpredictability is either an asset for, or key to, systems security are:

  • TCP/UDP source port numbers
  • TCP Sequence Numbers
  • Process IDs
  • Stack Canaries
  • Stack base and heap allocation offsets in ASLR

And, of course, randomness is incredibly important to cryptographic operations. Some of the places where random numbers materialize in that context include:

  • Cryptographic key material
  • Initialization Vectors
  • Salts

True Randomness vs Pseudo-Randomness

True randomness is incredibly hard to measure.  A bit string such as 11011101110111011101 might legitimately be produced by a true random number generator. However, you will note, that it clearly has a repeating pattern.  Proving that a source of entropy (i.e., randomness) is actually producing random values is notoriously difficult.  Many sources may have a bias which render them ineffective as a source of random material in and of them selves.

Not every true random source has a bias, but when dealing with ‘true’ randomness, how would you know? Functionally, this is a problem.  However, for most purposes, pseudorandom numbers are sufficient, assuming they are generated in a safe way.  That means running the raw entropy through a whitening algorithm of some sort.

Deterministic Random Bit Generators

The National Institute of Science and Technology (NIST) Special Publication 800-90A lays out four approved DRBGs models.  Three of these are still considered to be cryptographically secure, where as the notorious Dual_EC DRBG is well known now to be kleptographic in nature.  Dual_EC was removed in Revision 1 of SP800-90A due to the inherent insecurity of the mechanism.

The currently approved mechanisms are:

  • HASH_DRBG
  • HMAC_DRBG
  • CTR_DRBG

In a HASH_DRBG implementation, raw entropy (discussed later) is collected and pushed through an approved SHS hashing algorithm.  These are currently in the SHA2 family, and include SHA2-256, SHA2-384 and SHA2-512.  The bit stream of the hash value is considered to appear sufficiently statistically “random” in nature.

In an HMAC_DRBG, a Hashed Message Authentication Code (HMAC) algorithm is applied to the entropy pool in order to generate a pseudorandom bit stream.  This requires pulling a larger number of raw entropy bits, as the some number of bits is pulled to serve as a key for the HMAC function which is applied to another input bit stream to produce the result.

CTR_DRBG requires pulling three input streams:

  • One to use as the Initialization Vector (IV) for the AES counter-mode cipher
  • One to use as the AES key
  • One to use as the “plaint text” value to feed into the AES_CTR cipher block.

Entropy: The Obvious Weakness with DRBGs

The key word in Deterministic Random Bit Generator is “Deterministic.” From looking at the approved models, you see that they are all based on cryptographic primitives and modes of operation which will ALWAYS produce the same output given the same input. Obviously, then, the value of the DRBG’s output rests not only on the correctness of implementation, but on the value of the input (i.e, raw entropy).

Raw entropy is not necessarily truly random, but is close to it. There are many sources of raw entropy on a given system and, generally speaking, no one of them is good enough in and of itself.  For example, on FreeBSD kernel source, we can see that ‘raw’ entropy is derived from the following sources:

static const char *(random_source_descr[]) = {

           “CACHED”,

           “ATTACH”,

           “KEYBOARD”,

           “MOUSE”,

           “NET_TUN”,

           “NET_ETHER”,

           “NET_NG”,

           “INTERRUPT”,

           “SWI”,

           “FS_ATIME”,

           “UMA”, /* ENVIRONMENTAL_END */

           “PURE_OCTEON”,

           “PURE_SAFE”,

           “PURE_GLXSB”,

           “PURE_UBSEC”,

           “PURE_HIFN”,

           “PURE_RDRAND”,

           “PURE_NEHEMIAH”,

           “PURE_RNDTEST”,

           /* “ENTROPYSOURCE” */

   };

That is, 11 sources of environmental randomness, including keyboard/mouse events, interrupts, filesystem access time updates, and network packets.  There is also the ability to pull from a hardware “true” random number generator/noise source.  On Intel processors, this is the PURE_RDRAND source (noting, of course, that calls to RDRAND actually return the output of a CTR_DRBG implementation inside RDRAND on the processor, and not raw noise like from the Free Ring Oscillators in the Secure Enclave Processor (SEP) of an iPhone).

Any one of the environmental noise sources could be gamed.  If you were only using NET_ETHER as a noise source, a remote attacker would be able to control the entropy on your system just by sending packets over the network, for instance. Taken together through, having 11 sources of environmental randomness plus a hardware noise source gives enough variety to significantly raise the bar for an attacker looking to compromise the validity of a cryptographic system.

Measuring Entropy

NIST SP800-90B defines not only what entropy is, particularly in the context of cryptographic systems, but also all of the tests which one needs to do in order prove that the entropy collection on the system is sufficient to seed a DRBG without compromising down-stream randomness with predictable output.

For those who are interested in the details, they can be found in the NIST publication. For purposes here, we’ll say that what is really important is measuring the “worst case scenario,” which is the “min-entropy value.”  When measuring raw entropy, you might expect on a system such as the one described above to come in around 5-6 bits per byte of min-entropy when measuring raw entropy.  If one were to run the same tests on data pulled from the result of the DRBG (i.e., pulling 1MB of random out of /dev/random), you would expect around 7.5 bits per byte min-entropy.

The tools for measuring entropy can be found here. Gathering raw entropy from the kernel pools is beyond the scope of this article.

Applicability

For engineers working on products pursuing FIPS 140 or Common Criteria certifications, entropy is of prime importance.  As stated, the entire strength of a cryptographic system hinges on the entropy source, as weak entropy will lead to predictable input to the DRBG, leading to predicable output of the DRBG.  This means your cryptographic key material will be predictable as well.

When dealing with either FIPS or CC, your entropy source must be described in detail in the documents submitted to the government. In CC, this document is called an Entropy Assessment Report (EAR) and is submitted at the beginning of a validation.  It will describe how entropy is collected, how the DRBG works, and requires sampling raw entropy from the kernel pools and running the statistical tests provided by NIST.

NSA’s Information Assurance Directorate (IAD) reviews the EAR and the raw entropy files in order to determine whether the entropy source is sufficiently strong to provide assurance that, assuming correct implementation of the DRBG and other cryptographic primitives, the cryptographic system in use in the target of evaluation will provide the necessary security.

If you are not a product engineer (or work at a test lab, like I do), then understanding where the “randomness” on your system is derived from and how it is leveraged through various security functions in the computers you are responsible for is still highly relevant.  And, if you are a product engineer or involved in SDLC, ensuring that you understand the fundamental building blocks of system security will help you avoid or detect some of the more egregious mistakes and/or sabotages of your systems which some vendors have experienced in recent months.

Share with Friends
FacebookTwitterLinkedInEmail
Use Cybytes and
Tip the Author!
Join
Share with Friends
FacebookTwitterLinkedInEmail
Ready to share your knowledge and expertise?
2 Comments
  1. Great article

  2. i almost have no idea what i just read lol, but i understand about the “randomness” so no one can predict it.

Comment on This

You must be logged in to post a comment.

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Support Cybrary

Donate Here to Get This Month's Donor Badge

 

We recommend always using caution when following any link

Are you sure you want to continue?

Continue
Cancel