Here I am documenting issues with X.509 certificates and Public Key Infrastructure I have encountered.
In the grand tradition of true geeks I use the most compatible format that alien civilizations might be able in million of years - a simple text file (in a pre tag)
PKI Issues Random collection by Elke Stangl, elke@punktwissen.at ------------------------------------------------------------------------ Certificate path validation * Ambiguous chains and chains sent in SSL handshake. The web server sends the chain it prefers. If there are two valid chains, such as a shorter chain associated with an internal root CA and a longer chain connected to a cross-certificate issued by a public CA AND the server is available on 'internal' and 'external' networks (via a reverse proxy) it will send the untrusted internal chain to external relying parties as well. * Some embedded devices cannot deal with chains - including earlier versions of CISCO PIX and Apple's IOS SCEP client. In order to get validation working you might need to: Import the subordinate CA to the root / 'CA' store or add the thumbprint of the sub CA where one would expect that of the root CA or vice versa. * Some apps / devices cannot deal with a 'renewed' CA, that is: Two CA certificates with same subject names but different keys imported to the same CA cert. store. Unfortunately this is the default state of affairs if CA's life times are nested according to the shell model (CA certificates renewed at half of its validity period e.g.) CISCO fixed a related bug some years ago. ------------------------------------------------------------------------ Names and encoding * CAs may change the encoding of subject names of the certificates issued in relation to the encoding in the request. The subscriber may not be happy with that - and it can be quite a challenge to track this down if this client is a custom-made device / blackblox. * CAs may reorder the X.500 components (Should we go O-->CN or CN--> O) and again apps. who combine the binary name blob could fail. * Details of the validation depend on the browser (version) used. I can't recall the versions unfortunately but some years ago some browser was happy to match certificates on names (neglecting encoding) while another did a binary check of names plus cross-checking AIA versus SKI fields. * I was surprised to see that Windows clients fall back on name only matching if they are not able to match on SKI / AKI. This gives the user a nice picture of a certificae chain, however an error message tells you that the certificates may be corrupt. ------------------------------------------------------------------------ Revocation checking * Devices may have size limits - I recall 256kB for some of the older (?) ones. This would cause VPN and the like to fail if you would use, say, current cacert certificates or those issued by the Austrian public CA, A-Trust. * I have seen Outlook failing often when trying to download such large CRLs as well - although the CRL servers were accessible. Fortunately there are some registry keys that allow for tuning the way Outlook deals with CRLs and related errors. Unfortunately you cannot manage the registry keys of the e-mail clients that receive your e-mail. * OCSP is a solution to oversome the size issue but not necessarily the issue of current revocation information. The Windows OCSP server retrieves information from a CRL, and the validity period of OCSP responses is either that of the CRL used or of the OCSP signing certificate (the latter is two weeks by default). Sure, the caching behavior can be configured so the OCSP server would consult the CRL more often. Yet the responses sent to relying parties are still 'long-lived'. As I understood the options the only way to really purge responses at the client earlier is to use an HTTP Expires header at the OCSP server and hopefully the OCSP client does respect it. * Deleting CRLs regularly should be a built-option of PKI-enabled servers. VPN servers (CISCO, Nortel, Juniper) have been able to do this since a long time. Then you can configure CRLs a way that allows for reasonable operations (that is, solving the issue: What happens if the CA runs into an issue when the CEO gives the yearly motivation speech at Dec. 24, 11:30 - when will you be able to spot the problem). CRLs would be allowed to live for, say, a week, but are purged at the validating server every, say, 3 hours. With Windows, you can do this on princple since Vista/Server 2008 has been given a supported option to delete CRLs - but you need to create scripts to do it. ------------------------------------------------------------------------ How apps use certificates for authorisation (in probably unexpected ways) * Certificates might be used as files to be parsed for name-value pairs. I found something like an 'authorisation scheme' coded into X.500 name fields. * So-called LDAP group memberships: While some devices understand memberOf attributes, some so-called groups are based on parsing X.500 names. Such as: Putting everybody with OU=External in the 'external group', 'external VLAN' etc. It can be a challenge to reconcile this with a concept of real groups in LDAP directories such as Active Directory. ------------------------------------------------------------------------ How users don't expect PKI-enabled apps to work. (This could probably be used as a title for anything in this file) * CRLs are blacklists not only used for blacklisting in the way admins expect it. Often people are surprised that network logon etc. will fail simply because the CRL is not accessible or expired. * Sent items of encrypted e-mails in Outlook are encrypted. This comes as a painful surprise to users who had used smartcards (e.g. the Austrian National ID certificates issued by A-Trust) to encrypt their mails and whose card used basically for other purposes (health insurance) has been retired / cut in two pieces. Ironically, it does not help that new cards are issued with the same keys as Outlook tries to find the associated certificate in the store first before 'accessing' the key (via the CSP). * CRLs cannot not necessarily be pre-fetched - though this is what admins would like to do whose internal AD logon depends on certificates and CRLs issued by an external provider. Of course you can build all sorts of hacks as mirroring an external LDAP server, periodically polling for CRLs etc. * Windows NTAuth store and the number 1 misconception of how certificates are used for logging on to AD: UPNs in the SAN are automatically mapped to UPNs in AD (DNS names for machines). This is a string-based mapping - not a binary comparison of certificates or hashes - and the security hinges on the fact that the issuing CA's certificate has been distributed via an attribute in the so-called NTAuth object in AD's configuration container. This means if you somehow manage to get a highly privileged admin's UPN into a certificate issued by an NTAuth-entitled CA you could impersonate that admin (logging in using smartcard for example). That's why it is a really bad idea to 'delegate' management of an enterprise CA AND management of certificate templates(the defintions of how cert. content is constructed and how certs. are issued - such as allowing for arbitrary names in requests) to the administrators of a child domain who on principle only want to issue certificates to their users or machines. * Certificates are not necessarily more secure than machine logon in a Windows environment - comparing EAP-TLS using certificates configured as non-exportable (as per cert. template) and PEAP-TLS. Hacking the latter would require transferring / extracting the machine's password/ Kerberos secrets / system state. 'Hacking' the former is not hacking at all as the 'not exportable' option can be overruled by a local administrator at enrolment. Since Vista/2008 this can be done in the GUI (certmgr.msc), before you needed to craft your key and request with certreq and submit it in a sepearate step to the CA. * The advantage of certificates over PEAP-TLS is that they are more standards-compatible - but still the process can be painful (to equip print server boxes with certifiactes for example. To let iPhones do 802.1x logon (to AD) via WLAN you need to add host/machine.domain.com to the subject CN (so that the device send the correct string) and machine.domain.com to the SAN (so that AD-based mapping against the dnsHostName attribute does work). And of course you need a dummy / shadow object in AD with that DNS name and a service principle name of host/machine.domain.com. * Accessing 'public' CAs' CRL is more difficult than expected - in particular if the validation is done by machine entities. Servers such as an Exchange server that should check CRLs for e-mail certificates on behfalf of a web access user, or 'internal' webs servers that should validate users' logon certificates) often cannot access 'the internet' and/or a proxy server is used in the context of users but not in the context of machines. ------------------------------------------------------------------------ Processes and the human factor * It is always the seemingly simple processes and logistics that go wrong - that is: scheduling CA renewal or issuing a CRL signed by an offline CA infrequently. This is also true for well-managed environments. * Offline CAs escape the usual monitoring processes. There is an inside joke about carefully naming an offline CA (e.g. the virtual machine) so that it does not get deleted accidentally because 'it is never online'. Since I have encountered such an incident - a classical unfortunate connection of events - I don't laugh anymore. * Freshly minted PKI consultants often take a very academic, PKI theological ((C) Peter Gutmann) approach. I was no exception. But who needs three tiers for an internal, "device / infrastructure" PKI really? * Eternal CRL as fall-back solution. I have seen processes re HSM management gone wrong too often. Thus I recommend to create a CRL that will be valid until the related CA's certificate will be expired. In case an HSM is renderend inaccessible this CRL will provide business continuity. ------------------------------------------------------------------------ CA Operations * CRL publication can fail due to the CA's issues with writing the CRL file to the file system. A virus scanner has once locked the temporary .tmp file and a (Windows) CA was not able to rename it to .crl. ------------------------------------------------------------------------ Law and politics * Digital signatures on invoices transmitted electronically have been mandatory in Austria for a few years before the law has been changed. I wonder how agencies will ever check the signatures applied in these years by wildy varying technologies - XML signatures, signed PDFs (including CRLs or not, including time stamps or not), signatures stored on / provided by server-side components such as the 'mobile signature'... * I wonder how cross-country checks of signatures on PDFs are ever going to work. Legal cross-certification does not imply technical compliance. For validating Austrian Qualified signatures (ECC) with Adobe Reader you need to install a plug-In AND know how to configure advanced security settings. Otherwise error messages are misleading. * Time-stamps have not been mandatory with digitally signed invoices in AT. Yet, Adobe Reader will report signatures as invalid in the future if the computer's clock time has been embedded. Fortunately some PDF signers allow for embedding CRLs or OCSP responses. * My impression is that (in middle Europe) governmental organizations or organizations closely related to agencies are 'motivated' to use PKI-based technology provided by those CA operators that originally were founded to bring PKI and digital signatures to the masses. ------------------------------------------------------------------------ Enigmatic stuff to be investigated * For some Windows 2008 R2 CAs built from scratch with a software-based key I saw the CA 'suddenly' losing access to its keys after it had run for some days properly, after some service re-start. I thought it is some issue with DPAPI protection of system keys, probably when some not supported virtualization software is used. Now I rather think it is due to a 'confusion' of chains: At the CA its own certificate is present different cert. stores, the Personal store being associated with the private key, the CA store not so. But then if have seen some private keys also being indicated for certificates in a non-Personal store - causing some of the chains (in case of renewed CAs) to fail while others still work. ------------------------------------------------------------------------
Personal website of Elke Stangl, Zagersdorf, Austria, c/o punktwissen.
elkement [at] subversiv [dot] at. Contact
(Not sure if I will ever update this.)