DB2 Field Procedures (FieldProcs) were introduced in V7R1 and have greatly simplified encryption, often without requiring any application changes. Now you can quickly encrypt sensitive data on the IBM i including PII, PCI, PHI data in your physical files and tables.
Watch this webinar to learn how you can quickly implement encryption on IBM i. During the webinar, security expert Robin Tatam shows you how to:
- Use Field Procedures to automate encryption and decryption
- Restrict and mask field level access by user or group
- Meet compliance requirements with effective key management and audit trails
In the webinar, you’ll see our Encryption for IBM i [formerly Crypto Complete] solution for IBM i. Encryption for IBM i will automatically create and manage the FieldProcs needed for encrypting your database fields. Our software also includes the security controls, key management, and detailed logging needed to pass audits and meet privacy regulations.
There is no better time to encrypt sensitive data on your IBM i. Register and watch today.
Alright, well, good day, everybody. Welcome to our session on encryption on IBM i, with a focus on a newer technology, I shouldn't say completely new, but a newer technology called Field Procedures that assist us with encrypting far more easily than we've been able to do in the past.
My name is Robin Tatam, I'm the Director of Security Technologies here at HelpSystems. I'm going to be your guide today as we discuss some of these features, the pros and cons of different things, and certainly invite you to ask any questions that you have in the chat window of the WebEx. If you don't see that, click the little speech bubble icon, then it should be displayed for you.
So, not really an agenda per se, but these are the primary topics that I want to touch on in some capacity in the limited time that we have today to give you some ideas and insight into how we approach encryption on the IBM i platform. So, we're going to talk about a basic introduction to encryption, first of all of course, and then we're going to talk about how we've done encryption in the past, some of the options that were at your disposal, and then we're going to introduce these DB2 Field Procedures or FieldProcs as they're typically referred to and what they can do for you, in order to simplify encryption, how you get started with them, some considerations when you use them, and I'm also going to share with you a little bit of information about a product within our portfolio that leverages Field Procedures as one of several options in order to simplify even further the compliance regulation that many of us have to start encrypting data.
So, let's talk about data risk initially. Certainly, there is a lot of information stored on IBM power system servers. We've been running these boxes for many years, and we have various database functions, that are inherently in most of our applications. We have people that are accessing this information, static information, dynamic information, through applications like RPG, and COBOL applications. We have people that are doing Dynamic SQL. We have a number of system utilities like DSU and DBU at are commonplace for accessing and manipulating data within files. We have Query/400, of course, the staple that we all use to view and extract data from our files using a green screen interface, and then if we want to do it through a PC Connections over ODBC and JDBC. So lots of different ways the databases are accessed with tools. Now typically, we hope that this is with legitimate access from authorized users, but, of course, we have to set the expectation that there could be an occurrence where it's an external hacker, or potentially a disgruntled or rogue employee that is making that access, and they're not using the approved access methodology, and we've got to make sure that that database is protected. That even if they are able to obtain the data, it's in a form that it's going to have limited use for them.
Now, in addition to just the database, we also potentially have the movement of data from one point to another over things like FTP, or email, we want to make sure that that data transmission is encrypted as well so that people can't just what we call sniff the traffic off the network.
We use protocols like TLS and other types to ensure that that tunnel is in an encrypted form and is not detectable or readable to anybody else. Then we also have to consider back-ups; a lot of people will have protection from IBM i security on their objects, and then they back those objects up, of course, for disaster recovery purposes, forgetting that that tape could potentially be restored to assist different system where the authorities are different or the user privileges are greater and all of a sudden their data is visible or accessible to somebody. And, it still surprises me the number of breaches that are attributed to lost or misplaced backup tapes or media, and we want to make sure that that doesn't become our point of risk.
If you've heard of the Ponemon Institute, you're probably familiar with their study published every year that talks predominantly to the cost associated with data breaches. So, they break down the breaches for a year. They look at the type of breach that it is, the number of records that are breached, and they tally up some of the costs that are involved, including of course notifications, the damage to public perception if it is a business-to-consumer type organization instigating credit monitoring, and then, of course, the trail of lost business, regaining the trust of their customers, and potentially lawsuits. I don't know if any of you are impacted by the recent breach of the credit monitoring service here in the US, I actually found out about 10 minutes ago that my data was included in that, so now I've got to start making sure that I have good observation into anything that they're offering with regards to monitoring that credit report, but certainly there's a lot of tangible and intangible cost, but the interesting thing is that it did actually decrease in 2017 over what it was previously.
And people always take that as a positive sign—“Okay, we may be breached, but the cost is coming down and really 3 million dollars is not that big of a deal in the scheme of things”—but what we've got to do is break that down and look at what caused that reduction in cost? It certainly wasn’t a reduction in the number of breaches, those were actually up, probably not surprisingly. But, the cost per record is down, and what I would say is that there are several factors that they have published that influence that. Not least of which is the increased use of encryption at the database level, so even though we may have seen additional data being lost, the cost per record actually is coming down because the mitigation costs overall have been reduced thanks to things like encryption obfuscating the ability for the perpetrators to be able to read and process that data. It's also worth noting the other two causes: the incident response team, meaning people, were positioned better and were able to respond in a more timely manner, that's absolutely crucial, and just something as simple as employee training. I know in the IT field, I think we all sit here and think, “Well, surely nobody would actually click that link in the email,” but the number of users who actually do is quite staggering, and it simply comes down to the fact that employees need to be counseled and educated and shown the way when it comes to recognizing a fishing attack or some type of email-based intrusion on their system to ensure that they're not falling victim to that type of activity.
Oftentimes, I get the question about what type of data should we be looking for? What should we protect? The first knee-jerk reaction I think people have is, “Well, let's just encrypt everything. If the entire system is encrypted, then I don't have to worry about all this nitty-gritty stuff or finding out where this data may reside.”
Unfortunately, that is not the answer.
In fact, if you encrypt everything, or more than you need to, it actually has a tendency to start generating patterns in the data.
In fact, the Enigma Machine in World War II was broken, the encryption was broken based on the fact that somebody was transmitting the same sentence at the end of all their transmissions and that allowed the people trying to resolve that and break that encryption to have a pattern to work off of. So, you generally don't want to encrypt more than necessary, but what should you encrypt? Well, I've provided a list here some of the most common things. PCI data or credit card information is something mandated by the Standards Council if you are processing credit cards, which is not overly common on IBM i. I think a lot of people have outsourced that, but if you are storing credit card data, then it needs to be in an encrypted form. If you have any personal identifying information which is more common – social security numbers, birth dates, driver's license numbers in some instances – something that can be deemed PII data, should be in an encrypted form, to make sure if it is somehow subjected to a breach that it is not usable information.
Health-related information has certainly bubbled up as being very profitable for hackers. That's partly because the credit card numbers have become more difficult now that we've gone to at least chip and signature in the US, if not chip and pin elsewhere in the world.
But the thing about credit card data is, it has a short life span. If somebody takes your credit card and you see charges showing up, you report it to your credit card company, and they cancel your car immediately and issue a new one, so the life span of that credit card number is now dead. With healthcare information, especially here in the US where we have a lot of folks without insurance and have high cost healthcare, this is very valuable information. And if it is deemed to be abused, there’s really not much you can do to stop it. You can't cancel your health information, so if you discover that somebody is using that type of data, it becomes very difficult, first of all to discover, but also to do anything about it.
Other things like bank account numbers, pin numbers, data that is about payroll, financial data about your organization is all the type of data that we should be protecting. And I think that's pretty obvious.
There may also be things that are unique to your business and that's a statement to those of you that are thinking in the back of your mind, “there is nothing on my system that would be deemed really sensitive.”
Everybody has something, and it may or may not warn encryption, but generally does warrant some type of authority protection, even if it's a matter of just limiting perhaps the permissions that are granted to the users. So, we want to make sure that that's there, but encryption is just a great way of saying if that fails, for some reason, if those layers are reversed in some way, than ultimately the data is not going to be usable.
Alright, so let's start with a very, very basic introduction to encryption as really the process of encoding information so that it is protected in the instance that it is accessed from some unauthorized mechanism or user. The goal of encryption is not to hide the existence of the data, but to obfuscate the meaning of the message, and that's been true since the beginning of time. Encryption dates back thousands of years, in order to be able to create a message that only the intended recipient was able to read.
Now this data is encrypted using an algorithm and a key, and the key is the key, right? So, that is the most sensitive part. We want to make sure that that key is protected. Now, if we take plaintext data, which in this case in my example, we just got an English phrase here from a nursery rhyme.
The output that comes from that is known as Ciphertext. The data is still there, it's still visible, but it's not readable, unless you have the ability to decrypt it back to its plain text form.
Some common terminology and encryption includes Cipher, which is actually a pair of algorithms, maybe the same algorithm, but it's designed to perform the encryption and decryption functions. The most common Cipher in the US at the moment is AES which is the Advanced Encryption Standard that's a little bit newer than some of the others that are out there, and it's really something supported and indicated by NIST, the National Institute of Standards and Technology, as being kind of the de facto standard certainly in the US Government at the moment.
Be careful, some of the older Ciphers that are out there have indeed been broken. So, DES the forerunner to Triple DES has actually been broken as PCs and attacking methodologies became stronger, it was actually possible to break these encryption algorithms. So, you want to make sure you stay current, that you're staying on a strong Cipher, such as AES, and then that way you have a good chance of your encryption being able to withstand an attack.
Now I mentioned the key, and this is what controls the actual operations of that algorithm. Now, the output is manipulated by the key, and that's a bit representation, and it is the secure just like it is in the physical world where a key can open or close a lock. That's why we have to put so much emphasis on key management and the protection of those keys. Bear in mind, if you encrypt data and you lose the key, assuming you used strong encryption, there’s a good chance you're not getting it back. This is where ransomware takes hold. Ransomware is potentially the encryption of a file or files on your server or your PC and the hackers in that instance, the criminals, hold the key and that's what, in essence, they're offering to sew you back in Bitcoin payment. So, we want to make sure that we protect the key in this instance, where we're using encryption legitimately.
There’s two basic types of Key Cryptology. One is a Symmetrical Key, where the sender and the receiver share the same key. You, of course, now have to make sure that that key is kept secret; it makes it a little bit difficult if you're in different places because how do I get the key from point A to point B, without somebody potentially gaining access to that key? So that becomes a challenge, but this is known as secret key cryptology for that reason, again, using the same key to encrypt and decrypt. And that's typically what we're using when we're doing things like encryption of data at rest, where the data needs to be encrypted and decrypted on the same server. We also have Asymmetric Key Cryptology, and that's where we have a key-pair. There's a public key and a private key; the sender will potentially encrypt the data using their public key, and then the recipient is going to use a private key to decrypt that information, so it's typically referred to as a Public Key type of Cryptology. If you're doing transactional transmissions, then these are typically going to use an asymmetrical type key.
Now, we talk about encryption on IBM i as pre 7.1 and post 7.1. Prior to 7.1, you really had two basic database options. One was to use API calls inside your application. These APIs were provided by IBM and allowed you to encrypt data prior to a right and update and to perform a decryption prior to display or printing that data. This required you to modify your database to support the extended character set and length typically involved in encrypted forms of data, and it also required you to go through your entire application looking for everywhere the way that file was read and those fields were accessed, so that you could perform the encryption and decryption function. I always described this as being similar to a "Y2K project”. It wasn't difficult, but it was very intensive and took a lot of work to do and maintain.
I was one of the authors of the IBM Red Book on data encryption, and I like in the API approach to being something very beneficial for people that were to write once deploy many, such as software vendors who could put these API calls into their application because, of course, they own the source code and could do that, and then it was deployed to many install basis, and that was really beneficial. It's very efficient and effective, but for the average shop, you and I, it's not very beneficial. We don't always own the source code and we're not overly familiar with the application and want to tear through it making such extensive modifications.
Okay, another option is that you can use column triggers to automatically encrypt data on a right or an update, which is less invasive, meaning that the application doesn't really know what's happening. It sends the data out to the file and just before it hits the file, that trigger will launch, and it will actually potentially encrypt the data and then write the information to the file. That means we can limit the amount of modification necessary to the application, which is great.
Now, there's some performance considerations there using triggers on every File I/O basically, and the big downfall is the column triggers work when you write and update data but not when you read data. Now, there's certainly a re-trigger, but there is a hardwired restriction in the operating system that prevents a trigger program being fired on a read operation from changing the data stream. And that's a protection mechanism because, if you think about it, if somebody did put a trigger on a file and they altered the data every time you read it, no matter how you access that data, it would always come back in the form that is not necessarily stored in the file. So, you could have a value of blue in the file and no matter what you did to read that data, it would come back as yellow, and it would also state the fact that the data wasn't actually accurate. So, IBM put that restriction in there, so that limits our use of triggers.
Now, one thing I didn't list here is the idea of doing Disk Encryption, and disk encryption is often seen as the silver bullet because it encrypts the entire disk unit. And, I didn't list it here because it's really not intended to be done at a field level. It's really more of a system level function, and it's really beneficial if you're using remote discs because it protects the flow of data from the hardware card inside the cache to the disc unit in the end location. The problem is, when most of us have local or internal disk, it doesn't really accomplish much because any type of access, any file I/O we do will be presented with the plain text version of that information. Regardless of whether you do an FTP, an ODBC connection, a copy file, an RPG read, a DFU, you name it, it's all presented with the plain text version. And, if that's a hacker running that type of activity, then they've just obtained the plain text version of your data, which is what you were trying to protect against. When you have local disk, the only thing really the disc encryption is going to protect you from is pulling the disk unit out of the server and walking away with it, which is not very realistic in the average data center and thanks to single level store technology within your IBM i server, pulling a drive out of IBM i is not going to allow you to simply insert that drive into another system and start reading it.
It scatters that data across all the drives that are in the system, and they're not going to be readable without re-initialize the drive, so this is not like a Windows Server, and we're not able to just yank the drive and insert and read, so the disc encryption has its place, but it is not the silver bullet when it comes to encrypting your data for protection.
Alright, now I mentioned there is a database change. That's because encrypted data is typically going to use an extended character set and have an extended length in comparison to normal fields. So, for example, if I have a six numeric field or a nine numeric field, it may end up being a 16 alphanumeric field that holds that data. That was part of the reason prior to 7.1 that we had so many issues was, because we had to change the structure of the database, or we had to maintain manually some type of shadow file that contained the encrypted version of the data and some type of token to match the two together. There are also some limitations on the type of data that could be encrypted in that format, and so it was a real challenge.
Now, post 7.1, we had a new option. It was called Field Procedures, typically referred to as FieldProcs, and these were born in 7.1, in essence, as an enabling technology that allows us to simplify the process of encryption. It doesn't actually perform any task right out of the box, it's simply there to allow you to facilitate this activity.
In many instances, it's able to minimize or potentially even eliminate any application changes.
How does it work?
Well, two-fold. One is, at the database level, the database manager was enhanced so that it could store the encoded value for the field behind the scenes. It's not really hidden from us, in fact that's the only thing we see, but it means that we don't have to change out database structure, we don't have to change the types of fields, we don't have to change the lengths of the fields, the database handles all that behind the scenes. That's supported under DDS-described physical files, the traditional DB2 files that many of us still have, as well as SQL-defined tables. It is possible to indicate that a file has a field procedure on one of the fields in it, and that will indicate that much like a trigger at a field level, there's some type of action that needs to be performed when that field is read or written to in the file. Build procedures also handle multi-member file, so if you're using multi-member physical file still there is still support for that, which is very helpful.
Now field procedures are not overly different from the concept of triggers in that once the I/O happens outside of the application, the operating system will invoke the field procedure, which can actually perform any function. For most of us, the intent and the use of a field procedure is for encryption and decryption, but it is actually a user written program and, therefore, it can do any task that you want. If you wish to add a field procedure to a file, you actually use SQL to do so. The syntax for it is shown at the top here, it's the ALTER TABLE directive and what we tell the system to do is alter the table or the file that we're interested in, we're looking to alter the column which is another way of giving the name of the field and then we assign the field procedure program to that field.
We can't have any other lots active on the file while that statement runs. So, certainly we have to consider that when we do this for the first time, we also have to have object alter permission to the file as well as use permission to the field procedure program.
So in essence, you're probably going to be somebody who has some level of authority on the system. This is not something the end user would suddenly pop on to the field. This is done at a security officer type level. What happens when you assign a field procedure is that it will go through and invoke the field procedure for every field in that file. So, if this is an existing file with a lot of data, you want to consider that from a performance or impact perspective.
In essence, what's happening is the operating system is recognizing that we want the field in this table to be encrypted, and step one is to encrypt the existing data, so it does a mass encoding function of the necessary field values. Of course, we recommend sending this to batch, it can take some time, and if there are other locks that tried to go on to the file as it runs, then you're going to run into issues. So if you're a 24-7 shop, this is certainly something that we would work with you and give you some advice on how best to implement it.
Likewise, if you wish to remove a field procedure, you simply run another ALTER TABLE statement. In this time, you're going to indicate that you wish to drop the field procedure in the operating system will immediately start to perform a mass decode function of the field. That was protected from using or by using that field procedure. So in essence, you're doing a mass encryption when you turn field procedures on, you're doing a mass decryption when you turn the field procedure off for some reason.
So, I mentioned that when you encrypt data, it typically has a different format and often a different length than the original field. Here we have an example from a display file field description command, a bit of a mouthful there. This one happens to be a social security number, it's a PACKED field that is nine numeric, and that is not going to support encrypted data.
So, what do we do? Well, in the old world, we have to modify this file, change it, recompile it, copy the data back in, and then modify our application anywhere that this field was referenced. Now, what we see is that there's actually a field procedure associated with this field and the file. That means the operating system is maintaining a space for the encoded value. So, it still looks to us as if being nine numeric, but reality is, the encoded space behind the scenes is an alphanumeric field of 16 long.
Alright, now the nice thing about using field procedures is that, first of all, it masks the complexity associated with this change, and it also does not cause a record format level check. So if you're familiar with creating physical files and say, RPG or COBOL programs that are sensitive to the order in which objects are created, you typically have to create the file first and then the program that references the file. In this instance, since we can change the file without causing a level check, that would require all the RPG or COBOL programs to also have to be re-compiled. So, this is huge, just in of itself. Lots of different encode events will occur. In other words, the things that cause the field procedures to be invoked. Native access is to the table. If you are updating or writing data to the file that will invoke the field procedure to perform presumably the encryption task. If you're using SQL insert or updates, it will also do that, so that will handle any type of script or if you have embedded SQL, it will take care of that. If you're using certain CL commands, like CPYF, RGZPFM, or STRDFU, that will also invoke the field procedure, so you don't have to worry that certain mechanisms will and others won't invoke this engine. If you're still using Query/400, if you do a select in essence, or a SQL statement that has a select in it, then be aware that that will invoke potentially being code event. Now, why would it encode if you're just reading data from the file? Well, in this instance we have aware clause indicating that we want to select the social security number in the name field, anywhere that the social security number matches the given pattern.
Now there's two choices here, one is to decrypt the entire file, so that it can find that match. Well, that seems like a lot of work and performance overhead, so the operating system is a little smarter than that. It actually takes the given value, encrypts that using the field procedure, and now it can compare the encrypted value of the given setting to the encrypted value of the database, so it can do a match using encryption and that will have a negligible impact from a performance perspective, which is great. If we have certain database operations, these are some keyword examples from RPG then we may also get that benefit. For example, with CHAIN and READE, it will encode the given value and then match that to the database. If, however, we do a set low limit or set greater than, we do have some considerations that it's potentially going to have to decode all of the values in the file to find out which ones are greater than the given setting.
We have a similar list for the decode events, again being handled by the field procedure. From a native perspective, anything that is pulling data from the file potentially can decode that data. Again, if you're doing a READE or a CHAIN, it's not going to be a big performance impact, but if we are doing other types of database activities, that's certainly a consideration. If we're doing SQL select and fetch, if we're doing query, same type of thing here. On this example, I did give an instance where the social security number is supposed to be greater than a given value, in which case it's not going to be as streamlined because it has to decrypt the other value here. Other types of things, file transfer utilities – even if you're accessing data from your PC or using a system command like DSPPFM or CPYF all of these are going to be subjected by the processing of the FieldProc.
A strong consideration is, if your application is sorting on data that is about to be encrypted because the sort order of that information when in encrypted form, may not fall in the same order, probably won't, as it did in the plain text version, right? So here we have an employee master file, we got a file layout, we've got an employee ID, employee name and social security number, and it is sorted by the employee ID. In this instance, if we decide that the employee ID is something that we wish to encrypt, and we put the field procedure onto the file using our ALTER TABLE directive here, then when we go to read the file in order or what we believe to be in order of the employee ID we're actually potentially going to get a miss ordered list because it's now reading the encrypted file or field values that may not be in the same order and then we subsequently decrypt that value, it may appear to be coming in a random order, so we have to consider that. If you're using CHAINS or READEs versus what we're doing here, it’s just to read through the file, then you probably don't have an issue, but you may also want to consider using a SELECT statement in some embedded SQL, using the ORDER BY statement and this will actually decrypt that value before sorting the data and reading it. So, there is a performance implication but again, prevent the impact. Ideally, I don't recommend sorting on encrypted data.
Some other things that you probably need to know. If you use a CRTDUPOBJ, it will duplicate any field procedure definitions on that file, so it'll create a new file, and it'll still point to that same field procedure, alright. If you use the CPYF, which typically just moves the data from file A to file B, it will first decode the values in the from file and then move them to the to file and if that happens to have a field procedure, then it will do a mass encryption on that data, as well, so be cognizant of that, if you're doing COPYFs say in a CL program.
As a user, you have to have authority to the field procedure, okay? They only need “use” permission, but in essence, without that permission, the database can't in essence, access that data, so it will fail out. You'll get a CPF4236 and some text indicating that the user is not authorized to open that file. Very importantly, you need to make sure that you back up your field procedure programs with your files because they're not automatically included. That's important because if you back up a file that has encrypted information in it and a field procedure attached and you restore that file to a different system, you may not be able to, most likely won't be able to, decrypt the data because the field procedure will not be invoked to perform that decrypt function. So, consider that when you are doing your safe or store strategy. There is a select statement in the operating system that you can run through interactive SQL that will show the contents of your file fields, and you will see information on there as to whether there is a field procedure attached to that field. So, this is a nice way of establishing “am I’m even using field procedures today?” and later on determining which ones they are.
One of the most common questions I get is performance. There is definitely consideration that needs to be paid here. I'm not going to say it's a performance issue because, for many people it's not, but it's certainly a good consideration. There's two parts of this, of which we did some testing. Test one is the actual addition of the field procedure. Now, why would that take some time? Well, because if you remember, it does a mass encode of the existing data. So, what we did was add a field procedure to a file that had a million records in it. If we added multiple field procedures, we saw an exponential increase in the amount of time it took to add each field procedure. Not surprising because what happens is that when IBM i adds that second and third field procedure, it first decrypts the data associated with the first, and then re-encrypts it all again.
So our recommendation is to spend a little bit of time and design what you want to encrypt and if there are multiple columns or fields in the file that you wish to have managed by a field procedure, add all the field procs at once because then it is a relatively minor increase in the processing time than it would be if you added each field procedure individually. The second test was to read encrypted data and what we're doing here is pulling data in and decrypting it so it's not just the reading of the data, but it's the invocation of the field procedure.
If you have no field-procs associated with the file, what we saw was an RPG-embedded SQL statement was much slower than a native RPG read, but once we started adding field procedures, notice that that flip-flopped. All of a sudden, now the RPG-embedded SQL was more efficient and faster, so you may want to consider that as you're designing new applications. As we add more field procedures, again, there is an increase in each time because we have to decrypt multiple columns or fields now, with each database read.
And the challenge that people run into when they embark on an encryption project, especially when they're encrypting themselves, is that it can be very time consuming. If you have ever embarked on using IBM’s IBM supplied APIs, you will discover that these are relatively complex, they take some research, you need to understand them, and then it's a matter of going through tearing through your application looking for anywhere that that data is used and adding the calls to the APIs. As I said earlier, this is probably one of the better approaches from a performance standpoint, and would be conducive to somebody like a software vendor adding encryption to their own applications, less so to you and I doing it to either a home-grown application or potentially to a third party application that the vendor is no longer going to support force.
Part of this is because the application changes, are pretty impactful. First of all, you have to change the database type. If it is a numeric field, it has to be changed to alphanumeric, and then we have to expand the field sizes to accommodate the encryption data.
You also have to then tear through your application, as I mentioned earlier, to add those cipher functions into it. Sometimes the encrypted data can mess with your 5250 emulator because it's going to try display characters that are not part of the normal emulate a character set, so you may get some funky activities there on your screen with DSPPFM, for example. The encryption function itself is relatively straightforward, if time-consuming, but we have to recognize that the key is so critical here that handling key management is a big part of what we need to consider. Unfortunately, a lot of times we see that the keys are stored inside source code. That was actually part of the reason that there was a documented AS/400 hack quote on quote according to Verizon, when people broke into their web services, they discovered that they were IBM i user IDs and passwords embedded in scripts and they were able to access the system. So, storing keys in plain view, inside source code, not recommended.
We also have to control who can create and manage those keys, potentially supporting split keys, so that no one person has that full responsibility. And it's quite difficult to rotate keys without decrypting and then re-encrypting all the data, which as we saw from the initial load can be quite time-intensive and require unique lock on the file.
Some regulations require key rotation, meaning that five years in you can't be using the same key because if it was broken somehow it's going to continue to be so. We want to rotate those keys typically on a six-monthly or annual basis, and we want to use a function that supports that without having to do a mass decryption.
If you're writing your own encryption, decryption process, make sure that audit trails are included. Often that's not the case, knowing what somebody's doing, the decryption function is probably the obvious one, but any type of key activity would also be pertinent to an auditor. Sadly, also the programmers end up knowing too much about this solution, and I'm not saying we are not a trustworthy group because as a programmer myself, I certainly hold myself to good ethics, but at the end of the day, from a business perspective, we have to recognize that programmers should not reside on production systems unless there is a fire call, in which case we need to track their actions and having them as the authors of your own encryption solution opens the door to potential abuse.
I've had a couple of you mention in the chat window that you're using Powertech Encryption for IBM i, which is great. That's a solution provided by HelpSystems that came to us a couple of years ago, or a year ago now, through an acquisition of Linoma Software. They have a great solution that we were excited to bring into the fold that simplifies the task by enabling the system to automatically create field procedure programs. I mentioned earlier that's an enabling technology. It doesn't provide the field-procs to you, so what we've done is do that on your behalf.
In essence, when we indicate in the software that we want a field to be encrypted, it will automatically generate the field procedure, it will attach it to the file and then it will support key management and decryption functions at multiple granular levels. It's no longer just a mass you either see the data or you don't. We also support the idea of masking, which is very beneficial if you still need limited access to the data, but in the case of a social security number, seeing the last four digits or a credit card number, the same thing, then we can still allow that without necessarily giving the user full visibility.
We use IBM i Authorization List to decide how much the user is able to see, if anything, which makes it a very simple IBM i supported type of object to allow for decision-making inside that field procedure.
There is a global policy setting in crypto, meaning that we can do separation of duties, fire multiple keys, which is good for PCI compliance, we have key management built in. If you have an existing key manager such as TKLM or SKLM, which is an IBM supplied key management solution, we can tie into that. So you don't have to reinvent the wheel. But for many of us, and I know during the initial discussion before this session, some of you said you hadn't started on an encryption project yet, so you may not have a key manager it's all built in. You don't have to buy anything else. Now, the security controls are there, so you can define who's allowed to create and manage the keys, who can encrypt and decrypt the data and what level of masking if any, they're able to see. It supports strong encryption.
My preference is AES256. That's the strongest AES that we support and is far in excess of what most regulations call for and is going to keep you pretty well protected there.
The field encryption function is very simple. There is a field registry inside the software, where you indicate the name of the file in the field that you wish to have encrypted, and it will automatically encrypt the data using a generated field procedure, so you don't have to do anything. If you're not yet at 7.1, it can still do it using a column trigger, but there are some restrictions, so hopefully everybody’s at 7.1, bearing in mind that 7.1 is no longer supported in 2018, so you need to have a road map defined to get you onto 7.2 or preferably 7.3. But, assuming that you are at least at version 7, it will gen those field procedures for you. If you're not yet at version 7, we can generate a trigger program for you, but it does have some impact on your application for the reasons given earlier. We support pretty much all of the standard field types, so you don't have to worry about that. And again, our integration with IBM i means all of this happens behind the scenes. You don't have to do anything. It also supports IFS and backup encryption, so it's not just about DB2 protection, although that's obviously the primary content here, but if you need to encrypt stream files in the IFS, you can do that, you can decrypt selectively, and if you are concerned with your backups not being encrypted, we do support that as well.
Now, here's how it works at the DB2 level.
Here's the crypto main menu, and I'm going to focus on just one option here, which is the field encryption menu, option 4, and then from there, I have the ability to work with that Field Encryption Registry, that's option 1 for me. So, I'm going to take option one here, and you'll see the list of the existing six fields that have been defined to the system indicating that they contain data that I wish to have protected.
Now, if I want to add a field to that, all I have to do is hit the F6 key to add. From there, I'm asked for a unique field identifier. My preference is to use a combination of the library file and field name because I know that will be unique and then I indicate a little bit of information about the field that I wish to protect. In this case, the name of the field in the file that it resides in. When that field is added, it will come back in a non-active status, that's because we've defined it in the registry, but we still have to actually activate it.
That's the process where the field procedure will be generated, it will be attached to the file field definition, and the first mass encryption will be performed, so that's why we don't activate it instantly because most people need to plan around that. But, there is an option to say, I want to activate that, and it will perform all of those background tasks for you automatically.
So how do we control who can see this data? Well, I mentioned that we use authorization lists, and we actually have a couple of them. In this case, we have the authorization list for those users that are allowed to see the full value. If a user exists on this authorization list with at least *USE permission, they're going to see the full, unrestricted version of that data element.
You also have a masked value authorization list, and if a user exists on here with at least *USE permission, then they're going to be able to decrypt the data in a limited view, so we'll be able to see partial data but not the full value.
The performance of this of course is paramount. So, we do support authorization list caching, which means that if a check is performed, we’ll record that check and keep it in memory and we can reuse it without having to go out to the authorization list or use the operating system to tell us how much authority of the user has. Now there is a trade-off here, of course, if we use caching, then performance is increased, but we do now have to recognize that an addition or a removal from that authorization list may not have an immediate effect. So you have the option of choosing what makes sense for your application. We also have the ability to define a field mask. So in the case of let's say a social security number, we have defined here that it has an option to mask and there's multiple different masks based on the data type. In this case, Option2 allows me to specify the ability to see leading and trailing portions of the masked information. In this case, it's going to be a credit card number. We’re going to allow the user to see the first two characters and the final four characters of that credit card number and any other characters are going to be overlaid with a hashtag or pound sign character.
If we're not authorized to the value we're going to get an asterisk there, indicating that we're not allowed to see that data. Now, here is a view using a simple application of a customer master file. This is the one we were working with; it has a customer ID, a customer name, a credit card number, Social Security, bank ID, and a credit limit.
This is the ultimate file for a hacker to obtain information about. So, we've thrown all of it into one file. Hopefully you don't have files that are quite this sensitive, but I'm sure some of you probably do.
Now, with the field procedure active and the user not on either of the authorization lists, this is what we get to see. The application is providing us visibility to the data only for those fields that I have authority to, namely those that are not encrypted. This makes it difficult for me to abuse the data, but I can still see the data that perhaps I need access to. In this case, just the customer name. If I set the limitation to say, "This user is allowed to see the data, but only in a limited view,” then I can enact my masking, simply by adding the user to the masked activation list, I can hit F5 on my data, and the data will change to this.
You can see from the credit card number, I'm now able to see the leading two characters and the trailing four characters and anything else has been overlaid with the hash symbol, which is great. So now I have the ability to use the data but not abuse it by having the full information.
This is great in the application I hear you say, but what about if the user simply DFUs the file or runs query against it? Am I going to now see either garbage, or am I going to see the full value?
Well, the answer is that in tools like query and DFU, we’re able to still utilize those field procedures, so it's fully integrated with IBM i. In this case, the query has returned again the leading two digits and the trailing four digits of the credit card number and everything else masked, using our masking character. When we get out to the PC, same thing. Now this one had a slightly different masking applied to it, so we don't see the leading two characters, but we do see the trailing four, and this is using an SQL connection, this happens to be using Linoma’s great little utility called Surveyor/400 and then also IBM i’s own data file or DFU Utility to access the data, so it's all compliant, regardless of the tools that you use.
I also mentioned that keys were critical. And so, we include key management here, gives you a policy on how those keys can be created and managed. We assign who's allowed to do that, we support key rotation, so you can change those keys on a select basis without having to do a mass decryption, and we're also now able to store those keys in what's called a key store, so you can have numerous keys. Now, that key store is actually a Validation List Object, and if you're familiar with that, you'll know that it does support encrypting of data, so it's intended to store this type of information. Access to the key store is controlled through IBM i’s own object authority, and we can limit to a certain extent what can be done with even an All Object (*ALLOBJ) User ID. So, if you have too many privileged users, we don't love it, we prefer you not just for best practices, but we can set some limitations on those types of users. If you're not sure how to create a key, we can generate them for you randomly. So you get these strong encryption keys without having to get too deep into the weeds of cryptology, and things that perhaps you're not familiar with and then we can control through those authorization lists, which users can encrypt and decrypt the data.
We're also auditing these activities, so if you want to know when somebody uses the decryption function, maybe it's a very infrequently used data element, and you simply want to know that somebody tried to decrypt it, then we can log that and notify you that it occurred.
Now, the key hierarchy is something I'm not going to get into too much depth here. If you're interested in cryptology, you're probably familiar with the idea of having multiple key levels, but it really breaks down to think of this: You have a key to your car. If somebody gets access to that key, they have access to the system, so we have to make sure that those keys are protected. So, think about taking your car perhaps to somewhere where they will park your car for you. Those car keys are kept in a box that has a master key associated with it. Without that master key you can't open the box, and you can’t access all the car keys inside.
Same concept here. We have a master encryption key that is designed to protect the data encryption keys that are actually doing the encryption and decryption functions. We also have a product encryption key, which is behind the scenes that protects that master key, so we take it even further.
Again, don't get concerned with the layers here. For most of us, this is pretty transparent, just know that we do support very strong encryption and good key practices.
I mentioned that you can do backups with crypto, so you can encrypt and save libraries and objects an encrypt them as you go, that's even if you're sending them to disc or tape or other types of offline media. We can support key-based and password protection, and if you're using a tool like BRMS (Backup Recovery Media Services) that HelpSystems is now in charge of developing on behalf of IBM, then we have support in that, and that support is likely just going to continue to grow because of our authoring of those tools.
There are some native IBM i commands that you can bed in schedulers or CL programs, if you want to save an encrypt or decrypt as you go.
I also mentioned the IFS. This is a critical part of an encryption project. If you are encrypting documents for storage in document management for example, so you want to make sure that all of those documents are protected and they're not accessible to somebody from outside or that doesn't have a business reason to be there.
We generate comprehensive audit trails that are stored in an IBM journal that's a tamper-proof repository, meaning that you can't just simply go in and alter or remove entries. Lots of different activities that you can choose from to be audited ranging from when the key settings are changed to when the keys are used, or when any encryption decryption functions are performed and then we can of course generate reports based on the type of audit records, the user or date range.
So, as a summary, this is a product installs as a license product uses only or less than 100 Mb of disk space. You can install and start encrypting data in less than a couple of hours. And I would offer that if you're doing this on a test file, of which we do include some test data, you could do this in probably less than 20 minutes. So, it's very easy in a test environment to get the set up, to see how it works, to play with trying to DFU or query the file, and we'll help you get that set up. We do include good documentation and online help text to spell it all out, but our goal is to make that as simple as possible, and we’d be more than happy to help you, giving you a 30-day free trial of the software full version. Encryption is just a part of our overall security solutions set, ranging from policy management to virus protection – yes, folks, you can get a virus on IBM i – to compliance reporting. We also have vulnerability assessment capabilities, on not only IBM i but also AIX and a number of our solutions, while predominantly IBM i, are also compatible with AIX, Linux, and Windows. So, as your data center is filling out with these other technologies as I'm sure they are, just know that HelpSystems is continuing to expand into that business line, so that we're not only your go-to for IBM i security, but we also have experts on the other platforms, as well.
I mentioned the free security scan or at least the assessment. There is a free security scan as part of that, and I certainly invite you to take advantage. I don't have a slide here, but I'm also going to throw out that we have a data analysis tool, as well, that can help search your system looking for data that matches certain pen. So if you're not sure if you have sensitive data on your system, let us take a look, we can scan through your database and seek out some of that information that would help justify perhaps either encrypting the information, or making some changes to your security configuration. Both of those tools, free to use, and we would be more than happy to facilitate that for you.
In fact, I'm going to go ahead and open up a poll right now. One of the questions here is, if you're interested in the scan, let us know. If you’d want more information on that before you make a decision, that's no problem. Let us know that, and we’d be happy to provide that to you.
Alright, well, I kind of whizzed through there, but that's the end of the content. Hopefully that's given you at least a good basic understanding of the options that you have. If you're at version 7.1, field procedures is almost always the absolute way to go, although not every time. So talk to us and we'd be happy to help you with it.
If you do have any questions use the chat window. We’ll be more than happy to answer those questions, and a few of you did that as we went. And I have a couple of minutes here, we'll try to get through as many as we can, but if we run out of time, I'll make sure we follow up with you after the fact.
Alright John, I'm going to give you the award for the longest question that anybody has ever asked me in eight years of webinars, so I'd be more than happy to get with you on that, but please bear with me, let me do that off-line. Otherwise, we'll use the whole time here, just reading the question. Alright, let see, a couple of you have told me that my where clause on the SQL statement is incorrect, so I appreciate that, I will make sure I correct that. Yeah, it looks like I had some quotes around it, but we'll get that fixed. So Justin, you had a great question here, “What do field procedures protect against? It seems like anything you do, the system decrypts it for you.” That's actually not the case. That's very true of disk encryption because everything we do there, the disk controller will decrypt the data, meaning there’s no protection from data access. With field procedures, because it is actually an application program, it is going to be selective. Now in our case, with that crypto product, we use authorization lists that the field procedure will interrogate to decide what type of visibility the user has. If the user is not on either of the two authorization lists specified for that field procedure, they won't see anything.
And that's true of a system tool like DFU, as well as the application. So, the field procedure is able to make an intelligent decision as to who can see the data and how much of that data they can see, so it solves the issue of selective decryption. So hopefully that answers your question.
Alright, hopefully I’m pronouncing your name right, there. Once files are encrypted, will the DR test or actual DR will take more time? It depends, if the data is encrypted at rest in the physical file and you move that data across your DR test shouldn't be any different.
Now bear in mind big caveat here, if you are encrypting data and you are doing high availability and or have a DR plan, make sure, you accommodate the idea that your key management solution, be it the internal one in crypto or an external one, is accessible to the backup machine because if you get that encrypted data replicated to your back-up machine, but you don't have the field procedures or you don't have the keys, you're not going to be able to decrypt that data so that is absolutely a consideration. Shouldn't be impactful from a performance or time perspective, but certainly when you're doing a role swap you want to make sure that you have that accommodated.
Roger: “The screen would have to change if the social security number was defined as a numeric field, and we're masking with the pound symbols.” Correct, yes. Now, if it's defined as a numeric, I probably wouldn't mask with the pound symbol, but I would mask with a numeric value. Alright, so sometimes that's a little tricky if you expect a bunch of nines in that field, then that may be difficult, but I don't know if there's any reserve numbers in a social security number. I'm actually British, not American. So I don't know the answer to that, but I'm sure a quick Google would find out, but yes you would potentially then change the hash symbol to something else to support it.
Data save to a save file and restore on a different IBM i, what issues? I think maybe I address that hopefully again, the save of the data is not the relevant part, the restore is not even the relevant part, the access to the file on the backup system is the relevant part, so you would need to make sure that you have access to the key is there, as well. So again, just part of the planning, it's not difficult, it's just a new consideration that you haven't dealt with before, and it would just get added to a DR plan, back up recovery plan and no big deal.
Alright, Roger you’re nothing your ODBC or web services users are typically coming in as a generic user, and how can you make a distinction there? Yeah, that's a good question. One of the challenges with using service accounts behind the scenes when there is an application, I know JD Edwards tends to do that, you can't differentiate of course based on the user ID, so that is certainly something that you'll want to look at. I don't have a quick and easy answer, for you. There's potentially some things that could be done inside the field procedures. Now bear in mind, I mentioned that, we generate all auto-generate those field procedures for you because most people don't want to deal with them.
That's not to say that you can't use our APIs in a field procedure.
So in those instances, you could potentially interrogate something else, maybe the IP address of the user is coming from about the job, so that you can differentiate between those different users. So it's not a game-over-type situation, it does make it a little bit more complex, you'd have to use one of the shelf field procedures and put a little code in there to differentiate between those users.
Alright, I know we've got a couple of more questions. That are not answered, but we’re well past the top of the hour, so I appreciate you hanging with me, if you do have any follow-up questions that you think of after the fact, feel free to reach out, and I'll be happy to get you aligned with one of our Encryption experts. If you're interested in kicking the tires and seeing how this works in your environment, take advantage of that trial. It's a great way to just run it over some test data and see how it works, and I think you'll be pretty impressed at how seamless this whole deal is.
Thanks again for joining me today, I appreciate it, I will distribute the recording after the event, so feel free to use that.
And again, here's the contact info if you do have any additional questions. Thanks everybody, have a great remainder of your day.
See for yourself how easy IBM i database encryption can be. Request your free Powertech Encryption for IBM i trial today.