Tue Jan 28, 2020 1:26 am
Login Register Lost Password? Contact Us


Cryptographic functions

Comments and questions related to the Enterprise Control Language

Tue Dec 30, 2014 6:28 pm Change Time Zone

Hello!

Is there any cryptographic hashing functions in ECL?
I know of the HASH and Co. functions but as far as I can tell (correct me if I'm wrong) they are all non-cryptographic (including FNV).

What I'm trying to accomplish here is hashing sensitive information (i.e. some fields, not all) before sending over the data to production Roxie cluster for web delivery of that same data. Only SHA-type hashing functions are approved by NIST for example.


Thanks!
Luc.
lpezet
 
Posts: 64
Joined: Wed Sep 10, 2014 3:14 am

Tue Dec 30, 2014 8:33 pm Change Time Zone

...well MD5 is a crypto function but still...SHA is the minimum I need here :(
lpezet
 
Posts: 64
Joined: Wed Sep 10, 2014 3:14 am

Wed Dec 31, 2014 3:17 pm Change Time Zone

Hi Luc,

As you found, there aren't any built-in cryptographic functions beyond MD5 (which is really a hash). The platform, however, does support OpenSSL and if it is enabled (which I believe is the default) then the low-level functionality does exist. I'll take a look at an implementation. You're strictly looking for symmetric encryption/decryption, right?

Cheers,

Dan
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 566
Joined: Tue Oct 18, 2011 4:45 pm

Wed Dec 31, 2014 4:02 pm Change Time Zone

Hi Dan!

I'm actually looking for secure hash functions, and if I may be picky I'd even say SHA-256.
In the meantime, should I just be using PIPE and output say "key,hashed key", to JOIN it with my data and keep only the "hashed key" from that point on?

Thanks!
Luc.
lpezet
 
Posts: 64
Joined: Wed Sep 10, 2014 3:14 am

Wed Dec 31, 2014 4:18 pm Change Time Zone

Hi Luc,

I should read things more closely. I thought you were looking for encryption/decryption, not hashing. Hashing is considerably easier, as you don't have to deal with quickly and securely generating IVs and such.

PIPE should work, provided you know that you have a binary to call (and where it is) on each node. There is a performance hit though, as the system will do an internal fork-and-execv set of steps for each call to the binary. If you're dealing with a large number of records, you will certainly want something with a little more performance.

I'll keep looking at this, focusing on hash functions this time. An ECL code module would be a much cleaner solution, I think.

Cheers,

Dan
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 566
Joined: Tue Oct 18, 2011 4:45 pm

Wed Dec 31, 2014 4:24 pm Change Time Zone

Sorry for the confusion.

Thanks a lot Dan!
lpezet
 
Posts: 64
Joined: Wed Sep 10, 2014 3:14 am

Wed Dec 31, 2014 5:29 pm Change Time Zone

OK. First, here is some working code specifically for SHA-256:

Code: Select all
DATA DoHash(STRING s) := BEGINC++
    #include <openssl/evp.h>
    #include <openssl/crypto.h>
   
    #body
   
    EVP_MD_CTX*         mdctxPtr = EVP_MD_CTX_create();
    const unsigned int  kMaxDigestSize = EVP_MAX_MD_SIZE;
    unsigned int        finalDigestSize = 0;
    unsigned char       digest[kMaxDigestSize];
   
    if (mdctxPtr)
    {
        if (EVP_DigestInit_ex(mdctxPtr, EVP_sha256(), NULL) == 1)
        {
            if (EVP_DigestUpdate(mdctxPtr, s, lenS) == 1)
            {
                if (EVP_DigestFinal_ex(mdctxPtr, digest, &finalDigestSize) == 1)
                {
                    __lenResult = finalDigestSize;
                    __result = reinterpret_cast<char*>(rtlMalloc(__lenResult));
                    memcpy(__result,digest,__lenResult);
                }
            }
        }
       
        EVP_MD_CTX_destroy(mdctxPtr);
    }
ENDC++;

//--------------------------------------

DoHash('this is a test');

// hex result = 2E99758548972A8E8822AD47FA1017FF72F06F3FF6A016851F45C398732BC50C

There is a caveat, however. My earlier thinking regarding OpenSSL availability was possibly incorrect. Most of the 4.x and 5.x clusters I tested this against did not have OpenSSL installed. Or, if it was installed, it was installed in a different location and I didn't find it (a runtime error of "openssl/evp.h: no such file" is something like what you will see in this case). Maybe someone else will come along and tell me I'm wrong about all this. Anyway, I did successfully test the code against an HPCC 5.x cluster (CentOS) I built from source which had OpenSSL installed in its normal location (headers in /usr/include/openssl/).

(Edit: What may actually be happening is that HPCC itself uses OpenSSL and is linked against the appropriate libraries, but when you write embedded C++ code you also need access to the OpenSSL header files. An RPM installation of HPCC requires the libraries but won't install those header files; you will have to install them yourself.)

I would highly recommend going with a function like this if possible for performance reasons if you're dealing with a large number of hashes and can ensure that OpenSSL is installed. Otherwise, you can use the PIPE command and execute an external binary to compute the hash.

Final note: It's possible to run an HPCC cluster that was not built with OpenSSL on an OS that has OpenSSL installed. In that situation, the OpenSSL libraries won't be included in the HPCC executable and you'll have to import those libraries at runtime. The way to do that is to include this line somewhere in the toplevel Thor or Roxie code:

Code: Select all
#OPTION('linkOptions','-lssl -lcrypto');

Hope this helps.

Dan
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 566
Joined: Tue Oct 18, 2011 4:45 pm


Return to ECL

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron