10 Ways To Destroy A SQL Database

Database is the asset of most online or internet based company. Everyone looks at how to improve and secure their databases to protect or improve their company. While everyone is searching for remedies or enhancement pills for their company, there are often simple mistakes made by some companies (especially the small to middle ones) that might just destroy their businesses. Rather than looking at how we can protect our database, this article will look at ways to destroy it instead! (through mistake, of course)

Don't Monitor Error Log

The first line of defense that any database would have. Error log may indicates first time problem occurs or warnings that your database might be facing problem. These troubles can be easily avoided or missed depending on what you do. Be my guest and ignore error log will definitely help to destroy your database.

Many company databases are designed in a way to enforce availability. Hence, there will surely be primary and slaves databases in such company. These databases also contain error files. However, if you would like your secondary databases or slave databases go out of sync with the primary database, be sure to ignore the error file and give it some time. Depending on the size of your company, the amount of data lost caused by the lost of synchronization might just cost you dearly when some hero DBA shutdown the primary database without issuing "stop slave" first on the slave database or wait till some errant SQL come down the line. Although this might take some times to destroy your database but its worth to think about using it.

Don't Fine Tune Queries

You have a big server, lots of memory and fast disk, so don't have to worry. Continue with this attitude and you are on your way to success (destroy it!). Developers writing bad code that caused full table scan and trying their best to trash your query cache, overloading your Innodb buffer cache with useless blocks. Plus hitting disk instead of main memory as much as possible? Well, there won't be ANY problem since every piece of hardware is the latest, fastest and most powerful ones! Let's just wait and see till your database fall on its knees! Especially when data is getting larger! That's the best time we see this happen!

Don't Document Procedures and Configurations

Ah! No documentation won't cause a single problem! No problem at all my friend! Why the need to enforce such tedious job when 'we' can maintain the job? Just you wait when the 'we' becomes 'i' and 'i' becomes 'who'. Employees come and go nowadays. People always look for a better future in life and no matter what happens to you, it's really none of their concern. Man! What so difficult? Let's just hire an expert to take the job. Oh boy! That's a great solution! Let's see when he presses the wrong button 🙂

Don't Backup

Another great way to destroy your database is to avoid making backups! Hardware failure is a common thing in data center. Hard disk fail, power supply down, plugs get pulled, basically anything you can imagine. Don't backup regularly can just do the thing!

Its not only hardware that might assist you in your task. Developers or DBA who accidentally deletes data can also be of help. Deleting columns, table, rows, data or even database! These sort of things do happen and it happens quite frequently. Other than deleting data, mistakes made on program might just transfer the wrong data to the wrong place. All of these are just things that might happen to any IT firms. Well, there is always never happen before situation for some of you. Just follow your instinct (backup suck!) on this and you will do just fine.

Don't Use Memory Wisely

Server nowadays have huge memory installed with it. Technology advancement has made everything more powerful than before. Furthermore, it is that affordable that many companies can afford big and fast memory! With such powerful memory backup, we can assure that whatever developers throw in, the server and database will surely be able to take it! We can safely assume MySQL knows our database memory requirement! We just have to run the wizard installation and viola! Everything is automatic nowadays! There won't be such thing as misallocation of memory. The system is perfect.

Good to know. You have everything that required (even attitude) in preparing to bomb up your database.

Don't Worry About Indexes

Indexes is the most effective way to destroy your database. However, you must know the trick to do this. There are two ways to succeed. The first one required you to do absolutely nothing. No indexing is required purely full scan table. However, this required certain criteria to be meet but it should do the trick. If this doesn't suit your taste, you can try a faster way by creating useless or unwanted indexes and ensure that your table have tons of records. This can surely improve the process of destroying your database.

Don't Normalize Your Database Design

If you are just starting to build a system, you can consider skipping normalization in your database design. Skipping normalization can help contribute to bad database design which is part of the plan to destroy a database. Furthermore, without normalization, there is a good chance your system can be inaccurate, slow, and inefficient, and it might not even produce the data you expect.

Don't Make Policies For Database Patch

Its true that majority company doesn't update their databases immediately after a new security vulnerability patch has been released. It would be irresponsible for a company to deploy a patch in production without first running it through quality assurance. Furthermore, some companies didn't event bother to have policies to update their database. If you think that databases are a little more isolated than the desktop, there's less of a security concern and thinks that your databases are more secure because they're behind firewalls and and have a good perimeter security, you are on the right track of destroying a database.

Don't Bother Caching

My database can take tons of crap anything throw towards it. Its the fastest computer (why bother to have technology advancement when we already have the fastest? Dot) on the planet. Cache or no cache won't destroy my database. Its the fastest (ya ya, i get it.). The dramatics performance gain using cache table might not interest you. Scalability, flexibility, availability and performance are just some benefits that caching can gives. Multitier architecture, what bullshit. Your server will NEVER go down and you will NEVER required cache table to be available. Millions of hits on the server database might just do the trick on helping you achieve what you want in this article. It will definitely work better when your table has few millions record and a few lousy queries. (it might not even required millions of hit to kill it)

Don't Use Fast or Reliable Disk

Using something like a single disk or mirror will definitely makes your I/O the main bottleneck on your system. With the help of a single disk, you can expect OS and your database fighting for resources, serving one user at a time while others waiting for their turn. To make things worst, you can try utilize RAID-5 instead of RAID-10. May be you already are!  Well, to  compare between these two, RAID-5 only performed reasonably well on read while RAID-10 exceed almost two times better than RAID-5 on writes. RAID-5 only can handle 1 fails and any drive die will approximately caused 64% degration in read performance until the faulty drive is discovered. Furthermore during recovery, read performance for a RAID5 array is degraded by as much as 80% compared to RAID-5 which only degrade performance on the faulty disk itself. There are more 'advantages' on RAID-5 but i will just stop here. RAID-5 seems to be good with destroying than RAID-10 don't you think so?

Conclusion

The points discuss here might just happen to large data or traffic internet site (well, your site will eventually grow to have big data, hopefully). However, the conclusion to all these jokes are more valuable; Learn to save your ass. No one will.

10 PHP Micro Optimization Tips

There are many ways to improve the way you write your PHP code. And we can easily increase the efficiency of our code just by putting in some effort during development. However, there might be some unknown information that you might not aware in PHP that can help improve your code. In this article, i will try to provide you with some tips that can serve as micro optimization for your code and could also add on to the list of knowledge that you have in PHP. We will also look at many benchmarking of these tips where possible!

Loop

Every program will required certain amount of loop. And loop is considered as efficiency killer if you have many nested loop (means loop in a loop) as one loop will required to run 'n' times and if you have 1 nested loop, this means your program will have to run n2 times. Well, i think you can do the math. But we are not talking about this. There is something more interesting about loop in PHP. There are many ways we can define a loop in PHP. But how do you know which one is the best way to loop your data? Apparently, using a for loop is better than foreach and while loop if the maximum loop is pre-calculated outside the for loop! What do i mean? Basically is this.

#Worst than foreach and while loop
for($i =0; $i < count($array);$i++){
echo 'This is bad, my friend';
}

#Better than foreach and while loop
$total = (int)count($array);
for($i =0; $i < $total;$i++){
echo 'This is great, my friend';
}

The above shows two different ways of writing a for loop. The first way includes the operation count into the loop while the second one pre-calculate the total number of loop outside it. The difference between these two is that the second doesn't run count operation n times while the first one did. You can find this VERY interesting benchmarking on loops on  PHP.

Single Vs Double Quotes

Since i mentioned that benchmarking page on loops, it also includes the benchmark for single(') and double(") quotes. Now, between these two what is the best one to use? It really doesn't makes much differences. But i preferred to use  single(') quote because i don't have to press shift? Just kidding (not). That's one of the reason why i use a single quote over the double one. But the other reason is that PHP will scan through double quote strings for any PHP variables (additional operation) and usually i don't mix my variables and strings into one. I usually use single quote instead. However, you might also have aware that if an empty string is declared using a single quote, it seems like there is a performance pitfall. You or I might want to take note of that. Basically, there is a dollar($) symbols in your string, try to avoid double quote unless its variable?

Pre increment vs Post increment

Well, increment a certain value also have a few ways to improve. We all know that there are many ways to increment integer values such as

$i++;
$++i
$i+=1;
$i = $i + 1;

Out of all these what way is the most efficient? In PHP, it seems like pre increment is better than the other ways of performing an increment. Its around 10% better than post increment? The reason? Some said that post increment made certain copy unlike pre increment. There isn't any benchmark done for PHP but i found one on C++ which should be quite the same. Well, without a proper benchmark on this, i can't really confirm this. Furthermore, it really doesn't makes a big differences towards normal programmers but may affect those who are working towards micro optimization.  Nonetheless, many people do suggest pre over post increment in term of optimization.

Absolute Path VS Relative Path

Absolute path which is also known as full path compare to a relative path which will be better for PHP? Surprisingly, it seems that absolute path is better. Compare to relative path which might just help to screw up your include and require operation in PHP, absolute path doesn't. Well, that's the reason why i use absolute path. But the real reason is that using absolute path eliminate the need for the server to resolve the path for you. Simply to say, do you know where the file is located when you just look at a relative path or is it faster if i just throw you the full path?

Echo Vs Print

Yes! I know, echo is better. But how much better? Interested to know? I am interested. So i went to dig a bit on the internet and found some useful information for benchmarking between these two! Its around 12%-20% faster using echo compare to print when there is no $ symbol in the printing string. And around 40-80% faster if there is an $ symbol used in a printing string! This really demonstrate the differences between the keyword $ symbol used in PHP.

Dot Vs Commas Concatenation

Between dot and commas which way do you use to concatenate between two strings/variables? I personally used dot to concatenate my stuff. Such as the one shown below

$a = '10 PHP programming ';
$b = 'Improvement Tips';
#10 PHP Programming Improvement Tips
echo $a.$b;

I usually do the above. Instead of this,

$a = '10 PHP programming ';
$b = 'Improvement Tips';
#10 PHP Programming Improvement Tips
echo $a,$b;

Well, between these two which is more efficient? If you did visit the link for benchmarking between echo and print, you might have aware on the exact same test, they also have performed test case for dot and commas. The result shows that dot is more preferable if there are no variables or $ symbol involved which is around 200% faster. On the other hand, commas will help to increase around 20%-35% efficiency when dealing with $ symbols.

str_replace vs preg_replace vs ereg_replace

Ok! We have 3 string search function in PHP. Out of these three functions, which do you think will run the fastest? Some of you might have know, str_replace will run faster. Reason? str_replace doesn't run any complex expression unlike preg_replace and ereg_replace. Well, maybe many of you might know that but it is not necessary always str_replace that runs fastest. If you have to call str_replace 5 times compare to preg_replace, which will run faster? (str_replace, of course) Wrong! preg_replace runs 86.99% faster than 5 str_replace function call! Basically, i also have such doubt and search for such benchmark. The benchmark really explains some doubts we have in these functions.

Find Timestamp

When you want to find out the time when your script starts running in order to get that timestamp to store it into your database. The first thing you do is to fire up your Google and search for some PHP function. Well, after PHP5 you do not have to do that. After PHP 5 you can easily retrieve the execution timestamp of your script by using

$_SERVER['REQUEST_TIME']

This could really save some time digging for something that already exist within your reach.

explode Vs preg_split

Well, when you want to split a string what do you use in PHP? I usually used explode because it support even PHP4.0 and that's also what i was taught by my ex-colleagues. The answer in term of efficiency is explode. Split supports regular express and this makes it quite the same comparison between str_replace and preg_replace, anything that have regular expression support will usually be a bit more slower than those that doesn't support it. It took around 20.563% faster using explode in PHP.

Other Benchmarks

I believe this list can go on forever with such great benchmarking site i found. It basically shows you most of the benchmarking between PHP functions or those articles that claim whatever stuff is better than the other in PHP but doesn't provide you with any real evidence on their article. In this article, hopefully i have provided you with the necessary information for you to perform some micro optimization and also other more of such optimization through the benchmarking site.

Summary

I believe many of you have see all these information floating around the internet. But the only differences i see is that they don't really provides you with the real interesting part of this article, that is the benchmark of each test. This article really help me a lot with all the figures and testing given by all these great benchmarking site. Hopefully it also gives you the same result.

P.S: Most of these benchmark sites are run on real-time.

Integrate Paypal Express Checkout Solution

As a web developer, there will surely be someday where you wish to integrate Paypal into one of your products or services. The most appropriate way is to read the documentation provided by Paypal. But reading it doesn't mean you will understand the documentation with one shot and this call for a lot of research and finding before your Paypal will work. I went through this process these few days that is why there wasn't much article written in the process.  Although it wasn't really difficult but going through the process of reading first before looking into their sample codes really wasn't the correct way of approaching this solution. Instead, looking into the sample code will definitely brings light to integrating Paypal express checkout solution (well, you still have to read a bit). In this article, i will try to demonstrate Paypal express checkout solution as simple as possible for you guys to be able to DIY.

Paypal Express Checkout

What is Paypal Express checkout solution? Paypal Express Checkout makes it easier for your customers to pay and allows you to accept PayPal while retaining control of the buyer and overall checkout flow. This means that you can integrate a payment solution with Paypal that retain most of the interaction on your website other than user login and verifying the product they are purchasing. Paypal express checkout also provides you with the ability to create recurring payment which can really eliminate the need to repurchase the exact service or product every single time. However, Paypal express checkout solution doesn't have the ability to allow your user to use credit card for purchases. Your customers must have Paypal in order to purchase with this solution. Credit card solution will only be available together with Paypal in Website Payment Pro solution. Hopefully this clear some doubt and help you select what solution you really need.

Integrate Paypal Express Checkout Solution - Step 1

Firstly, you might wonder where exactly are the correct documentation out of all the places in Paypal. You can get the documentation and Sample at the respective links. The sample is contain at the section PayPal API: Name-Value Pair Interface as i believe this will give you a better understanding on the flow of Paypal express check out solution. The sample files will required you to throw them  into your server and run (go to the browser and key in the url you have thrown the folder into) as it will simulate some of the payment flow you might want. Then you will look into the code and see how they are achieved. Please take note that localhost might not work for you as it will required you to have curl installed.

Integrate Paypal Express Checkout Solution - Step 2

Once the sample are placed into your server and you have play around, the next thing you might wonder is the exact file you will required to run your own Paypal express checkout solution. And here are the files you will only need.

  • APIError - display error
  • CallerService - main player that initial the talk
  • constants - all the required variables
  • SetExpressCheckout - display for step 1 of the process
  • GetExpressCheckoutDetails - display for step 2 of the process
  • DoExpressCheckoutPayment -  display for step 3 of the process + send final request to paypal
  • ReviewOrder - request handler for step 1 and responsible to redirect to step 2

The files i am looking at are all PHP files.  Well, the above file respective function should be self explained. The first 3 files(APIError, CallerService and Constants) are the files imported into the ReviewOrder and  DoExpressCheckoutPayment files as they are required to talk to Paypal. Once we understand this it is time to go into a more complicated stuff.

Integrate Paypal Express Checkout Solution - Step 3

To illustrate what is going on in the sample file, we will look at the following diagram provided by Paypal.

From left to right, we have 5 interfaces user will see. And two of them are display from Paypal where it is colored in blue (2nd and 3rd interface). Hence, we left with 3 interfaces which are SetExpressCheckout, GetExpressCheckoutDetails and DoExpressCheckoutPayment which is 1st, 4th and 5th interface respectively.  So we are all clear with the display files right now. Next we will need to know where ReviewOrder will appear. There are altogether 4 Calcuts as written on the diagram. The ReviewOrder will be triggered on the 2nd and 3rd Calcuts where SetExpressCheckout API and GETExpressCheckoutDetails API is being fired. Don't worry about what does these API means at the moment. Just treat them as a method that will tell Paypal what they do.

Integrate Paypal Express Checkout Solution - Step 4

I guess everyone should understand how Paypal work looking at the sample file and the explanation above.  Next i will explain some of the important things you will need to know since writing all the codes here is meaningless as they are the same for every sample files. It just makes it more confusing to read. Firstly, for each request made to Paypal, you will always see the following line in the sample file.

$resArray=hash_call("SetExpressCheckout",$nvpstr);

where $nvpstr is the name-value pair string passed into the method hash_call. What this function hash_call does it to send the request to Paypal to notify them the action you performing. In this case, SetExpressCheckout API is being performed here. There are also other API as mention previous such as GetExpressCheckoutDetails API and DoExpressCheckoutPayment API. These are the three API you will need to talk to Paypal in each stage shown on the previous diagram. So we should all clear about what does API mean that are written all over the Paypal documentation. The next important step is to know what name-value pair does each API required you to send in order for Paypal to understand you.

Integrate Paypal Express Checkout Solution - Step 5

Here we will see what does each API in the process of express checkout required. For SetExpressCheckout, you will required to have the following name-value pair in your string.

  • AMT
  • CURRENCYCODE
  • RETURNURL
  • CANCELURL
  • PAYMENTACTION

That is all! But in the sample it gives you more than just the above which is pretty good to understand what can be dump into the nvp string for it to display what you want on the paypal website where your user gets redirected.

For GetExpressCheckoutDetails API is pretty simple. It will just required you to have a token passed into the nvp string and this token can be retrieved via $_GET method where Paypal send it through there.

Lastly, for DoExpressCheckoutPayment API, you will need to provide the following nvp for it to work.

  • TOKEN
  • PAYERID
  • AMT
  • CURRENCYCODE
  • PAYMENTACTION

And that's it! The value forGetExpressCheckoutDetails and DoExpressCheckoutPayment API are provided by Paypal during the process while SetExpressCheckout data are given by you.

Summary

I believe the above explanations were pretty clear. But i still used quite a hell lots of time working on it *SLAP MYSELF*! This article is intended to provide any newbie on Paypal to get the hang of integrating Paypal without the need to spend time on reading and learning all about Paypal integration. However, the sample provided by Paypal is not secure and is only used to serve as a demonstration on 'how integration can be made easy'. I believe this article will be pretty useful for anyone to understand how Paypal work rather than reading few thousand words given by Paypal and never direct you to the correct sources or code (other than more documentation). Guess what? I found this Paypal Integration Wizard which is a wizard that creates all the above codes for you! :[

Enhance Security Hash Function For Web Development

Previously i wrote an article on Better Hashing Password in PHP and PHP Secure Login Tips And Tricks. Both these articles are closely link to hash function and i find that i really didn't answer hash function on these two articles clearly enough. Especially on the article Better Hashing Password in PHP where i find my reader get confused on some of the security term used and caused some misunderstanding on the article. Hence, i came up with an idea to write out an article to detail some hash function question and improvement any developer can make to further secure their website. You will get a better understanding on security hash function and apply it on your web development once this is over.

Stop Using MD5 or SHA-1 in the future

If you have read Better Hashing Password in PHP, US-CERT of the U. S. Department of Homeland Security said that MD5 should be considered cryptographically broken and unsuitable for further use in 2008. Wiki explained a good detail of MD5 vulnerability that will surely make you switch your hash function from md5 to other better hash function out there. SHA1 on the other hand is more secure than MD5. However, collision has also been found on SHA-1. This is no surprise as both SHA-1 and MD5 are descended from MD4. Nonetheless, SHA-1 is still strong enough currently but not in the future looking at how computer power advancement has been progressing. Furthermore, faster way of creating a collision has also been found with only 252 attempt required which is a significant reduction from 263. What does this means? This means that SHA-1 attacks affect collision resistance, not pre-image resistance. Short to say that after 252 operations, the researchers are able to generate two unique messages that hash to the same digest value which is approximately a few month times in a dedicated hardware to construct such collision. While obtaining a SHA-1 collision via brute force would still require 280 operations. By the way for people who have no clue what's SHA means, it stand for Secure Hash Algorithm which is required by law for use in certain U. S. Government applications, including use within other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. Thus, SHA family algorithm should be pretty solid for anyone who wish to secure their web system unless weakness has been found such as SHA-1.

Why Collision Is Bad

Any hash function will face collision eventually due to pigeonhole principle whenever members of a very large set are mapped to a relatively short bit string. The impact of collisions of any kind are undesirable. This means that two different set of message may generate the same hash value and one hash value may be generated by more than one message. Simply to say that the hash function doesn't generate a one to one relationship hash value. What this means for security is that there is a message exist 'n' which is different from 'm' that can be used to access the same system. Therefore, the more collision resistance a hash function is the better it is for security. Let me provide you with an example on how bad collision can caused for a system. Assume  a system had their hashed password compromised. Instead of 'hello' as password, another value 'bye' which was generate the same hashed value (which is not true) that the hacker is able to use access the system. Now, instead of a 5 length word, a similar hash value can be found with a 3 length word. And this means shorter time to crack your hashed value and easily it is for our hacker but its not always necessary this case. Obviously this is just an example and security cautious people will definitely do much more than this to prevent birthday or brute force attacks.

Moving To SHA-2 Hash Function

Like i mention on Better Hashing Password in PHP it is time to start considering moving towards a more solid hash function that have not found any weakness yet. Most U.S. government applications will be required to move to the SHA-2 family such as SHA-224, SHA-256, SHA-384, and SHA-512 of hash functions by 2010 as stated by NIST(National Institutes of Standards and Technology) on their hash function policy. Hence, for web security conscious people, we should also start moving towards SHA-2 too. Currently, PHP 5.12 also offered such option, hash

$phrase = 'This is my password';
$sha1a =  base64_encode(sha1($phrase));
$sha1b =  hash(’sha1′,$phrase);
$sha256= hash(’sha256′,$phrase);
$sha384= hash(’sha384′,$phrase);
$sha512= hash(’sha512′,$phrase);

For users who used version lower than PHP5.12, you you can try to use mhash which is an open source class for PHP.

$phrase = 'This is my password';
$sha1a =  base64_encode(sha1($phrase));
$sha1b =  base64_encode(bin2hex(mhash(MHASH_SHA1,$phrase)));
$sha256= base64_encode(bin2hex(mhash(MHASH_SHA256,$phrase)));
$sha384= base64_encode(bin2hex(mhash(MHASH_SHA384,$phrase)));
$sha512= base64_encode(bin2hex(mhash(MHASH_SHA512,$phrase)));

SHA-2 should be used if you are web security minded person. SHA-1 can still be used definitely but additional enhancement would have to be in placed to strengthen such hash function from various attacks on its weakness. On the other hand, you may want to avoid MD5 for very security data.

Is SHA-2 Totally Secure?

Like we mention previously, all hash function is unavoidable towards collision. It is the same as stating that any system can be compromised with the proper amount of time and resources. The objective of a hash function is not to provide a totally uncrackable solution but to delay the process of breaking it where the process might required few hundred year to achieve.

Currently, the best public attacks on SHA-2 break 24 of the 64 or 80 rounds where 64 and 80 rounds is the number of loop in SHA-256 and SHA-512 respectively. Hence, it is safe to say no collision has yet to be found on SHA2 family yet. Hence, SHA-2 family can be said to have high collision resistance at the moment.

Attacks On Hash Function

An attack can be process faster with the given resources.  Some ways is to divide up the cracking process to different computer or exploit CPU architecture to take advantage of multiple cores on one processor or multiple processors on a single machine. This makes cracking the password much easier. And the below attacks might just be one of the cracking process used by them.

Birthday Attack

Most attacks on hash function are targeting the hash function collision resistance part as it is easier to launch a collision attacks. At the same time, preimage attacks has yet to be found on established hash functions.  The typical kind of attack against hash function will usually be collision attacks which is also known as birthday attacks. We mention earlier that every hash function will face collision problem. Therefore, theoretically it is possible to find two message that produce the same hash value.

The birthday attack is a statistical probability problem. The method used to find a collision is to simply evaluate the function ƒ for different input values that may be chosen randomly or pseudorandomly until the same result is found more than once. Because of the birthday problem, this method can be rather efficient.

Let's see how efficient it is. Given 'n' inputs and 'k' possible outputs, there are n(n-1)/2 pairs of inputs. For each pair, there is a probability of 1/k of both inputs producing the same output key. So, if you take k/2 pairs, the probability will be 50% that a matching pair will be found. If n is greater than sqrt(k), there is a good chance of finding a collision.

Brute Force Attack

brute force attack is the most fundamental attack but is often used.  Brute force is often used against hash value as hash function generates a one way hash value. This also means that the hash value is irreversible which makes it such a secure function. The only bet is to try each and every words in order to get it right. However, such method is very time consuming as the search space may be very huge.

Rainbow Table Attack

Another common attack is to use a rainbow table to try to figure out the original message given a hash value. Unlike brute force that attempt to match every single  character, rainbow table attack uses tables which offer time-memory-trade-off  and chains to lookup for the particular message. The table content does not depend on the hash value to be inverted. It is created once and then repeatedly used for the lookups unmodified. Increasing the length of the chain decreases the size of the table. It also increases the time required to perform lookups, and this is the time-memory trade-off of the rainbow table. In a simple case of one-item chains, the lookup is very fast, but the table is very big. Once chains get longer, the lookup slows down, but the table size goes down. The table here refers to the rainbow table where all kind of password is being hashed and stored in it. This process usually takes a while as the table should be extremely large to have a possibility of finding hash and as stated in bold above, it is usually repeatedly used for lookups unmodified. Once you understand this, you can visit kestas for an explanation of how rainbow table attack works.

Defensive Strategies

There are known ways of attacks for hash function. Similarly, there will be defensive strategies web developers can apply to better protect our system.

Using SALT

Another common way of enhancing hash function is through the using of SALT. A salt is used to strengthen user password if ever the system table is to be compromise. SALT is just an additional string appended to the password before it is being hashed and store into the database table. The same exact SALT will be required during verification between the user entered password and stored password. However, some of us get confuse of the capability of a SALT in cryptography can greatly help in a web system.

Before i explain SALT criteria, let's see what SALT can do for us. If your system database was compromised but no SALT was used for your password hashing. What will you do? Basically you will send an email out to everyone and request them to change their password like what Reddit did (well, the only differences is they did not even hash their password). On the other hand, if SALT is being used in this case, you guys will just have to modify the SALT algorithm without causing so much kiosk. Furthermore, the passwords that was compromised will be pretty solid to be easily decrypted.

Let’s look at brute force attack. We assume the hacker know the length of your SALT (since its compromised) but you did not change the length (assume the length is 1000 character). The hacker began to hack your system using the passwords they receive. Let’s assume there are around 94 character you can fill for your SALT or password (Its around 94 characters, please refer to ASCII values). Hence, 1 single character will make a brute force run 92 times to get a shot of a single character password (means no salt). Now, we have 1000 character for our SALT. This means that our hacker will have to try 921000 to get a shot to decrypt one password on the list of hacked passwords. This means they will have to try 101500 which is many many combination for just one password! I don’t think one year is enough for them to crack one password using a single computer. But correct me if i’m wrong. Anyway we can see from this interesting table from Lockdown.co.uk - The Home Computer Security Centre research which was written on Friday 10th July 2009 04:01 that the longer and more character in a password, the longer it will need to crack in a simple brute force attack. Try imagine 1000 character with different combination will take after looking at the table.

Next, let’s look at rainbow table attack attempt to crack such password. Now, we understand that rainbow table precompute hash chain. The goal is to precompute a data structure that, given any output h of the hash function, can either locate the p in P such that H(p) = h, or determine that there is no such p in P. Since all our password are salted uniquely, the attacker will have to generate a rainbow table for each salt for each possible password – exponentially increasing the effort an attacker must make. Furthermore, looking at the size of the SALT, it is infeasible for such attack to occur because of the sizable investment in computing processing, rainbow tables beyond fourteen places in length are not yet common. So, choosing a password that is longer than fourteen characters or that contains non-alphanumeric symbols may force an attacker to resort to brute-force methods.

However, OUR TABLE WAS COMPROMISED! This doesn't mean our system is compromised as well. It really depend on the SALT function you create as mention on Better Hashing Password in PHP. A paranoid SALT implementation criteria might have:

  • More than 64 bits long of SALT where 64 bits is the least amount of SALT length required to be secure as specific in PKCS #5
  • Randomly populated ASCII value for each SALT character
  • Validate the SALT to ensure it contains symbols, uppercase, lowercase and numbers
  • BASE64_encode it before storing into database (decode it when used)
  • Use other dynamic field in the table that might change by users as part of the SALT such as email address, telephone numbers, last login
  • Validate that no such password hash exist on the table.

Basically the above criteria should be enough for very solid SALT implementation that will not compromised your system even if your database was compromised. Let me explain the reason why on the above criteria.

  • Longer salt length means more brute force values and bigger size rainbow table required
  • ensure that any kind of character is being considered as salt to generate different hash for each password.
  • this is just a validation to ensure every kind of salt is being purchased
  • decode this salt value into the database so that stupid hacker will use it directly.
  • this is the criteria that makes the system secure as we are not doing hash(password + salt) instead we utilize the data in the table or other table to hash it this way hash(password + salt + username + secret answer). Take note that the each variable used to create a hash value are all static data (it doesn't change). Hence, the hacker might not know how the sequence and data of hashing the password is generated unless the function of the SALT is also compromised which is possible (developers can be hackers). However, the above description wrote dynamic data. This means that the SALT function will always generate a new SALT or password to be updated to the table whenever the dynamic data changed. Since last login time will only change upon login, we will use this dynamic data and generate a hash using this sequence hash(password + salt + last login time). Hence, whenever a user logged in, a new password hashed is generated and update the password field (which is a hash).  During user validation we will use hash(password + salt + database last login time) and update current time and new password into the database upon logged in. Similar, SALT can be generated every time upon logged in as well. Hence, we will use a new salt for every single logged in request. Now, if a hacker compromise your database, unless it can crack before every user logged in, this is pretty solid i would said.
  • In case of collision, always validate that the hash value cannot be found within the table before inserting it.

Like i said, this is a total paranoid solution that some of you might like to have. But basic hash(password + salt + username + secret answer) will be quite enough to protect your system provided you follow the PKCS#5 way of constructing your salt. You may want to visit PHP Secure Login Tips And Tricks for login security.

Iterative Hashing Technique

One way of enhancing your hash function is through iterative hashing. This technique basically takes a particular hash function and rehashing it many times (around 1000-10,000) making it incredibly hard to crack which is also known as key stretching or key strengthening in cryptography. Assuming a hacker manage to crack a hash value within 8 hours, rehashing it many times might change this from 8 hours to 8000-80,000 hours per password. I think we can imagine how an attack will work on such technique where every decrypted hash resulted in more hash to be decrypt. Its like opening a birthday present and found yourself a new box to open again (make me cry). Now we all know that SHA family hash function are all considered as fast. However, SHA-2 family might run slower than SHA-1 hash function as the block size is bigger but no much differences is being made.

What most of us will worry about is the time required for that many iterative round hashing to occur each time a user login or account creation. This will depend on how secure your system is required. You don't expect a bank web portal to compromise security for speed don't you? (hacker: HELL YEAH! SPEED!! SPEED!!) Hence, you may want to tune this according to your need and be at least 1000 iterative as something between 2-100 iterative is equivalence to nothing as such hash might had already exist on rainbow table. On the other hand, according to Wiki, Slowest personal computers in use today (2009) can do about 65000 SHA-1 hashes in one second using compiled code. Thus a program that uses key strengthening can use 65000 rounds of hashes and delay the user for at most one second.  The standard as mention on PKCS #5 is around 1000 and above. And in case you never heard of PKCS, in cryptography, PKCS refers to a group of Public Key Cryptography Standards devised and published by RSA Security.

SALT can be appended to each iterative to make your iteration algorithm more secure but it really depends on how you are going to design this.

Is Double Hashing Better?

Double hashing here is different from iterative hashing. Double hashing uses two different hash function instead of one. This is really confusing as double hashing is also refer to iterative hashing by some people. Hence, making everything confusing. An example is


sha1(md5($pass))

your system is at least as weak as the weakest of the hash algorithms you are using (md5).  It is true that both iterative and double hashing will reduce the search space but the effective reduction is insignificant compare to the gain from the benefit of iterative hashing. On the other hand, double hashing is bad as it will weaken the system since the hacker will only required to break the weaker hash function. Therefore, try avoiding double hashing and go for iterative hashing instead. Furthermore,  hashing two times with the same algorithm is considered suboptimal.

References And Good Reading

Conclusion

Enhance Security Hash Function is a pretty interesting topic. Understanding the strength and weakness of your used hash function is necessary to better protect your web system. Hash function is the only method most of us, web developers and the internet are using to secure ourselves. Without fully understanding  hash function is like a man on a battle field  with a gun but have little to no understand of its capability and weakness. And that's bad. (not that gun where you bring along with you to the toilet)

Using WordPress dbDelta Function

Many of us who develop WordPress plugin might have come across dbDelta function. dbDelta function is usually used when one wish to create table for your WordPress plugin. However, this function might not be that easy to deal with since it is not an official function in WordPress. Nonetheless it is a powerful function that majority of us would want to utilize. In this article, we will talk about dbDelta function and how we can ensure that it perform what it is made to do.

dbDelta Function

Like i mention before in one of my article, dbDelta function has the ability to examine the current table structure, compares it to the desired table structure, and either adds or modifies the table as necessary, so it can be very handy for updates of our plugin. However, unlike many WordPress function, dbDelta function is the most picky and troublesome one.  In order for dbDelta function to work, a few criteria will have to be met.

  1. You have to put each field on its own line in your SQL statement.
  2. You have to have two spaces between the words PRIMARY KEY and the definition of your primary key.
  3. You must use the key word KEY rather than its synonym INDEX and you must include at least one KEY.

Well, the above criteria's seem easy to achieve. But wait till it hits you.

Strengthen and Weakness of dbDelta Function

The strength of this function is that we are assure that any modification to our table structure will be shown on the plugin. Hence, we won't have to worry about our user plugin not being updated whenever we change our table structure to accommodate new features. This function which is build by the WordPress community is definitely much more secure than other function that an individual came up with to solve such problem. Hence, the function itself is much more reliable. Using dbDelta also removes the need to request each individual instruction to be execute separately. Code can be sum up and dump into dbDelta function for it to run.

On the other end, like i mention earlier, this can be a real pain in the ass. dbDelta function is not very tolerant against mistakes. Hence, any mistakes made on your SQL query might just fail this function. Furthermore, certain restriction is given to you in order to utilize this function. If you accidentally break such restriction, the function will fail. Moreover, no documentation were provided for this method which makes it much more time consuming to get the hang of it. In case you haven't notice, dbDelta will only update new fields or keys. This means that if you decide to remove any particular field or keys on your table and hoping dbDelta will help you out with it, you are wrong.  And if dbDelta function fail to work for you, debugging it might just be as headache since printing out the message on dbDelta might not work well for you. To make things worst,  Wordpress will mark an error on your plugin if you try to exit(0) on some part of the script instead of stopping and display the printing message.

Using dbDelta function

Initially using dbDelta function wasn't that bad. We just have to be very careful with the spaces given. An example given by WordPress would be the one shown below,

$sql = "CREATE TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT,
	  time bigint(11) DEFAULT '0' NOT NULL,
	  name tinytext NOT NULL,
	  text text NOT NULL,
	  url VARCHAR(55) NOT NULL,
	  UNIQUE KEY id (id)
	);";

require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
dbDelta($sql);

Well, if you copy directly and change the field name, it should work nicely for you. But take note of the spaces given. Here are a few example that will cause dbDelta function to fail.

$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT,
	  time bigint(11) DEFAULT '0' NOT NULL,
	  name tinytext NOT NULL,
	  text text NOT NULL,
	  url VARCHAR(55) NOT NULL,
	  UNIQUE KEY id (id)
	);";

require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
dbDelta($sql);

The above contains an extra space between CREATE and TABLE. Hence, instead of one space we have two and dbDelta fail. The same thing might happen if there are an extra space between TABLE and your table name. Another good example might occur on the key level.

$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT,
	  time bigint(11) DEFAULT '0' NOT NULL,
	  name tinytext NOT NULL,
	  text text NOT NULL,
	  url VARCHAR(55) NOT NULL,
	  UNIQUE KEY id (id, time)
	);";

require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
dbDelta($sql);

The above fail due to this:

UNIQUE KEY id (id, time)

the correct writing should be this:

UNIQUE KEY id (id,time)

where there are no spaces between the commons. On the other hand, try to avoid having any spaces between each commons such as these

$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";

require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
dbDelta($sql);

It is always safe to ensure that all keyword are separated by one space and between each commas there shouldn't be any spacing. Another thing to take note is that every table creation should have a KEY in order for it to work. And like the criteria stated, each field should have its own line like the one shown on WordPress example. And at the end of each instruction there should be semicolon to be safe!

The above are things you should be cautious when using dbDelta. However, i did learn some tricks when reading the code of dbDelta function. If you are creating multiple table or query with dbDelta, it can be done using one call.

#$table_name = 'test1'
$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
#$table_name = 'test2'
$sql .= "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
#$table_name = 'test3'
$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
dbDelta($sql);

The above is similar to having one instruction per dbDelta such as this:

require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
#$table_name = 'test1'
$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
dbDelta($sql);
#$table_name = 'test2'
$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
dbDelta($sql);
#$table_name = 'test3'
$sql .= "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
dbDelta($sql);

Hence, you might want to practice the initial one to make your code run more efficient. Another interesting thing to take note is that every last instruction will not be required to have semicolon. Hence,

#$table_name = 'test2'
$sql .= "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	);";
#$table_name = 'test3'
$sql = "CREATE  TABLE " . $table_name . " (
	  id mediumint(9) NOT NULL AUTO_INCREMENT , 
	  time bigint(11) DEFAULT '0' NOT NULL , 
	  name tinytext NOT NULL , 
	  text text NOT NULL , 
	  url VARCHAR(55) NOT NULL , 
	  UNIQUE KEY id (id, time)
	)";
require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
dbDelta($sql);

will work. This means that if you have only 1 SQL instruction for dbDelta to run, you can safely remove the semicolon. But if you have many SQL instructions, only the last instruction can exclude semicolon since dbDelta function use semicolon as delimiter for splitting each instruction and remove the last array data if it is empty. Hence, leaving the last semicolon will reduce the step required to complete dbDelta function.

Conclusion

dbDelta in WordPress can be really useful. We just need to be careful not to make those silly mistakes that might just cost us our precious time debugging it. Although it can't remove fields for us, it definitely help us save a lot of time by adding in new fields!