Simple Html Dom is a PHP dom manipulator. This library is one of the easiest and most powerful dom manipulator for PHP. In fact, you can even use this to create your own web crawler like what i have done. However, Simple Html Dom library isn't perfect. Although you are able to do almost everything without a problem using simple htm dom, the most problematic thing that will happen in a complex program would be to have different combination of URL. The combination of a URL is endless and this can cause simple html dom to fail.
I faced this problem with simple html dom where fatal error keep stopping my php crawler using simple html dom. The fatal error always occurs around "call to a member function on a non-object at...." and when I look at the fatal error link being process, it was perfectly fine. In PHP, we cannot really stop a fatal error without using some black magic which not always work. Like many people would have say, prevention is better than cure. Hence, doing a checking to determine whether the variable is an object before proceeding will definitely fixed this problem. If you think like many other people out there, this is most likely what you would have done and bet that it will definitely fix your problem.
$html = file_get_html($url); if(is_object($html)){ foreach($html->find('img') as $img){ //bla bla bla.. } }
Well, the above might work in some case but not all. when file_get_html failed, it will return false regardless of 404 or 500 occurs on the other side of the server. Hence, you may do this as well,
$html = file_get_html($url); if($html){ foreach($html->find('img') as $img){ //bla bla bla.. } }
But if it still doesn't solve your problem which it really shouldn't be able to take care of all cases, you might turn to do the following,
$html = file_get_html($url); if($html && is_object($html)){ foreach($html->find('img') as $img){ //bla bla bla.. } }
Well, if this still doesn't work and your brain is stuck, you might feel lucky this time that you come to this blog of mine.
Simple_Html_Dom Fatal Error Solution
The solution for your problem is actually quite simple but not direct. However, i have tested this with almost few hundred thousands of different URL so i can confirm you that this will definitely solve your fatal error and get rid of the "call to a member function on a non-object" especially when it reaches "find". The solution is simple, using the above example, we will just have to add in a condition where it doesn't fail and the object was created by simple html dom class but it doesn't contain any node! In this case, you should write something like the following,
$html = file_get_html($url); if($html && is_object($html) && isset($html->nodes)){ foreach($html->find('img') as $img){ //bla bla bla.. } }
In the above example, i uses all the true condition but in my real program i was using all the false condition (is false). But it should still works. I tested this for a few days as the bot was required to run for a few lots of hours before i bang into a fatal error. This is really a wasteful of time. But the solution is rewarding. I hope this help some fellow out there 🙂