Simple Html Dom is a PHP dom manipulator. This library is one of the easiest and most powerful dom manipulator for PHP. In fact, you can even use this to create your own web crawler like what i have done. However, Simple Html Dom library isn't perfect. Although you are able to do almost everything without a problem using simple htm dom, the most problematic thing that will happen in a complex program would be to have different combination of URL. The combination of a URL is endless and this can cause simple html dom to fail.
I faced this problem with simple html dom where fatal error keep stopping my php crawler using simple html dom. The fatal error always occurs around "call to a member function on a non-object at...." and when I look at the fatal error link being process, it was perfectly fine. In PHP, we cannot really stop a fatal error without using some black magic which not always work. Like many people would have say, prevention is better than cure. Hence, doing a checking to determine whether the variable is an object before proceeding will definitely fixed this problem. If you think like many other people out there, this is most likely what you would have done and bet that it will definitely fix your problem.
$html = file_get_html($url); if(is_object($html)){ foreach($html->find('img') as $img){ //bla bla bla.. } }
Well, the above might work in some case but not all. when file_get_html failed, it will return false regardless of 404 or 500 occurs on the other side of the server. Hence, you may do this as well,
$html = file_get_html($url); if($html){ foreach($html->find('img') as $img){ //bla bla bla.. } }
But if it still doesn't solve your problem which it really shouldn't be able to take care of all cases, you might turn to do the following,
$html = file_get_html($url); if($html && is_object($html)){ foreach($html->find('img') as $img){ //bla bla bla.. } }
Well, if this still doesn't work and your brain is stuck, you might feel lucky this time that you come to this blog of mine.
Simple_Html_Dom Fatal Error Solution
The solution for your problem is actually quite simple but not direct. However, i have tested this with almost few hundred thousands of different URL so i can confirm you that this will definitely solve your fatal error and get rid of the "call to a member function on a non-object" especially when it reaches "find". The solution is simple, using the above example, we will just have to add in a condition where it doesn't fail and the object was created by simple html dom class but it doesn't contain any node! In this case, you should write something like the following,
$html = file_get_html($url); if($html && is_object($html) && isset($html->nodes)){ foreach($html->find('img') as $img){ //bla bla bla.. } }
In the above example, i uses all the true condition but in my real program i was using all the false condition (is false). But it should still works. I tested this for a few days as the bot was required to run for a few lots of hours before i bang into a fatal error. This is really a wasteful of time. But the solution is rewarding. I hope this help some fellow out there π
Awesome post! I've spent hours trying to resolve this. How ever would it make more sense to use "||" (or) instead of "&&" (and)? Wouldn't you want your script to bail out if any of those conditions failed?
because the condition was isset($html) instead of !isset($html). If it's !isset($html) everything will have to go for or conditional. my program was using or condition but the example up there i just use the opposite π
Your cure to check that you received the $html works, how ever I still get the fatal error when doing a find(). It seems that if the find() function fails to find what you are looking for, it throws a fatal error which stops the script from running. Huge pain... Is there a way to further check the $html for let's say a certain table exists before calling the find() function?
Thank you SO MUCH
I have also spent way too long trying to figure this out. I had figured out the ifset and ifobject test but had not gotten to the ->nodes. AWESOME π
no problem derek, glad it helps π
Very helpful - been struggling with this for the past 48 hours. Managed to get the checking of $html and is_object in there, but the isset nodes thing must have been the key to it all. Thanks for doing this blog post man, it's people like you who make the internet invaluable!
Youre Awsome man !
This is very Helpful , I love you!
I'm facing this problem please help ?
"I still get the fatal error when doing a find(). It seems that if the find() function fails to find what you are looking for, it throws a fatal error which stops the script from running. Huge painβ¦ Is there a way to further check the $html for letβs say a certain table exists before calling the find() function?"