Dan Frost's blog A problem with robots.txt & other minor issues Posted by Dan Frost on Fri, 27/07/2007 - 15:09
We've noticed there's a bug with changing use or not use robots.txt. It's going to change so the default is that it does use robots.txt and should be done on Monday. A couple of sites have also completed the crawl and then all the crawl data has been cleared. If you have this problem please re-crawl. We're looking into it. There's some interesting sites out there Posted by Dan Frost on Wed, 25/07/2007 - 16:23
We've been watching sites go through the crawler and the first, most immediate issue is that our crawler isn't coping with malformed HTML as well as it might. There's a fix on the way and the crawler should be able to handle those issues shortly. We'll re-crawl and notify those users that had that problem. Crawler now stable... Posted by Dan Frost on Fri, 13/07/2007 - 17:29
It looks like we've cracked the crawler problem. We're running tests right now and all seems to be well - apart from our internet connection which is extremely ropey. For the tests we're running the crawler from a supposedly "high capactity" ADSL line so perhaps the ISP is freaking out. Website live & crawler blows a fuse Posted by Dan Frost on Tue, 10/07/2007 - 14:25
This website went live on 10th July but the usually well behaved crawler decided now was a good time to throw a wobbly. We've decided to make some fairly drastic changes (that we'd had planned anyway) so we're a little behind schedule but we're going to roll with the punches. Ok, so we've been busy Posted by Dan Frost on Wed, 04/07/2007 - 10:12
Not a great start to this blog - nearly two months between blog posts but we've been busy...honest. The idea was to post on here with fairly regular progress but we've been nose to the grindstone getting the software ready for launch. It's looking like we're going to go live next week as a beta with a few known issues. There's a huge list of other improvements but the issues below are things we want to get sorted next week. They are : 1) Documents such as PDF, DOC, etc are not being properly identified so they appear in the duplicate/no titles reports incorrectly. We love crawlscore! Posted by Dan Frost on Thu, 17/05/2007 - 14:36
When we dreamt up crawlscore it sounded like a good idea and now it's here and we've used it, it's a brilliant idea. Google Webmaster Tools is useful and webmasters should certainly use it but we feel that it only tells a part of the story. crawlscore is go Posted by Dan Frost on Thu, 17/05/2007 - 14:23
Hello world. Sorry to be so unoriginal but I thought it was fitting for the first of, what we hope to be, many blog posts. You can read about what crawlscore is elsewhere on the site but we hope to have a version live during June/July. We're in the classic catch 22 situation of trying to get as many features in there as possible but at the same time get it launched once and for all! |