Tuesday, May 31, 2005

Did Google's Success Overwhelm DMOZ?

DMOZ - Welcome or Not?With the recent closing of the Open Directory Project's Status Check Forum, search engine experts like Danny Sullivan have suggested that perhaps it's time to change their name to the Closed Directory Project.

Transparency issues aside, the ODP (aka DMOZ) has gone through some rocky times, ironically its fate tied to Google's decision in 2000 to mirror the ODP's content on their directory.google.com subdomain.

Initially the Google version of the directory was the only way to view PageRank ™, until the introduction of the Google ToolBar a year later in 2001.

Part of the benefit of having a site listed in DMOZ was, plain and simple, the dozens of mirror sites across the web, including Google's own version, which sent inbound links to listed sites. In the quest for better rankings, a DMOZ listing was a definite edge.

However, as more and more promoters got wind of Google's achille's heel, things got out of hand. In what could be considered a 5 year long "slashdotting", DMOZ was flooded with Google-powered submissions, and both legitimate and "overly commercial" sites jammed the submission cues. In today's DMOZ, waits of 18-24 months from submission to listing are not uncommon in some cats, and unfortunately, "never listed" is more the rule than the exception.

In part, these enormous backlogs, and the flood of "low quality" submissions is Google's fault. It was Google that emphasized the importance of DMOZ by making a mirror of the directory; it was Google that put a link to the directory above their search box, it was Google that added the Directory icon to their Toolbar advanced features options, it was Google that used ODP editor descriptions of sites in their search results.

It only goes to follow that search engine promoters would notice this and do their best to get their sites listed, by hook or by crook. Although listing criteria have become increasingly strict, and rumors of corruption at DMOZ are rampant, the problems continue to grow.

This has led to, in my opinion, to Google's gradual de-emphasis of the ODP on their site.

Last year, the directory search was relegated to the ignoble more >> tab. Google stopped using DMOZ editor's descriptions for sites in favor of the "ransom note" snippets, and the new Google 3.0 Toolbar doesn't have the directory icon as an option.

DMOZ detractors rejoiced on forums like webmaster world, "Google hates the ODP now!", they said with tears of joy in their eyes.

However, I believe this was a doubleblind bluff by Google.

They've diminished the public visibility of the ODP on their site, and by doing so, made an attempt to relieve some of the enormous pressure created on the Open Directory Project, yet at the same time they remain vitally dependent on the human edited results as a vital crosscheck of the quality of their own algorithmically generated results. Google cannot afford to ignore the valuable data that can be mined from a gargantuan directory that quickly outpaced Yahoo's directory, which had a 5 year head start.

Recently, in order to combat the "scraper site" phenomenon, Google has returned to using DMOZ descriptions in their SERPs again, probably because the DMOZ descriptions have a less keyword-laden snippets than Google's machine generated "ransom notes" do.

It is unfortunate that the minimal transparency afforded to DMOZ through Resource-Zone's site status forum is gone. In my opinion DMOZ remains a vital component in the success of a "white hat" site. This is not to say a site cannot make it without a listing, of course it can, but a DMOZ listing is a giant shortcut.

There is only one productive approach when attempting to get a DMOZ listing.

1) Carefully make sure that your site complies with all editorial guidelines.

2) Carefully choose the most appropriate category for your site, and submit a concise, accurate description.

3) Click submit, and never think about it again. If you get in, your in, if not, there are plenty of other places to get links from.

Sunday, May 29, 2005

Google Update Bourbon

Google Bourbon UpdateThe Bourbon update started around May 20th, 2005, and 10 days later has yet to settle down. This officially makes it the second longest update since Florida which lasted almost 20 days.

What has Bourbon filter/algo adjustment gone after? So far it looks like:

  • Non-thematic Linking

  • Duplicate Content*

  • Fraternal Linking

  • Run of Site Links

  • Low Quality Reciprocal Links


* Google Bourbon seems to have screwed up their duplicate content detector, and there are many examples of Google assigning "authority" to the duplicate page, and restricting the original page to the "similar results" filter.

The victims of the Bourbon Update are in panic mode - some hypothesize that Google misidentified their site as a "scraper site" due to their AdSense placement and penalized it; others are going as far as to speculate (incorrectly) that Google is somehow detecting their AdSense ads on scraper sites, and considering them to be links from "bad neighborhoods", which is patently incorrect, as none of the search engine bots will actually trigger the JavaScript that loads the ads.

However, upon viewing a few of the sites filtered by Bourbon, they have a number of things in common - basically 2 or more of the factors listed above.

Lesson to be learned? Don't reciprocal link with any old site - viagra-casino-porn.biz is not a good link exchange partner for a florist's website.

Think like Google - what "signals of quality" does your site have? How can you increase and expand those signals? What can they do to automatically detect both quality and low-quality signals.

Google is trying to present their users with the best search results - concentrate on what criteria, apart from the basic on-page factors, they are using to judge the quality of a given page. Google wants quality results - if your site isn't ranking well in the post-update aftermath of Bourbon, reassess your strategy and look into what factors Google's algo might use to rank sites now, and in the future.

Saturday, May 28, 2005

Google PageRank Issues

Page Rank IssuesYesterday afternoon, Google's cached pages servers went down, shortly thereafter, all pages stopped displaying PageRank ™ - which, according to Google, is "the heart of our software".

Dozens of worried webmasters, fearful that their "green fix" was gone and would never come back, posted on forums like WebmasterWorld, to express their anxiety at not seeing Page Rank on their indicator bar in a virtual equivalent of heroin withdrawal.

But what is PageRank? Many experienced promoters have been stressing that the page rank meter on the Google Toolbar is wildly inaccurate at best, essentially useless. What they really mean is that PageRank used to trump other factors, that Page Rank used to be one of the most important factors of the ranking algorithm. This eventually changed, as PageRank was abused, bought, swapped and sold like any other commodity, and Google eventually shifted the emphasis to anchor text.

Of course, like Page Rank, anchor text was quickly abused, as evidenced by the massive flood of guestbook and blog spamming that created thousands of anchor text links to affiliate sites allowing them to rank for incredibly competitive terms within 48 hours.

Now it seems that Google is attempting to fight anchor text spamming with their "sandbox" algorithm, combined with algo adjustments like Google Update Bourbon - have really shifted the power of anchor text only from on-theme pages and on-theme sites.

What this means is that a real estate site will benefit more from a link from another real estate site than a Britney Spears fan page.

What does this all have to do with Page Rank disappearing? I think that PageRank is just temporarily offline - perhaps they are going to implement an uncrackable encoding system to defeat the myriad of unofficial PageRank tools and sites, if not Google is going to have to do a lot of updating:

Page Rank Display
Wondering whether a new website is worth your time? Use the Toolbar’s PageRank ™ display to tell you how Google's algorithms assess the importance of the page you're viewing.


As Brett Tabke, founder of WebmasterWorld said, "If they were going to [get rid of PageRank], the feature would just be removed from the toolbar."

Friday, May 27, 2005

Google Dupes Its Directory Content

The Google version of the Open Directory - DMOZ.org - has always "lived" at directory.google.com, which is why I was surprised to notice today that Google has in fact duplicated its directory contents via the WWW subdomain:

A google search for site:directory.google.com yields approximately 1,300,000 results while site:www.google.com/Top/ pulls up a respectable 995,000 pages.

Google Directory with duplicate content


Recent speculation has suggested that Google is trying to nullify PR transfer from directories - but could the answer be as simple as a duplicate content penalty?

Thursday, May 26, 2005

Star Wars Torrents - With Google Ads

The new BitTorrent Search is live and is showing Google contextual advertising syndicated through Ask Jeeves.
Star Wars Torrents with Adwords Advertisers

As I predicted in an earlier post, Jeeves Embraces Piracy To Expand Reach, a search for "star wars" pulls up over 100 different download options for the latest Star Wars film, Revenge of the Sith.

Advertisers for this term include the Washington Post. I believe that this site is in direct violation of the Adsense publisher program policies, namely:

Site may not include:...
- Hacking/cracking content
- Any other content that promotes illegal activity or infringes on the legal rights of others

Violating copyright certainly covers the second point. In case you haven't checked the Bit Torrent searches will also find tons of cracked software.

Wednesday, May 25, 2005

Jeeves Embraces Piracy To Expand Reach

According to The Street, Ask has announced plans to provide contextual advertising for the upcoming Bit Torrent search engine.

Currently Jeeves is syndicating Ads from Google on its properties - and it is assumed that they will start showing Adwords advertisers on these searches.

Imagine, sites like starwars.com can have their ads shown contextually right next to the link to download the pirated version of Episode III! Way to go Jeeves!

Discussion on Webmaster World

Attribution, Hotlinking and Courtesy

When I broke the story about the adsense URL hijacking, I posted the news, in a number of prominent forums in order to get the word out.

This lead to getting additional stories on several prominent blogs, every one of them credited me as the source for the original story... with the exception of one - Mike's List (I am not linking to it).

Mike Elgan, proprietor of mikeslist.com has not only omitted the fact that I broke the story - he hotlinked *my* screen cap of the SERP.

The picture you see here is none other that Mr. Elgan himself - I hotlinked his pic. Mr. Elgan, according to elgan.com, is a professional journalist yet he omitted proper credit for the story.

Thanks for leeching my bandwidth Mike, and thanks for teaching me that even this little blog isn't safe from content stealing scumbags.

I have an mod rewrite .htaccess in place for future incidents.

Monday, May 23, 2005

Google's Bourbon Hangover

Funny thing happened when I did a regular search for Adsense on Google so I could check my channel tracking, I ended up in the 72.14.207.104 data center, where I saw this:

Google SERP Update: Bourbon


When has a Google page ever been beaten by another site, let alone by someone that is redirecting to the regular Google Adsense page with a meta refresh.

Using this bug, someone could be #1 in Google for Yahoo, #1 for, well, pretty much anything with an affiliate program :) .

Fluke? Bug? Quirk? You decide, but if this guy can beat Google for Adsense, there's a big f'ing hole in their ranking algo. Enjoy.

Googleguy has publically stated on several occasions that the hijacking problem does not exist, however this site could now redirect every adsense search to any page they want - they *own* the term "adsense" in Google.

It isn't clear if this is a side effect of the Bourbon Update, or just a symptom of Google's ongoing troubles with redirects.

Sunday, May 22, 2005

Google Update Bourbon

Webmasterworld has named the latest Google update "Bourbon".

To be honest, though several datacenters are showing different results, I have yet to see them propagate into the regular google.com results.

What does this mean? Hard to guess, usually when changes are in the data centers, but not in the regular results, the changes in the DCs fluctuate, as Google tweaks the results. But in the few SERPs I checked, they've remained the same in the data centers for several days, but haven't started appearing in the regular search results.

However, several posters in webmasterworld are claiming "75% drops" in Google traffic and the like - so it could just be that my "pet" keywords haven't been affected.

Though the general mood seems to be one of panic - I'd counsel patience, even the "Florida" update went through a number of iterations (nearly 3 weeks) before they got it right.

Friday, May 20, 2005

Personalized Google Home Page Launched

During the recent Google Factory Tour webcast, after I rushed out to check if Google Earth domain names were available, I started playing with the Google's "IG" - I cannot call it "My Google" - the url is http://www.google.com/ig/ - IG stands for "interface graft" perhaps? What a boring, uninspired, Frankensteinian approach to personalization - slapping a few headlines below the fold of the "normal" Google search box doesn't cut it.

During the webcast Marissa Meyer was literally waiting for the thumbs up from the dev team, as she was presenting it: "The Google personalized interface... which.... is live... as of..... Now!... No... Okay... Now!"

What are my feed choices - BBC, NYT, Slashdot, Wired and Google News? Big f'ing deal. Long way to go before it will compete with *my* "my yahoo" page - which I've had, incidentally, for 7 years.

The Google "interface graft" project reeks of last minutism. Honestly, in my opinion, it isn't even good enough for Labs, but fear not, Marissa promised that a googol of other features for it are coming "real soon now".

Sunday, May 15, 2005

Google Web Accelerator

Earlier this month, Google introduced the Google Web Accelerator, a tool that speeds up the user's web surfing experience by:

  • Sending page requests through Google servers dedicated to handling Google Web Accelerator traffic.

  • Storing copies of frequently looked at pages to make them quickly accessible.

  • Downloading only the updates if a web page has changed slightly since you last viewed it.

  • Prefetching certain pages onto your computer in advance.

  • Managing your Internet connection to reduce delays.

  • Compressing data before sending it to your computer.


By default, the web accelerator is supposed to automatically check for new pages, and not show the stored copies - however, I've found that not to be the case.

For instance, yesterday, I was making changes in the website I keep in my webmaster world profile, www.patrick.com.mx, and I kept uploading my CSS file and refreshing the page to see changes, and I wouldn't see it. I kept uploading and refreshing, finally remembering that the Web Accelerator was active. As soon as I stopped it, of course, the changes were shown instantly.

Now, strangely enough, if I visit the site today, with the accelerator on, it shows the cached version of the site, not the new template. If I refresh, the new version will snap into place.

Makes me wonder how many sites I've surfed are actually staler versions based on the Google cache.

A bug that has arisen is that the web accelerator has gotten confused as to whose session belongs to whom.

As reported on Threadwatch and elsewhere, forum users with WA suddenly found themselves in other peoples' accounts. This actually happened to me for an instant when I went to the Google home page, and it announced that I was signed in to Google's services as "pliebrand@imaxa.com". Just out of curiousity, I attempted to go to "my account", and it requested that I sign in. This was during the first few hours of the accelerator, and it hasn't happened again, so perhaps they've ironed these bugs out.

However WA breaks geotargeting. I live in Mexico, and have several adwords campaigns that are restricted to US only - yet with the Accelerator on, suddenly I see my adwords in the serps. I assume it is doing the same for Adsense ads. Not very fair for advertisers.

Of course the much larger issue is what is Google gaining by sending all this traffic through their own servers?

Human usage metrics. Far more accurate than the toolbar. This tool literally keeps track of every move you make, even tells Google how their competitors' search results are performing.

Could the Web Accelerator be part of Google's TrustRank? Very likely. Google loves automation, and what better way to judge the value of sites ranking in their SERPs than having the usage data of thousands of users on a 24 hour basis.

Interesting times ahead.